Professional Reading (and Listening) Shelf
Data Mining
Source: The Kojo Nnamdi Show/WAMU
Radio Program Discusses Data Mining (RealAudio)
From the description, “Data mining searches large databases for unexpected patterns of data – and it’s used by everyone from Amazon to political campaigns to government and law resources. But some say the technology crosses the lines of public versus private information.” Guests:
+ Usama Fayyad, Founder and president, DMX Group, a business and technology consulting group
+ Lee Strickland, Visiting Professor and Director, Center for Information Policy, University of Maryland, College Park
+ Nick Gillespie, Editor-in-Chief, Reason magazine
+ Angelique Waller, artist, author “Data Mining the Amazon”
The program was broadcast on WAMU, a public radio station in Washington, DC, and runs about one hour.
–
Scholarly Communication
Source: C&RL News
New Article, Information Access Alliance: Challenging anticompetitive behavior in academic publishing
This intro to the IAA was written by Mary M. Case, Director of the Office of Scholarly Communication at the Association of Research Libraries.
Archive for May, 2004
Information Access Alliance: Challenging anticompetitive behavior in academic publishing
Wednesday, May 26th, 2004More Historical Material from the U.S. Census Available Online
Wednesday, May 26th, 2004Resources, Reports, Tools, Lists, and Full-Text Documents
Women–United States
Source: Harvard University Library Open Collections Program
Women Working, 1870-1930
“The Open Collections Program has chosen the subject Women Working from 1870 to 1930 as its first topic to demonstrate the feasibility of bringing together books, manuscripts, and images from across the Harvard Libraries and Museums and integrating them into a digital collection using the Web as a primary access tool.”
+ Browse by topic.
+ Browse by dates and events
+ Search full text.
–
U.S. Military
Source: U.S. DoD Office of Force Transformation
New, Full-Text Report, National Military Strategy of the United States of America 2004 (PDF; 460 KB)
From Richard B. Myers, Chairman of the Joint Chiefs of Staff: “The ‘National Military Strategy’ conveys my message to the Joint Force on the strategic direction the Armed Forces of the United States should follow to support the National Security and Defense Strategies in this time of war. This document describes the ways and means to protect the United States, prevent conflict and surprise attack and prevail against adversaries who threaten our homeland, deployed forces, allies and friends.” Focuses on three priorities: Winning the war on terrorism, enhancing joint warfighting and transforming the armed forces for the future.
–
United States Census
+ Census of Population and Housing (1790-2000)
See Also: Selected Editions of the Statistical Abstract (1878-2001)
See Also: Mini Historical Statistics
Files available in pdf or xls formats.
–
Environment–United States–Database
Source: EPA
Updated, UV Index
Search by zip code or city name.
–
United States–History
Presidents–United States
Source: National Archives and Records Administration (U.S.)
Released Today, National Archives Releases New Materials Related to the Nixon Presidency
From the overview, “The National Archives and Records Administration will release approximately 20,000 pages of transcripts of Dr. Henry Kissinger’s telephone conversations during his tenure as Assistant to the President for National Security Affairs (1969-74) and Secretary of State (1973-74) during the Nixon Administration. These telephone calls, which took place at various locations, were recorded between January 21, 1969-August 8, 1974. The National Archives will also release approximately 7,000 pages of materials from the “White House Central Files: Subject Files,” including Pardon Files from 1973; and 1,600 pages of “White House Central Files: Name Files,” including a small amount of material relating to John Kerry and Roger Ailes.
See Also: Addional Info in this AP Report
LexisNexis Expands Presence in China
Wednesday, May 26th, 2004Briefly
+ Solcara and LexisNexis Launch News Monitoring Tool (via ManagingInformation.com)
& LexisNexis Expands Presence in China
–
–
+ Vivisimo Continues Rolling Along with Six New Biopharma Clients
The company launched their ClusterMed product at the end of March.
–
+ 9/11 Panel Chooses Publisher for Report (via NY Times)
“[W.W] Norton has announced plans to publish 500,000 copies of the report and sell them for $10 each, which competitors acknowledge is a relatively low retail price for a book expected to be hundreds of pages long. The federal government’s printing agency, the Government Printing Office, is expected to sell its own version within several days of the report’s release.”
Why You Can’t Sue Google
Wednesday, May 26th, 2004Search Engines–Legal Issues
Source: Findlaw
Why You Can’t Sue Google
From the column by Julie Hilden: “As Google prepares for its Initial Public Offering, it’s worth reflecting on a special advantage the law gives to it, and to other, similar search sites: Such sites are, in effect, immune from much of the liability risk a traditional publisher of news and other factual information faces. For publishers of books, magazines, newspapers and the like, publishing, or even re-publishing, a false statement can trigger defamation liability. But, for reasons I will explain, the same is not true for search sites like Google. Search sites can provide access to information that may be false, without worrying about the risk of a defamation suit. (No wonder, then, that Google’s stock may turn out to be valuable; some of the value it will have doubtless comes from this special legal bonus.)”
Article: From IR to Search and Beyond
Tuesday, May 25th, 2004Professional Reading Shelf
Source: ACM Queue
From IR to Search and Beyond
This article was written by Ramana Rao, CTO at Inxight Software. He writes, “searching has come a long way since the 60s, but have we only just begun?”
See Also: More Search Articles from ACM Queue
–
Enterprise Search
Source: Intelligent Enterprise/Intelligent Portals
Consumer and Enterprise Search: Not an Exact Match
Two Recently Released CRS Reports
Tuesday, May 25th, 2004Resources, Reports, Tools, Lists, and Full Text Documents
Privacy–United States
Source: The Technology and Privacy Advisory Committee
Full Text Report, Recently Released, Safeguarding Privacy in the Fight Against Terrorism
From a FCW article, “The 140-page document resulted from Congress canceling the Defense Advanced Research Project Agency’s controversial Terrorism Information Awareness program in 2003. TAPAC officials concluded that TIA represented “a flawed effort to achieve worthwhile ends.” The report runs 140 pages.
–
Congressional Research Service
Source: FPC/CRS
Two Recently Released CRS Reports
+ Greece: Threat of Terrorism and Security of the Olympics
+ Terrorist Identification, Screening, and Tracking Under Homeland Security Presidential Directive 6
–
Housing–United States–Statistics
Source: U.S. Census
New, Housing Data Between the Censuses: The American Housing Survey
–
Taxes–Australia
Source: Parliamentary Library, Australia
New Full Text Report, Less tax or more social spending: twenty years of opinion polling
–
e-Government–United States
Source: Pew Internet & American Life Project
Use of E-government Increases 50% from 2002 to 2003, but Citizens Want Multiple Channels Available to Contact Government
“New research by the Pew Internet & American Life Project shows that 97 million adult Americans, or 77% of Internet users, took advantage of e-gov in 2003, whether that meant going to government Web sites or emailing government officials. This represented a growth of 50% from 2002. At the same time, citizens who contact government said they are more likely to turn to traditional means either the telephone or in-person visits rather than the Web or email to deal with government.” Download full report, by sections or just a summary of the findings.
108552228780775428
Tuesday, May 25th, 2004Web Search Update
Web Search–Yahoo
Source: San Jose Mercury News
+ Plaxo,Yahoo make deal on search
From the article, “Today Plaxo integrates Yahoo’s search engine directly into the Outlook e-mail program. Under the deal with Yahoo, Plaxo will get paid for channeling people to Yahoo’s search engine. The search box will be placed beside a Plaxo icon that sits atop Outlook. Plaxo will eventually make Web searching possible from individual e-mails, according to Masonis. Ultimately, he wants Plaxo to search individual words within the e-mails. You would click on the word and Plaxo would do a Web search through Yahoo.” Thanks to Searchblog for the tip. J.B. also comments on the deal.
See Also: Direct to Plaxo
–
Web Search
Source: SearchDay
Search Engine User Attitudes
D.S. and C.S. with a thorough overview of some recently released numbers from iProspect. Here are a few points that caught my eye.
+ “What do people do when they can’t find the information they’re looking for? 26 percent said they’d give up on a search and try again if they didn’t find a match in the first two pages of results, more than any other choice. This was followed by 23 percent who said they’d review only the first few matches on the first page, then 19 percent who said they’d review only the entire first page of results. 15 percent said they’d give up after reviewing three pages. The remainder said they’d look at more than three pages.”
+ “Nearly half of those surveyed — 49 percent — said they use one or more search toolbars. This is based on the fact that respondents were asked to answer which toolbar they had installed: Google, Yahoo, MSN or None Of The Above. Yahoo was ranked top among those choices, at 22 percent, followed by Google at 20 percent and MSN at 17 percent.”
+ “In other findings, both men (65%) and women (57%) preferred natural results over paid listings, though the 43% of women who said they favored the paid listings suggest that the preference for organic results is not as strong in women as men.”
–
Web Search
Source: News.com
Study questions Google’s long-term dominance
A couple of comments:
+ The study points out that, “Google’s results vary little from those found on other search sites.” However, SearchDay recently noted that, ” A new comparison tool shows that the major search engines have surprisingly little overlap, even for popular search terms. Search engine guru Greg Notess has long studied search engine overlap — the number of pages found by more than one search engine. Greg’s findings have consistently shown that there is very little overlap in the web page databases of the major search engines, meaning you’ll likely get very different results depending on the engine.” The article also says, “Google users searching for the leading cause of death for people between the ages of 25 and 34, found the information they were looking for 55 percent of the time. The company’s rivals fell close behind with between 52 percent and 54 percent success rates, Vividence observed.” So I guess the question is, what did the study participants consider a good result? Were those surveyed satisfied with whatever they found? Did time constraints come into play? What were the search terms? How many search terms were used? Those of us who use specialized info databases (free and fee-based) — along with things called books (no kidding) — realize that the web/web engines are just one of many research tools. However, I think the general public has little or no idea about “other” existing resources that could not only be helpful, but also SAVE them aggravation and effort. We also know that with a little effort, general web search tools like Google and Yahoo can become much more powerful and precise. This will become even more noticeable as these resources grow in size. I guess the most interesting news is that more and more users are realizing that general web search tools (other than Google) are useful.
+ “The company found that Google clearly remains consumers’ favorite, largely because of the search engine’s less-cluttered interface.” I can’t figure out why Yahoo doesn’t spend some effort promoting the search.yahoo.com interface? Heck, you can even customize the tabs! I also think Teoma.com is far from cluttered, and it also gives refinement options not available at Google. Again, creating and purchasing an info resource is one thing, but getting people to use it is something else. Google does it very well (better than just about anyone); others, including traditional vendors and libraries, need to do better.
+ “Watkins said part of the reason why Google lags behind its competitors is the company’s stringent practice of keeping ads well marked, while the other sites sometimes mix solicitations in with regular search results.” Google deserves mega kudos for their work in labeling web results and making everyone else follow. That said, everyone else is better, and it’s hard to find examples of where the other engines mentioned in the article don’t clearly mark ads vs. organic results.
+ I’ll conclude with two comments that I think are relevant. The first from our friend Tara Calishain, who said in a 8/03 AP article, “Google has a lot of smart people who have built a great search engine, but there are a lot of other smart people out there looking for ways to make search engines even better.” The other comment is found in a 5/03 Forbes article, “Even Google’s engineers admit FAST and Teoma deliver results comparable to theirs.”
More on Science.gov 2.0
Monday, May 24th, 2004Science–Specialized Search Tools
Source: Info Today NewsBreaks
More on Science.gov 2.0
On May 11th we mentioned that Science.gov 2.0 had just launched. Today, Paula Hane offers an excellent overview of the enhanced service and new technology. One weakness Paula doesn’t mention in her column that I’ve found while testing Science.gov 2.0 is that direct links to citations found via this metasearch tool are not available. This could cause problems in trying to return to a citation or including it in a bibliography. An example, I ran a quick search for the phrase “global warming” and, as expected, got many results. When I clicked on an entry, I was unable to find a direct url to that specific entry. However, if I went directly to one of the underlying databases I was able to find a unique url.
And More Link Bombs
Monday, May 24th, 2004Web Search
Source: Wired
And More Link Bombs
I agree with Danny S. and have started to call these types of things “link bombs.” The writer seems to have a problem understanding the differences between “bombs” and keyword advertising. He also doesn’t mention that AltaVista, Lycos, and Yahoo all use the same underlying database. Oh well. I do find it worth noting that once again Jeeves and Teoma seem more resistant to link bombing and manipulation than other web engines.
Librarians Are Not Search Engines
Monday, May 24th, 2004Professional Reading Shelf
Librarians
Source: American Libraries
Librarians Are Not Search Engines
In his latest column (a must read), Dr. Joseph Janes writes, “Maybe it’s just me, but I don’t see the obvious comparisons between that [web search engines] and what a librarian does. To be sure, both are ways to get answers to questions; so in a sense both librarians and search engines are ‘answerers.’ It does seem an odd parallel, though; we never got ourselves compared (much less compared ourselves) to databases, catalogs, reference books, or the like. I think I know where this notion comes from: Some librarians, not without justification, might see search engines as competition. It’s not at all difficult to look at the rise of free and easy Internet searching and the simultaneous and sometimes precipitous drop in reference statistics and put two and two together. And that may well be a big part of what’s going on. So why not portray ourselves as the preferred alternative, in the same ballpark? Because it’s dangerous, that’s why. Sure, you can get an answer out of Vivisimo or Teoma, and you can also get an answer out of one of your local public library’s telephone reference service. The answer from Viv�simo might even be faster. (It might even be right.) But it’ll also be mindless. And unconcerned with quality, evaluation, instruction, or meeting your specific needs. There’s also a good chance it’d be a good answer to a question you weren’t really asking.” I would add to Dr. Janes’ comments that we also see the idea of the open web as the world’s largest library mentioned in many articles. I can understand where this comes from (large amounts of info in one “virtual” location), but it’s a real stretch. A library is a controlled, well-maintained, selective and organized collection of resources. We all know that the open web is not close to this idea. This doesn’t mean that some of the massive amounts of material found via web engines is not valuable — it ABSOLUTELY IS — but this alone doesn’t make a general web engine a library. That said, I think the general web engines (Yahoo, Google, Ask, etc.) could work more with the library community to solicit our thoughts on how to make their products more valuable tools for all users (including many librarians).
108542421025870936
Monday, May 24th, 2004Web Browsers
Source: News.com
Start-up Looks to Add Pluck to Browsers
From the article, “Start-up Pluck on Monday launched its first product, a set of tools designed to help people add capabilities to Microsoft’s Internet Explorer Web browsing software. Pluck’s self-titled package of browser add-ons promises to affix a range of extensions to IE, including expanded Web searching capabilities, live content folder sharing and a so-called rich site summary (RSS) reader. The product also includes an online community aspect, as it lets people exchange information saved in documents or folders.”
See Also: Pluck Press Release
See Also: Netcaptor, another browser that’s been described as “IE on steroids.”
The TV Show Database
Monday, May 24th, 2004Resources, Reports, Tools, Lists, and Full-Text Documents
Television–Database
Source: Yahoo (Info supplied by Tribune Media Services/Zap2it.com)
TV Show Database
A browsable database containing basic directory info (premiere date, description, stars, brief history) for hundreds of television programs new and old. Browsing by category is not a good idea since these categories contain only a few of the many entries available if you browse by title.
See Also: The Encyclopedia of Television (Full-Text)
From the site,”…includes more than 1,000 original essays from more than 250 contributors and examines specific programs and people, historic moments and trends, major policy disputes and such topics as violence, tabloid television and the quiz show scandal. It also includes histories of major television networks as well as broadcasting systems around the world and is complemented by resource materials, photos and bibliographical information. The book is not searchable but does contain hyperlinked cross-references.
–
Webliographies
Source: Science, Technology, and Business Division, The Library of Congress
New Research Guides
+ Wedding Industry Research – Selected Internet Resources
+ 17-Year Periodical Cicadas (2004) – Selected Internet Resources
+ Ricin (Toxic Substance)
–
R&D–United States–Statistics
Source: NSF
New Info Brief, Largest Single-Year Decline in U.S. Industrial R&D Expenditures Reported for 2002
This InfoBrief will focus on statistics from the 2002 Survey of Industrial Research and Development. It announces the availability of survey results on the World Wide Web (WWW) and the publication of the forthcoming annual detailed statistical tables (DST) and methodology reports; presents statistics on levels and sources of industrial R&D support, sales, and employment for manufacturing and nonmanufacturing industries; highlights the funding of R&D from companies’ own resources and from the Federal government; and details R&D spending per R&D scientist and engineer by R&D-performing companies.
–
Maps
Source: U.S. Military Academy Department of History
Department Maps
“In 1938, the predecessors of what is today The Department of History at the United States Military Academy began developing a series of campaign atlases to aid in teaching cadets a course entitled, ‘History of the Military Art.’ Since then, the Department has produced six atlases and nearly one thousand maps, encompassing not only America’s wars but global conflicts as well. In keeping abreast with today’s technology, the Department of History is providing these maps on the World Wide Web. The maps were created by the United States Military Academy’s Department of History and are the digital versions from the atlases printed by the United States Defense Printing Agency.”
Direct to Atlases
–
Taxes–United States
Source: U.S. Census Bureau
New, State Government Tax Collections Up 2.4 Percent; Biggest Increase in Tobacco Taxes
“According to data from the 2003 Annual Survey of State Government Tax Collections, general sales taxes were up 2.8 percent to $185 billion and taxes on individual income declined overall by 1.5 percent to $182 billion. These taxes made up more than two-thirds of all state tax collections. Among other major taxes, levies on tobacco products increased the most (29 percent), followed by severance taxes (24 percent) and documentary and stock transfer taxes (23 percent).”
See Also: 2003 Annual Survey of State Government Tax Collections
–
Traffic Information–United Kingdom
Relaunched, Highways Agency Web Site
From Kablenet.com, “The site was launched on 24 May 2004. It has been designed to be easier to navigate than its predecessor, and incorporates four new services to help drivers plan their journeys. Visitors to the site can get real time traffic information from the National Traffic Control Centre. The site provides regional maps that include icons at points where there are roadworks and accidents, and a click of the mouse will provide details and forecasts of any delays. They will also be able to get traffic flow forecasts, information on future roadworks from an online database, and a stream of updated information on the road network.”
108541175268659857
Monday, May 24th, 2004Bill Gates Mentions Weblogs, RSS in Speech
Sunday, May 23rd, 2004Professional Reading Shelf
Weblogs
Source: BBC News
Gates Backs Blogs for Businesses
“In a speech to an audience of chief executives, Mr Gates said the regularly updated journals, or blogs, could be a good way for firms to tell customers, staff and partners what they are doing.” Here’s the actual quote from Bill Gates:
“Another new phenomenon that connects into this is one that started outside of the business space, more in the corporate or technical enthusiast space, a thing called blogging. And a standard around that that notifies you that something has changed called RSS.” The full text of his speech along with his PowerPoint slides are also available online.
–
Web Design
Source: Jakob Nielsen’s Alertbox
Guidelines for Visualizing Links
“Textual links should be colored and underlined to achieve the best perceived affordance of clickability, though there are a few exceptions to these guidelines.”
–
Digital Libraries
Source: OCLC
Distinguished Seminar Series: Jim Gray on Digital Libraries
“Jim Gray’s presentation provides an overview of his work with the World-Wide Telescope effort from the perspective of a digital library, focusing on metadata, schema, curation, and preservation issues.” Slides are in PPT format. Audio is in MP3 format. The seminar took place on May 17, 2004 in Dublin, Ohio.
See Also: Materials from Other DSS Presentations
108533933505025344
Sunday, May 23rd, 2004Library Briefs
+ North Carolina…NC Live Databases Receive Press Attention (via Kinston Free Press)
–
+ Texas… School libraries dig in to keep providing resources (via Midland Reporter-Telegram)
–
+ Illinois…Library stands by controversial book (via UPI)
Useful Links for Exporting to Iraq
Sunday, May 23rd, 2004Resources, Reports, Tools, Lists, and Full-Text Documents
Iraq
Source: Export-Import Bank of the United States
Useful Links for Exporting to Iraq
Collection of links organized by headings: United States Government, United States Government Commercial Information, Information on Iraqi Businesses and Organizations, Contracting Opportunities, International Organizations.
–
History of Medicine–Gateways
Source: The Wellcome Trust (UK)
MedHist
“MedHist is a gateway to evaluated, quality Internet resources relating to the history of medicine and allied sciences, covering all aspects of the history of health and development of medical knowledge.”
Workshop Presentations: Metadata Practices on the Cutting Edge
Saturday, May 22nd, 2004Metadata
Source: NISO
Workshop Presentations: Metadata Practices on the Cutting Edge Workshop
The NISO workshop took place in DC yesterday. Here’s a list of the presentations. All presentations are in PowerPoint format.
+ Metadata Practice and Direction: a Community Perspective, Lorcan Dempsey, OCLC
+ RSS: Really Simple Syndication – A Publisher’s Perspective by Howard Ratner, Nature Publishing Group
+ New Developments Relating to Linking Metadata, Chuck Koscher, CrossRef
+ Metadata Standards for Managing and Discovering Image Collections, Oya Rieger, Cornell University Libraries
+ Addressing Metadata in the MPEG-21 and PDF-A ISO Standards, William G. LeFurgy, Library of Congress
+ Using MODS (Metadata Object Description Schema) for Rich Descriptive Data, Rebecca Guenther, Library of Congress
+ The Metadata Encoding and Transmission Standard (METS), Morgan Cundiff, Library of Congress
+ ONIX for Serials and the NISO/EDItEUR Joint Working Party for the Exchange of Serials Subscription Information Nathan Robertson, Johns Hopkins University Libraries
+ Metadata Interaction, Integration, and Interoperability, William Moen, University of North Texas
+ DSpace SIMILE: using semantic web technology for metadata support, MacKenzie Smith, MIT Libraries
+ Beyond Parsing: Metadata Quality Management, Bruce Rosenblum, Inera, Inc.
Scopus to challenge Web of Science?
Saturday, May 22nd, 2004Professional Reading Shelf
Citation Indexing
Information Industry–Elsevier
Source: Access
Scopus to challenge Web of Science?
From the article, “Elsevier is developing a bibliographic database called Scopus, which several industry observers believe will compete with ISI’s Web of Science for library dollars. At the heart of Scopus is the world’s largest abstracts database of over 12,900 journal titles from 4,000 publishers providing access to over 25 million abstracts going back to 1966 and 5 years of reference back years, building up to 10 years by 2005.”
108514817823508633
Saturday, May 22nd, 2004Resources, Reports, Tools, Lists, and Full-Text Documents
Military–Multimedia Resources
Military Clip-Art & Multimedia
Source: Dudley Knox Library, Naval Postgraduate School
Briefly annotated collection of links to graphics and audio/video resources for all branches of the military.
See Also: The Air War College Also Offers an Excellent Collection of Links
–
The following two items were culled from the Infomine What’s New Newsletter
–
Ornithology
Source: Cornell University Laboratory of Ornithology
All About Birds
Everything you wanted to know about birds and birding, including where to go, how to identify different species, choosing and using binoculars/spotting scopes, attracting birds to your yard, studying birds and conservation.
–
Politics
Source: Georgetown University
Political Database of the Americas
“The Political Database of the Americas is a non-governmental Internet-based project that provides reference materials, primary documents, comparative studies and statistical data for countries in the Western Hemisphere.” Available in English, Spanish, French and Portugese.
Alibris Ends Plans for IPO
Saturday, May 22nd, 2004Update
Alibris Ends Plans for IPO
Five weeks ago we mentioned that online used book marketplace Alibris had filed to go public. Well, things change. The company has withdrawn the IPO. Nevertheless, the S-1 filing offers some interesting info about the used book industry. Thanks to Tara C. for the news tip.
