Archive for August, 2003

ResourceShelf Gets A Mention in the NY Times!

Thursday, August 21st, 2003

Cool!
ResourceShelf Gets A Mention in the NY Times!
Very cool! Note: The NYT used our old url (not a problem, it still works). The official address of ResourceShelf is simply www.resourceshelf.com. We’re also happy to be sharing the paragraph with Greg, Danny, and Tara.
More Mentions…Sreenath Sreenivasan from The Poynter Institute continues to find this site useful. Sreenivasan wrote a full review of ResourceShelf a few months ago.

Searchable Databases of U.S. Media and Digital Books

Thursday, August 21st, 2003

Resources of the Week (3 Items)
1) Media–Directories
Media Post’s All Media Directory (U.S)
Don’t let the simple, fast, painless and free registration turn you off to a potentially valuable resource. The Media Post All Media Directory offers FREE access to a database containing directory information for about “13,000 stations, 8,000 publications, 3,000 sites and networks.” Media Post also contains names, numbers, and addresses for thousands of people working in the media industry via the “Media People-Finder” database. The focus of the People-Finder is towards those individuals working in the advertising and sales end of the business. To keyword search all databases (including the People-Finder) use the Knowledge Base search box located at the top of website. You can also begin your search by selecting a media category (radio, television, newspapers, etc.). Then, in most cases, it becomes a simple, point/pull/scroll and click operation. For example, if you need a list of television stations in the Cleveland market, simply click the television link and then scroll to the appropriate market. A list of television stations with telephone numbers, addresses, and urls will appear in a matter of seconds. The magazine category, organized by subject, contains contact and web urls for thousands of publications. This section could prove useful if you�re trying to find business and trade publications. Media Post’s All Media Directory can be of use to many types of users. From the person just needing a list of radio stations in a specific market to the media professional needing names and numbers of colleagues around the country.

2) Digital Books–Directories
Digital Book Index
A searchable database (free to use) containing information about more than 72,000 digitized books from over 1800 publishers. More than one third of the titles are available in full-text format at no charge while other titles are fee-based. Registration, also fee, is required to access the database. You will need to give an email address but can make the choice to opt out of any future mailings. The database can be searched by author and title. You can also browse by author, title, subject, and publisher. NetLibrary titles (available at no charge to many of you via your public or academic library) can be browsed using Dewey. Many thanks to D.D. for the tip.
See Also: The Online Books Page
The OBP was a Resource of the Week in November, 2002.

Digital Collection of the Week
School Books
Source: Digital Research Library, University of Pittsburgh
19th Century Textbooks
From the site, Examine digital editions of ninety-nine schoolbooks and two surveys of historic schoolbooks by John Nietz, the founder of the Nietz Old Textbook Collection. The online collection contains page images as well as searchable text.

Microsoft says investing heavily in Web search

Thursday, August 21st, 2003

Web Search–Microsoft
Source: Reuters
Microsoft says investing heavily in Web search
From the article, Microsoft Corp. is “investing heavily” in Web search as an important and potentially lucrative market, Christopher Payne, the executive in charge of search, said on Thursday. “On the information side of the house, there’s no question that search is the cornerstone of our strategy. We’re investing heavily in this space,” Payne, a vice president for Microsoft’s MSN Internet unit, said at Jupitermedia Corp.’s Search Engine Strategies Conference running this week in San Jose, California…Without giving specifics, Payne said Microsoft’s research department is devoting significant resources to the company’s search project. “We think there’s massive headroom left in this category. There are a lot of searches that can be better,” said Payne, who added that search represents great revenue opportunities for Microsoft and its advertisers. If you’re interested in reviewing a selection of recently awarded patents awarded to MS in the search space along with a few reports by MS engineers, please visit ResourceShelfPLUS.

Web Database Estimates: AllTheWeb Makes Now in the Lead

Thursday, August 21st, 2003

Web Search–AllTheWeb
Search Engine Size Estimates: AllTheWeb Takes the Top Spot
For those of you who like to keep on top of these types of things AllTheWeb is now CLAIMING a larger database than Google. Those of you who have read ResourceShelf know I’ve been writing about how useful AllTheWeb is for a long time. However, this type of claim by an engine is primarily a pr type of thing. The real question is how useful a search engine is in satisfying your information needs. Also, precisely what comprises that total size number? (Note: See this 3/02 from Search Engine Showdown). In many cases, a smaller but “focused” database can deliver excellent if not better results than a larger one. That said, I’m thrilled to see that ATW continues to develop into a highly useful product. If you’ve never used ATW, give it a try. I think you’ll be impressed. For the advanced searcher, ATW offers several search options not available elsewhere.
One More Thing: Web search historians might be interested to note that in June 2002, AlltheWeb claimed the largest index (2,095,568,809 pages). About 6 weeks later (8/8/02), Google returned the volley by increasing the Google total (2,469,940,685 pages). Stay tuned.
See Also: Make Sure to See Today’s ResourceShelf Post About Page Estimate Issues at Google.

Exclusive! It’s Doom for Tabloid Archives!

Thursday, August 21st, 2003

Archives
Source: The New York Times
“Exclusive! It’s Doom for Tabloid Archives!”
From the article, It may not be a collection worthy of the Smithsonian, but it is quintessential Americana, the trove of photos, notes and clippings from the spicy, arresting and often downright unbelievable issues of The National Enquirer, Star and other supermarket tabloids. Now those archives, trapped here inside the posh, abandoned former headquarters of the tabloids’ publisher, American Media Inc., or A.M.I., are destined for destruction. A.M.I. librarian Kathy Cottay is quoted (photo too!) in the article.

Remotely Accessible Databases (via a Public Library) Get Mention in California Newspaper

Thursday, August 21st, 2003

Briefly
Always Good to Read, Remotely Accessible Databases (via a Public Library) Get Mention in California Newspaper (via Napa Valley Register)
Check to see what your public library offers!

Emerging Visions for Access in the Twenty-first Century Library

Thursday, August 21st, 2003

Professional Reading Shelf
Libraries
Source: CLIR
Conference Proceedings: Emerging Visions for Access in the Twenty-first Century Library
From the abstract, “Emerging Visions for Access in the Twenty-first Century Library” is the second in a series of international symposiums that are supported by a grant from Documentation Abstracts, Inc. (DAI).”
Papers include:
“Reaching across Library Boundaries”, by Robert S. Martin
“The Personal Library: Integrating the Library in the Networking Society, by Jens Thorhauge
“Toward Supported “Communities of Interest” in Digital Environments, by Robin Stanton
“The Open Access Movement in Scholarly Communication”, by Michael Eisen
“Lessons in Deep Resource Sharing from the University of California Libraries” by Daniel Greenstein
See Also: Direct to Full-Text

Digital Libraries: What Should We Expect From Search Engines

Wednesday, August 20th, 2003

Digital Libraries–Search Engines
Source: FAST Search and Transfer
FAST CEO Conference Presentation, Digital Libraries: What Should We Expect from Search Engines
Earlier this week ResourceShelf ran an item about FAST “collaborating” with a German academic library to promote and develop search software for the academic information market. The post mentioned that FAST’s CEO, Dr. John Lervik was presenting a keynote address titled, “Digital Libraries: What Should We Expect From Search Engines”, at the European Conference on Research and Advanced Technology for Digital Libraries taking place this week in Norway. ResourceShelf has been able to obtain a copy of the slides from the presentation and is making them available on our site. The presentation (44 slides) focuses on defining what search engines do and what the third generation engines offer in terms of architecture and relevance. Thanks to P.G. for assisting us in getting a copy of the presentation.

Google and an IPO, Google and Blogs

Wednesday, August 20th, 2003

Web Search–Google
Source: Reuters
Google IPO? Brin Says “Good Chance”
From the article, It’s something that we debate periodically at board meetings — not every board meeting — every other or every third,” [Sergey] Brin said at Jupitermedia Corp.’s Search Engine Strategies Conference this week in San Jose…Brin, who is also president, technology, said Google is profitable and added the company right now is not pressed for the cash that initial public offerings generate. “It might be nice to have some degree of currency to do acquisitions and things like that. On the other hand, there are significant management distractions involved in being a public company, so it’s always kind of a toss-up,” he said. Among the distractions, Google executives have pointed out, are frequent meetings with Wall Street bankers, analysts and other financial industry players, as well as the requirement that a company publicly disclose financial information.

And Something Else From the Google Beat Google

In Some Cases “Site Search” Page Totals Not Even Close OR Blog Content Dominates Google
With so much talk about weblogs and their influence at Google I decided to see what eliminating a few weblog sites tha host weblogs would do to page estimate totals if they were removed from the universe of potential documents. In the process I discovered what I think is a Google inconsistency dealing with estimated page totals. Here are examples illustrating a few of the searches I’ve been running during the past few days. Results show that eliminating one or more domains from a query might not be functioning properly as it relates to page estimates. Are other page total estimate also inaccurate? Do major weblog domains hold a majority amounts of content in the Google database (very likely not the case).
Example 1
Search “weapons of mass destruction”
Approx 1.1 million hits
Now, eliminate results from the blogspot.com domain (-site:blogspot.com)
Approx 650,000 hits.
Does this mean that nearly one-half of the web pages containing the phrase “weapons of mass destruction” come from weblogs hosted on Blogspot? I doubt it.

What happens if you eliminate more weblog domains?
Search: “weapons of mass destruction”
and eliminate the following domains: -site:userland.com -site:weblogs.com -site:blogspot.com
-site:livejournal.com -site:journalspace.com -site:greymatter.com
Approx 240,000 hits. This would indicate that an overwhelming majority in Google about WMD comes from weblogs.

Example 2
Search: search engine optimization
Approx 800,000 hits
Eliminate blogspot.com pages
Approx 415,000 hits
Does this mean half of the pages in Google discussing SEO come from Blogspot? I doubt it.
Finally, search search engine optimization and eliminate the following domains: -site:userland.com
-site:weblogs.com -site:blogspot.com -site:livejournal.com -site:journalspace.com -site:greymatter.com
Approx 165,000 hits.
Bottom Lines:
* Regardless of the cause this illustrates (again!) that using page total estimates might be a less than accurate in determining the “popularity” of someone or something.
* I’ve contacted Google about the problem and will report back.
See Also: Search Engine Showdown Maintains A List of Google Inconsistencies

THIS WEEK ONLY: Free Full-Text Access to Library Management

Wednesday, August 20th, 2003

Professional Reading Shelf (3 Items)
Libraries
Source: Emerald
THIS WEEK ONLY: FREE Full-Text Access to Library Management
Access is via Emerald’s Journal of the Week program. Free full-text access is available for Vol. 15 (1994) -Vol. 24 (2003). Citations are also available (free) for Vol. 10 – Vol. 14. Restricted access will begin again next Monday.
See Also: NEXT WEEK Emerald Will Offer Free Access to the Journal of Documentation

Digital Libraries
Source: NSF
Full-Text Report, “Reference Models for Digital Libraries: Actors and Roles”

Librarianship
The September Issue of Walt Crawford’s Cites and Insights is Now Online

106145985978409286

Wednesday, August 20th, 2003

Public Libraries
Source: St. Paul Pioneer Press
So Sad, Minneapolis Public Library Prepares for Hiatus
From the article, “Something like this hasn’t happened in recent memory,” said Public Library Director Kit Hadley. “But it has happened before � such as during the Great Depression.”
See Also: Seattle Public Library Will Also Go On Hiatus For One Week

Coming Soon: PAIS (Public Affairs Information Service) Retrospective Database

Wednesday, August 20th, 2003

Information Industry–OCLC
Two Items from the Folks in Dublin, OH
1) Coming Soon: PAIS (Public Affairs Information Service) Retrospective Database
“…electronic versions of records previously available only in the 62 annually cumulated print editions of the PAIS Bulletin, 1915-1976. The PAIS Archive will be released on FirstSearch in phases, beginning with years 1957-1976 in Spring 2004, with availability of the full file projected for mid-2004.”
-
2) E-Books from Overdrive Now Come With Full MARC Records

106133508959184223

Wednesday, August 20th, 2003

Resources, Reports, Tools, and Full-Text Documents (3 Items)
Passports–United States
New Toll-Free Telephone Number For U.S. Passport Information
Before this number became available you had to dial a 900 number to speak to someone at the National Passport Information Center. The new number is 1-877-4USA-PPT (1-877-487-2778). More details in the news release.

Energy–Iraq
Source: EIA
Updated, Iraq Country Analysis Brief
Info about energy sector in Iraq.

Parole–United States–Statistics
Source: BJS
Full-Text Report, Probation and Parole in the United States, 2002

Overture Research Launches Web Site

Tuesday, August 19th, 2003

Web Search
Overture Research Launches Web Site
The team at Overture Research have launched a new site with background on research projects, publications/tech reports, links to news articles of interest to members of Overture Research, staff bios, and other odds and ends that search geeks might want to review. ResourceShelf will do its best to alert you to new material as its placed on the site. Here are a few things I’m planning on reading asap.
* Background and Links About Overture’s Concept Discovery Research and other research projects.
* Full-Text, “1 Billion Pages = 1 Million Dollars? Mining the Web to Play “Who Wants to be a Millionaire?”, Overture Research Technical Report OR-2003-009

Happy Birthday to PubMed Central

Tuesday, August 19th, 2003

Professional Reading Shelf
Digital Archives–Journal Articles
Source: ARL Bimonthly Report
“PubMed Central–Three Years Old and Growing Stronger”
An article by Edwin Sequeira, National Center for Biotechnology Information, National Library of Medicine. From the article, PubMed Central (PMC) is the National Library of Medicine’s (NLM) digital archive of medical and life sciences journal articles. It was conceived in the spring of 1999 when Harold Varmus, then director of the National Institutes of Health (NIH) of which NLM is a part, proposed that NIH create and manage an open archive of research papers in the life sciences. Many of the early exchanges about the proposal within the publishing community made it sound as if revolution was in the air. The reality, however, is that PubMed Central represents evolution not revolution. PMC is here to stay, but it does not spell disaster for academic societies and other publishers.

Once Again the Internet Bookmobile Takes to the Streets

Tuesday, August 19th, 2003

The Internet Archive
The Internet Bookmobile From the Internet Archive Will Visit a Publishing Conference
The Internet Archive (also home to The Wayback Machine) is taking the Internet Bookmobile to the Seybold Conference in San Francisco. From the announcement, [The Internet Bookmobile's] immediate mission [is] allowing event attendees to peruse a database of some 100,000 books and perhaps walk away minutes later with a free, bound volume. Titles currently available include “The Wizard of Oz” and a manual of Buddhist ethics from the 4th century BC. However, beyond the fun and satisfaction of getting an instant, personalized copy of “Alice in Wonderland,” the minivan [aka bookmobile] represents a global effort to use 21st century technology to help preserve ink-on-paper media, vulnerable tomes at risk of being lost forever. Ashley Rindsberg, “bookmobilist” for the Internet Archive, which created the vehicle in October 2002, said the San Francisco-based nonprofit is working in concert with the Million Book Project at Carnegie Mellon University. “We’ve (the bookmobile) handed out some 10,000 books throughout the Bay Area,” he said. “The bookmobile fits well with the Internet Archive’s motto, Universal Access to All Human Knowledge.’ Since the group started in 1996, it has amassed a collection of some 10 billion Web pages of “pretty much anything digital,” he said. “As long as it has some cultural or historical value or value in knowledge, we’ll collect it.”
See Also: Learn More About the Internet Bookmobile

Oyez, A Multimedia Archive of U.S. Supreme Court Oral Arguments, Wins Praise From Washington Post

Tuesday, August 19th, 2003

United States Supreme Court–Audio
Digitization Projects
Source: The Washington Post
Oyez, A Multimedia Archive of U.S. Supreme Court Oral Arguments, Wins Praise From Washington Post
About two weeks ago the the AP ran a story about Oyez (based at Northwestern University). Oyez is in the process of digitizing oral arguments from the U.S. Supreme Court. Today, The Washington Post ran an editorial about the importance and need for such an archive. From the editorial, “…thanks to the creative and technologically innovative work of scholars at Northwestern University, the public has gained entree to a significant aspect of the Supreme Court’s workings. Oral arguments for more than 100 cases and some decisions are now available online in MP3 format, thanks to a multimedia archive called the Oyez Project. The complete historical record since 1955 should be ready by 2007, according to project director Jerry Goldman, a political science professor at Northwestern…The online archive will be a boon to historians, lawyers, students and anyone who wants to learn about the most important court in the land.
See Also: Direct to AP Article About Oyez
See Also: Direct to Oyez Homepage

Elsevier and American Chemical Society Make Linking Agreement

Tuesday, August 19th, 2003

Briefly
New Web Page, National Library of Medicine Offers Suggestions To Assist Authors in Selecting Medical Keywords
-
Elsevier and American Chemical Society Agree to Link

106132197662474688

Tuesday, August 19th, 2003

Resources, Reports, Tools, and Full-Text Documents
Homeland Security
Source: Congressional Research Service (via Federation of American Scientists)
Full-Text, Homeland Security: Protecting Airliners from Terrorist Missiles
This report was last updated on 3/25/03. Make sure to check for updates. The report (same date) is also available on the House.Gov site (via Rep. Chris Shays).

“Google Is Most Popular Search Site, But Others Sometimes Do It Better”

Monday, August 18th, 2003

Web Search
Source: Wall Street Journal
“Google Is Most Popular Search Site, But Others Sometimes Do It Better”
Kudos to Wall Street Journal writer Lee Gomes on writing an important article that will be seen by millions of people. The article focuses on what ResourceShelf has been writing about since we started, no search engine is perfect, not even Google. A familiarity with other general web engines is crucial, this is especially true for info pros. This article contains a very brief but accurate history of link analysis (you’ve seen it on ResourceShelf many times) along with a discussion of what Teoma is all about. One final point not mentioned in the article. In many cases using a specialized search tool, allows the user to search a small universe of material and often gives greater utility to the database content. Look at what a interface developed for the Internet Movie Database. The bottom line? A preexisting knowledge of what key resources offer can save you a great deal of time and effort. In other words, the right tool for the job. No different than “knowing” what’s in key reference books and how they’re indexed prior to using them.
See Also: One again, a link to Prof. John Kleinberg’s and Colleagues Excellent Article About Web Search From 1999
The concepts employed at IBM’s Clever are very similar to what Teoma is all about.
See Also: Learn More About Dr. Kleinberg and IBM’s CLEVER.