Archive for the ‘Info Management and Retrieval’ Category

Europe: Eastern Librairies Come to Brussels; More About Europeana

Sunday, November 22nd, 2009

From the Article:

Europeana, the European Digital Library, is the aggregator of European cultural content. Interoperability is at the heart of what Europeana is doing: integrating format types across borders, across domains and between institutions. Museums, libraries, archives and audio-visual (AV) collections have different histories, user-groups and purposes. These are reflected in their diverse approaches to cataloguing and the development of varying standards. The result is that delivering all content types in the same online space requires a commitment to working collaboratively and sharing knowledge across long-established professional boundaries.

[Snip]

EuropeanaLocal aims to make digital content from regional and local museums, libraries, archives and audio-visual institutions interoperable and accessible through Europeana and other service providers. EuropeanaLocal will ensure that the approaches, standards and tools developed by Europeana are adopted widely across the cultural heritage sector, thereby supporting the interoperability of content within Europeana beyond that which is held by purely national level institutions.

EuropeanaLocal, which runs from 1 June 2008 until May 2011, is funded under the European Commission’s eContentPlus Programme and co-ordinated by MDR Partners from the UK. The project consortium consists of 32 partners from 27 countries across Europe. There will be no portal by the end of the project; all material will be made accessible through the main Europeana interface.

Access the Complete Article

Source: New Europe

See Also: Europeana: Europe’s cultural heritage online

Open Book Alliance Posts Third Party Analysis of Revised Google Books Settlement

Sunday, November 22nd, 2009

The Special Libraries Association of the New York Library Association are members of the Open Book Alliance.

From the Blog Post:

…the Open Book Alliance released an extensive compilation of third party analysis of the revised Google Books Settlement. The wide-ranging consensus from a diverse chorus of voices is that the revised Settlement is barely distinguishable from the original Settlement and far from satisfying the major concerns of the Department of Justice and other objectors.

The post is divided into seven sections:

+ Overall Analysis

+ Continuing Anti-trust Concerns

+ FATE OF UNCLAIMED/Orphan Works

+ Copyright Infringement CONCERNS

+ COMPLETE LACK OF Privacy:

+ ABUSE OF Class Action PROCESS:

+ CONTINUING Foreign COMPLAINTS

Source: OBA

Google Book Search: Amazon.com Files Motion Asking Judge to Reconsider His Preliminary Approval of Settlement 2.0

Saturday, November 21st, 2009

James Grimmelmann writes on The Laboratorium:

It’s a full-on attack on the settlement; Amazon’s theory is that the future-claims issue is such a fundamental flaw in the settlement that there is no way Judge Chin could ultimately approve it.

[Snip]

Thus, Amazon argues, Judge Chin should save time and resources, reject this settlement, and give the parties another 30-45 days to negotiate a settlement that includes only releases relating to past claims.

Access the Complete Post

Source: The Laboratorium

See Also: Judge Gives Preliminary Approval to Google Deal, Sets Feb. 18 for Final Hearing

Google Books Settlement 2.0: Evaluating Competition

Friday, November 20th, 2009

Its been a week since Settlement 2.0 was released. We have a comprehensive press review along with many related documents from the past week here.

Until the next major event and our next press review, we will continue to post Settlement 2.0 news and analysis with a focus on stories, analysis, and opinion that has a library angle to it.

We begin with this analysis of competition by Fred von Lohmann at the Electronic Frontier Foundation. It includes an entire section dealing with institutional subscriptions titled, “Monopoly Pricing of the Institutional Subscription Database?”

One of the commercial services that Google is authorized to provide under the proposed settlement is the “Institutional Subscription Database” (aka “ISD”), which will provide “all-you-can-eat” access to the corpus of scanned books. The chief customers for the ISD are likely to be universities (the same folks who are providing Google with the books to be scanned), for whom instant digital access to every word in every book in Google’s collection is likely to be very compelling.

The big question is whether, over time, the ISD will become the one database that no university can do without, and the one database with no market substitute (again, because Google will be the only company who can provide a comprehensive corpus without fear of copyright liability, for the reasons explained above). This, of course, is a recipe for monopolistic price gouging, as a group of academic authors led by Prof. Pam Samuelson have pointed out. Over time, universities could face spiraling prices as Google and the Registry conspire to maximize their revenues on the ISD product.

Source: Electronic Frontier Foundation

Collection Development: Want a Non-Stop Stream of Recently Digitized eBooks to Choose From? Check This Out!

Friday, November 20th, 2009

A Never Ending “Virtual Stream” of Digitized Text
by Gary Price, Senior Editior

When Chris Sherman and I were writing and then giving book talks and presentations about The Invisible Web, we said John Mark Ockerbloom’s Online Books Page was an essential resource for anyone interested in digitized, full text books. Now referred by most as eBooks. More than eight years later I feel the same way about this awesome and well organized collection.

Where do you begin with a site so full of content? For me, that’s easy. Monitoring the latest additions to the catalog/page. I am always blown away by the amount of new listings (when does Ockerbloom sleep?) and the number of organizations digitizing books. If you think it’s only Google digitizing books (of course they are a major player) but not they’re far from the only one doing this type of work. Just look for yourself. The page even has an RSS feed.

So, the Online Books Page is not only a “must have” searchable directory of ebooks but it can also be a great collection development resource to find and add digitized content to your local collection/OPAC.

But wait, we’ve got more.

The Online Books Page new listings only includes some of the digitized text output from the Internet Archive (IA).

If you want to be able to review (at your leisure) all of the new digitized content text content that the IA produces, it’s possible by subscribing to this RSS feed. Even if you’re not going to review the titles, just let it run for a few days to see the AMOUNT of text material that’s digitized in variety of formats. It’s an understatement to say that the scanners at the IA are cranking it out on all cylinders. So, collection development types, subscribe to both RSS feeds and have a large virtual bookshelf to choose from each day. If you don’t do the collection development thing both feeds are useful to illustrate the amount of material being digitized each day, week, month.

UPDATE: Not an RSS user? No problem. Just visit this Internet Archive page and refresh it a few times a day. The most recent addition is at the top.

Judge Gives Preliminary Approval to Google Deal, Sets Feb. 18 for Final Hearing

Friday, November 20th, 2009

From the Article:

Judge Denny Chin has given his preliminary approval to the Google Book Search settlement agreement and established a timeline to move the agreement toward a final resolution. A final settlement/fairness hearing has been set for February 18 at which Judge Chin will hear arguments to determine whether the agreement is “fair, reasonable, adequate;” consider whether to certify the class for purposes of the settlement; and to make a determination whether to approve the agreement.

Prior to the hearing, the judge has ordered that supplemental notices about the amended agreement be sent beginning December 14, and he set a January 28 deadline for objections to be filed with the court.

[Snip]

As part of the amended settlement, companies from outside of the U.S. were to be added as plaintiffs. The order notes that new plaintiffs include Harlequin, Melbourne University Publishing Ltd., and The Text Publishing Company.

Source: Publisher’s Weekly

HathiTrust Offers Full-Text Search of Millions of Digitized Books and Journals

Thursday, November 19th, 2009

From the Announcement:

A year after its launch by 25 leading U.S. research libraries, HathiTrust Digital Library announces a service that will transform how researchers use the more than 1.6 billion pages (4.6 million volumes) in its collections.

The breakthrough allows for full-text searching capabilities across the entire library. Researchers can now search public domain and in-copyright works by keyword or phrase.

Based on open source Solr/Lucene technology, the service expands on an experimental search of public domain volumes introduced in November 2008. Full-text search will continue to be supported across the repository as it grows at a rate of hundreds of thousands of volumes every month.

“The HathiTrust partners are pleased to offer a search service that helps mine this growing body of authoritative library materials,” said John Wilkin, HathiTrust executive director and associate university librarian at the University of Michigan. “HathiTrust continues to distinguish itself with its reliability and with its efforts to broaden the availability of digitized library collections in the flow of scholarly discourse. We see this valuable discovery service as one in a series of major steps HathiTrust is taking to shed light on this vast body of material.”

In combination with the HathiTrust Digital Library’s carefully curated bibliographic data, the new functionality allows researchers to more efficiently locate items relevant to their research. It also lays the foundation for future services such as full-text search with faceted browsing, advanced search, “more like this” options, and tools that can be used in computational research.

The effort to provide full-text searching capabilities across the repository has yielded valuable benchmarking data, methods, and code to the broader large-scale search community, said Wilkin.

The HathiTrust partners are committed to developing the repository and its services to meet the long-term needs of their academic communities, and offer a unique resource on the Web for scholarship and research.

Source: HathiTrust / University of Michigan

See Also: HathiTrust Home Page and List of Partners

See Also: Access HathiTrust Search Interfaces (Including Full Text Search)

ARL — E-Science Survey Preliminary Results and Resources Released

Thursday, November 19th, 2009

E-Science Survey Preliminary Results and Resources Released

The Association of Research Libraries (ARL) E-Science Working Group surveyed ARL member libraries in the fall of 2009 to gather data on the state of engagement with e-science issues. An overview of initial survey findings was presented by E-Science Working Group Chair Wendy Lougee, University Librarian, McKnight Presidential Professor, University of Minnesota Libraries, at the October ARL Membership Meeting. Lougee’s briefing explored contrasting approaches among research institutions, particularly in regard to data management. The briefing also summarized survey findings on topics such as library services, organizational structures, staffing patterns and staff development, and involvement in research grants, along with perspectives on pressure points for service development. To better explicate the findings, Lougee reviewed specific cases of activities at six research institutions.

Audio of the briefing along with slides and a handout are available as part of the Proceedings of the 155th ARL Membership Meeting (see http://www.arl.org/resources/pubs/mmproceedings/155mm-proceedings/index.shtml#esci).

ARL has also compiled a set of resources provided by survey respondents. Examples of a range of campus and library documents, tools advancing e-science support, needs assessments, and position descriptions, among other items, are listed on ARL’s Web site at http://www.arl.org/rtl/eresearch/escien/esciensurvey/index.shtml.

Source: Association of Research Libraries

Australia: Historic Newspaper Digitisation: Early Editions of Sydney Morning Herald Now Available Online; What is Trove?

Thursday, November 19th, 2009

From an e-Mail:

The National Library’s Australian Newspapers service has recently made available https://mail.google.com/mail/?zx=1rjv366gqucji&shva=1#inbox/1250a7f37fa96144early editions of The Sydney Morning Herald.

The digitisation of The Sydney Morning Herald was made possible by a $1 million contribution from the Vincent Fairfax Family Foundation. Eventually, all out-of-copyright editions of the Herald will be available, from its inception in 1831 to 1954.

It is now just over a year since Australian Newspapers was released to the public and there are 8.5 million articles available from 33 newspaper titles. A community of volunteer ‘text correctors’ has now corrected 7 million lines of the electronically translated text in 318 000 articles, enabling more accurate search results.

Access the Collection (via Trove)

Btw, what is Trove?

“one search…a wealth of information”

Trove is our new free online service that gathers information about Australia and Australians in a single search.

Discover:
+ Digitised Australian newspapers, 1803 – 1954
+ Books, magazines and articles
+ Pictures and photographs
+ Music, oral histories and videos
+ Maps
+ Archived websites
+ Biographical information

Source: National Library of Australia

Just Released: Shakespeare Quartos Archive Opens Access to Hamlet

Tuesday, November 17th, 2009

We are looking forward to spending some quality time with this very high quality resources (that’s also free). If nothing else, it really shows off the power of digital archives and digitization.

From the Announcement:

The highly-anticipated Shakespeare Quartos Archive has been officially launched today with a complete digital collection of rare early editions of Hamlet.

For the first time, all 32 existing quarto copies of the play held by participating UK and US institutions are freely available online in one place. This initiative is jointly led by the Bodleian Library, University of Oxford and the Folger Shakespeare Library, Washington DC, through a joint transatlantic grant from Jisc in the UK and the National Endowment for the Humanities in the US.

[Snip]

Now scholars can explore these different quarto versions side by side for the first time. It features high-quality reproductions and searchable full text of surviving copies of Shakespeare’s Hamlet in quarto in an interactive interface. The project, which began in April 2008, reunites all 75 pre-1642 quarto editions of Shakespeare’s plays into a single online collection. The prototype interface is at present fully functional only for Hamlet, but the Shakespeare Quartos Archive plans to apply this technology to all the plays in quarto, and to seek involvement from new partner institutions.

Now scholars can explore these different quarto versions side by side for the first time on the project website. It features high-quality reproductions and searchable full text of surviving copies of Shakespeare’s Hamlet in quarto in an interactive interface. Functions and tools – such as the ability to overlay images, compare them side-by-side, and mark and tag features with user annotations – facilitate scholarly research, performance studies, and new applications for learning and teaching.

The project, which began in April 2008, reunites all 75 pre-1642 quarto editions of Shakespeare’s plays into a single online collection. The prototype interface is at present fully functional only for Hamlet, but the Shakespeare Quartos Archive plans to apply this technology to all the plays in quarto, and to seek involvement from new partner institutions.

Direct to Shakespeare Quartos Archive

Source: JISC, NEH

The Internet Time Machine from the Momento Project

Tuesday, November 17th, 2009

This is a must read from start to finish. Here are a few snippets to wet you whistle.

Access the Complete Article from New Scientist

Bookmarking a page takes you to its current version – but earlier ones are harder to find (to see an award-winning 1990s incarnation of newscientist.com, see our gallery of web pages past, right). One option is to visit a resource like the Internet Archive’s Wayback Machine. There, you key in the URL of the site you want and are confronted with a matrix of years and dates for old pages that have been cached.

It’s a lot of hassle. But it shoudn’t be, says Herbert Van de Sompel, a computer scientist at Los Alamos. “Today we treat the web like a library in which you have to know how to go and search for things. We’ve a better way.”

That “better way” is a system that gives browsers a “time-travel” mode, allowing users to find web pages from particular dates and times without having to navigate through archives.

[Snip]

“In addition to language and media type, we negotiate in time. So Memento asks the server not for today’s version of this page, but how it looked one year ago, for instance,” says Van de Sompel.

[Snip]

Jakob Voss, a developer with the Common Library Network in Göttingen, Germany, is an early Memento user – and he is already advocating use of Memento for sites with frequently updated pages like Wikipedia.

“Memento is only a proof of concept but it looks very promising and could be a great enhancement to the web. There is little support in today’s browsers for digging into archives, especially those with dynamic content management systems like wikis and weblogs,” Voss says.

You Can Try a Demo Here and Learn More Here

Access the Complete Article from New Scientist

Source: New Scientist

Old-Book Smell, Sniffing, and Preservation

Tuesday, November 17th, 2009

An interesting out of the way question and the NY Times goes on a hunt to find out in this story.

From the Article:

If you have torn yourself away from the virtual library that is the Internet long enough to visit a real library, you know that the smell of old books — musty, slightly acidic, even grassy — is instantly recognizable. But is it quantifiable? And if so, might old-book odor prove useful to librarians and conservators charged with preserving collections?

[Snip]

Dr. Strlic said he got the idea one day at a library when he saw a conservator sniffing an old piece of paper, trying to determine what it was made of. “I thought, certainly a technique could be developed to do that more accurately,” he said. The approach is similar to breath analysis used to diagnose illness, he added.

He and his colleagues analyzed the volatiles produced by 72 samples of old paper of different types and in varying condition from the 19th and 20th centuries, using liquid chromatography-mass spectrometry. They found that some compounds were reliable markers for paper with certain characteristics — high concentrations of lignin or rosin, for example, which make paper degrade relatively quickly. Their findings were published in the journal Analytical Chemistry.

Source: NY Times

The November/December 2009 Issue of D-Lib Magazine is Now Available

Tuesday, November 17th, 2009

Before we post a selection of what in the new issue of D-LIB, ResourceShelf would like to thank Bonita Wilson for editing a great publication. She has been the sole editor of D-LIB since July, 2001. This is her last issue as editor. She’ll now have more time to engage in the “other things” she likes doing at her home on the Chesapeake Bay in VA. She’ll continue with CNRI in a part time capacity.

Here are Some of the Articles in the November/December 2009 Issue of D-LIB:

+ Beyond 1923: Characteristics of Potentially In-copyright Print Books in Library Collections
by Brian Lavoie and Lorcan Dempsey, OCLC Online Computer Library Center

+ Service-Oriented Models for Educational Resource Federations
by Daniel R. Rehak, LSAL; and Nick Nicholas and Nigel Ward, Link Affiliates, Australia

+ From TIFF to JPEG 2000? Preservation Planning at the Bavarian State Library Using a Collection of Digitized 16th Century Printings
by Hannes Kulovits and Andreas Rauber, Vienna University of Technology; and Anna Kugler, Markus Brantl, Tobias Beinert, Astrid Schoger, Bavarian State Library

+ Measuring Citation Advantages of Open Accessibility
by Samson C. Soong, Hong Kong University of Science and Technology

+ The Importance of Digital Libraries in Joint Educational Programmes: A Case Study of a Master of Science Programme Involving Organizations in Ghana and the Netherlands
by Marga Koelen, International Institute for Geo-Information Science and Earth Observation; and Jonathan Arthur Quaye-Ballard, Kwame Nkrumah University of Science and Technology

The Practice and Perception of Web Archiving in Academic Libraries and Archives
by Lisa Gregory, University of North Carolina at Chapel Hill

Pennsylvania Literary Journal: Google Websites as an Easy Publication Route
by Anna Faktorovich, Indiana University of Pennsylvania

Access the Complete November/December 2009 Issue of D-LIB:

Bookless Libraries Increase Accessibility

Monday, November 16th, 2009

From the Article

Carnegie Mellon has for years already been building its own “bookless library”: “For the nearly 15 years I’ve been in the Carnegie Mellon Libraries, we’ve been working hard to provide the campus with what I think is a very realistic view of the library of the future. We are working towards a hybrid of an online and paper-based library,” computer science librarian Missy Harvey explained.

[Snip]

The rarity of many works in the Posner Memorial Collection highlights the advantages of the increased accessibility and longevity of electronic republishing. “Rare books [can be] scanned and delivered via the World Wide Web to scholars in places such as Argentina and Germany who could not visit the books,” Mary Catharine Johnsen, the special collections and design librarian, said. “Electronic versions save wear and tear on using the physical book, which is important if you are a book from 1755 and your leather spine is dry and cracking.”

[Snip]

Johnsen also said that even in successful scans of books to electronic formats, the many subtleties in a book’s presentation and metadata may be lost in an online medium. “For literature students, you really want to see the original format of the work as received by its first public. Was it a fancy coffee-table book? Was it a cheap paperback or flimsy pamphlet? Was it a colorful book to tempt you in a Victorian train station or an airport bookstall?”

Source: The Tartan (Carnegie Mellon)

Video: Preserving and Providing Access to Digital Info from State Legislatures

Monday, November 16th, 2009

From an Announcement:

A new video features Minnesota Speaker of the House Margaret Anderson Kelliher talking about new methods to preserve and provide access to digital records of state legislatures. The production describes the work of A Model Technological and Social Architecture for the Preservation of State Government Digital Information Project, which is supported by the Library of Congress National Digital Information Infrastructure and Preservation Program.

Direct to Video (via Minnesota Historical Society)
It runs about six minutes.

Source: National Digital Information Infrastructure and Preservation Program

Press Review+: Google Book Search Revised Settlement (2.0) Released; What About Libraries?

Friday, November 13th, 2009

We’re going to on the lookout for news, commentary from experts, and viewpoints from various organizations and companies involved in the GBS story. We’re posting selected snippets with links to the full text. We also know that in the document filed with the court, there is one mention of libraries, public libraries to be specific.

From Google and Others Involved:

+ Modifications to the Google Books Settlement (via Google Public Policy Blog, Dan Clancy)

The changes we’ve made in our amended agreement address many of the concerns we’ve heard (particularly in limiting its international scope), while at the same time preserving the core benefits of the original agreement: opening access to millions of books while providing rightsholders with ways to sell and control their work online.

The blog post also links to a settlement modifications overview (3 pages) and a Revised Settlement FAQ (2 pages).

Are libraries mentioned in these documents? Yes. As you’ll read not much is different in terms of access except that the amended agreement allows the Registry to increase the amount of terminals in a public library.

On Page 2 of the Overview it States:

The amended settlement does not change the primary access models outlined in the original agreement, including enabling readers to preview and purchase books, selling institutional subscriptions to the whole database, and giving libraries free access at designated terminals. Under the revised agreement, possible additional access models to which Google and the Registry might agree in the future have been reduced and are now limited to: print-on-demand*, file download, and consumer subscription. The amended agreement also enables the Registry to increase the number of terminals at a public library building

* The Amended Settlement limits POD, if approved, to Books that are not Commercially Available.

There is no mention of the words library or libraries in the FAQ.

There is a third document, a Supplemental Notice (an actual court filing; 6 pages; PDF),  listing all of the changes to the settlement.  #17 talks about the terminals in public libraries that we mentioned a moment ago.

Here are a few more changes (via the supplemental notice) that might be of special interest:

+ #16:

The Amended Settlement provides that the Registry will facilitate Rightsholders’ wishes to allow their works to be made available through alternative licenses for Consumer Purchase, including through a Creative Commons license…The Amended Settlement also clarifies that Rightsholders are free to set the Consumer Purchase price of their Books at zero.

+ #18:

The Amended Settlement no longer includes children’s book illustrations in the definition of Inserts. (ASA Section 1.75) The Amended Settlement, however, does not change the inclusion of pictorial works, such as graphic novels and children’s picture books, in the definition of Books and provides that the Amended Settlement only authorizes Google to display the pictorial images in such Books if a U.S. copyright owner of the pictorial image also is a Rightsholder of the Book. The Amended Settlement also clarifies that comic books are considered to be Periodicals and that Periodicals (as well as compilations of Periodicals) are not included in the definition of “Books,” and thus are not in the Amended Settlement.

Finally, if you would like to read the complete Amended Settlement Agreement, here’s the 173 page PDF file.

+ Amended Google/AAP Settlement (via Coyle’s InFormation, Karen Coyle)
An excellent overview of Settlement 2.0 from librarian Karen Coyle. She brings up several library related issues including the removal of an OCLC “exception”; download formats and course packs; and much more. This is must read material.

+ Is the Google Books Settlement Worth the Wait?

The Open Book Alliance–SLA and The New York Library Association–are two of its members has posted their views after a preliminary reading of the revised settlement. Here are a few snippets.

Open Book Alliance co-chair Peter Brantley said, “Our initial review of the new proposal tells us that Google and its partners are performing a sleight of hand; fundamentally, this settlement remains a set-piece designed to serve the private commercial interests of Google and its partners. None of the proposed changes appear to address the fundamental flaws illuminated by the Department of Justice and other critics that impact public interest.

[Snip]

Most critically, the settlement proposal must not grant Google an exclusive set of rights (de facto or otherwise) or result in any one entity gaining control over access to and distribution of the world’s largest digital database of books. It is clear that Google has failed to meet these requirements.

UPDATE: 11/17 The Monopoly Continues (Source: Open Book Alliance)

UPDATE: 11/17 Proposed Changes Fail to Address Fundamental Flaws, Says Open Book Alliance Co-Chair (via Open Book Alliance)

+ Revised Google Book Settlement Filed & Live Blogging The Press Call (via Search Engine Land, Danny Sullivan)

Danny took the time to live blog the conference call that took place early Saturday morning, east coast time. On the call were:

+ Richard Sarnoff, chairman of the American Association of Publishers

+ Paul Aiken, executive director of the Authors Guild

+ Daniel Clancy, engineering director for Google Books

Here’s how they responded to the Open Book Alliance comments that are posted and linked to above this item.

So the response to that? Clancy stepped up, saying there were lots of discussions on how to change things. Adjustments were made to address class member concerns (the people involved in the lawsuit, rather the the Open Book Alliance, which is not a party to the suit). “I understand Amazon, Microsoft and the Internet Archive don’t want to increase access to these books,” he said, or very close to that. That was a zinger, stressing that the Open Book Alliance just happens to be backed by major Google competitors. Not that Google minds. Clancy said they welcome the competition and feel the settlement addresses concerns.

Aiken: “These are substantial changes.” He added that yes, the core settlement was largely protected but that it had to be, as it was in general seen correct.

Sarnoff: Said he assumed the OBA hadn’t read the settlement. That was probably true enough. The press conference itself appears to have started about 1/2 hour after the settlement was out. Some reporters on the call mentioned they hadn’t even read it.

+ The Authors Guild Has a Review of the MaJor Changes on their Site

+ Google Book Search Settlement Revised: No Reader Privacy Added (From the Electronic Frontier Foundation)

Unfortunately, the parties did not add any reader privacy protections. The only nominal change was that they formally confirmed a position they had long taken privately that information will not be freely shared between Google and the Registry. Our partners at the ACLU of Northern California have a blog post describing the changes we, and the authors we represent, have demanded and continuing the call for readers everywhere to let Google CEO Eric Schmidt know that reader privacy should not be left behind as books move into the digital age.

+ Amended Google Book Settlement: Doesn’t Deal with Privacy Problems (ACLU of Northern California)

One of our core privacy concerns with the Settlement has been that reading records are not properly protected from disclosure to the government and third parties. Readers should be able to use Google Book Search without worrying that the government or a third party is reading over their shoulder. No Settlement should be approved that allows reading records to be disclosed without a properly-issued warrant from law enforcement and court orders from third parties.

The Amended Settlement does not resolve this concern, with its only new privacy provision being the following:

“The revised agreement includes language that specifies that Google will not share any private information with the Registry without valid legal process.”

Much More After a Click
(more…)

Digitized Historic Newspapers: Topic Guides for Chronicling America

Friday, November 13th, 2009

The Chronicling America from the NEH and the Library of Congress is searchable database containing more than million digitized American newspapers pages (and growing) from 1880-1922.

Guides cover topics “widely covered” in the American press of the time. As of today (11/13/2009), there are 21 guides available with more expected soon.

Here are the titles of a few of the guides:

+ Annexation of Hawaii
+ Bloomer Girls (Women’s baseball)
+ Clara Barton
+ Comic Strips
+ Ellis Island
+ Jack Johnson vs. James J. Jeffries
+ Jack the Ripper
+ Patent Medicines
+ Presidential Election of 1896
+ Pullman Porters
+ San Francisco Earthquake, 1906

You can the find the complete list and register for alerts when new guides released here.

Hathi Trust Digital Library Publishes Update on October Activities (November, 2009)

Friday, November 13th, 2009

The update consists of a four page PDF.

Here’s a list of some of the topics covered. Access the full text to get all of the details.

Ingest

HathiTrust ingested a record 553,963 volumes in October. These included nearly 5,000 volumes from Penn State and initial loads of volumes from the University of California’s Santa Cruz and San Diego campuses. Ingest of volumes from Penn State will continue in November. Subsequent shipments of metadata for up to 600,000 additional volumes from UC campuses are expected in November. Ingest of these volumes will begin shortly thereafter.

HathiTrust participates in grant from Mellon Foundation

Google Summit and Internet Archive Ingest

Large-scale Search

Staff at the University of Michigan successfully indexed all volumes in HathiTrust using the newly acquired hardware. However, the official launch of the large-scale search application was postponed in order to acquire additional hardware to accommodate new index growth.

HathiTrust/OCLC Catalog

After finalizing metadata requirements for the version 1 catalog in September, the HathiTrust/OCLC Catalog team turned its attention in October to interface requirements. The team is currently finalizing interface requirements for version 1 of the catalog and has agreed to engage in collaborative usability testing during the first quarter of 2010. Meanwhile, OCLC’s e-content synchronization work for HathiTrust remains on schedule, and is expected to be completed by the end of the calendar year.

New Growth: Number of Volumes Added
Indiana University
64,614 volumes added in October, 84,132 Total
Penn State University
4,675 volumes added in October, 4,675 Total
University of California
264,710 volumes added in October, 786,414 Total
University of Michigan
206,283 volumes added in October, 3,417,264 Total
University of Wisconsin
20,430 volumes added in October, 242,705 Total
Totals
553,963 volumes added in October, 4,535,190 Total

Source: Hathi Trust

Milestones: The British Library’s Digital Library Passes 500,000 Items

Friday, November 13th, 2009

From the Announcement:

The British Library has added the 500,000th item to its long-term Digital Library System. The milestone item was a digitised copy of a newspaper originally published in 1864 and scanned as part of the Library’s 19th Century British Library Newspapers project, which recently made more than 2 million pages of historic newspapers available online. [Subscription Required].

[Snip]

Steve Green, Head of the Digital Library Programme at the British Library said: “The task of collecting, preserving and providing long-term access to the nation’s digital assets is in many ways a daunting and complex undertaking. The sheer amount of material being published digitally is challenging enough in itself, but the wide range of different formats – many of which will inevitably become obsolete – makes preservation and future accessibility far from straightforward. The Digital Library Programme has made huge progress in the past few years and we now have the foundations of a robust and fully scaleable system that can handle large quantities of digital items, ensuring their availability for future generations of researchers just as our historic print collections remain available for users today.”

Currently the Digital Library System holds:

+ 386,000 items received through the Voluntary Deposit of Electronic Publications (VDEP) scheme
+ 23,000 British Library Sound Archive master files
+ 65,000 19th century digitised books
+ 2,000 electronic journal items
+ 29,000 newspaper items

Source: British Library

Update: NARA/Footnote Holocaust Collection of Digitized Records, Materials to Remain Free Through December 31st

Thursday, November 12th, 2009

At the end of September we posted an in-depth overview about a new collection of digitized Holocaust records from the National Archives (NARA) and Footnote. Our post said that at the end of October a sizable portion of the content would only be available to Footnote.com subscribers.

Today, an update. All of the material will remain free through December 31, 2009.

From an E-Mail:

…due to the popularity of this collection, we have decided to keep the records open free to the public through the rest of this year. This will enable more people to search and explore the original records from the National Archives. On January 1, 2010 these records will become part of the paid subscription on Footnote.com. These records, however, will remain free to access through any of the National Archives physical locations.