Archive for the ‘Preservation/Conservation’ Category

The Internet Time Machine from the Momento Project

Tuesday, November 17th, 2009

This is a must read from start to finish. Here are a few snippets to wet you whistle.

Access the Complete Article from New Scientist

Bookmarking a page takes you to its current version – but earlier ones are harder to find (to see an award-winning 1990s incarnation of newscientist.com, see our gallery of web pages past, right). One option is to visit a resource like the Internet Archive’s Wayback Machine. There, you key in the URL of the site you want and are confronted with a matrix of years and dates for old pages that have been cached.

It’s a lot of hassle. But it shoudn’t be, says Herbert Van de Sompel, a computer scientist at Los Alamos. “Today we treat the web like a library in which you have to know how to go and search for things. We’ve a better way.”

That “better way” is a system that gives browsers a “time-travel” mode, allowing users to find web pages from particular dates and times without having to navigate through archives.

[Snip]

“In addition to language and media type, we negotiate in time. So Memento asks the server not for today’s version of this page, but how it looked one year ago, for instance,” says Van de Sompel.

[Snip]

Jakob Voss, a developer with the Common Library Network in Göttingen, Germany, is an early Memento user – and he is already advocating use of Memento for sites with frequently updated pages like Wikipedia.

“Memento is only a proof of concept but it looks very promising and could be a great enhancement to the web. There is little support in today’s browsers for digging into archives, especially those with dynamic content management systems like wikis and weblogs,” Voss says.

You Can Try a Demo Here and Learn More Here

Access the Complete Article from New Scientist

Source: New Scientist

Old-Book Smell, Sniffing, and Preservation

Tuesday, November 17th, 2009

An interesting out of the way question and the NY Times goes on a hunt to find out in this story.

From the Article:

If you have torn yourself away from the virtual library that is the Internet long enough to visit a real library, you know that the smell of old books — musty, slightly acidic, even grassy — is instantly recognizable. But is it quantifiable? And if so, might old-book odor prove useful to librarians and conservators charged with preserving collections?

[Snip]

Dr. Strlic said he got the idea one day at a library when he saw a conservator sniffing an old piece of paper, trying to determine what it was made of. “I thought, certainly a technique could be developed to do that more accurately,” he said. The approach is similar to breath analysis used to diagnose illness, he added.

He and his colleagues analyzed the volatiles produced by 72 samples of old paper of different types and in varying condition from the 19th and 20th centuries, using liquid chromatography-mass spectrometry. They found that some compounds were reliable markers for paper with certain characteristics — high concentrations of lignin or rosin, for example, which make paper degrade relatively quickly. Their findings were published in the journal Analytical Chemistry.

Source: NY Times

The November/December 2009 Issue of D-Lib Magazine is Now Available

Tuesday, November 17th, 2009

Before we post a selection of what in the new issue of D-LIB, ResourceShelf would like to thank Bonita Wilson for editing a great publication. She has been the sole editor of D-LIB since July, 2001. This is her last issue as editor. She’ll now have more time to engage in the “other things” she likes doing at her home on the Chesapeake Bay in VA. She’ll continue with CNRI in a part time capacity.

Here are Some of the Articles in the November/December 2009 Issue of D-LIB:

+ Beyond 1923: Characteristics of Potentially In-copyright Print Books in Library Collections
by Brian Lavoie and Lorcan Dempsey, OCLC Online Computer Library Center

+ Service-Oriented Models for Educational Resource Federations
by Daniel R. Rehak, LSAL; and Nick Nicholas and Nigel Ward, Link Affiliates, Australia

+ From TIFF to JPEG 2000? Preservation Planning at the Bavarian State Library Using a Collection of Digitized 16th Century Printings
by Hannes Kulovits and Andreas Rauber, Vienna University of Technology; and Anna Kugler, Markus Brantl, Tobias Beinert, Astrid Schoger, Bavarian State Library

+ Measuring Citation Advantages of Open Accessibility
by Samson C. Soong, Hong Kong University of Science and Technology

+ The Importance of Digital Libraries in Joint Educational Programmes: A Case Study of a Master of Science Programme Involving Organizations in Ghana and the Netherlands
by Marga Koelen, International Institute for Geo-Information Science and Earth Observation; and Jonathan Arthur Quaye-Ballard, Kwame Nkrumah University of Science and Technology

The Practice and Perception of Web Archiving in Academic Libraries and Archives
by Lisa Gregory, University of North Carolina at Chapel Hill

Pennsylvania Literary Journal: Google Websites as an Easy Publication Route
by Anna Faktorovich, Indiana University of Pennsylvania

Access the Complete November/December 2009 Issue of D-LIB:

Video: Preserving and Providing Access to Digital Info from State Legislatures

Monday, November 16th, 2009

From an Announcement:

A new video features Minnesota Speaker of the House Margaret Anderson Kelliher talking about new methods to preserve and provide access to digital records of state legislatures. The production describes the work of A Model Technological and Social Architecture for the Preservation of State Government Digital Information Project, which is supported by the Library of Congress National Digital Information Infrastructure and Preservation Program.

Direct to Video (via Minnesota Historical Society)
It runs about six minutes.

Source: National Digital Information Infrastructure and Preservation Program

Milestones: The British Library’s Digital Library Passes 500,000 Items

Friday, November 13th, 2009

From the Announcement:

The British Library has added the 500,000th item to its long-term Digital Library System. The milestone item was a digitised copy of a newspaper originally published in 1864 and scanned as part of the Library’s 19th Century British Library Newspapers project, which recently made more than 2 million pages of historic newspapers available online. [Subscription Required].

[Snip]

Steve Green, Head of the Digital Library Programme at the British Library said: “The task of collecting, preserving and providing long-term access to the nation’s digital assets is in many ways a daunting and complex undertaking. The sheer amount of material being published digitally is challenging enough in itself, but the wide range of different formats – many of which will inevitably become obsolete – makes preservation and future accessibility far from straightforward. The Digital Library Programme has made huge progress in the past few years and we now have the foundations of a robust and fully scaleable system that can handle large quantities of digital items, ensuring their availability for future generations of researchers just as our historic print collections remain available for users today.”

Currently the Digital Library System holds:

+ 386,000 items received through the Voluntary Deposit of Electronic Publications (VDEP) scheme
+ 23,000 British Library Sound Archive master files
+ 65,000 19th century digitised books
+ 2,000 electronic journal items
+ 29,000 newspaper items

Source: British Library

Digital Preservation: Two New Publishers Join CLOCKSS

Monday, November 9th, 2009

From the Announcement:

CLOCKSS is pleased to announce that two new society publishers have recently joined the CLOCKSS archive. The Royal Society of Chemistry and the Royal Society have signed agreements this fall to join CLOCKSS and preserve their materials in the CLOCKSS network of geographically and geopolitically distributed archive nodes. CLOCKSS (Controlled Lots of Copies Keep Stuff Safe) is a community-governed, not-for-profit archive founded by librarians and publishers to ensure the long-term availability of scholarly digital content.

As part of joining CLOCKSS, the two societies agree to release their archived content to the world for free if a time comes when it is no longer available from any publisher (”trigger event”).

Access the Complete Announcement

Source: CLOCKSS

Digital Preservation: ACM Will Partner with Portico and CLOCKSS for Preservation of Its Digital Library Resources

Friday, November 6th, 2009

From an Announcement:

ACM (the Association for Computing Machinery) announced today that it is providing its institutional library customers with advanced electronic archiving services to preserve their valuable electronic resources. These services, provided by Portico and CLOCKSS, address the scholarly community’s critical need for long-term solutions that assure reliable, secure, deliverable access to their burgeoning digital collection of scholarly works. ACM is offering these services to protect the vast online collection of resources in its Digital Library (DL), which are used by over 1 million computing professionals and students worldwide.

“By partnering with Portico and CLOCKSS, we are able to meet a growing demand in the library community for a trusted, reliable third-party archive, and to ensure that digital collections remain accessible to future scholars, researchers, and students,” said Scott Delman, ACM Group Publisher. “Scientific discovery and the educational process are not possible without reliable access to the accumulated scholarship of the past and secure preservation of the scholarly record, and these agreements are a clear step forward with the relationship between the ACM and the library community.”

By investing in long-term digital preservation of content, ACM’s aim is to make it easier for libraries to accelerate their transition away from print and free up resources invested in print collections in favor of new and innovative electronic products and services.

Much More After a Click
(more…)

New Video on Web Archiving

Friday, November 6th, 2009

From the Description:

Web content changes all the time. If we don’t save that content before it disappears, a major part of our cultural history will be lost.

The Library of Congress is working to provide permanent access to web content of historical importance. It selects websites for collection, requests permissions from the website owners, addresses the technology of collecting websites and preserves the websites and makes them available.

This video examines those four challenges.

Access the Video (embedded here)

A text transcript is also available (PDF)

Source: National Digital Information Infrastructure and Preservation Program

Webcast: Preserving OSTI’s Printed Archive

Thursday, November 5th, 2009

Webcast: Preserving OSTI’s Printed Archive
A three minute video from the Office of Scientific and Technical Information at the U.S. Department of Energy.

Here’s the Blurb:

The American public has invested billions of dollars in the atomic energy and subsequent related programs. This investment has mostly been in the form of the printed page. OSTIs historical preservation is described.

Direct to “Printed Archive” Video (via YouTube)

Direct to OSTI YouTube Channel

Direct to OSTI Home Page

While print preservation is essential, OSTI is home to many free online databases including:

+ Science Accelerator
+ Science.gov (Content from Many Government Databases, Search Tecnology from OSTI)
+ WorldWideScience (Global in Scope)
+ Information Bridge: DOE Scientific and Technical Information (Includes over 210K Full Text Documents)
+ DOE Data Explorer
+ Energy Citations Database
+ E-print Network
+ Several Others Linked on the OSTI Home Page

Source: OSTI

Bibliotheca Alexandrina: A Digital Revival

Tuesday, November 3rd, 2009

The Bibliotheca Alexandrina is one busy place. If you want to learn more read on through our highlights but make sure to read the complete article. Our highlights is just a sample of what’s going on.

From the Article:

The International School of Information Science (ISIS) a research institute affiliated with the BA [Bibliotheca Alexandrina], aims at furthering the BA’s goals of being a leading institution in knowledge dissemination and, specifically, promoting research and development related to the digital libraries. Toward that goal, ISIS has embarked on an array of ambitious projects, in partnership with world-class institutions. These include hosting a mirror site for the Internet Archive, participating in the Million Book Project, organizing the digital archive of the Gamal Abdel Nasser collection, digitizing 113 years of Al-Hilal magazine, presenting the first-ever complete digital version of Description de l’Egypte, conducting advanced research such as the Arabic component of the UN-sponsored Universal Networking Language computerized multi-language translation program, and offering the most advanced 3D virtual imaging techniques in a virtual immersive environment for science and technology applications. Thus, despite being barely seven years in existence, the BA already has a substantial record of achievements.

Among the other projects you’ll read about are:

+ The Digital Assets Repository (DAR)

+ Memory of Modern Egypt Digital Repository

+ Archive documenting the history of the Suez Canal

+ SuperCourse

To empower science educators worldwide, the BA is working with a team of specialists, in partnership with the University of Pittsburgh, to launch the first science SuperCourse, comprising thousands of PowerPoint lectures made available for free to teachers and lecturers, who can use the lectures as they see fit in their teaching of science. The SuperCourse has been effectively implemented in the area of Public Health and Epidemiology, with a network of 65,000 scientists in 174 countries, providing more than 3,500 lectures in 31 languages. The BA maintains a mirror site of SuperCourse, which receives an average of one million hits per month, and is working on setting up a similar course in all fields of science.

Much More in the Complete Article

Source: EDUCAUSE Review
Hat Tip: OAN

The World Media Has Responsibility to Save Audio-Visual Archives + Library of Congress Research Project

Tuesday, October 27th, 2009

October 27 is UNESCO Audio-visual Heritage Day.

From the Article:

Federation president Herbert Hayduck says that the world media community has a common responsibility to save audio-visual archives, many of which are on the verge of being lost.

Source: CCTV

See Also: UNESCO World Day for Audiovisual Heritage: Library of Congress Engaged in Cutting Edge Grooved Recording Imaging Research

In celebration of UNESCO’s World Day for Audiovisual Heritage, the Library of Congress Preservation Directorate is featuring information about an innovative project using imaging technology to recover ‘lost’ sound from grooved analog recordings.

+ Learn More about the IRENE Project

+ Webcast: Capturing Recorded Sound through Imaging: The I.R.E.N.E. Project and Future Prospects

See Also: UNESCO World Day for Audiovisual Heritage Day Web Page

See Also: Message from Director-General of UNESCO

GeoCities Says So Long as Internet Archive Works to Preseve Content

Tuesday, October 27th, 2009

In August, we first posted about the Internet Archive (IA) asking GeoCities users to make sure their content was archived by the IA. Why? As of yesterday, GeoCities is no longer online.

From the Article:

Yahoo, which acquired the site for $3.57bn (£2.17bn) in 1999 at the height of the dotcom boom, said sites would no longer be accessible from 26th October.

However, many of the pages have been archived and will still be available to view via the nonprofit Internet Archive project.

The giant digital library, which has been archiving the public web since 1996, has set up a special project to archive GeoCities before it is lost forever.

“We’ve collected a lot of GeoCities sites over the years – but might not have every site and every page,” the Internet Archive said.

Access the Complete Article

Source: BBC

See Also: Saving a Historical Record of GeoCities (via Internet Archive)

Library of Congress’ National Digital Information Infrastructure and Preservation Program Wins Government Computing News Award

Saturday, October 24th, 2009

The NDIIPP as one of 11 projects to receive GCN [Government Computing News] Award for Agency IT Achievement.

From the Summary:

It took two centuries for the Library of Congress to acquire its 29 million books and 105 million other items. Today, it only takes 15 minutes for the world to produce an equal amount of information in digital form, creating unprecedented archiving challenges for the Library of Congress. The Library is meeting the challenge of digital preservation by developing new tools to transfer large quantities of digital content. To date, more than 3 million files have been transferred and stored using the BagIt specification. Due to the Library’s digital preservation initiatives, more than 1,000 collections of digital content have been selected, captured, preserved, and made available to the U.S. public and online visitors across the globe.

Access the Complete Article

We are warned to be careful about what we put online because data on the Internet lives forever. But keeping random copies of files on servers, routers and databases is not the same as preservation, said Martha Anderson, director of program management for the Library of Congress’ National Digital Information Infrastructure and Preservation Program. Digital data can be ephemeral. “That is the paradox,” she said.

Much More in the Summary and Complete Article

Source: GCN

See Also: Library of Congress News Release

Getting to Know the HathiTrust Digital Library

Friday, October 23rd, 2009

Barbara Quint Writes:

With all the controversy still swirling around Google Books and its post-settlement offerings, an alternative route to the millions of digitized books and journals supplied by leading Google Book Search library partners has arrived. The HathiTrust (www.hathitrust.org) is a collaboration of 25 research libraries already participating in Google Book Search to produce a shared digital repository for preservation and access to a curated collection. By mid-November, the HathiTrust Digital Library will have a full-featured, full-text search service for 4.3-5 million items. The searches will retrieve bibliographic citations and page references, including those for in-copyright books. Content will extend beyond the digitized copies of books returned to early library partners by Google. HathiTrust is pushing to acquire other digitized special collections from its members, as well as making arrangements for opening access to university press books.

[Snip]

The new launch will open indexing to nearly 1.5 billion pages from well more than 4.3 million volumes with full-text searching by keyword or phrase. (Just between us, if you simply cannot wait until mid-November, go to

http://babel.hathitrust.org/cgi/ls.

[John] Wilkin, [associate university librarian at the University of Michigan and executive director of the HathiTrust], tipped me off that, [our emphasis] although this “experimental search” site claims to search only 500,000 documents, it actually includes the full 4.3-5 million volumes. Feedback options appear at the top and bottom of each search results page.) The system already had the equivalent of library cataloging searching, though they expect to upgrade even that kind of searching under a cooperative program with OCLC.

Much More in the Complete Article

Source: InfoToday NewsBreaks

Article: Missing Links: The Enduring Web

Thursday, October 22nd, 2009

From the Abstract:

The Web runs at risk. Our generation has witnessed a revolution in human communications on a trajectory similar to that of the origins of the written word and language itself. Early Web pages have an historical importance comparable with prehistoric cave paintings or proto-historic pressed clay ciphers. They are just as fragile. The ease of creation, editing and revising gives content a flexible immediacy: ensuring that sources are up to date and, with appropriate concern for interoperability, content can be folded seamlessly into any number of presentation layers. How can we carve a legacy from such complexity and volatility?

Access the Complete Article (PDF)

Source: International Journal of Digital Curation (4.2)

The Leon Levy Foundation: Helping Organizations to Collect, Conserve, and Digitize Archival Collections

Tuesday, October 13th, 2009

From the Article:

The National Park Service found the original deed from 1695 for the homestead in Virginia where George Washington was born and copies of John Peter Zenger’s New-York Weekly Journal from 1735 reporting on his landmark trial affirming freedom of the press. The Center for Jewish History discovered the 1944 document in which Raphael Lemkin coined the term genocide. The Morgan Library turned up a 1913 letter from the sister of Virginia Woolf saying that “Virginia was very much depressed yesterday” and attempted suicide — three decades before she would kill herself.

Those are among the nearly two dozen institutions that have received grants from the Leon Levy Foundation since 2007 to identify, preserve and digitize their archival collections and to make them available online to scholars and to the public.

The foundation’s archives and catalogs program has awarded more than $10.3 million, including two grants this week: $3.5 million to the Institute for Advanced Study in Princeton, N.J., to collect and conserve the papers of its present and former scholars, including George F. Kennan, J. Robert Oppenheimer and Albert Einstein; and [our emphasis] $2.4 million to the New York Philharmonic, where archivists will digitize 1.3 million pages, including a 1909 Mahler score for his First Symphony originally marked up by the composer and further annotated 50 years later by Leonard Bernstein.

Much Much More in the Complete Article

Source: NY Times

See Also: Learn More via the Leon Levy Foundation Web Site

Preserving Internet Content

Tuesday, October 13th, 2009

From the Web Site:

On October 7, 2009, the IIPC [International Internet Preservation Consortium] sponsored a free, one-day event, Active Solutions for Preserving Internet Content, following iPRES 2009, the 6th International Conference on Preservation of Digital Objects, held at the Mission Bay Conference Center, San Francisco. Slide presentations are available on the conference program page.

Presentations with Slides Include:

+ Billions and billions of objects, METS, PREMIS, oh my! (Gina Jones)

+ Preserving Access-Making more informed guesses about what works (David Pearson)

+ “Here be dragons” – Strategies for dealing with viruses in the web archive (Matt Holden)

+ Say Emulate; He Says Migrate (David Pearson)

+ Keep Websites Alive (Jeffrey van der Hoeven)

+ What do web archivers (or is it archivists) really do? (Gina Jones)

+ Web Archives Are Forever: defining a workflow for long term preservation of web archives (Maureen Pennock)

+ Square pegs? Fitting web archives into the digital preservation repository of the National Library of New Zealand (Kevin De Vorsey)

+ Continuity and Preservation: The National Archives approach to maintaining permanent access to the web presence of UK Central Government
(Amanda Spencer and Alison Heatherington)

+ It’s the end of a project, as we know it: a leading discussion on experiences and issues in embedding web archiving and preservation in an organization (Marcel Ras and Hilde van Wijngaarden)

Source: netpreserve

Preservation at the Library of Congress: Stabilizing Special Collections for High-Density Storage

Friday, October 9th, 2009

From the Introduction:

In 2005, the Library of Congress opened a state-of-the-art, high-density storage facility thirty miles from Capitol Hill at Fort Meade in Maryland. The facility was constructed on a modular basis: Modules 1 and 2 were designed for traditional bound library materials; Modules 3 and 4 (on which these webpages focus) were designed to house 22 million special-format collection items.

The Conservation Division Move Project team was charged with preparing especially challenging special format collections for off-site transport and storage. The collections came from eight custodial divisions across the Library and included a variety of formats such as globes, rolled architectural drawings, ephemera, large works of art on paper, photographs, negatives, maps, manuscripts, newspapers, rare folios and a variety of three-dimensional objects.

Sections of the Report Include:

+ Introduction & Planning
+ Globes
+ Objects
+ Rolled Drawings
+ Works of Art
+ Bound Volumes
+ Standard Archival Manuscript Collections

Source: Preservation Directorate at the Library of Congress

NDIIPP Releases Web Archiving Video

Friday, October 9th, 2009

From the Story:

Web content changes all the time. If we don’t save that content before it disappears, a major part of our cultural history will be lost.

This is the message of the second video in the Library of Congress National Digital Information Infrastructure and Preservation Program’s video series. The just-released video, “Web Archiving,” discusses the Library’s approach to collecting and preserving content found on the World Wide Web.

The three-minute video is targeted to librarians, archivists, and others interested in working with digital content.

[Snip]

The “Web Archiving” production is the second in the series, following the Bagit video that was released in July 2009. The Bagit video describes a specification for securely transferring digital content.

View the Web Archiving Video

Video Presentations Homepage

Source: National Digital Information Infrastructure and Preservation Program, Library of Congress