Archive for the ‘Preservation/Conservation’ Category

Digital Preservation: ACM Will Partner with Portico and CLOCKSS for Preservation of Its Digital Library Resources

Friday, November 6th, 2009

From an Announcement:

ACM (the Association for Computing Machinery) announced today that it is providing its institutional library customers with advanced electronic archiving services to preserve their valuable electronic resources. These services, provided by Portico and CLOCKSS, address the scholarly community’s critical need for long-term solutions that assure reliable, secure, deliverable access to their burgeoning digital collection of scholarly works. ACM is offering these services to protect the vast online collection of resources in its Digital Library (DL), which are used by over 1 million computing professionals and students worldwide.

“By partnering with Portico and CLOCKSS, we are able to meet a growing demand in the library community for a trusted, reliable third-party archive, and to ensure that digital collections remain accessible to future scholars, researchers, and students,” said Scott Delman, ACM Group Publisher. “Scientific discovery and the educational process are not possible without reliable access to the accumulated scholarship of the past and secure preservation of the scholarly record, and these agreements are a clear step forward with the relationship between the ACM and the library community.”

By investing in long-term digital preservation of content, ACM’s aim is to make it easier for libraries to accelerate their transition away from print and free up resources invested in print collections in favor of new and innovative electronic products and services.

Much More After a Click
(more…)

New Video on Web Archiving

Friday, November 6th, 2009

From the Description:

Web content changes all the time. If we don’t save that content before it disappears, a major part of our cultural history will be lost.

The Library of Congress is working to provide permanent access to web content of historical importance. It selects websites for collection, requests permissions from the website owners, addresses the technology of collecting websites and preserves the websites and makes them available.

This video examines those four challenges.

Access the Video (embedded here)

A text transcript is also available (PDF)

Source: National Digital Information Infrastructure and Preservation Program

Webcast: Preserving OSTI’s Printed Archive

Thursday, November 5th, 2009

Webcast: Preserving OSTI’s Printed Archive
A three minute video from the Office of Scientific and Technical Information at the U.S. Department of Energy.

Here’s the Blurb:

The American public has invested billions of dollars in the atomic energy and subsequent related programs. This investment has mostly been in the form of the printed page. OSTIs historical preservation is described.

Direct to “Printed Archive” Video (via YouTube)

Direct to OSTI YouTube Channel

Direct to OSTI Home Page

While print preservation is essential, OSTI is home to many free online databases including:

+ Science Accelerator
+ Science.gov (Content from Many Government Databases, Search Tecnology from OSTI)
+ WorldWideScience (Global in Scope)
+ Information Bridge: DOE Scientific and Technical Information (Includes over 210K Full Text Documents)
+ DOE Data Explorer
+ Energy Citations Database
+ E-print Network
+ Several Others Linked on the OSTI Home Page

Source: OSTI

Bibliotheca Alexandrina: A Digital Revival

Tuesday, November 3rd, 2009

The Bibliotheca Alexandrina is one busy place. If you want to learn more read on through our highlights but make sure to read the complete article. Our highlights is just a sample of what’s going on.

From the Article:

The International School of Information Science (ISIS) a research institute affiliated with the BA [Bibliotheca Alexandrina], aims at furthering the BA’s goals of being a leading institution in knowledge dissemination and, specifically, promoting research and development related to the digital libraries. Toward that goal, ISIS has embarked on an array of ambitious projects, in partnership with world-class institutions. These include hosting a mirror site for the Internet Archive, participating in the Million Book Project, organizing the digital archive of the Gamal Abdel Nasser collection, digitizing 113 years of Al-Hilal magazine, presenting the first-ever complete digital version of Description de l’Egypte, conducting advanced research such as the Arabic component of the UN-sponsored Universal Networking Language computerized multi-language translation program, and offering the most advanced 3D virtual imaging techniques in a virtual immersive environment for science and technology applications. Thus, despite being barely seven years in existence, the BA already has a substantial record of achievements.

Among the other projects you’ll read about are:

+ The Digital Assets Repository (DAR)

+ Memory of Modern Egypt Digital Repository

+ Archive documenting the history of the Suez Canal

+ SuperCourse

To empower science educators worldwide, the BA is working with a team of specialists, in partnership with the University of Pittsburgh, to launch the first science SuperCourse, comprising thousands of PowerPoint lectures made available for free to teachers and lecturers, who can use the lectures as they see fit in their teaching of science. The SuperCourse has been effectively implemented in the area of Public Health and Epidemiology, with a network of 65,000 scientists in 174 countries, providing more than 3,500 lectures in 31 languages. The BA maintains a mirror site of SuperCourse, which receives an average of one million hits per month, and is working on setting up a similar course in all fields of science.

Much More in the Complete Article

Source: EDUCAUSE Review
Hat Tip: OAN

The World Media Has Responsibility to Save Audio-Visual Archives + Library of Congress Research Project

Tuesday, October 27th, 2009

October 27 is UNESCO Audio-visual Heritage Day.

From the Article:

Federation president Herbert Hayduck says that the world media community has a common responsibility to save audio-visual archives, many of which are on the verge of being lost.

Source: CCTV

See Also: UNESCO World Day for Audiovisual Heritage: Library of Congress Engaged in Cutting Edge Grooved Recording Imaging Research

In celebration of UNESCO’s World Day for Audiovisual Heritage, the Library of Congress Preservation Directorate is featuring information about an innovative project using imaging technology to recover ‘lost’ sound from grooved analog recordings.

+ Learn More about the IRENE Project

+ Webcast: Capturing Recorded Sound through Imaging: The I.R.E.N.E. Project and Future Prospects

See Also: UNESCO World Day for Audiovisual Heritage Day Web Page

See Also: Message from Director-General of UNESCO

GeoCities Says So Long as Internet Archive Works to Preseve Content

Tuesday, October 27th, 2009

In August, we first posted about the Internet Archive (IA) asking GeoCities users to make sure their content was archived by the IA. Why? As of yesterday, GeoCities is no longer online.

From the Article:

Yahoo, which acquired the site for $3.57bn (£2.17bn) in 1999 at the height of the dotcom boom, said sites would no longer be accessible from 26th October.

However, many of the pages have been archived and will still be available to view via the nonprofit Internet Archive project.

The giant digital library, which has been archiving the public web since 1996, has set up a special project to archive GeoCities before it is lost forever.

“We’ve collected a lot of GeoCities sites over the years – but might not have every site and every page,” the Internet Archive said.

Access the Complete Article

Source: BBC

See Also: Saving a Historical Record of GeoCities (via Internet Archive)

Library of Congress’ National Digital Information Infrastructure and Preservation Program Wins Government Computing News Award

Saturday, October 24th, 2009

The NDIIPP as one of 11 projects to receive GCN [Government Computing News] Award for Agency IT Achievement.

From the Summary:

It took two centuries for the Library of Congress to acquire its 29 million books and 105 million other items. Today, it only takes 15 minutes for the world to produce an equal amount of information in digital form, creating unprecedented archiving challenges for the Library of Congress. The Library is meeting the challenge of digital preservation by developing new tools to transfer large quantities of digital content. To date, more than 3 million files have been transferred and stored using the BagIt specification. Due to the Library’s digital preservation initiatives, more than 1,000 collections of digital content have been selected, captured, preserved, and made available to the U.S. public and online visitors across the globe.

Access the Complete Article

We are warned to be careful about what we put online because data on the Internet lives forever. But keeping random copies of files on servers, routers and databases is not the same as preservation, said Martha Anderson, director of program management for the Library of Congress’ National Digital Information Infrastructure and Preservation Program. Digital data can be ephemeral. “That is the paradox,” she said.

Much More in the Summary and Complete Article

Source: GCN

See Also: Library of Congress News Release

Getting to Know the HathiTrust Digital Library

Friday, October 23rd, 2009

Barbara Quint Writes:

With all the controversy still swirling around Google Books and its post-settlement offerings, an alternative route to the millions of digitized books and journals supplied by leading Google Book Search library partners has arrived. The HathiTrust (www.hathitrust.org) is a collaboration of 25 research libraries already participating in Google Book Search to produce a shared digital repository for preservation and access to a curated collection. By mid-November, the HathiTrust Digital Library will have a full-featured, full-text search service for 4.3-5 million items. The searches will retrieve bibliographic citations and page references, including those for in-copyright books. Content will extend beyond the digitized copies of books returned to early library partners by Google. HathiTrust is pushing to acquire other digitized special collections from its members, as well as making arrangements for opening access to university press books.

[Snip]

The new launch will open indexing to nearly 1.5 billion pages from well more than 4.3 million volumes with full-text searching by keyword or phrase. (Just between us, if you simply cannot wait until mid-November, go to

http://babel.hathitrust.org/cgi/ls.

[John] Wilkin, [associate university librarian at the University of Michigan and executive director of the HathiTrust], tipped me off that, [our emphasis] although this “experimental search” site claims to search only 500,000 documents, it actually includes the full 4.3-5 million volumes. Feedback options appear at the top and bottom of each search results page.) The system already had the equivalent of library cataloging searching, though they expect to upgrade even that kind of searching under a cooperative program with OCLC.

Much More in the Complete Article

Source: InfoToday NewsBreaks

Article: Missing Links: The Enduring Web

Thursday, October 22nd, 2009

From the Abstract:

The Web runs at risk. Our generation has witnessed a revolution in human communications on a trajectory similar to that of the origins of the written word and language itself. Early Web pages have an historical importance comparable with prehistoric cave paintings or proto-historic pressed clay ciphers. They are just as fragile. The ease of creation, editing and revising gives content a flexible immediacy: ensuring that sources are up to date and, with appropriate concern for interoperability, content can be folded seamlessly into any number of presentation layers. How can we carve a legacy from such complexity and volatility?

Access the Complete Article (PDF)

Source: International Journal of Digital Curation (4.2)

The Leon Levy Foundation: Helping Organizations to Collect, Conserve, and Digitize Archival Collections

Tuesday, October 13th, 2009

From the Article:

The National Park Service found the original deed from 1695 for the homestead in Virginia where George Washington was born and copies of John Peter Zenger’s New-York Weekly Journal from 1735 reporting on his landmark trial affirming freedom of the press. The Center for Jewish History discovered the 1944 document in which Raphael Lemkin coined the term genocide. The Morgan Library turned up a 1913 letter from the sister of Virginia Woolf saying that “Virginia was very much depressed yesterday” and attempted suicide — three decades before she would kill herself.

Those are among the nearly two dozen institutions that have received grants from the Leon Levy Foundation since 2007 to identify, preserve and digitize their archival collections and to make them available online to scholars and to the public.

The foundation’s archives and catalogs program has awarded more than $10.3 million, including two grants this week: $3.5 million to the Institute for Advanced Study in Princeton, N.J., to collect and conserve the papers of its present and former scholars, including George F. Kennan, J. Robert Oppenheimer and Albert Einstein; and [our emphasis] $2.4 million to the New York Philharmonic, where archivists will digitize 1.3 million pages, including a 1909 Mahler score for his First Symphony originally marked up by the composer and further annotated 50 years later by Leonard Bernstein.

Much Much More in the Complete Article

Source: NY Times

See Also: Learn More via the Leon Levy Foundation Web Site

Preserving Internet Content

Tuesday, October 13th, 2009

From the Web Site:

On October 7, 2009, the IIPC [International Internet Preservation Consortium] sponsored a free, one-day event, Active Solutions for Preserving Internet Content, following iPRES 2009, the 6th International Conference on Preservation of Digital Objects, held at the Mission Bay Conference Center, San Francisco. Slide presentations are available on the conference program page.

Presentations with Slides Include:

+ Billions and billions of objects, METS, PREMIS, oh my! (Gina Jones)

+ Preserving Access-Making more informed guesses about what works (David Pearson)

+ “Here be dragons” – Strategies for dealing with viruses in the web archive (Matt Holden)

+ Say Emulate; He Says Migrate (David Pearson)

+ Keep Websites Alive (Jeffrey van der Hoeven)

+ What do web archivers (or is it archivists) really do? (Gina Jones)

+ Web Archives Are Forever: defining a workflow for long term preservation of web archives (Maureen Pennock)

+ Square pegs? Fitting web archives into the digital preservation repository of the National Library of New Zealand (Kevin De Vorsey)

+ Continuity and Preservation: The National Archives approach to maintaining permanent access to the web presence of UK Central Government
(Amanda Spencer and Alison Heatherington)

+ It’s the end of a project, as we know it: a leading discussion on experiences and issues in embedding web archiving and preservation in an organization (Marcel Ras and Hilde van Wijngaarden)

Source: netpreserve

Preservation at the Library of Congress: Stabilizing Special Collections for High-Density Storage

Friday, October 9th, 2009

From the Introduction:

In 2005, the Library of Congress opened a state-of-the-art, high-density storage facility thirty miles from Capitol Hill at Fort Meade in Maryland. The facility was constructed on a modular basis: Modules 1 and 2 were designed for traditional bound library materials; Modules 3 and 4 (on which these webpages focus) were designed to house 22 million special-format collection items.

The Conservation Division Move Project team was charged with preparing especially challenging special format collections for off-site transport and storage. The collections came from eight custodial divisions across the Library and included a variety of formats such as globes, rolled architectural drawings, ephemera, large works of art on paper, photographs, negatives, maps, manuscripts, newspapers, rare folios and a variety of three-dimensional objects.

Sections of the Report Include:

+ Introduction & Planning
+ Globes
+ Objects
+ Rolled Drawings
+ Works of Art
+ Bound Volumes
+ Standard Archival Manuscript Collections

Source: Preservation Directorate at the Library of Congress

NDIIPP Releases Web Archiving Video

Friday, October 9th, 2009

From the Story:

Web content changes all the time. If we don’t save that content before it disappears, a major part of our cultural history will be lost.

This is the message of the second video in the Library of Congress National Digital Information Infrastructure and Preservation Program’s video series. The just-released video, “Web Archiving,” discusses the Library’s approach to collecting and preserving content found on the World Wide Web.

The three-minute video is targeted to librarians, archivists, and others interested in working with digital content.

[Snip]

The “Web Archiving” production is the second in the series, following the Bagit video that was released in July 2009. The Bagit video describes a specification for securely transferring digital content.

View the Web Archiving Video

Video Presentations Homepage

Source: National Digital Information Infrastructure and Preservation Program, Library of Congress

A Look at the Major League Baseball Video Library Film Archive

Thursday, October 8th, 2009

If you’re a baseball fan, this is a “must read.”

From the Article:

No American sport has a past as deep and cherished as baseball’s. But precious little of the sport’s history is preserved in moving images. Much occurred before the television age, leaving only grainy, scattershot clips culled from newsreels and home movies — and rarely does it show a player of [Babe] Ruth’s stature.

The newly arrived Ruth film is part of the video collection of Major League Baseball Productions, the league’s official archivist, which spans more than 100 years and includes about 150,000 hours of moving images. Most of the collection is stored in plastic cases that line metal shelves of a room labeled “Major League Baseball Film and Video Archive.” The overflow rests in storage a few miles away, in Fort Lee, N.J.

The article goes on describe how Frank Caputo, manager of the MLB Network video library film archive and Joe Porciello research a newly discovered 8-millimeter clip (it was found by a New Hanpshire man in his grandfathers home movie collection).

Source: The New York Times

See Also: Just in Time for the Major League Playoffs and World Series: Baseball Resources at the Library of Congress Web Guide

On Google and Usenet

Wednesday, October 7th, 2009

The article begins with one paragraph about Google Book Search but the story actually focuses on the Usenet archive (Google Groups).

From the Article by Kevin Poulsen:

…a few geeks with long memories remember the last time Google assembled a giant library that promised to rescue orphaned content for future generations. And the tattered remnants of that online archive are a cautionary tale in what happens when Google simply loses interest.

That library is Usenet, a vast internet- and dial-up-based message board system erected in 1980. Though moribund today, for decades Usenet was the paper of record for the online world, and its hundreds of millions of “newsgroup” postings chronicle everything from the birth of the web to the rise of Microsoft, as well as more trivial matters.

In February 2001, Google rescued that history when it acquired the New York-based Deja.com, and with it a Usenet archive going back to 1995. It turned the archive into Google Groups, in a move that was cheered by net geeks who had seen Deja’s reliability declining, and were certain that the supremely competent Google would save it.

[Snip]

Flash forward nearly eight years, and visiting Google Groups is like touring ancient ruins.

[Snip]

Searching within a newsgroup, even one with thousands of posts, produces no results at all. Confining a search to a range of dates also fails silently, bulldozing the most obvious path to exploring an archive.

[Snip]

“The search results are extremely poor,” says network pioneer Brad Templeton. “Like nobody cares.”

Henry Spencer, whose Usenet archive forms much of Google Groups, is troubled by the company’s curatorship. “Google does get a lot of credit for putting it together and making it available,” Spencer says. “But search capabilities are important for such a large collection of data. The archive’s value to the community is considerably reduced if it’s not conveniently searchable.”

Source: Wired

Legal Delays Have Blown a Hole in UK’s Digital Heritage

Monday, October 5th, 2009

From the Article:

Digital literature, online scientific research and internet journalism that should have been saved in the nation’s main libraries over the past five years may have been lost because ministers have failed to give them the legal power to copy and archive websites, the Guardian has learned.

Lost digital archive: ‘It’s taken 6 years to begin consultation’ Link to this audio Senior executives at the British Library and the National Library of Scotland (NLS) are dismayed that legislation giving them the right to collect online and digital material is still not in force, more than six years after it was passed by parliament.

The omission has meant the libraries – which are legally required to archive books, newspapers and journals – have failed to record online coverage of major events such as the Iraq and Afghanistan wars, the release of the Lockerbie bomber and the MPs’ expenses scandal.

[Snip]

Phil Spence, head of operations at the British Library, said the failure had left a major “digital black hole” in the library’s collections, with huge gaps in the archives for researchers, scientists and historians.

It meant the British Library was unable to store the BBC’s website, the National Gallery or British Museum website, any UK newspapers’ websites, or scientific journals published online because of copyright issues. Blogs, community pages, government and business websites can only be archived after laborious voluntary agreements. The act would protect the libraries against copying defamatory material, but would also protect a publisher’s copyright.

“We’ve lost five years of digital content which is gone potentially for ever, and the ability of the nation to capitalise on that as well,” he said.

Much More in the Full Text Article Including a 3.5 Minute Audio Report

Source: The Guardian

Report From Digital Preservation Workshop Held in DC

Monday, October 5th, 2009

From the Report:

Over twenty Library of Congress staff had an opportunity to participate in a special workshop, Digital Preservation Management: Implementing Short-term Strategies for Long-term Problems, hosted by the Inter-university Consortium for Political and Social Research, held September 21-22, 2009 in Washington, DC.

Initially developed at the Cornell University Library and supported with funding from the National Endowment for the Humanities, the Digital Preservation Management workshops are structured curricula geared toward managing digital preservation planning and policies for libraries, archives, and other cultural heritage institutions. The goal of the workshop is to provide those managers and staff responsible for digital assets the practical means to exercise stewardship in an age of technological change. Many institutions struggle with the initial stages of developing digital preservation policies, and the workshop aides participants in understanding the fundamental pieces of how to think about and enact planning for organizations.

[Snip]

The next five-day workshops will be held October 11-16, 2009 at the University of Michigan – where Martha Anderson, director of program management for the National Digital Information Infrastructure and Preservation Program, will be the keynote speaker – and June 13-18, 2010 at MIT in Cambridge, Massachusetts. For more information about the workshops, please visit: www.icpsr.umich.edu/dpm/workshops/fiveday.html.

Source: National Digital Information Infrastructure and Preservation Program / Library of Congress

NDIIPP Conducts Two Day Workshop on Preserving Digital News

Friday, October 2nd, 2009

From the Post:

The Internet has impacted news and journalism more than almost any other category of information. Newspapers have always been important research resources for users of libraries, archives and historical societies. But significant events are now reported in new ways, such as through blogs, podcasts, social-networking services, online news aggregators and multimedia web content. To address this change, the National Digital Information Infrastructure and Preservation Program convened a two-day workshop to discuss a national strategy for collecting and preserving news content that is disseminated only in digital form.

The meeting on September 2-3, 2009, brought together over fifty invited specialists in the field: creators, distributors, archivists, and researchers who depend upon historical news. The topics for discussion included the following:

+ What is digital news? Who produces it? What forms does it take?
+ What is important to preserve for the nation?
+ What collaborative efforts for preservation are succeeding now?
+ What are the roles for content owners and public archives in preserving digital news?
+ What roles do “local” and “national” content and organizations serve?
+ What are some strategies and possible models for addressing the issues in a distributed way?

A number of lively conversations among the diverse participants prompted several innovative solutions, including blogs that self-archive and newspapers that opt-in to public institution web archiving. Case studies and analyses of how historical news is consumed and used, especially with regard to dynamic and multi-media content, were suggested. Local news blogs were deemed an important area to monitor as they seem particularly at-risk and ripe for a distributed solution.

A Bit More in the Complete Article

Source: National Digital Information Infrastructure and Preservation Program (NDIIPP) / Library of Congress

The October, 2009 Issue of the Digital Preservation Newsletter is Now Online from the NDIIPP and Library of Congress

Friday, October 2nd, 2009

Access the Complete Issue (2 pages; PDF)

This Issue Includes:

+ News of 2009 Best Practices Exchange and the Preserving Digital News meeting

+ An article about a Digital Preservation Workshop held at the Library of Congress

+ The Netherlands Coalition for Digital Preservation sponsored a national conference and published an interim report

+ Government Computer News recognizes NDIIPP among the best of Federal information technology initiatives of 2009

+ New guidelines for content categories and digitization objectives published by the Federal Agencies Digitization Guidelines Initiative

+ An interview podcast about the DuraSpace pilot project is available from Federal News Radio

Source; National Digital Information Infrastructure and Preservation Program (NDIIPP) / Library of Congress