Archive for the ‘Preservation/Conservation’ Category

A Look at the Major League Baseball Video Library Film Archive

Thursday, October 8th, 2009

If you’re a baseball fan, this is a “must read.”

From the Article:

No American sport has a past as deep and cherished as baseball’s. But precious little of the sport’s history is preserved in moving images. Much occurred before the television age, leaving only grainy, scattershot clips culled from newsreels and home movies — and rarely does it show a player of [Babe] Ruth’s stature.

The newly arrived Ruth film is part of the video collection of Major League Baseball Productions, the league’s official archivist, which spans more than 100 years and includes about 150,000 hours of moving images. Most of the collection is stored in plastic cases that line metal shelves of a room labeled “Major League Baseball Film and Video Archive.” The overflow rests in storage a few miles away, in Fort Lee, N.J.

The article goes on describe how Frank Caputo, manager of the MLB Network video library film archive and Joe Porciello research a newly discovered 8-millimeter clip (it was found by a New Hanpshire man in his grandfathers home movie collection).

Source: The New York Times

See Also: Just in Time for the Major League Playoffs and World Series: Baseball Resources at the Library of Congress Web Guide

On Google and Usenet

Wednesday, October 7th, 2009

The article begins with one paragraph about Google Book Search but the story actually focuses on the Usenet archive (Google Groups).

From the Article by Kevin Poulsen:

…a few geeks with long memories remember the last time Google assembled a giant library that promised to rescue orphaned content for future generations. And the tattered remnants of that online archive are a cautionary tale in what happens when Google simply loses interest.

That library is Usenet, a vast internet- and dial-up-based message board system erected in 1980. Though moribund today, for decades Usenet was the paper of record for the online world, and its hundreds of millions of “newsgroup” postings chronicle everything from the birth of the web to the rise of Microsoft, as well as more trivial matters.

In February 2001, Google rescued that history when it acquired the New York-based Deja.com, and with it a Usenet archive going back to 1995. It turned the archive into Google Groups, in a move that was cheered by net geeks who had seen Deja’s reliability declining, and were certain that the supremely competent Google would save it.

[Snip]

Flash forward nearly eight years, and visiting Google Groups is like touring ancient ruins.

[Snip]

Searching within a newsgroup, even one with thousands of posts, produces no results at all. Confining a search to a range of dates also fails silently, bulldozing the most obvious path to exploring an archive.

[Snip]

“The search results are extremely poor,” says network pioneer Brad Templeton. “Like nobody cares.”

Henry Spencer, whose Usenet archive forms much of Google Groups, is troubled by the company’s curatorship. “Google does get a lot of credit for putting it together and making it available,” Spencer says. “But search capabilities are important for such a large collection of data. The archive’s value to the community is considerably reduced if it’s not conveniently searchable.”

Source: Wired

Legal Delays Have Blown a Hole in UK’s Digital Heritage

Monday, October 5th, 2009

From the Article:

Digital literature, online scientific research and internet journalism that should have been saved in the nation’s main libraries over the past five years may have been lost because ministers have failed to give them the legal power to copy and archive websites, the Guardian has learned.

Lost digital archive: ‘It’s taken 6 years to begin consultation’ Link to this audio Senior executives at the British Library and the National Library of Scotland (NLS) are dismayed that legislation giving them the right to collect online and digital material is still not in force, more than six years after it was passed by parliament.

The omission has meant the libraries – which are legally required to archive books, newspapers and journals – have failed to record online coverage of major events such as the Iraq and Afghanistan wars, the release of the Lockerbie bomber and the MPs’ expenses scandal.

[Snip]

Phil Spence, head of operations at the British Library, said the failure had left a major “digital black hole” in the library’s collections, with huge gaps in the archives for researchers, scientists and historians.

It meant the British Library was unable to store the BBC’s website, the National Gallery or British Museum website, any UK newspapers’ websites, or scientific journals published online because of copyright issues. Blogs, community pages, government and business websites can only be archived after laborious voluntary agreements. The act would protect the libraries against copying defamatory material, but would also protect a publisher’s copyright.

“We’ve lost five years of digital content which is gone potentially for ever, and the ability of the nation to capitalise on that as well,” he said.

Much More in the Full Text Article Including a 3.5 Minute Audio Report

Source: The Guardian

Report From Digital Preservation Workshop Held in DC

Monday, October 5th, 2009

From the Report:

Over twenty Library of Congress staff had an opportunity to participate in a special workshop, Digital Preservation Management: Implementing Short-term Strategies for Long-term Problems, hosted by the Inter-university Consortium for Political and Social Research, held September 21-22, 2009 in Washington, DC.

Initially developed at the Cornell University Library and supported with funding from the National Endowment for the Humanities, the Digital Preservation Management workshops are structured curricula geared toward managing digital preservation planning and policies for libraries, archives, and other cultural heritage institutions. The goal of the workshop is to provide those managers and staff responsible for digital assets the practical means to exercise stewardship in an age of technological change. Many institutions struggle with the initial stages of developing digital preservation policies, and the workshop aides participants in understanding the fundamental pieces of how to think about and enact planning for organizations.

[Snip]

The next five-day workshops will be held October 11-16, 2009 at the University of Michigan – where Martha Anderson, director of program management for the National Digital Information Infrastructure and Preservation Program, will be the keynote speaker – and June 13-18, 2010 at MIT in Cambridge, Massachusetts. For more information about the workshops, please visit: www.icpsr.umich.edu/dpm/workshops/fiveday.html.

Source: National Digital Information Infrastructure and Preservation Program / Library of Congress

NDIIPP Conducts Two Day Workshop on Preserving Digital News

Friday, October 2nd, 2009

From the Post:

The Internet has impacted news and journalism more than almost any other category of information. Newspapers have always been important research resources for users of libraries, archives and historical societies. But significant events are now reported in new ways, such as through blogs, podcasts, social-networking services, online news aggregators and multimedia web content. To address this change, the National Digital Information Infrastructure and Preservation Program convened a two-day workshop to discuss a national strategy for collecting and preserving news content that is disseminated only in digital form.

The meeting on September 2-3, 2009, brought together over fifty invited specialists in the field: creators, distributors, archivists, and researchers who depend upon historical news. The topics for discussion included the following:

+ What is digital news? Who produces it? What forms does it take?
+ What is important to preserve for the nation?
+ What collaborative efforts for preservation are succeeding now?
+ What are the roles for content owners and public archives in preserving digital news?
+ What roles do “local” and “national” content and organizations serve?
+ What are some strategies and possible models for addressing the issues in a distributed way?

A number of lively conversations among the diverse participants prompted several innovative solutions, including blogs that self-archive and newspapers that opt-in to public institution web archiving. Case studies and analyses of how historical news is consumed and used, especially with regard to dynamic and multi-media content, were suggested. Local news blogs were deemed an important area to monitor as they seem particularly at-risk and ripe for a distributed solution.

A Bit More in the Complete Article

Source: National Digital Information Infrastructure and Preservation Program (NDIIPP) / Library of Congress

The October, 2009 Issue of the Digital Preservation Newsletter is Now Online from the NDIIPP and Library of Congress

Friday, October 2nd, 2009

Access the Complete Issue (2 pages; PDF)

This Issue Includes:

+ News of 2009 Best Practices Exchange and the Preserving Digital News meeting

+ An article about a Digital Preservation Workshop held at the Library of Congress

+ The Netherlands Coalition for Digital Preservation sponsored a national conference and published an interim report

+ Government Computer News recognizes NDIIPP among the best of Federal information technology initiatives of 2009

+ New guidelines for content categories and digitization objectives published by the Federal Agencies Digitization Guidelines Initiative

+ An interview podcast about the DuraSpace pilot project is available from Federal News Radio

Source; National Digital Information Infrastructure and Preservation Program (NDIIPP) / Library of Congress

Study of Media at Indiana University Bloomington Reveals Critical Preservation Needs

Thursday, October 1st, 2009

You have to wonder how many other universities of all shapes and sizes are in the same shape (or worse) than IU?

From the Announcement:

Indiana University Bloomington holds more than 560,000 audio and video recordings and film reels, many of which are historically significant, all of which are actively deteriorating. And the window of time to save these materials is closing fast; most archivists agree that such audio and video materials could be lost forever in 20 years or less.

That’s the urgent conclusion of the just-released IU Bloomington Media Preservation Survey, a comprehensive study produced by a task force of archival experts drawn from around the campus.

[Snip]

The final report presents a detailed look at the characteristics and condition of audio, video, and film media on the campus, including numbers of holdings, general condition, and preservation risks. (This survey focused on one class of media and did not include photographs or other physical objects in special collections.) Among its major findings, the report reveals that IU Bloomington:

* Has media holdings dating back to wax cylinder recordings of Native Americans made in the early 1890s
* Holds an estimated 154,136 unique (one-of-a-kind) items
* Holds an estimated 94,993 rare items
* Holds a larger and more diverse film collection than almost any other U.S. university
* Has at least 180,000 items that are at high or very high risk for loss of content

“Large portions of IUB holdings are seriously endangered due to inadequate storage, degradation of media, and format obsolescence,” says [Mike] Casey [associate director for recording services at the Archives of Traditional Music] in his introduction to the survey report. “Some media preservation efforts on campus exist, but none are sustainable, and none are at a scale or pace that will allow them to preserve more than a tiny fraction of their holdings before it is too late.”
(more…)

Google Book Settlement 1.0 Is History

Thursday, September 24th, 2009

A new article by UC Berkeley Law Professor (she also has an appointment from the School of Information), Pamela Samuelson. She has written several articles for The Huffington Post on the Google Book Settlement that are linked at the bottom of this post.

From the New Article:

A memorandum submitted in support of the postponement optimistically observes that the DOJ had recognized that “a properly structured settlement agreement in this case offers the potential for important societal benefits” and that DOJ was committed to “working with the parties constructively to address concerns raised by the United States.” It thus appears that DOJ will be actively participating in negotiations for a new settlement.

This is, however, pollyannish view of the situation. The GBS deal can’t be fixed by tweaking a few details. Reading through even a sampling of the hundreds of objections to the proposed settlement, one sees an amazingly diverse configuration of opponents and a vast array of problems that cannot be remedied by minor fixes.

Much Much More in the Complete Article

Source: The Huffington Post

See Also: Other Articles by Pamela Samuelson Appearing in The Huffington Post

+ DOJ Says No to Google Book Settlement (9/20/2009)

+ Why is the Antitrust Division Investigating the Google Book Search Settlement? (8/19/2009)

+ The Audacity of the Google Book Search Settlement (8/10/2009)

See Also: Press Review: Google Book Settlement Hearing Postponed

See Also: Press Review: U.S. Department of Justice Would Like to See Changes to Google Book Settlement

See Also: Press Round-Up: UC Berkeley Conference Regarding Google Book Search Settlement

Cloud Computing and Digital Preservation on Federal News Radio

Wednesday, September 23rd, 2009

From the Text:

The Library of Congress has a mission that is very similar to several Federal agencies…they are preserving huge amounts of records. And like Federal agencies, they are looking at new technologies to meet that mission. One way they’re doing that is through a pilot project with DuraSpace, that will store some records in the cloud. Bill LeFurgy is the Digital initiative project coordinator at the Library of Congress, and he told me how the pilot project will work.

Listen Online or Download (mp3) the Audio of the Interview. It runs about 14 minutes.

Source: Federal News Radio

Newspaper Digitization: Chronicling America Keeps Growing, 192,000 Pages from AZ, OH, PA, WA Added to Collection

Friday, September 18th, 2009

From the Announcement:

On Sept. 17, the National Digital Newspaper Program added more than 192,000 historic newspaper pages to the Chronicling America Web site, hosted by the Library of Congress. The site now provides free and open access to 1,442,000 pages from 171 titles, that were published between 1880 and 1922 in 15 states and the District of Columbia. This most recent update expands date coverage for many titles already represented in the site and includes content from 4 new states–Arizona, Ohio, Pennsylvania and Washington.

In addition to new content, the site also now includes links to other ways to use the searchable newspapers available in Chronicling America, including:
- links to Topic guides for events and subjects found in Chronicling America,
- links to use of Chronicling America in the LC Flickr photostream,
- and detailed documentation of the Chronicling America API.

Chronicling America passed the one million digitized page mark in June.

See Also: Digitization: Chronicling America Illustrated Newspaper Pages from 1906 Added to LC Flickr Photostream (9/2009)

See Also: Now Available: Webcast: One Millionth Page in Chronicling America (8/2009)

Source: National Endowment for the Humanities, Library of Congress

Webcast Archive Available: The Preservation Function in Research Libraries

Wednesday, September 16th, 2009

Viewing the video is available free but you do need to register.

The event took place on September 15, 2009.

From the Announcement:

Grounded in the recently released report by Lars Meyer, “Safeguarding Collections at the Dawn of the 21st Century: Describing Roles & Measuring Contemporary Preservation Activities in ARL Libraries,” the webcast offered a brief overview of Meyer’s key findings about how research libraries are working to ensure ongoing access to collections in all formats.

In addition, the webcast showcased comments from two reactors to the report: James Neal, Vice President for Information Services and University Librarian, Columbia University Libraries, offered his perspectives on community level preservation challenges; and Deborah Jakubs, Rita DiGiallonardo Holloway University Librarian and Vice Provost for Library Affairs, Duke University Libraries, discussed aligning preservation activities with institutional and inter-institutional concerns. Participants had the opportunity to ask the panelists questions at the end of the session.

Source: Association of Research Libraries

Ohio Group to Mark 25 Years of Protecting State History

Monday, September 14th, 2009

From the Article:

As preservation manager for the College of Wooster, [Sue] Dunlap spends most of her day repairing library books to make sure they remain on the shelves for years to come.

But she doesn’t stop there. As chairwoman of the Ohio Preservation Council, she leads a group of conservationists dedicated to maintaining the state’s historical resources, including books, artifacts and documents.

On Thursday in Columbus, the council will celebrate its 25th anniversary with a symposium about paper.

[Snip]

Members of the council bring a variety of experience to the table, Hayes said. Some, such as Dunlap, work mainly with books, and others work with textiles, artwork or documents.

Regardless of expertise, all members are dedicated to preserving original artifacts, she said.

“If we can digitize something, that’s great, but if we can, we want to preserve the original,” Hayes said. “We don’t ever know how long the current digital format will last.”

[Snap]

“We have to take care of our paper records,” said Ed Vermue, a member of the council who helped plan the symposium.

Source: The Columbus Dispatch

Digitization: Chronicling America Illustrated Newspaper Pages from 1906 Added to LC Flickr Photostream and Other Chronicling America Links

Saturday, September 12th, 2009

From the Announcement:

The Library of Congress has added another year’s worth of historic illustrated newspaper pages to the LC Flickr photostream. The New-York Tribune Illustrated Supplement section of 1906, printed on Sundays, includes published images of signature events of 1906, including: construction of the Panama Canal, 3 weeks of coverage on the San Francisco Earthquake, the Chicago meat packing industry, storm devastation in Hong Kong and Alabama and more….In Flickr, you can tag it, add a note, share it….and even read more about it!

Access the Library of Congress Flickr Stream

Access the Chronicling America Database and Directory

See Also: Milestones: Library of Congress, National Endowment for the Humanities Celebrate Millionth Page in Chronicling America Program

See Also: Now Available: Webcast: One Millionth Page in Chronicling America

See Also: New from the Library of Congress: Chronicling America Topic Guides

See Also: Library of Congress Flickr Stream Adds European Images

Source: LC

Now Online: September 2009 Issue of the Library of Congress Digital Preservation Newsletter

Friday, September 11th, 2009

Access Full Issue

Articles Include:

+ Profile of Digital Preservation Pioneer David Riecks

+ An article about recently published white papers on preserving digital legislative data

+ LOCKSS Chief Scientist David Rosenthal speaks at Library of Congress

+ An article about the K-12 Web Archiving Program

+ Library of Congress digital initiatives profiled in Library Journal

+ News of the 2009 SAA annual meeting and Saving Public Policy Web Content meeting

+ Upcoming Events: iPres 2009 and the Cultural Heritage Online Conference

Source: National Digital Information Infrastructure and Preservation Program at the Library of Congress

Another New Web Archiving Service: WAX from Harvard University

Tuesday, September 8th, 2009

A few weeks ago we posted about the new California Digital Library Public Web Archive Service Collections.

Today, via DigitalKoans we learn of another web archiving service named WAX at Harvard University.

From the Web Site:

The public interface for Harvard’s new Web Archive Collection Service (WAX) launched on February 4, 2009. WAX began as a pilot project in July 2006, funded by the University’s Library Digital Initiative (LDI) to address the management of web sites by collection managers for long-term archiving. It was the first LDI project specifically oriented toward preserving “born-digital” material. WAX has now transitioned to a production system supported by the University Library’s central infrastructure.

Collection managers, working in the online environment, must continue to acquire the content that they have always collected physically. With blogs supplanting diaries, e-mail supplanting traditional correspondence, and HTML materials supplanting many forms of print collateral, collection managers have grown increasingly concerned about potential gaps in the documentation of our cultural heritage.

WAX was developed as an initial–and only partial–response to these and other concerns, which range from technical feasibility to legal and financial implications. The pilot focused on harvesting content from the surface web–content that is discoverable to search engines through web crawlers, as opposed to content hidden from web crawlers in a database or restricted by password or login protection.

Review the WAX Collections

Much More about WAX from DigitalKoans

Source: WAX, DigitalKoans

Note: Of course, don’t forget about The Wayback Machine from the Internet Archive (IA). It’s now home to over 150 billion archived web pages. The IA also does “custom” web archiving via their very cool Archive-It service.

Full Text: Foundation Grants for Preservation in Libraries, Archives, and Museums, 2009 Edition

Monday, September 7th, 2009

From the Web Site

Foundation Grants for Preservation in Libraries, Archives, and Museums, 2009 Edition is a collaborative project of the Library of Congress and the Foundation Center. This publication lists 1,944 grants of $5,000 or more awarded by 488 foundations, from 2004 through the publication date of this guide. It covers grants to public, academic, research, school, and special libraries, and to archives and museums for activities related to conservation and preservation. This publication includes:

+ an introduction that explains the book’s coverage, arrangement, entries, and how to research using the volume. Note: This PDF file contains hotlinks to free online tutorials that cover grant writing and provide an insight into the world of U.S. foundation giving offered by the Foundation Center, as well as to some other widely used non-profit guidance on preservation grants found on the Conservation Online web site.

+ a statistical analysis of grant funding in the area of preservation by foundation, recipient location, subject, recipient type (e.g., Library), grant size, and foundation generosity nationwide.

+ state-by-state descriptions of projects funded in preservation nationwide including the foundation’s name, limitations on giving, recipient(s), size of grants, and purpose of the grant described. Note: This section is hot linked in the PDF version directly to more detailed descriptions of the foundations.

+ indexes by recipient, geographic area of the recipient, and subject. Note: If you do not find what you are looking for in the indices, use the find feature to search the text for your term.

+ a list of all foundations that have donated to preservation and conservation with their contact information and limitations on giving.

Access the Complete Document (125 pages; PDF)

Source: Library of Congress, Foundation Center

Web Archiving: Administration Wants Help Archiving its Facebook, Twitter Content

Wednesday, September 2nd, 2009

And yet another role for the information professional.

From the Article:

The Executive Office of the President (EOP) plans to hire a company to help archive the ever-expanding amount of data that qualifies as presidential records that the office publishes on publicly accessible Web sites and social networking sites, according to a recently published solicitation notice.

The EOP wants a contractor to capture and store content posted on the sites that the administration is required to maintain under the Presidential Records Act (PRA), the notice said. According to the request for quote (RFQ) notice, the contractor will also be responsible for transferring the captured data to the National Archives and Records Administration (NARA) for historical preservation.

[Snip]

White House officials want the company to crawl and archive PRA content on third-party Web sites where the EOP maintains a presence, such as the White House’s Facebook page and Twitter feed, the notice said. According to the RFQ, the EOP wants data capture to be automatic rather than how it’s currently done.

White House officials want the company to crawl and archive PRA content on third-party Web sites where the EOP maintains a presence, such as the White House’s Facebook page and Twitter feed, the notice said. According to the RFQ, the EOP wants data capture to be automatic rather than how it’s currently done.

EOP officials want to capture the posted content at least twice a day, the notice said. In addition, they said the vendor will have to make the data organized and searchable and provide a Web-based tool that government employees can use to manage the record-keeping.

The data will also need to be stored in a way that will let NARA ingest the records into the agency’s next-generation Electronic Records Archives system.

Access the Complete Solicitation (PDF)

Source: FCW

English Language: A Future for Our Digital Memory: Permanent Access to Information in the Netherlands

Wednesday, September 2nd, 2009

From the Announcement:

A future for our digital memory: permanent access to information in the Netherlands, English-language summary, twenty-page English-language summary of the report of the Dutch National Digital Preservation Survey; Dutch report 1 July 2009.

In order to underpin its strategy, the NCDD decided build a detailed picture of the current situation in the public sector in the Netherlands. Can institutions or domains be identified which have successfully risen to the challenge of digital preservation and permanent access? What categories of data are in danger of being lost? How can the risks be managed? The so-called National Digital Preservation Survey was funded by the Ministry of Ministry of Education, Culture and Science, and was held in the first six months of 2009.

A team of three researchers conducted some seventy interviews with stakeholders in three distinct sectors: government & archives, the research community, and cultural heritage institutions.

Access the Complete English Language Report: A future for our digital memory: permanent access to information in the Netherlands, English-language summary

Source: Netherlands Coalition for Digital Preservation

A Data Deluge Swamps Science Historians

Friday, August 28th, 2009

From the Article:

In a vault beneath the British Library here, Jeremy Leighton John grapples with a formidable challenge in digital life. Dr. John, the library’s first curator of eManuscripts, is working on ways to archive the deluge of computer data swamping scientists so that future generations can authenticate today’s discoveries and better understand the people who made them.

His task is only getting harder. Scientists who collaborate via email, Google, YouTube, Flickr and Facebook are leaving fewer paper trails, while the information technologies that do document their accomplishments can be incomprehensible to other researchers and historians trying to read them. Computer-intensive experiments and the software used to analyze their output generate millions of gigabytes of data that are stored or retrieved by electronic systems that quickly become obsolete.

Source: Wall Street Journal
Hat Tip: ACM Tech News

Experts Discuss Saving Public Policy Web Content

Thursday, August 27th, 2009

From the Post:

Curators and public policy experts representing commercial, academic and non-profit organizations convened for a two-day meeting at the Library of Congress to explore strategies for preserving public policy content that has been made available only on the web.

As more and more of existing public policy content is only available on the web, the challenge of providing enduring access to, and long-term preservation of, public policy information is increasingly complicated. The Library’s National Digital Information Infrastructure and Preservation Program is exploring ideas about how to work with others to preserve this information.

NDIIPP is interested in this area as part of its work to catalyze development of a national collection of digital content though a national network of preservation partners. To date, the Program has engaged over 130 partners from the public and private sectors to work together to develop approaches and solutions for saving America’s digital heritage.

Source: NDIPP / Library of Congress