The Library of Congress and its 700 Terabytes of Data (and Growing Fast)

From the Article:

So far, the library has a total of 700 [50 million individual files] terabytes of data. But because of copyright issues, only 200 of those are available on the Web.
[Snip]
At the Library of Congress, the numbers can be mind-boggling. Experts estimate they have more than 120 million books, 36,000 feature films, hundreds of thousands of music sheets and recordings, and the large collections of manuscripts, Web sites, posters and photography. Yet only one percent of it has been digitized.
[Snip]
Most of the library’s digital collection is for preservation reasons. But it is the one percent of the collection that has been digitized for the web that serves most of its customers: 85 million a year.
[Snip]
The collection of around 65 million manuscripts hold some of the most treasured documents at the library, from presidential papers to original poems.
[Snip]
More than five million maps are being digitized.
[Snip]
…nearly one and a half million photos have been posted on the web.

Jane Mandelbaum, Thomas Youkel, James Hutson, and Colleen Cahill from LC are all quoted in the article.

Source: Voice of America

Comments are closed.