Archive for the ‘Info Management and Retrieval’ Category

Digital Archives: Siegfried Sassoon Collection Added to First World War Digital Poetry Archive

Thursday, November 12th, 2009

We first posted about The First World War Digital Poetry Archive from Oxford University in September.

The First World War Poetry Digital Archive is an online repository of over 4000 items of text, images, audio, and video for teaching, learning, and research.

The heart of the archive consists of collections of highly valued primary material from major poets of the period, including Wilfred Owen, Isaac Rosenberg, Robert Graves, Vera Brittain, and Edward Thomas. This is supplemented by a comprehensive range of multimedia artefacts from the Imperial War Museum, a separate archive of over 6,500 items contributed by the general public, and a set of specially developed educational resources.

Yesterday, the Siegfried Sassoon Collection was added to the archive.

Jennifer Howard from The Wired Campus writes:

Although it contains photographs and other materials, the collection centers on manuscripts of Sassoon’s poems, drawn from holdings at Oxford’s Bodleian Library and at the University of Cambridge, the New York Public Library, and the Harry Ransom Center at the University of Texas at Austin. A draft of Sassoon’s poem “Standing With the Dead” turns up in a June 19, 1918, letter to his friend Robert Nichols.

A New Digital Collection: A Calm Voice in a Strident World: Senator J.W. Fulbright Speaks

Wednesday, November 11th, 2009

This new digital collection comes from the University of Arkansas Libraries.

From the About Page:

For three decades following World War II, J.W. Fulbright represented Arkansas in the Congress of the United States. His single term in the House and four terms in the Senate saw Fulbright rise to become the foremost congressional authority on American foreign policy. From the beginning, Fulbright was a voice of calmness in the halls of congress, counseling international cooperation, the exchange of information, and support for the United Nations.

This digital collection contains fifty speeches Fulbright made during his congressional career. While the speeches deal with many topics, the emphasis is given to foreign affairs.

In order to put the speeches into their historical context, a variety of resources are included—including a detailed time line, a bibliography on the senator, and a selection of photographs.

Readers are reminded that these 50 speeches and related materials included on this site represent only a tiny fraction of the J.W. Fulbright Papers, comprised of over 1400 linear feet, held by the University of Arkansas Libraries. Serious students of Senator Fulbright and his era are urged to consult the full collection at the University. The collection, and a partial guide to the collection, may be accessed
[here].

Access: A Calm Voice in a Strident World:Senator J.W. Fulbright Speaks

See Also: More Digital Collections from the University of Arkansas

More in this News Release from the U. of Arkansas

Source: U. of Arkansas Libraries

UK: Making Research Easier to Find and Access

Tuesday, November 10th, 2009

From the Announcement:

A JISC funded study is making recommendations to help people find university research outputs through better integration of library catalogues, research repositories and other university systems.

The JISC-funded ‘Online catalogue and repository interoperability study’ carried out by the Centre for Digital Library Research at the University of Strathclyde suggests that [our emphasis] although there is overlap between the types of information resources recorded in library catalogues and repositories, these overlaps are rarely apparent to the information seeker. This is because both types of system need to be searched separately as there is no interlinking.

Barriers between systems arise not only for technical reasons but also because they are often based in different departments of the university.

Practical advice for universities looking to make improvements in this area include:

+ Improving co-ordination between the departments responsible for institutional information systems to reduce duplication of effort and increase the efficiency of workflows

+ Making it clear to the information seeker what types of information the library catalogue and the digital repository each cover

+ Describing the same types of resources consistently in the library catalogue and digital repository

+ Improving the consistency and quality of subject descriptors, classification and author naming in digital repositories and using the same standards for these as the library catalogue as far as possible

Much More in the Complete Announcement

The complete report: Online catalogue and interoperability study is available here.

Source: JISC

See Also: Personalisation Allows Researchers to Create Online Bibliographies
A new interface to of Copac is available for all to use.

{It’s] a freely-available, merged online library catalogue that allows you to easily search the UK’s national library catalogues as well as many major academic and specialist libraries at the same time.

Princeton Theological Seminary Library Partnering with Internet Archive to Digitize Content

Tuesday, November 10th, 2009

From the Web Site:

The Princeton Seminary library is delighted to be working in partnership with the Internet Archive, a 501(c)3 organization in San Francisco, to digitize a selection of out-of-copyright books from its collections. Approximately 21,000 items have been scanned to date. These books include historical sources about Princeton and Princeton Seminary, early editions of John Calvin in Latin and English, illustrated works on early Protestant missions, and classic biblical commentaries, among much else. All of these books are available to alumni/ae without restriction.
,
The URL of the Internet Archive is http://www.archive.org/details/Princeton.

Note: The article includes a picture of a person scanning materials into the database using the IA’s scanning technology.

Source: Internet Archive / Princeton Princeton Theological Seminary Library

Digital Preservation: Two New Publishers Join CLOCKSS

Monday, November 9th, 2009

From the Announcement:

CLOCKSS is pleased to announce that two new society publishers have recently joined the CLOCKSS archive. The Royal Society of Chemistry and the Royal Society have signed agreements this fall to join CLOCKSS and preserve their materials in the CLOCKSS network of geographically and geopolitically distributed archive nodes. CLOCKSS (Controlled Lots of Copies Keep Stuff Safe) is a community-governed, not-for-profit archive founded by librarians and publishers to ensure the long-term availability of scholarly digital content.

As part of joining CLOCKSS, the two societies agree to release their archived content to the world for free if a time comes when it is no longer available from any publisher (”trigger event”).

Access the Complete Announcement

Source: CLOCKSS

Personalisation Allows Researchers to Create Online Bibliographies

Monday, November 9th, 2009

From the Article:

A new interface is now available for Copac, a freely-available, merged online library catalogue that allows you to easily search the UK’s national library catalogues as well as many major academic and specialist libraries at the same time.

Users include researchers and librarians from the UK and abroad, who use Copac as a resource discovery tool to find rare and specialist materials in all formats. Copac enables users to search for books, journals, electronic resources and multimedia materials and find out where they are held across the UK.

Copac offers search and export facilities, with tables of contents displayed for books and journals (where available). If you choose to log in to Copac, you can now enjoy a new range of personalised facilities, including ‘My References’, an online bibliography that you can develop over time.

Access Copac

Access the Complete Article

Source: Research Information

Parties Involved Ask for a Brief Extension Before Google Book Search Settlement 2.0 Released

Monday, November 9th, 2009

An new version of the Google Book Search settlement was expected to be released today. However, as the NY Times reports, those involved have asked for more time. As of today, “Settlement 2.0″ will be released this Friday.

The parties to the Google book settlement, which would legalize the creation of a vast library of digital books, have asked the judge overseeing a revision of the agreement for an extension to this Friday, Nov. 13.

The parties told Judge Denny Chin of the Federal District Court of the Southern District of New York that they would submit a revised settlement for the court’s preliminary approval by Nov. 9.

But on Monday, the parties submitted a letter to the court requesting an extension to Nov. 13. In the letter, the group indicated that it had met with the Justice Department before and after the October status hearing and had met as recently as Friday, Nov. 6.

Source: Media Coder Blog (NY Times)

Source: Open Book Alliance Releases Baseline Requirements for Revised Google Book Settlement Proposal

An Interview with Project Gutenberg Founder Michael Hart

Sunday, November 8th, 2009

Andrea Kobeskzo recently interviewed Michael Hart and now you can read the Q & A interview on the Project Gutenberg News Portal.

Here’s an interesting passage when Kobeskzo asks Hart about how Project Gutenberg has evolved over the years (PG began in 1971). Hart says:

Believe it or not people were still saying eBooks were never going to make it just a few years ago. Look for a quote in the Wall St. Journal: “Ebooks are never going to make it.” Before that the NY Times: look for: “twitchy” screen. However now that it’s obvious they are moving eBooks on their own, but I can’t tell how serious they are. They may just be following the rule of simple reporting: “Follow The Money.” If eBooks fall flat will they all just move on and pretend there was never any interest?

The first goal of PG was just to prove eBooks feasible. My own estimations were that it would take about 10,000, and that seems to have proved correct as Google called me in to advise them ASAP after we hit 10,000, and we went to do just that on December 14, 2003: and they announced they had invented eBooks and eLibraries December 14, 2004. However, they did the opposite, or rather exact opposite of what I said they should do and look what happened. Most of the big legal fray is because they were more money oriented, and as such may have intentionally played the copyright cards that got them in the big legal hassles. If they had started out by emphasizing the public domain it probably would have worked out a lot better for them in the press as the good will they would have built up would have gone a long way.

Personally, I am OK with nearly any eBook format that is compact and search quote friendly.

Access the Complete Interview

GAO — National Archives: Progress and Risks in Implementing its Electronic Records Archive Initiative

Friday, November 6th, 2009

National Archives: Progress and Risks in Implementing its Electronic Records Archive Initiative (PDF: 154 KB)
From Highlights (PDF; 45 KB):

NARA has completed two of five planned increments of ERA, but has experienced schedule delays and cost overruns, and several functions planned for the system’s initial release were deferred. Although NARA initially planned for the system to be capable of ingesting federal and presidential records in September 2007, the two system increments to support those records did not achieve initial operating capability until June 2008 and December 2008, respectively. In addition, NARA reportedly spent about $80 million on the base increment, compared to its planned cost of about $60 million. Finally, a number of functions originally planned for the base increment were deferred to later increments, including the ability to delete records and to ingest redacted records. In fiscal year 2010, NARA plans to complete the third increment, which is to include new systems for Congressional records and public access, and begin work on the fourth.

Source: Government Accountability Office (David A. Powner, director, information technology management issues, before the Subcommittee on Information Policy, Census and National Archives, House Committee on Oversight and Government Reform)

Digital Preservation: ACM Will Partner with Portico and CLOCKSS for Preservation of Its Digital Library Resources

Friday, November 6th, 2009

From an Announcement:

ACM (the Association for Computing Machinery) announced today that it is providing its institutional library customers with advanced electronic archiving services to preserve their valuable electronic resources. These services, provided by Portico and CLOCKSS, address the scholarly community’s critical need for long-term solutions that assure reliable, secure, deliverable access to their burgeoning digital collection of scholarly works. ACM is offering these services to protect the vast online collection of resources in its Digital Library (DL), which are used by over 1 million computing professionals and students worldwide.

“By partnering with Portico and CLOCKSS, we are able to meet a growing demand in the library community for a trusted, reliable third-party archive, and to ensure that digital collections remain accessible to future scholars, researchers, and students,” said Scott Delman, ACM Group Publisher. “Scientific discovery and the educational process are not possible without reliable access to the accumulated scholarship of the past and secure preservation of the scholarly record, and these agreements are a clear step forward with the relationship between the ACM and the library community.”

By investing in long-term digital preservation of content, ACM’s aim is to make it easier for libraries to accelerate their transition away from print and free up resources invested in print collections in favor of new and innovative electronic products and services.

Much More After a Click
(more…)

New Video on Web Archiving

Friday, November 6th, 2009

From the Description:

Web content changes all the time. If we don’t save that content before it disappears, a major part of our cultural history will be lost.

The Library of Congress is working to provide permanent access to web content of historical importance. It selects websites for collection, requests permissions from the website owners, addresses the technology of collecting websites and preserves the websites and makes them available.

This video examines those four challenges.

Access the Video (embedded here)

A text transcript is also available (PDF)

Source: National Digital Information Infrastructure and Preservation Program

Open Book Alliance Releases Baseline Requirements for Revised Google Book Settlement Proposal

Friday, November 6th, 2009

On Monday (November 9th), a revised proposed settlement (aka Settlement 2.0) from Google, the Authors Guild and the Association of American Publisher will be released. The Open Book Alliance (OBA) has posted on its web site what they call “baseline requirements” for the Settlement 2.0.

The Special Libraries Association and The New York Library Association are members of the OBA.

From the Blog Post:

The Open Book Alliance is issuing the following baseline requirements that the new settlement proposal must meet if it is to achieve those critical objectives. These requirements reflect the collective expression of concerns by the U.S. Department of Justice, authors, publishers, academics, libraries, foreign nations, state Attorneys General, consumer advocacy groups, and many others, and thus we think it appropriate to review the revised settlement within this framework.

[Snip]

+ The settlement must not grant Google an exclusive set of rights (de facto or otherwise) or result in any one entity gaining control over access to and distribution of the world’s largest digital database of books.

+ Authors and other rights holders must retain meaningful rights and the ability to determine the use of their works that have been scanned by Google.

+ The settlement must result in the creation of a true digital library that grants all researchers and users, commercial and non-commercial, full access that guarantees the ability to innovate on the knowledge it contains.

+ All class members must be treated equitably.

+ The settlement cannot provide for competition by making others engage in future litigation.

+ Congress must retain the exclusive authority granted by the U.S. Constitution to set copyright policy.

+ All rights holders impacted by the settlement must have a meaningful ability to receive notice, understand its terms and opt-out.

+ The parties that negotiated the settlement must live under the terms to which they seek to bind others, rather than their own separately negotiated arrangements.

Access the Complete Blog Post

Source: Open Book Alliance

See Also: Press Review: Judge Chin Sets Nov. 9 Deadline For Revised Google Book Settlement (via ResourceShelf, October 7, 2009)

New Research Paper from Stanford InfoLab: A Dynamic Navigation Guide for Webpages

Thursday, November 5th, 2009

Ed. Note: One thing that we used to do more of on ResourceShelf was to occasionally link to new and hopefully interesting research papers that we came across . Granted, the papers could sometimes get very technical (even for the editors) but those readers who could read the technical content appreciated the material while non-techies could get a good idea about the research by reading the abstract and usually the first several paragraphs of the paper. So, let’s restart this feature again with a new paper the InfoLab at Stanford Univesity.

A Dynamic Navigation Guide for Webpages (4 pages; PDF)
by Jawed Karim and Ioannis Antonellis and Varun Ganapathi and Hector Garcia-Molina
Note: This version of the paper has been submitted for publication

Navigating websites is often a frustrating process: Website visitors, despite their widely varying and individual information-seeking needs, must contend with static, general-purpose link structures that have been set in place by website owners. Because many visitors tend to browse for the same content, they are individually repeating the same navigation activity. Visitors would benefit from being able to take advantage of the collective search and discovery work that has already been performed by other visitors. Although many attempts have been made to improve website navigation by tapping into the “wisdom of the crowds”, the currently available approaches suffer from maintenance, usability, and user interface integration issues. We present a navigation guide for websites that provides visitors with helpful suggestions based on their browsing activity and the browsing activity of prior, similar visitors. Our navigation guide does not require any downloads, can be easily added to websites by website owners, and automatically remains up-to-date.

Sections of the Paper Include:

+ Introduction
+ Current Methods
+ The Wisdom of Crowds
+ A Dynamic Navigation Guide
+ How it Works
+ Related Work
+ Conclusion and Future Work

Source: Stanford InfoLab

Webcast: Preserving OSTI’s Printed Archive

Thursday, November 5th, 2009

Webcast: Preserving OSTI’s Printed Archive
A three minute video from the Office of Scientific and Technical Information at the U.S. Department of Energy.

Here’s the Blurb:

The American public has invested billions of dollars in the atomic energy and subsequent related programs. This investment has mostly been in the form of the printed page. OSTIs historical preservation is described.

Direct to “Printed Archive” Video (via YouTube)

Direct to OSTI YouTube Channel

Direct to OSTI Home Page

While print preservation is essential, OSTI is home to many free online databases including:

+ Science Accelerator
+ Science.gov (Content from Many Government Databases, Search Tecnology from OSTI)
+ WorldWideScience (Global in Scope)
+ Information Bridge: DOE Scientific and Technical Information (Includes over 210K Full Text Documents)
+ DOE Data Explorer
+ Energy Citations Database
+ E-print Network
+ Several Others Linked on the OSTI Home Page

Source: OSTI

Internet Archive Founder Brewster Kahle Profiled in Forbes

Wednesday, November 4th, 2009

Brewster Kahle has many titles. These days he’s best known as founder of the Internet Archive (home of The Wayback Machine) and founding member of the Open Content Alliance.

From the Article:

“We have to have universal access to everything, just like a library,” he says. “Do we want that under a single corporation’s control? It is openness, not corporate control, that propels capitalism.”

[Snip]

Digital libraries will shape education, creativity and our shared intellectual heritage, Kahle declares. As founder and director of the Internet Archive, Kahle has posted online digital copies of 1.7 million books, 100,000 hours of television, 200,000 video clips, 70,000 concerts and 415,000 audio recordings. All that material can be downloaded for free from the Archive’s Web site.

[Snip]

Bookserver* uses a range of open source and proprietary electronic book standards, search algorithms, editing tools and libraries. The architecture, as Kahle calls it, potentially separates manufacturers of devices from control over much of the content inside them. It also preserves the idea of the lending library–if you “check out” a volume, others cannot access it in the time allowed to you. Publishers sell their books in the system using credit cards.

The article continues with more about Google Book Search and Kahle’s background.

We were surprised not to see The Wayback Machine mentioned in the stats about the Internet Archive listed above. At the moment (and we know of nothing coming), “Wayback” is probably the best chance a researcher has to access a page no longer on the Internet. Material in “Wayback” dates back to 1996 and as of today, contains more than 150 BILLION archived pages. The Internet Archive also offers a fee-based service that helps organizations organize and archive their web content. It’s called, Archive-It.

* See Also: We Have an In-Depth Post About Bookserver on ResourceShelf
It Includes an comprehensive press review the day after the Bookserver announcement.

Source: Forbes

Bibliotheca Alexandrina: A Digital Revival

Tuesday, November 3rd, 2009

The Bibliotheca Alexandrina is one busy place. If you want to learn more read on through our highlights but make sure to read the complete article. Our highlights is just a sample of what’s going on.

From the Article:

The International School of Information Science (ISIS) a research institute affiliated with the BA [Bibliotheca Alexandrina], aims at furthering the BA’s goals of being a leading institution in knowledge dissemination and, specifically, promoting research and development related to the digital libraries. Toward that goal, ISIS has embarked on an array of ambitious projects, in partnership with world-class institutions. These include hosting a mirror site for the Internet Archive, participating in the Million Book Project, organizing the digital archive of the Gamal Abdel Nasser collection, digitizing 113 years of Al-Hilal magazine, presenting the first-ever complete digital version of Description de l’Egypte, conducting advanced research such as the Arabic component of the UN-sponsored Universal Networking Language computerized multi-language translation program, and offering the most advanced 3D virtual imaging techniques in a virtual immersive environment for science and technology applications. Thus, despite being barely seven years in existence, the BA already has a substantial record of achievements.

Among the other projects you’ll read about are:

+ The Digital Assets Repository (DAR)

+ Memory of Modern Egypt Digital Repository

+ Archive documenting the history of the Suez Canal

+ SuperCourse

To empower science educators worldwide, the BA is working with a team of specialists, in partnership with the University of Pittsburgh, to launch the first science SuperCourse, comprising thousands of PowerPoint lectures made available for free to teachers and lecturers, who can use the lectures as they see fit in their teaching of science. The SuperCourse has been effectively implemented in the area of Public Health and Epidemiology, with a network of 65,000 scientists in 174 countries, providing more than 3,500 lectures in 31 languages. The BA maintains a mirror site of SuperCourse, which receives an average of one million hits per month, and is working on setting up a similar course in all fields of science.

Much More in the Complete Article

Source: EDUCAUSE Review
Hat Tip: OAN

More Digitization Underway: This Time Footnote.com is Digitizing the U.S. Census from 1790-1930

Tuesday, November 3rd, 2009

Footnote.com is once again partnering the National Records and Administration Agency (NARA) to digitize massive amounts of content and then make that material available, often for a fee, available online. Footnote is becoming–and for some has already become–and important resource for historians, genealogists, students, and others.

This time around, Footnote.com, is digitizing all publicly available Census materials from 1790-1930. These dates represent the period when all materials (including names) from a given census have been made publicly available. Through its partnership with NARA, Footnote.com will add more than 9.5 million pages of content when the census database project is complete. We’ve learned that Footnote.com is digitizing all of this material on their own.

From a Footnote.com Blog Post:

With over 60 million historical records already online, Footnote.com will use the U.S. Census records to tie content together, creating a pathway to discover additional records that previously have been difficult to find.

The Interactive Census Project Home Page offers much more detail and examples. You can also create email alerts when new states are added to the census database. On the lower-left side of the page you can track the progress of each census has been digitized. As you’ll see, the 1860 census is complete and the 1930 census is just about done.

Searching is free, Footnote provides numerous options to refine your search (here’s an example). Accessing the complete record is fee-based either subscribing to the database for a annually or monthly. You can also by individual documents for $2.95. Btw, Footnote.com also sells institutional access to libraries through EBSCO.

Footnote looks at the census project as a “highway” to assist the researcher in finding more information in other databases.

If you’ve been reading ResourceShelf for a while you’ve seen an increasing number of mention their services. Here’s a list of a few of them,

+ In August of 2009. we posted on the release of a joint project with the National Archives (NARA) to digitize holocaust material.

+ In December of 2008, in a partnership with NARA, Footnote released the largest interactive World War II collection online.

+ In March, 2008 we posted about Footnote.com offering an interactive version of the Vietnam Wall.

Our first post about Footnote dates back to January, 2007.

If you run this search using the ResourceShelf database, you’ll be able to see and read all of our Footnote.com posts.

But wait, there’s more. A quick review of the Footnote “press room” offers up even more projects. You can learn about them here.