Archive for the ‘Papers and Presentations’ Category
Wednesday, September 3rd, 2008
Extremely Fast Text Feature Extraction for Classification and Indexing
Most research in speeding up text mining involves algorithmic improvements to induction algorithms, and yet for many large scale applications, such as classifying or indexing large document repositories, the time spent extracting word features from texts can itself greatly exceed the initial training time. This paper describes a fast method for text feature extraction that folds together Unicode conversion, forced lowercasing, word boundary detection, and string hash computation. We show empirically that our integer hash features result in classifiers with equivalent statistical performance to those built using string word features, but require far less computation and less memory.
+ Full Paper (PDF; 348 KB)
Source: HP Labs
Posted in Info Management and Retrieval, Papers and Presentations, Source File | No Comments »
Wednesday, August 27th, 2008
The Google Controversy — Two Years Later (PDF; 56 KB)
Two years have passed since Google startled the world with its free, online, high-resolution mapping products of the world. Foreign governments expressed their shock and concern about such detailed imagery in the hands of the general populace; their facilities and state secrets exposed to the world. “Today, with the advent of civilian satellites here and abroad, we have opened wide the window on places and events that, not so long ago, only spies could see,” writes Sharon Weinberger.
As the initial shock wore off, five main responses to the “Google threat” emerged from nations around the world: negotiations with Google, banning Google products, developing a similar product, taking evasive measures, and nonchalance. This report discusses foreign reporting and government response to the online mapping revolution after the initial brouhaha.
Source: Open Source Center (via Secrecy News)
Posted in Access to Information, Geographic, Open Source Intelligence, Papers and Presentations, Source File, Technology and Internet | No Comments »
Thursday, August 21st, 2008
Making Web 2.0 Accessibility Mainstream
Research into ‘Web 2.0 accessibility’ for people with disabilities has recently gained momentum in library and information science studies due to the unique problems disabled individuals face because they must rely on digitized formats. People with disabilities who use assistive technologies are often restricted by incompatibility issues involving software and hardware when retrieving Web content since many resources have been constructed without consideration for disabled users. The result has been a new dilemma emerging for many information centers and libraries regarding how to provide access to Web 2.0 technologies which are not designed for persons with disabilities and are incompatible with many assistive technologies. Careful consideration must be given in the development stage of web design to the layout, navigation and compatibility of different assistive technologies used to view the site.
+ Full Paper (PDF; 149 KB)
Source: Cheris Carpenter (via E-LIS)
Posted in Papers and Presentations, Source File, Web 2.0 | No Comments »
Thursday, July 31st, 2008
From the abstract:
Biomedical text-mining have great promise to improve the usefulness of genomic researchers. The goal of text-mining is analyzed large collections of unstructured documents for the purposes of extracting interesting and non-trivial patterns of knowledge. The analysis of biomedical texts and available databases, such as Medline and PubMed, can help to interpret a phenomenon, to detect gene relations, or to establish comparisons among similar genes in different specific databases.
Source: Gálvez, Carmen and Moya-Anegón, Félix (2008) Text-mining research in genomics. In Guimaraes, Nuno and Isaías, Pedro, Eds. Proceedings IADIS International Conference Applied Computing 2008, pp. 277-283, Algarve (Portugal).
Posted in Papers and Presentations, Science, Source File, Technology and Internet | No Comments »
Tuesday, July 8th, 2008
Academic Libraries: 2006
The Academic Libraries: 2006 First Look summarizes services, staff, collections, and expenditures of academic libraries in 2- and 4-year, degree-granting postsecondary institutions in the 50 states and the District of Columbia. The nation’s 3,600 academic libraries held 1.0 billion books; serial backfiles; and other paper materials, including government documents at the end of FY 2006, and there were 144.1 million circulation transactions from their general collections. During the same time period, academic libraries’ expenditures totaled $6.2 billion.
+ Full Report (PDF; 1.1 MB)
+ Supplemental Table (PDF; 169 KB)
Posted in Libraries and Librarianship, Papers and Presentations, Source File, Statistics | No Comments »
Monday, July 7th, 2008
Open Source Software in Education
Educational institutions have rushed to put their academic resources and services online, bringing the global community onto a common platform and awakening the interest of investors. Despite continuing technical challenges, online education shows great promise. Open source software offers one approach to addressing the technical problems in providing optimal delivery of online learning.
Source: Education Quarterly (EDUCAUSE)
Posted in Education, Online Education, Papers and Presentations, Source File | No Comments »
Wednesday, June 18th, 2008
Information Literacy from the Trenches: How Do Humanities and Social Science Majors Conduct Academic Research? (PDF; 697 KB)
This article examines the ways in which students majoring in humanities and social sciences conceptualize and operationalize course-related research. Findings are presented from an information-seeking behavior study with data collected from student discussion groups, a student survey, and a content analysis of professors’ research assignment handouts. Results indicate that students first use course readings and library resources for academic research and then rely on public Internet sites later in their research process. Students adopt a hybrid approach to course-related research. A majority of students in this study leveraged both human and computer-mediated resources to compensate for their lack of information literacy. In particular, students faced problems with determining information needs for assignments, selecting and critically evaluating resources, and gauging professors’ expectations for quality research.
Source: College & Research Libraries, forthcoming (Alison J. Head)
Posted in Education, Information Literacy, Information Seeking, Libraries and Librarianship, Papers and Presentations | No Comments »
Monday, June 16th, 2008
Google’s Joe Kraus on How to Make the Web More Social
Can the Internet be made more social? This is a question with which Joe Kraus, director of product management at Google, constantly has to grapple. He believes every killer app on the web — instant messaging, e-mail, blogging, photo-sharing — has succeeded because it helps people connect with one another. For Kraus, this means the Internet has an inherently social character, but it can be enhanced further — an area he continues to explore through Google initiatives such as Open Social and Friend Connect. Wharton legal studies professor Kevin Werbach spoke with Kraus recently about the increasing socialization of the Internet. Kraus will speak about social computing at the Supernova conference in San Francisco on June 16.
Audio also available.
Source: Knowledge@Wharton
Posted in Papers and Presentations, Search News, Social Media, Source File, Technology and Internet, Web 2.0, Webcasts and Podcasts | No Comments »
Sunday, June 15th, 2008
Government Data and the Invisible Hand
If the next Presidential administration really wants to embrace the potential of Internet-enabled government transparency, it should follow a counter-intuitive but ultimately compelling strategy: reduce the federal role in presenting important government information to citizens. Today, government bodies consider their own websites to be a higher priority than technical infrastructures that open up their data for others to use. We argue that this understanding is a mistake. It would be preferable for government to understand providing reusable data, rather than providing websites, as the core of its online publishing responsibility.
Rather than struggling, as it currently does, to design sites that meet each end-user need, we argue that the executive branch should focus on creating a simple, reliable and publicly accessible infrastructure that exposes the underlying data. Private actors, either nonprofit or commercial, are better suited to deliver government information to citizens and can constantly create and reshape the tools individuals use to find and leverage public data. The best way to ensure that the government allows private parties to compete on equal terms in the provision of government data is to require that federal websites themselves use the same open systems for accessing the underlying data as they make available to the public at large.
Several options available for retrieval of full text (PDF; 113 KB).
Source: Yale Journal of Law & Technology (via SSRN)
Posted in Access to Information, Government Documents and Political Information, Papers and Presentations, Source File, Technology and Internet | No Comments »
Sunday, June 15th, 2008
Journal Prices, Book Acquisitions, and Sustainable College Library Collections (PDF; 489 KB)
Library collections are economically sustainable only if the rate of increase in costs is no greater than the rate of increase in the library acquisitions budget. Because book prices increase at a much lower rate than journal prices, undergraduate libraries can achieve economic sustainability through a renewed emphasis on books rather than journals. Book?centered collections are consistent with the goals of many undergraduate colleges, and books rather than journals may provide the best teaching resources even in those fields that rely heavily on journals for the communication of original research results.
Source: College & Research Libraries, forthcoming (William H. Walters)
Posted in Education, Libraries and Librarianship, Papers and Presentations, Scholarly Publishing, Source File | No Comments »
Friday, June 13th, 2008
From the article:
At the suggestion of the assistant director for collections, instruction, and public service, the Ohio State University Libraries in fall 2005 initiated a program to provide grants to faculty members to enhance their courses with the library’s electronic resources. The purpose of this program was twofold: to maximize use of electronic resources for which the library was already paying and to encourage collaboration between faculty and librarians in course development.
The libraries initially set aside $50,000 to implement the program, deciding that for each accepted proposal the faculty member would get $2,000 to teach the course and another $2,000 if the course was taught a second time. In addition, the librarian associated with the project would get $1,000. The grants were considered incentives; there was no requirement that the money to be used to implement the activities set forth in the proposals.
Source: C&RL News
Posted in Education, Libraries and Librarianship, Papers and Presentations | No Comments »
Thursday, June 5th, 2008
Internet Information and Communication Behavior during a Political Moment: The Iraq War, March 2003
This article explores the Internet as a resource for political information and communication in March 2003, when American troops were first sent to Iraq, offering us a unique setting of political context, information use, and technology. Employing a national survey conducted by the Pew Internet & American Life project. We examine the political information behavior of the Internet respondents through an exploratory factor analysis; analyze the effects of personal demographic attributes and political attitudes, traditional and new media use, and technology on online behavior through multiple regression analysis; and assess the online political information and communication behavior of supporters and dissenters of the Iraq War. The factor analysis suggests four factors: activism, support, information seeking, and communication. The regression analysis indicates that gender, political attitudes and beliefs, motivation, traditional media consumption, perceptions of bias in the media, and computer experience and use predict online political information behavior, although the effects of these variables differ for the four factors. The information and communication behavior of supporters and dissenters of the Iraq War differed significantly. We conclude with a brief discussion of the value of “interdisciplinary poaching” for advancing the study of Internet information practices.
+ Full Paper (PDF; 352 KB)
Source: E-LIS
Posted in Government Documents and Political Information, Information Seeking, Papers and Presentations, Source File, Technology and Internet | No Comments »
Wednesday, June 4th, 2008
Canada’s Fastest-Growing Companies, 2008
The PROFIT 100 table ranks the 100 fastest-growing companies in Canada by percentage revenue growth from 2002-07, while the Next 100 table features companies ranked from Nos. 101 to 200.
Sort tables by:
* Rank
* Alphabetical order
* Revenue 2002
* Revenue 2007
* Growth 2002-07 (%)
* Profit margin
* Employees (number of)
* Exports as % of sales
Source: Canadian Business
Posted in Business and Economics, Lists and Rankings, Papers and Presentations, Real-Time Information, Source File | Comments Off
Monday, May 19th, 2008
Exploring historical trends using taxonomic name metadata
Background
Authority and year information have been attached to taxonomic names since Linnaean times. The systematic structure of taxonomic nomenclature facilitates the ability to develop tools that can be used to explore historical trends that may be associated with taxonomy.
Results
From the over 10.7 million taxonomic names that are part of the uBio system (http://www.ubio.org), approximately 3 million names were identified to have taxonomic authority information from the years 1750 to 2004. A pipe-delimited file was then generated, organized according to a Linnaean hierarchy and by years from 1750 to 2004, and imported into an Excel workbook. A series of macros were developed to create an Excel-based tool and a complementary Web site to explore the taxonomic data. A cursory and speculative analysis of the data reveals observable trends that may be attributable to significant events that are of both taxonomic (e.g., publishing of key monographs) and societal importance (e.g., world wars). The findings also help quantify the number of taxonomic descriptions that may be made available through digitization initiatives.
Conclusions
Temporal organization of taxonomic data can be used to identify interesting biological epochs relative to historically significant events and ongoing efforts. We have developed an Excel workbook and complementary Web site that enables one to explore taxonomic trends for Linnaean taxonomic groupings, from Kingdoms to Families.
+ Full Paper (PDF; 4.2 MB)
Source: BMC Evolutionary Biology
Posted in Information Science, Papers and Presentations, Source File | No Comments »
Monday, May 19th, 2008
Email Information Flow in Large-Scale Enterprises
14 pages; PDF.
by Thomas Karagiannis; Milan Vojnović
From the abstract:
We present analysis results of email communications in a large-scale enterprise network. Our study first focuses on understanding the social graph induced by email communications between individual users. Specifically, we examine how email communication flows are correlated with user profiles, the organization structure, and how outside information penetrates the enterprise. We then concentrate on understanding the information processing load imposed to users and the strategies applied by users in email triage. To the best of our knowledge, this is the first measurement study of email communications of a global enterprise network comprising email data from over 100,000 employees spread across multiple continents. Our analysis results inform the design of network applications that takes into account typical user behaviour in social interactions and solitary information processing. Our large-scale dataset further allows us to examine the validity of several hypotheses suggested by the social network theory.
Source: Microsoft Research
Posted in Papers and Presentations, Social Media, Source File, Web 2.0 | No Comments »