More Googler Research Papers Compiled Online; Object-Level Verticals and Phlat “Personal” Searching (Demo) from MS
First, this article/”pseudo press release” from MS Research Asia discusses “object-level vertical search.”
From the “story,”
Object-Level Vertical Search takes a refined approach that is a significant advance from traditional Web search. The latter paradigm is based on a page-level relevance ranking approach, in which pages that receive links from many other pages are adjudged to have more value by the very fact that they are popular. If more people link to a given page, it must have something to offer—that is the presumption.
In reality, we all know what happens. A search query returns a list of Web pages, some of which may have more relevance to what we are seeking, some with less. It’s up to us, then, to start clicking on likely candidates and scanning the pages for the information we want. It works, to a degree. We’re in the neighborhood, but we’re still looking for the right house. Object-Level Vertical Search is designed to put us on the doorstep. “In Object-Level Vertical Search,†Wen says, “we want to extract and integrate information from the Web about specific objects. “For example, in academic search for a researcher, his information may be distributed on different Web sites. We need to collect, extract, and integrate all of this information. On one Web site, we may find the e-mail address of this person. On another Web site, we can find his telephone number and his publications. “We collect all this information and integrate it. Then, after extraction and integration, the results will be a virtual page containing all the related information about this person.â€
It goes on to say:
There are a series of steps involved in the Object-Level Vertical Search process:
* Web Crawling: to collect relevant information on the Web efficiently.
* Classification: Does a page contain information on products, papers, people, or some other desired category?
* Extraction: pulling specific information about the search query from the relevant Web pages. For a product, for instance, that could mean product name, brand, image, description, and price.
In many ways, this sounds like:
+ What companies like ZoomInfo are doing. Disparate web resources, one-stop results collected on to a single page.
+ Taking a federated search tool and then, behind the scenes, helping the user select the databases they should query and then building a clean set of results. It also sounds similar to what well qualified info pros have been doing for years and years with tools like Dialog. Yes, it’s technically possible to search every Dialog file simultaneously but the great researcher users the tools available and lots of skill to create this object-level vertical process and then a final report for the customer. They also have the advantage of using the structure that Dialog provides.
Next,
a paper from the Text Mining Search and Navigation Research team at MS on personal search and organization:
Fast, flexible filtering with Phlat - Personal search and organization made easy.
by Edward Cutrell, Daniel Robbins, Susan Dumais, Raman Sarin.
Presented at CHI ‘06: Proceedings of the SIGCHI conference on Human Factors in computing systems
January 2006 New York, NY, USA
From the abstract:
Systems for fast search of personal information are rapidly becoming ubiquitous. Such systems promise to dramatically improve personal information management, yet most are modeled on Web search in which users know very little about the content that they are searching. We describe the design and deployment of a system called Phlat tha optimizes search for personal information with an intuitive interface that merges search and browsing through a variet of associative and contextual cues. In addition, Phlat supports a unified tagging (labeling) scheme for organizing personal content across storage systems (files, email, etc.). The system has been deployed to hundreds of employees within our organization. We report on both quantitative and qualitative aspects of system use.
Want to learn more and demo Phlat? Go to http://research.microsoft.com/adapt/phlat/. It works with Windows Desktop Search. A guide to using Phlat and screen caps are here.
Finally, the Google Labs site has been updated with a large helping of new (2006) papers by Googler’s. Here are just a few that we noticed. Some are available on the web while others are not:
+ A Large Scale Study of Wireless Search Behavior: Google Mobile Search
+ Names and Similarities on the Web: Fact Extraction in the Fast Lane
+ Sender Reputation in a Large Webmail Service
+ Retroactive Answering of Search Queries
+ Social- and Interactive-Television Applications Based on Real-Time Ambient-Audio Identification
+ An Experimental Study of the Skype Peer-to-Peer VoIP System
+ Large Scale Image-Based Adult-Content Filtering
+ Using Encyclopedic Knowledge for Named Entity Disambiguation
++ Many more papers available here.
