The exciting and pioneering work of Dr. C. Lee Giles and his students/colleagues (more about Dr. Giles here) at Penn State University continues. We have background about his past work following today’s news.
Today’s news is on the release of ChemxSeer. Giles is the codirector of the project.
What is it?
From the news release:
ChemxSeer, the first publicly available search engine designed specifically for chemical formulae, can sort out when “He” refers to helium and not a person more than nine times out of 10, according to the Penn State College of Information Sciences and Technology (IST) researchers who created the tool.
With the new engine, scientists searching for research on CH4 or methane no longer have to wade through search results about Channel 4 or Chapter 4 as ChemxSeer will only return documents with references to the chemical formula. “Results from our search engine are much more relevant than results returned by popular search engines,” Giles said. “It is one of several cyber tools under development in our lab which will enable better access to and sharing of information and data among scientists and scholars.”
So, targeted web searching with a focus on chemistry.
One of many interesting features that make-up ChemxSeer is TableSeer.
From the web site:
This tool automatically identifies tables in digital documents and extracts the contents in the cells of the tables. The contents are stored in a queryable table in a database. TableSeer extracts table metadata, and uses a novel ranking function to search for tables relevant to user queries.
Cool! More about TableSeer in this 2007 paper by Ying Liu, Kun Bai, Prasenjit Mitra, and Giles.
Learn More About ChemSeer in these papers including:
“Extraction and Search of Chemical Formulae in Text Documents on the Web” (10 pages; PDF) by Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee Giles
As many of you know, Dr. Giles is one of the developers of ResearchIndex/CiteSeer), a targeted search engine for “scholarly” information (on the open web) and citation information in the areas of computer science, info tech, etc. It predates Google Scholar by several years and continues to offer several features not available elsewhere. Links to CiteSeer can also be seen via Microsoft’s Live Search Academic where some articles list and link to “Citeseer citations.” Here’s an example as seen in the first result.
Dr. Giles is also involved with SmealSearch (scholarly business material) and eBizSearch (no longer online, redirects to SmealSearch). It’s hard to tell if SmealSearch is still being updated but it’s still worth a look when searching for older material.
See Also: In 2004, the technologies that power CiteSeeer (autonomous citation indexing, for example) was licensed by the Institute for Scientific Information (Web of Science) .
See Also: 2003 Interview with Dr. Giles
See Also: As we mentioned a couple of weeks ago in this post about science search on the web, Dr. Steve Lawrence (now at Google) and Dr. Gary Flake (now at Microsoft), and Kurt Bollacker worked at NEC Research with Giles.
Thanks to Pete W. for his help with this post.
