Archive for the ‘Web Search’ Category

The September, 2009 Issue of the Internet Resources Newsletter is Now Available Online

Wednesday, September 30th, 2009

You can access Issue 176 from Roddy MacLeod and crew from the Heriot-Watt University in Edinburgh here. An RSS feed is also available.

This issue includes:

+ Commentary
Random quotes and News items of interest

+ A-Z New & Notable Web Sites:
About 100 new and notable websites: new services, ejournals, directories, search engines, publishers, social networks, government sites, booksellers, calls for papers, software, news services, conferences, research groups, plus anything else of interest, etc, etc.

+ Nice Web Sites: Mendeley

+ Blogorama and Twittersphere
Selected interesting blogs, RSS feeds, Twitter items, related news items, etc

+ Get a life! Leisure Time

Source: IRN

Two News Items from the GooglePlex: New Features Added to Google Docs & Hot Trends OneBox Added to Certain Search Results

Monday, September 28th, 2009

First, in “Google Docs rolls out student-oriented features,” Harrison Hoffman discusses several new features added to Google Documents.

From the News.com Article:

Google Docs’ summer interns this summer were tasked with working on improvements and additions to the service geared toward students. The results of their work, now available to try out, include new features such as an equation editor, superscripts and subscripts, document translation, improvements to surveys, and more outlining options.

Second, Danny Sullivan reports on Search Engine Land that Google has added a new Hot Trends “OneBox” on certain search results pages.

From the SEL Article:

Starting around 3:45 Pacific Time today, those searching on topics that are spiking or “hot” in popularity should see a new Hot Trends OneBox near the bottom of the search results page and just above the related search area…

The article is illustrated with several examples and also provides a brief history of Hot Trends and its parent, Google Trends.

Direct to Google Hot Trends
Note: The number of Hot Trends visible has been reduced from 100 to 40. The Sullivan article explains why.

Direct to Google Trends

Finally, Sullivan points out that Bing has offered a service similar to Hot Trends named xRank dating back to two years.

Direct to Bing xRank
The only way to find out about xRank is to find a link (when available in the left rail of a results page, directly above the related searches link) like this one for Michael Jackson or by going directly to the xRank site. It will be interesting to see if Bing now begins integrating xRank results on to web results pages.

Calculating for Your Retirement in Wolfram|Alpha

Monday, September 28th, 2009

From the Blog Post:

Wolfram|Alpha’s investment-returns calculator prompts you to describe your current investment strategy. Once you submit your query, Wolfram|Alpha will provide you with a number of results such as a linear chart depicting investment value projection scenarios, pie charts of resource allocation, a bar graph that allows you to easily compare the distribution of ages at which the account balance would reach zero, and a table displaying projections of your portfolio’s value at various ages.

The blog post contain several screenshots and also links to other W|A money and finance examples.

Source: Wolfram | Alpha Blog

Google Celebrates its 11th Birthday & Some Google and Web Search History

Sunday, September 27th, 2009

Happy 11th Birthday to Google! In case you missed it, SEL has captured the special Google Doodle in case you missed it.

Source: Search Engine Land

See Also: Interested in some early Google history? This ResourceShelf post from 2002 captures several “early” pages, questions, and announcements.

See Also: A Collection of “Early” Search Engine Announcements
Scroll to the Saturday, December 11, 2006 posting.

Wolfram’s Search Goal: Compute All

Friday, September 25th, 2009

From the Article (Summary from ACM TechNews):

Stephen Wolfram has set the ambitious goal of converting the global corpus of knowledge into a computable format through WolframAlpha.com, a computational knowledge engine rather than a search engine. WolframAlpha.com computes data and frequently renders query results into lists, charts, and graphs. “You get to ask WolframAlpha specific questions and it provides specific answers, rather than asking about some general topic and expecting it will do what search engines do, giving you a bunch of links about that topic,” Wolfram says. He estimates that WolframAlpha can currently answer users’ questions with more than 75 percent accuracy, and the system’s linguistic comprehension capabilities are steadily improving. Wolfram says the long-term goal for WolframAlpha is to make as much globally accumulated knowledge computable as possible. One avenue being explored is the ability to upload one’s own data to WolframAlpha and have it perform analysis on that data. “Another direction we are just starting to play with … is being able to invent on the fly,” Wolfram says.

Read the Complete Article

Source: Investors Business Daily

See Also: Stephen Wolfram’s Latest Webcast Now Available Online
Recently (9/17), Stephen Wolfram Did a Live Q&A Webcast to Discuss Wolfram Alpha and Other Projects/Ideas.
You can watch an archived copy of the program here. It runs about 79 minutes.

Google Introduces “Jump To” Links Within Search Snippets

Friday, September 25th, 2009

Barry Schwartz from Search Engine Land Writes:

The first are the anchor based links and the second are the snippet based links.

Barry’s post include screenshots to illustrate these new features.

More information and examples can be found on the Official Google Blog. It also links to this page that explains what webmasters/page designers can do to take advantage of these new features.

Sources: Search Engine Land / Official Google Blog / Official Google Webmaster Central Blog

Peter Jacso Takes on Google Scholar Finding Ghost Authors, Lost Authors, and Other Problems

Thursday, September 24th, 2009

Access the Full Text of the Entire Article

With all of the talk about Google Book Search lately, little has been written about Google Scholar. Now, in a lengthy and well-documented analysis (numerous screenshots) published in Library Journal, Dr. Peter Jacso from the University of Hawaii at Manoa, a monthly columnist for Gale/Cengage and a friend of ResourceShelf, documents some of the problems (two of them named in the title of the article) that he has found while using Google Scholar [GS] during the past several months. Actually, some of the problems go back years.

Here are just a few passages from Dr. Jacso’s article that we found to be of greatest interest:

They [the Google Scholar developers] decided—very unwisely—not to use the good metadata generously offered to them by scholarly publishers and indexing/abstracting services, but instead chose to try and figure them out through ostensibly smart crawler and parser programs.

Millions of records have erroneous metadata, as well as inflated publication and citation counts

A free tool, Google Scholar has become the most convenient resource to find a few good scholarly papers—often in free full-text format—on even the most esoteric topics. [Our emphasis] For topical keyword searches, GS is most valuable. But it cannot be used to analyze the publishing performance and impact of researchers.

Very often, the real authors are relegated to ghost authors deprived of their authorship along with publication and citation counts. [Our emphasis] In the scholarly world, this is critical, as the mantra “publish or perish” is changing to “publish, get cited or perish.”


[Our emphasis] While GS developers have fixed some of the most egregious problems that I reported in several reviews, columns and conference/workshop presentations since 2004—such as the 910,000 papers attributed to an author named “Password”—other large-scale nonsense remains and new absurdities are produced every day.

The numbers in GS are inflated for two main reasons. First, GS lumps together the number of master records (created from actual publications), and the number of citation records (distinguished by the prefix: [citation]) when reporting the total hits for author name search.

…fee-based Web of Science and Scopus have lower article and citation counts and scientometric indicators, as they have a far more selectively defined source base with fewer journals from which to gather publication and citations data. In addition, they count only the master records for the authors’ publication count (as they should), and keep the stray and orphan citations in a separate file.

Unfortunately, the bad metadata has a long reach. These numbers are taken at face value by the free utilities such as the Google Scholar Citation Count gadget by Jan Feyereisl and the sophisticated and pretty Publish or Perish (PoP) software (produced by Tarma Software).

As about 10.2 million records from GBS [Google Book Search] are incorporated now in GS, the metadata disaster likely will continue unabated. It is bad enough to have so many records with erroneous publication years, titles, authors, and journal names.

In its stupor, the parser fancies as author names (parts of) section titles, article titles, journal names, company names, and addresses, such as Methods (42,700 records), Evaluation (43,900), Population (23,300), Contents (25,200), Technique(s) (30,000), Results (17,900), Background (10,500), or—in a whopping number of records— Limited (234,000) and Ltd (452,000). The numbers kept growing by several hundred thousands hits for the cumulative total of the above ”authors” during the few days this paper was being written. More screenshots are available here.

Lost Authors

These errors could be considered relatively harmless if they did not affect the contributions of genuine, real scholars. But the biggest problem is when the mess replaces real scholars with ghost authors, leaving the former as lost authors.


[Our emphasis] Certainly the entire database isn’t rotten, just a few million records. That may be a relatively small percentage—Google won’t reveal the total number of records, and these are just my few forensic search test queries—but there’s ample cause for worry.

In case of GBS [Google Book Search], Google relied on its collective Pavlovian reflex to blame the publishers and libraries (meaning the librarians, catalogers, indexers) for the wrong metadata.

In the case of Google Scholar, these same Googlish arguments will not fly, because practically all the scholarly publishers gave Google—hats in hand—their digital archive with metadata. The idea was to have Google index it and drive traffic to the publishers’ sites.

Yes, GS has fixed fairly quickly some of the major errors that I earlier used to demonstrate its illiteracy and innumeracy, but have so far left millions of others untouched.

GS designers have sent very under-trained, ignorant crawlers/parsers to recognize and fetch the metadata elements on their own. Not all of the indexing/abstracting services are perfect and consistent, but their errors are dwarfed by the types and volume of those in GS. This is the perfect example of the lethal mix of ignorance and arrogance GS developers applied to metadata and relevance ranking issues.

The parsers have not improved much in the past five years despite much criticism. GS developers corrected some errors that got negative publicity, but these were Band-Aids, where brain surgery and extensive parser training is required. Without these, GS will keep producing similar errors on a mega-scale.

Again, these highlights are a only a small portion of the entire article that also includes numerous screenshots. You can access the full text here.

Source: Library Journal

New from Google: “Place Pages”

Thursday, September 24th, 2009

Another day and more new/updated resources from Google. Yesterday, it was a video project with the Wharton School of Business and Sidewiki.

Today, Greg Sterling at Search Engine Land introduces a change to Google Maps named “Place Pages” that replace the “info window” that previously appeared when doing some map searches. Sterling says the changes present the information “much more effectively.”

Sterling writes:

The new “Place Pages” offer a more user friendly presentation of the same information. Also launching today are Place Pages that cover cities, neighborhoods, points of interest and transit stops, in addition to business locations. (The Place Pages are accessible from the “more info” link associated with the listing or result.)

Greg’s post is full of screenshots that illustrate the changes.

The idea behind Place Pages, according to Google is to “give you all the info about a place, in one place.”

Source: Search Engine Land

If You’re a Google Toolbar Users Learn About SideWiki to Comment on Any Site

Wednesday, September 23rd, 2009

Danny Sullivan Writes:

Google Sidewiki is a new feature being added today to the Google Toolbar that allows anyone to leave comments about pages as they surf the web. Love something you’re reading? Hate it? You can share your views with others who visit the page and who also have Sidewiki enabled. Share, that is, if Google thinks your comment is good enough.

The post continues with a detailed explanation and screen caps.

Danny continues:

What comments are shown, and in what order? Google secret sauce time. The official line is this:

Using multiple signals based on the quality of the entry, what we know about the author, and user-contributed signals such as voting and flagging, we work hard to ensure that only the highest quality, most relevant entries appear in the sidebar. Most of the engineering work for Sidewiki was dedicated to this ranking algorithm.

When I talking with Google about Sidewiki, they gave me a few other factors, such as:
+ Use of sophisticated language: “This page sucks” isn’t sophisticated; think complex sentences and ideas. Apparently, Google has a language sophistication detector now, and one that works in the 14 different languages that Sidewiki supports.
+ User’s reputation: Are your comments being voted up or flagged down?
+ User’s history: How long have you had a Google Profile? How long have you been commenting?

Danny says that you can also share comments with people who do or do not use the toolbar.

From the Conclusion:

Sidewiki feels like another swing at something Google seems to desperately desires — a community of experts offering high quality comments. Google says that’s something that its cofounders Larry Page and Sergey Brin wanted more than a system for ranking web pages. They really wanted a system to annotate pages across the web.

Certainly Google’s goal is to be something more than another commenting system.

“I think we would have failed if people were using it to say ‘Obama sucks’,” said Sundar Pichai, vice president of product management at Google.

That’s not to say the system is meant to promote pro-Obama comments! Rather, the hope is to produce more intelligent and thoughtful comments regardless of a particular position about Obama or any other topic.

“If those are the comments we’re surfacing, [Sidewiki] wouldn’t be that much different than much of the web. What we’re really trying to do is add value from people who really know what they’re talking about,” he said.

Again, much much more in the full post.

Source: Search Engine Land

Jan Pedersen on Search as Dialog; Danny Sullivan on Search Wars

Tuesday, September 22nd, 2009

By Jan Pedersen, Chief Scientist for Core Search at Microsoft.

From the Article:

Tomorrow, search will be the easiest way to answer a question or to complete a task. In the very near future search will have become such an essential companion that we will not understand how people survived without it — indeed that trend is already happening.

Several accelerating trends will guide this trajectory. First, the Internet will continue to grow. Devices, users, information, and services are developing at double-digit rates with few signs of slowing. Second, the power of computer systems and algorithmic ingenuity brought to bear on navigating the online landscape will continue to defy the imagination. Third and finally, users will continue to demand ever more functionality from this boundless medium.

[Snip]

One of the amazing phenomena of the last few years is the advent of large-scale, user-generated content sources, such as Wikipedia, Twitter, Flickr, and Facebook. These platforms combine the efforts of many individuals (a feat not possible at this scale prior to the Internet), but they are best experienced through a search interface that surfaces the highest-value content. Many start-ups companies now offer a pre-digested or filtered version of Twitter content that attempts to extract meaning from the stream.

We can expect this concept to transform an increasing number of information sources.

[Snip]

Increasingly, however, search engines will begin to understand more of the intention behind a user’s query through the application of better web crawling and mining and natural-language-understanding algorithms. For example, search engines have historically successfully applied complex statistical analyses to the web in several languages to produce translators that handily beat traditional rule-based approaches.

Much More After the Jump
(more…)

Yahoo Releases New Search Format

Tuesday, September 22nd, 2009

Greg Sterling Writes on Search Engine Land:

Yahoo has now gone live with its new search format. There’s nothing radical or “game changing.” However, there are some nice upgrades and improvements. Most prominently, it features a new left column that allows users to filter results by Search Monkey content providers or refine by related concepts. It also features more prominent placement for Search Pad and an expansion of Search Assist. Yahoo says it has also improved image and video search and says speed and performance are better across the board.

Much more including screen caps in the complete article.

Here’s an example of a new search results page.

Source: Search Engine Land

See Also: Yahoo Search Pad To Launch Tomorrow (7/2009)

See Also: Take a Tour of the New Yahoo Homepage and Dashboard (7/2009)

Forbes Cover Story: The Man Who’s Beating Google: All About Robin Li and Baidu

Tuesday, September 22nd, 2009

From a Four Page Forbes Article:
Btw, we think the headline, “The Man Who’s Beating Google,” is a bit over the top. That said, having Google in the title of the story is likely to sell more copies of the magazine.

Also, the article includes a revenue share chart (page 1) and eye tracking study of Baidu users vs. Google users (page 3).

Highlights from the Article:

“A lot of Chinese people have wondered if knowledge really means power in today’s market economy,” Li says during an interview with Forbes in Baidu’s no-frills Beijing conference room. (By year-end the company will move to a new headquarters designed to resemble an enormous, long rectangular search box.) “I think I’ve proven that it does.”

That proof won’t do much to hold off Li’s biggest rival. While Baidu has a 2-to-1 lead in China, Google has been steadily winning eyeballs there (see graph, right) and plans a near-doubling of its sales force, now in the hundreds, over the next 12 months in what is shaping up as an epic battle to dominate the world’s search business. “China’s going to be the largest Internet market in the world,” says Gary Rieschel, a cofounder of Qiming Ventures in Shanghai. “If Google isn’t the leader there, will it really be the leading search company in the world?”

On another front, China’s e-commerce giant, Alibaba, has declared war with Baidu over online shopping.

[Snip]

Once an investor with 2.6% of Baidu, Google sold its stake in 2006 and got a government license to operate as Google China.

It’s no longer quite that simple. According to a couple of studies with no connection to either company, Google is now demonstrably better at Chinese language search. Asked to rate each service, Li Yinan, Baidu’s chief technology officer, squirms. “I’m not in a position to compare the two results side by side. The evaluation of quality of search results is based on personal opinions,” he says.

“We have, hands down, the best Chinese language search product,” boasts Lee Kaifu, who was president of Google’s China operations until he resigned in September to start an angel investment firm. But, he concedes, “we’re learning that [market share] is about more than the product.”

[Snip]
Much More After the Jump
(more…)

Liked That Special Google Logo? New Tool Lets You Make It Permanent

Friday, September 18th, 2009

Barry Schwartz Writes:

There is a new greasemonkey script made for Firefox that allows you to pick any past Google Doodle and make it permanent. The script can be located over here and like I said, requires Firefox and having Greasemonkey installed.

Barry continues his explanation and provides info about where to find the Google Logo Gallery here.

Source: Search Engine Land

Later Today: Live Webcast With Wolfram|Alpha Founder, Stephen Wolfram

Thursday, September 17th, 2009

Stephen Wolfram will be answering questions about W|A beginning at 2pm CDST/19:00 GMT.

If you have a question you’d like to ask Stephen, please send it as a comment to this blog post or tweet to @Wolfram_Alpha. We’ll also be taking questions live on the justin.tv chat during the webcast.

Streaming Video Will Be Available Here via justin.tv

Source: Wolfram|Alpha Blog

Bing 2.0 “Visual Search” (Beta) Launches, Allows Search By Pictures

Monday, September 14th, 2009

Bing 2.0 Visual Search: Their Motto? “Start with pictures to find results faster!”

Access Bing Visual Search (Beta)

From the SEL Blog Post:

Bing Visual Search lets searchers browse easily through a slick interface of “structured data sets from trusted partners” using Sliverlight technology. At launch, Bing Visual Search will earn a spot on the homepage search categories, just under Travel, although depending on the homepage image of the day, those links can sometimes get lost in the background colors of the photo.

[Snip]

The concept behind Visual Search is simple: use clear imagery to help users sort through large sets of data easily. Certain categories of search lend themselves more easily to this than others, likely the reason why Bing has launched this feature in beta with a fairly limited set of visual information: cars, animals, people and products. Users must have Silverlight installed on their browsers to fully experience Visual Search.

[Snip]

Research-based topics including politicians, US States and items like the periodic table are useful applications, but perhaps only to a limited audience, such as younger students working on school products. However, this also affords Bing the opportunity to appeal to a new generation of searchers, who are highly dependent on visual cues and ease of use, as iPods and iPhones have shown us. Of course, visualization does have additional appeal to the middle demographic using those products as well.

Much More in Elisabeth Osmeloski’s Blog Post

Source: Search Engine Land

A bit more from the ResourceShelf Team:

To limit your visual search, look for a group of narrowing limits located in the left margin of a visual search category page. Here’s an example for U.S. Politicians. You can narrow by:

+ Party
+ State Represented
+ First Term
+ Gender

In this case, you can also place your cursor on top of an image and identify the politician by name, party affiliation, and age.

Once you’ve made your selection, the politicians name is automatically placed in the search box ready conduct a web search.

Compare the politicians search with this one for Billboard’s past songs. On the songs home page you can sort by song title. In the left margin you can narrow by:

+ Decade
+ Year
+ Artist
+ Genre

Google: Bigger Is Better When It Comes To Search Boxes

Thursday, September 10th, 2009

Matt McGee writes:

Fresh off of patenting its home page design, Google is throwing caution to the wind and messing with success: It’s making its search boxes bigger. (Gasp!)

The bigger search box will show both on the home page and on the top of Google’s search results pages.

McGee goes on to share his take (and that of Google’s Marissa Mayer’s) about the new search box.

Source: Search Engine Land

Source: Search Engine Land

College Is Hard. Wolfram|Alpha Makes It Easier

Wednesday, September 9th, 2009

A bunch of examples of how you can use Wolfram|Alpha to make life a bit easier on campus. The company says more examples to come.

Source: W|A Blog

Hunting Deep-Sky Objects with Wolfram|Alpha

Tuesday, September 1st, 2009

A primer (with screen caps) on how to use W|A into finding deep sky objects:

From the Blog Post:

The amount of activity that takes place here on planet Earth is at times unfathomable. But it’s the merest drop in the bucket in comparison to the boundless amounts of activity in our universe—Earth is merely one planet within the Milky Way Galaxy. Most deep-sky objects cannot be seen by the naked eye, but observers looking through a telescope are treated to views of colorful clusters of light and fuzzy clouds of gas in the sky. Here we’ll demonstrate ways Wolfram|Alpha can help you find deep-sky objects such as galaxies, nebulae, and star clusters—our universe has about 100 billion member galaxies, and with so many, it’s nice to have a place to start.

Source: Wolfram|Alpha Blog

Interview with Microsoft’s President of Online Services, Qi Lu

Monday, August 31st, 2009

From the Article:

Mr. Lu, who is 47, left Yahoo 14 months ago, but now finds himself once again leading the charge against Google. This time, he is backed by a patron that vows to spend even more than Yahoo did on the mission: Microsoft.

Source: NY Times

See Also: For More Analysis of the NY Times Article, Make Sure to Read Greg Sterling’s Post from Search Engine Land.

See Also: Microsoft’s Lu on Bing: ‘A First Step’ (BusinessWeek; May 28. 2009)

Let’s Be Careful Out There: McAfee Names Jessica Biel the Most Dangerous Celebrity in Cyberspace

Tuesday, August 25th, 2009

From the Announcement:

Jessica Biel has overtaken Brad Pitt as the most dangerous celebrity to search in cyberspace, according to Internet security company McAfee, Inc. (NYSE:MFE). For the third year in a row, McAfee researched Hollywood’s glamorous stars and pop culture’s most famous people to reveal the riskiest celebrities on the Web. McAfee’s latest report found that searches for Barack or Michelle Obama posed a lesser threat compared to others.

Fans searching for “Jessica Biel” or “Jessica Biel downloads,” “Jessica Biel wallpaper,” “Jessica Biel screen savers,” “Jessica Biel photos” and “Jessica Biel videos” have a one in five chance of landing at a Web site that’s tested positive for online threats, such as spyware, adware, spam, phishing, viruses and other malware. Searching for the latest celebrity news and downloads can cause serious damage to one’s personal computer.

The announcement continues with a listing of the Top 15 Celebrity Searches.

Source: McAfee, Inc.