The Man Behind Google's Ranking Algorithm
nbauman writes "New York Times interview with Amit Singhal, who is in charge of Google's ranking algorithm. They use 200 "signals" and "classifiers," of which PageRank is only one. "Freshness" defines how many recently changed pages appear in a result. They assumed old pages were better, but when they first introduced Google Finance, the algorithm couldn't find it because it was too new. Some topics are "hot". "When there is a blackout in New York, the first articles appear in 15 minutes; we get queries in two seconds," said Singhal. Classifiers infer information about the type of search, whether it is a product to buy, a place, company or person. One classifier identifies people who aren't famous. Another identifies brand names. A final check encourages "diversity" in the results, for example, a manufacturer's page, a blog review, and a comparison shopping site."
My ongoing gripe with Google is the number of times when the first page is filled with shopping sites, "review" pages, and click through pages that exist only to grab you onto the way to where you really want to go.
I would love a switch, or even a subscription, that would allow me to filter these usually useless types of pages and instead show me pages with real content.
Three Squirrels
One interesting thing about the article was the down-to-earth lack of abstraction in the problems described, such as the teak patio palo alto problem. Other search engines brag about their web-filtered-by-humans approach, as opposed to the "cold" algorithmic approach of Google. But it turns out Google is pretty human too, only with higher ambitions of creating generalizations from the human observations.