Compute Google's PageRank 5 Times Faster

← Back to Stories (view on slashdot.org)

Compute Google's PageRank 5 Times Faster

Posted by timothy on Wednesday May 14, 2003 @09:57AM from the hypercustom dept.

Kimberley Burchett writes "CS researchers at Stanford University have developed three new techniques that together could speed up Google's PageRank calculations by a factor of five. An article at ScienceBlog theorizes that "The speed-ups to Google's method may make it realistic to calculate page rankings personalized for an individual's interests or customized to a particular topic.""

9 of 140 comments (clear)

Min score:

Reason:

Sort:

Re:Lets see... by Anonymous Coward · 2003-05-14 10:11 · Score: 5, Informative

RTA. PageRankings are computed in advance and take several days. A 5x increase in speed means specialized rankings could be computed.
Re:Charge for it by ahaning · 2003-05-14 10:23 · Score: 4, Informative

But, didn't Google originate out of Stanford? Isn't it reasonable to think that the two are still pretty friendly?

(Don't you hate it when people speak in questions? Don't you? Huh?)

--
Withdrawal before climax is very ineffective and those who try this are usually called "parents."
Printer-Friendly by g00set · 2003-05-14 10:34 · Score: 2, Informative

Printer friendly version here

--
... and furthermore ... I don't like your trousers.
Re:Lets see... by jesser · 2003-05-14 10:40 · Score: 4, Informative

Google Search doesn't show hits exactly in the order of page rank. Relevance and other factors also affect order. My biggest page (the one that is my Slashdot URL) is PR7, but there are words on the page for which a lower-rank page beats me, because they're more relevant for that word. Relevance includes how many times the word appears on the page, the HTML context in which it is used, whether pages that link link using the search terms, and the order and nearness of the words in a multi-word search without quotes.

--
The shareholder is always right.
Re:Charge for it by pldms · 2003-05-14 10:43 · Score: 3, Informative

But, didn't Google originate out of Stanford?

Yep. Originally called Backrub, curiously.

--
Slashdot looked deep within my soul and assigned
me a number based on the order in which I joined
Sepandar Rules! by ChadN · 2003-05-14 11:22 · Score: 3, Informative

I studied under the SCCM program at Stanford, and started the same year as Sepandar Kamvar. I remember him as a great guy, very smart, and an EXCEPTIONALLY good speaker and tutor (I was always pestering him for explanations of the week's lectures).

I'm glad to hear his research is getting attention, and I hope others who are interested in the theoretical aspects of data mining and web search engines will take a look at the SCCM and statistics programs at Stanford (shameless plug - other can post pointers to similar programs).

--
"It's overkill, of course. But you can never have too much overkill." - Anonymous Slashdot Coward
Licensed under U.S. Patent 4,558,302 by yerricde · 2003-05-14 11:55 · Score: 2, Informative

I can't remember the last time I paid Unisys for using a GIF...

When was the last time you bought a copy of GraphicConverter, Fireworks, Photoshop, Paint Shop Pro, or any other program licensed under U.S. Patent 4,558,302 and foreign counterparts? The price of each of those programs includes a royalty paid to Unisys.

--
Will I retire or break 10K?
Re:Lets see... by philipborlin · 2003-05-14 12:02 · Score: 3, Informative

Didn't read the article did we? The page rank process is sped up 5x. All the pages are ranked ahead of time in a multi-day process so when you do your search you are searching against those pre-calculated ranks. What this technology will do is allow Google to rank their pages every day (instead of once every couple of days) or create more special interest sites ala groups, images, news, etc. with the extra processing power.
Customized Pagerank by K-Man · 2003-05-14 12:17 · Score: 4, Informative

Sounds a lot like Kleinberg's HITS algorithm, circa 1997. Try Teoma for a real-world implementation.
For example, searching a sports-specific Google site for "Giants" would give more importance to pages about the New York or San Francisco Giants and less importance to pages about Jack and the Beanstalk.
Coincidence time: I used the same example in a presentation a couple of years ago to illustrate how subgroupings can be found for a single search term. Try it on Teoma, and see the various subtopics under "Refine". IIRC each of those is a principal eigenvector of the link matrix.
Topologically speaking, each principal eigenvector corresponds to a more or less isolated subgraph, eg the subgraph for "San Francisco Giants" is not much connected to the nest of links for "They Might Be Giants", and we get a nice list of subtopics.
(I once tried to explain this algorithm to my bosses at my former employer, which is why I have so much free time to type this right now.)

--
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger