Compute Google's PageRank 5 Times Faster
Kimberley Burchett writes "CS researchers at Stanford University have developed three new techniques that together could speed up Google's PageRank calculations by a factor of five. An article at ScienceBlog theorizes that "The speed-ups to Google's method may make it realistic to calculate page rankings personalized for an individual's interests or customized to a particular topic.""
Who owns the software patent for this for the next 20 years?
What is 1/14th of a second divided by five?
That google hasn't already implemented something akin to quadratic extrapolation, or some orthogonal optimization technique. Google has come a long way since the published page rank papers 4 years back.
What if they combined extrapolation and blocking factors; they would focus on computing the pagerank of pages in groups that were logically "tight", or using subcomponents of URLS, as opposed just to domain sensitivity. To be more flexible, what if it computes a VQ-type data structure (like for doing paletted images from full-color) that is populated by the most popular "domains" of the internet according to the last pagerank, and then splits up its workload based on that?
What if they already figured that out?
In the abstract, they mention how the work is particular important to the linear algebra community. That is what their focus should be on; google is just an application/real-world-example of that research (but it may not be relevant today).
Or did they have access to the current page-rank algorithm?
Black holes are where the Matrix raised SIGFPE
According to the document, they reference the original 1998 paper on PageRank. I see a number of other references about improvements to the algorithm, but nothing specific to Google's own implementation. The paper mentions how the improvements help, but not if Google uses them.
Hence it is forward for the article author or one of the paper authors to assume these techniques will speed up Google- I'm confident their engineers have been following academic work in this area and perhaps they have already discovered these same (or orthogonal) techniques.
That is, not to say that google could not reimplement their algorithms to take in these improvements if they already have... but basing your speedup number on the 1998 algorithm and public domain mods is showy. Although it does help grab a readers attention when browsing abstracts. ^_^
Black holes are where the Matrix raised SIGFPE
Why are a public university's funds and time being used to benefit a private company? Last I checked, Google isn't a charity. Doesn't Google have its own programmers? Wouldn't these "CS Researchers'" time be better spent furthering science instead of being free labor for corporations, at the expense of taxpayers?
occultae nullus est respectus musicae - originally a Greek proverb
Um. This is about speed of *calculation* of PageRank, not speed of delivering the calculated result to you.
:)
The articles and earlier postings explain this a little more fully. Anyone who can't take the time to read them really needs to learn some patience
PageRanks are periodically calculated for the Web as a whole. The results are stored and served to users. (The periodic update is sometimes referred to as the GoogleDance.) PRs are not calculated on the fly.
Hence, a speed increase could reduce Google's required hardware investment and/or allow them to update more often (and hence pick up more topical items) and/or allow them to calculate a spectrum of regionally or topically themed PRs
The bit about customized rankings based on user profiling of some type.
Frequently when I want to refer someone to a topic of interest, I'll tell them to do a Google on (whatever) subject, and I like knowing they're seeing what I see.
If this is implemented, I hope there's a way to turn it off or assume a "joe user" standard profile for unbiased results actually based on rank popularity (the way it is now).
I DO like the 5x faster, but geez, the page load takes longer than the search already, who can complain?
-- You are in a maze of little, twisty passages, all different... --
Well, according to Moore's law (or rather observation), PageRank would become 5 times faster in a couple of years anyway.
The research was done partially with public funding from an NSF grant, yet the commercial applications are obvious and immediate.
So my question is, who sees the benefit of the research? The researchers? Can Google just jack the results and incorporate into their system?
It seems to me that the current system of allocation research dollars with public and private grants is very messy and needs overhaul.
... is quality.
I'm surprised how Google is choosing not to implement search features that would greatly enhance advanced queries.
How often I'd wish they allowed wildcards in their queries (where engl* would pull hits with england, english, etc).
Field searches still require you to add keywords, so I cannot just query "site:somesite.com" to get all the currently indexed pages from somesite.com
In this respect Altavista still produces better results, with an excelent range of fields to choose from.
If there is anything that Google is lacking, it's defenitely that.
Having said that, still my number one SE.