Slashdot Mirror


Compute Google's PageRank 5 Times Faster

Kimberley Burchett writes "CS researchers at Stanford University have developed three new techniques that together could speed up Google's PageRank calculations by a factor of five. An article at ScienceBlog theorizes that "The speed-ups to Google's method may make it realistic to calculate page rankings personalized for an individual's interests or customized to a particular topic.""

13 of 140 comments (clear)

  1. Re:Lets see... by deadsaijinx* · · Score: 4, Insightful

    that's exactly what i thought. But, as google is a HUGE international organization, it makes loads of sense for them. That's 5x the traffic they can feed, even though you won't see a noticeable difference.

    --
    YOU SUCK BALLS!
  2. Personalized PageRanks is from the dbpubs Abstract by malakai · · Score: 4, Insightful
    I have no idea what the hell they are talking about, but even I read this in one of the abstracts:
    The web link graph has a nested block structure: the vast majority of hyperlinks link pages on a host to other pages on the same host, and many of those that do not link pages within the same domain. We show how to exploit this structure to speed up the computation of PageRank by a 3-stage algorithm whereby (1)~the local PageRanks of pages for each host are computed independently using the link structure of that host, (2)~these local PageRanks are then weighted by the ``importance'' of the corresponding host, and (3)~the standard PageRank algorithm is then run using as its starting vector the weighted aggregate of the local PageRanks. Empirically, this algorithm speeds up the computation of PageRank by a factor of 2 in realistic scenarios. Further, we develop a variant of this algorithm that efficiently computes many different ``personalized'' PageRanks, and a variant that efficiently recomputes PageRank after node updates.


    What they mean by 'personalized' I can't tell you as I have not read through the entire PDF. But I wouldn't chastise the slashdot editors over this. If there is some sort of differential algorithm that can be applied to the larger PageRank to create smaller personalized PageRanks, it might not be so far fetched to think this could be done in realtime on an as-needed basis, at some point int he future using these algorithm improvements.

    I know that's a lot of optimism for a slashdot comment, but call me the krazy kat that I am.

    -Malakai
  3. Personal recommendations for news by costas · · Score: 4, Insightful

    In my view, personal recommendations from a search engine are mostly valuable for topical content --i.e. news items. However, the optimizations from these papers don't sound to me like they can do much for this case --news items pop up in a news site, and re-indexing the news source itself (say, the front page of CNN) won't tell you much about a particular CNN story.

    At any rate, personal news recommendations is a favorite topic of mine: this is why I built Memigo: to create a bot that finds news I am more likely to like. Memigo learns from its users collectively and each user individually --and BTW, it predates Google News by a good 6 months, IIRC. The memigo codebase (all in Python) is now up to the point where it can start learning what content each user likes... If you like Google News you'll love Memigo.

    And BTW, I did RTFA when it was on memigo's front page this morning :-)...

  4. Does speed matter? by zbowling · · Score: 2, Insightful

    I remember when Yahoo.com flauted all of the place how it would load in under 3 secs on a 28.8 modem. Now you visit them and you get big images, flash, java, and other massive bandwidth eatters.

    Does it really matter anymore? More and more users seem to be using broadband, and if they don't, they have at least a 56k (that can only go up to 53k because of the all wonderful FCC want to be able to decode it if they tap your line). Does it really matter though. Google is fast and simple so it loads on any kind of browser on the planet (even Lynx and PalmOS). Most searches for me come up in under 2.3 secs (1/2 is spent searching and the other is downloading). Anyone who can't wait that long really needs to learn some patients. Zac

    --
    No.
    1. Re:Does speed matter? by Slurpee · · Score: 4, Insightful

      I'm sorry, but haven't you totally missed the point of the article?

      The proposed speed increasae is TO THE PAGE RANKINGS, not to your searching! By the time you search, all page rankings have been done.

      This has nothing to do with the speed of your search and the weight of the web page (unless I missed something)

  5. Assumptions on PageRank by sielwolf · · Score: 3, Insightful

    I feel your assumption is wrong. It would be foolish to assume that the eigenvectors and eigenvalues they derive from one Pagerank will generally hold in a space as dynamic as the worldwide web. Sure, slashdot.org will probably maintain the same sort of authority and hub value... but what as terms change? A flurry of "blog" articles one month may make /. an authority... but what when the infatuation ends?

    We have already seen the effects of Google-bombing and Google-washing. The strength of Page Rank is that is objective in terms of the current state of the WWW. It makes no assumptions about the shape of the data. As a term takes on new meaning (see "second superpower") Page Rank stays cocurrent temporally. A new definition may bubble up to the top for a term for a month but then disappear as the linkage structure of the web phases it out (i.e. blogs talk about it less, less interconnectivity, less appearance at "hub" nodes).

    Numerically, PageRank is a recursive search for eigenvalues and vectors like updating a Markov Chain. It is a nice application of linear algebra. Because it is a matrix operation, it is highly parallelizable. Also there are many redundant calculation and ordering speedups one can do for matrix multiplications (as anyone who as taken a CS algorithms course knows).

    But to assume a stability from one calculation to the next could lead, over time, to the very inaccuracies Google was built to overcome. There is a lot of research in mining web data. There have been several academic improvements to it along with improvements to related algorithms such as Kleinbergs and LSI. It is well within reason that these were just applied to the Google app.

    --
    What is music when you despise all sound?
  6. Re:CmdrTaco, ScienceBlog editor? by bergeron76 · · Score: 2, Insightful

    Right, but couldn't people be stereotyped? This could be an abstraction of "individualized".

    --
    Don't think that a small group of dedicated individuals can't change the world. It's the only thing that ever has.
  7. A true test of our devotion to Google by SlashdotMirrorer · · Score: 2, Insightful

    What will be interesting to see if Google will implement the improvements to the algorithm. This is, of course, a given, so long as the researchers haven't gone for a patent, and it really has the a 5x speedup. The only questions are matters of what additional hardware would be needed, and how much development effort it will take to integrate it. I doubt Google will simply ignore the research.

    What will really be interesting to see, is if they decide to use it in the way the researchers recommended, bringing the power of ranking down to individual users with preferences. On one hand, they can boost performance and cut costs and have a little more green in their pockets from ads. On the other, they can maintain the sort of "geek cred" they've had up to this point, adding interesting features here and there, and take it the next mile by really adding something nice and useful.

    Also, for bonus points, will they see personalization as a money making opportunity, selling personal information and/or aggregated preferences?

  8. Bullshit by NineNine · · Score: 4, Insightful

    These researchers are all full of shit. Why? Nobody outside of Google knows how Pagerank works, exactly. And let me tell you, if anybody did, they could make themselves millionaires overnight. There are groups of people who do nothing but try to tackle Google, and very few people successfully crack the magic formulas. And those who do make a quick buck, but then Google changes it again once people catch on. They didn't improve PageRank because they don't know how it works... they're just guessing how it works.

    1. Re:Bullshit by Klaruz · · Score: 5, Insightful

      Umm... For the most part Stanford Researchers == Google Researchers.

      Google came about from a stanford research project. There's a good chance the people who are responsable for the speedup either allready knew about pagerank from working with the founders, or signed an nda.

      I haven't read the article, but I bet it hints at that.

    2. Re:Bullshit by fallout · · Score: 3, Insightful
  9. When Will Google Become Self-Aware? by Greyfox · · Score: 0, Insightful
    Google indexes a huge amount of everything. When will it become self-aware?

    If not self aware, couldn't it be used to calculate solutions to traditional problems, perhaps by trying to find pages in an order that works from a stated problem to a stated solution?

    I'd think such a huge index of data could be useful to the AI people...

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

  10. Personalized? Rather not! by jfreon · · Score: 2, Insightful
    I'd rather have a clean search, than a prejudiced search based on my past searches. Who knows what I'm really interested in that day - surely not Google.

    And don't call me Shirley!