Slashdot Mirror


The Math Behind PageRank

anaesthetica writes "The American Mathematical Society is featuring an article with an in-depth explanation of the type of mathematical operations that power PageRank. Because about 95% of the text on the 25 billion pages indexed by Google consist of the same 10,000 words, determining relevance requires an extremely sophisticated set of methods. And because the links constituting the web are constantly changing and updating, the relevance of pages needs to be recalculated on a continuous basis."

11 of 131 comments (clear)

  1. Pagerank is cool by pap3rw8 · · Score: 0, Interesting

    Whatever google's doing with PageRank, it seems to be doing it right. At least from my experience.

    1. Re:Pagerank is cool by silentounce · · Score: 5, Interesting

      Interestingly enough, google thinks so, too.

      Of course, yahoo has its own opinion.
       
      Although, altavista seems to almost agree. Check the second non-advertised result.
       
      I do find this amusing though. Third place, how humble.
       
      I didn't expect such interesting results. The site with the search term in its url was tops for av and yahoo, but not google. Yahoo ranked the wiki entry above google, but av reversed that decision, google of course thought itself was more important than the wiki. Google's own reference site was number one in its own search and near the top in the other two, but pagerank.net wasn't even in the top 10 for google's search. I'm not sure what conclusions can be drawn from all that, but it is definitely food for thought.

      --
      There are many tongues to talk, and but few heads to think. -Victor Hugo
  2. Bad summary by Knights+who+say+'INT · · Score: 5, Interesting

    The article specifically says the PageRank eigenvector is only recalculated once a month, approximately. Even though Google uses some clever numerics to calculate the eigenvectors to a 25 billion by 25 billion matrix by iteration, it still takes several hours to finish.

  3. I joke a lot on Slashdot, but serious question by CrazyJim1 · · Score: 2, Interesting

    I skimmed the article and didn't find what I wanted to find. If you make a webpage that you want ranked high, what do you do? Do you make 100 geocities accounts and provide links to your main website, or what? I'm just wondering this out of curiosity, not out of need.

    1. Re:I joke a lot on Slashdot, but serious question by TheLink · · Score: 2, Interesting

      "if you break Google's rules about displaying the same content to bots as to humans"

      I notice many sites that do that and don't get slapped down - esp subscription sites. And seems Google doesn't cache those, so its probably collusion.

      You see the keywords and paragraphs in the search, but click on it you get a login page.

      They should have to pay a special rate be marked differently from the other search results. It's a waste of time otherwise.

      --
    2. Re:I joke a lot on Slashdot, but serious question by oni · · Score: 4, Interesting

      I notice many sites that do that and don't get slapped down - esp subscription sites.

      I wonder, if I changed my useragent to be whatever the googlebot reports itself to be - would I get by the registration screen on websites like the NYTimes??

    3. Re:I joke a lot on Slashdot, but serious question by suggsjc · · Score: 2, Interesting
      Here is an email with associated response I received from Google on roughly this topic.

      This is a very general question. I'm creating a website. It is going to be a blogging platform. Obviouslly, the content of the site(s) is the most important thing. I've already started making the content of my site dynamic in the sense that I tailor it to the requesting agent (via the user-agent header). My intention for doing this is to make sure that the content renders correctly for *any* browser that accesses the site. I've built the site modularly, so tailoring the content to the requesting agent isn't a big deal. However that leads me to my question(s) and the reason I am emailing you? FYI, I have no ulterior motives for being able to tailor my content, other than making sure that the user get the most usefull information.

      That said. When a "bot" (ie your crawler) accesses my site. I'm going to treat it like I would a mobile browser. I'm going to give the minimal markup and the css will be very simple. I'm going to make sure that my content comes before my navigation, advertisements, etc in the source.

      My real question is does the fact that I'm presenting you the content of my site differently from other browsers make a difference? If so, (then again my reasoning is to make sure that my users get the correct content) how do I prevent this from hurting me in your rankings? If not, then how do you protect yourself from other sites taking advantage of this "hole"? Meaning I could make my site appear legit when I knew your bots were crawling me, but give "alternate" content when real users were visiting.

      Last question. Do you have any idea when your ads will work correctly with xhtml?
      Hope you weren't expecting a straightforward answer (like I was), because here is what I got back

      Hello Jonathon,

      Thanks for your email about the website you're creating.

      First, since you asked when our program will support XHTML, I wanted to
      let you know that we're unable to say if we'll support XHTML pages in the
      future.

      While the AdSense team isn't able to answer your questions about your
      site's ranking in the Google search index, I'd recommend visiting
      http://www.google.com/support/webmasters . I also wanted to let you know
      that our advertising programs are independent of our search results.
      Participation in AdWords and AdSense doesn't affect inclusion or ranking
      in the Google search index.

      I've also included answers to some of the most common questions AdSense
      publishers have asked.

      How can I improve my site's ranking?
      Answer:
      http://www.google.com/support/w ebmasters/bin/answer.py?answer=34432&hl=en_US

      H ow do I add my site to Google's search results?
      Answer:
      http://www.google.com/support/w ebmasters/bin/answer.py?answer=34397&hl=en_US

      M y site is no longer included in the search results. What happened?
      Answer:
      http://www.google.com/support/ webmasters/bin/answer.py?answer=34443&hl=en_US

      Why doesn't my site show up for a specific keyword?
      Answer:
      http://www.google.com/support/w ebmasters/bin/answer.py?answer=34434&hl=en_US

      F or additional questions, I'd encourage you to visit the AdSense Help
      Center (http://www.google.com/adsense_help), our complete resource center
      for all AdSense topics. Alternatively, feel free to post your question on
      the forum just for AdSense publishers: the AdSense Help Group
      (http://groups.google.com/group/adsense-hel p).

      Sincerely,

      Jake
      The Google AdSense Team
      --
      When I have a kid, I want to put him in one of those strollers for twins and then run around the mall looking frantic.
  4. Does PageRank count? by matr0x_x · · Score: 2, Interesting

    As a self proclaimed SEO expert - I honestly don't believe PageRank counts nearly as much as it did a few years ago! You'll find lots of PR5 sites ahead in the SERPS of PR9 sites!

    --
    LINUX ONLINE POKER: Linux Poker
  5. Re:PageRank doesn't seem to be based on keywords by Anonymous Coward · · Score: 3, Interesting

    It's not secret.

  6. Re:The two that matter by Anonymous Coward · · Score: 1, Interesting

    There's only two that really reflect the power of Pagerank: Click here.
    About 1.2 billion pages, and surprise surprise, Acrobat Reader tops the list, followed by a who's who of internet applications and plugins. But around result #30 it gets a bit more interesting, and when you're a few dozen pages in, "new patterns begin to emerge."

    And to explain why not to use "click here", I found this buried on page 45. Thanks for the proof pudding guys, it's delicious.

  7. Pages that don't exist anymore by namco · · Score: 2, Interesting

    I've seen links on google searches that don't exist anymore but were ranked highly when they DID exist and still exist in the top 10 of the query. What happens to those? Do they stay at their ranking till they get overtaken by other more popular pages on the same search? Get their ranking slowly reduced because they don't exist?