The Math Behind PageRank
anaesthetica writes "The American Mathematical Society is featuring an article with an in-depth explanation of the type of mathematical operations that power PageRank. Because about 95% of the text on the 25 billion pages indexed by Google consist of the same 10,000 words, determining relevance requires an extremely sophisticated set of methods. And because the links constituting the web are constantly changing and updating, the relevance of pages needs to be recalculated on a continuous basis."
But 9,000 of those words are slang for parts of the human anatomy. Go figure.
I have sites with a PR of 6, and I can tell you that they got that way because of inbound links from other sites. In fact, when other sites dropped those links, my PR dropped (to 5, and even to 4). Getting more inbound links brought the PR back.
Think about those links, too. How often do you use common words in an HREF? I don't think there's a lot of weeding out of common words since the link to a site is usually either its name, or a description containing some important keywords.
I love seeing these technoscientists think they understand PageRank, but just like TimeCube, they're way, way off.
The article specifically says the PageRank eigenvector is only recalculated once a month, approximately. Even though Google uses some clever numerics to calculate the eigenvectors to a 25 billion by 25 billion matrix by iteration, it still takes several hours to finish.
It seems like it would be the nouns, pronouns, etc. that Google should be paying attention to. Who cares about all the verbs, adjectives, etc. that just muddy the indexing waters?
I skimmed the article and didn't find what I wanted to find. If you make a webpage that you want ranked high, what do you do? Do you make 100 geocities accounts and provide links to your main website, or what? I'm just wondering this out of curiosity, not out of need.
God spoke to me.
As a self proclaimed SEO expert - I honestly don't believe PageRank counts nearly as much as it did a few years ago! You'll find lots of PR5 sites ahead in the SERPS of PR9 sites!
LINUX ONLINE POKER: Linux Poker
Because about 95% of the text on the 25 billion pages indexed by Google consist of the same 10,000 words, determining relevance requires an extremely sophisticated set of methods.
They use a set of nested if-else statements
*ducks*
Interestingly enough, google thinks so, too.
Of course, yahoo has its own opinion.
Although, altavista seems to almost agree. Check the second non-advertised result.
I do find this amusing though. Third place, how humble.
I didn't expect such interesting results. The site with the search term in its url was tops for av and yahoo, but not google. Yahoo ranked the wiki entry above google, but av reversed that decision, google of course thought itself was more important than the wiki. Google's own reference site was number one in its own search and near the top in the other two, but pagerank.net wasn't even in the top 10 for google's search. I'm not sure what conclusions can be drawn from all that, but it is definitely food for thought.
There are many tongues to talk, and but few heads to think. -Victor Hugo
ORDER BY adcost DESC
bite my glorious golden ass.
I think we can get four or five tomorrow.
My grandmother used anecdotal evidence all the time, and she lived to be 120 years old.
For a different, somewhat more technical, but more succint discussion, Cleve Moler [of Matlab fame] wrote another view of this topic, about 5 years ago.
The math is the same, of course, but two points of view may provide a greater sense of perspective. So to speak. And Cleve is always worth listening to.
I've seen links on google searches that don't exist anymore but were ranked highly when they DID exist and still exist in the top 10 of the query. What happens to those? Do they stay at their ranking till they get overtaken by other more popular pages on the same search? Get their ranking slowly reduced because they don't exist?