PageRank-Type Algorithm From the 1940s Discovered
KentuckyFC writes "The PageRank algorithm (pdf) behind Google's success was developed by Sergey Brin and Larry Page in 1998. It famously judges a page to be important if it is linked to by other important pages. This circular definition is the basis of an iterative mechanism for ranking pages. Now a paper tracing the history of iterative ranking algorithms describes a number of earlier examples. It discusses the famous HITS algorithm for ranking web pages as hubs and authorities developed by Jon Kleinberg a few years before PageRank. It also discusses various approaches from the 1960s and 70s for ranking individuals and journals based on the importance of those that endorse them. But the real surprise is the discovery of a PageRank-type algorithm for ranking sectors of an economy based on the importance of the sectors that supply them, a technique that was developed by the Harvard economist Wassily Leontief in 1941."
Well, this is actually pretty good advice for any developer; Don't reinvent the wheel. Look around, search for what's been done before and adapt it to suit your needs. Of course, as a last resort, one can design something new once he has done his homework and made sure nothing that has been done before may be re-used.
Through my life, I have seen a amazing high level of work that has been done in vain because it yielded poor results and that something doing the same better already existed anyway.
Don't get me wrong here, once you have made sure that nothing already existing suits your needs or can be reused, it is fine to innovate and create real new stuff. Just don't get caught trying to reinvent the wheel unless you reinvent it better ;-)
Also, an exception to that principle could be allowed for trivial tasks that are really quick to implement and where searching for an existing solution might cost more than implementing it yourself but be really careful applying that exception rule, it is an open door that leads to trying to reinvent the wheel sometimes ;-))
Everything I write is lies, read between the lines.
Nil novi sub sole
If you hate Google: Yes. If you don't: No. If you want Bananas: Get them.
Have you heard about SoylentNews?
So it could be used as previous art to invalidate Google's patent?
From my read of the linked article it seems that Sergey and Larry cited the previous art in their publications. So it looks like there was no plagiarism, just building a new idea using the tools provided by an earlier idea.
"Maybe this world is another planet's hell"
Aldous Huxley
No, since the one from 1941 didn't say "on the internet" or "with a computer".
What really shocked me when someone first described page rank to me was that it was linear. I felt that this just had to be wrong, because it didn't seem right for a *million* inbound links to have a *million* times the effect compared to a single inbound link. Maybe this is just the elitist snob in me, but I don't feel that the latest American Idol singer is really a thousand times better than Billie Holliday, just because a thousand times more people listen to him than to her. If it was me, I'd have used some kind of logarithmic scaling. I think people do usually describe page ranks in terms of their logarithms, but that's taking the log on the final outcome. I'm talking about taking logs at each step before going on to the next iteration.
To me, this has an intuitive connection to the idea that the internet used to be more interesting and quirky, and it was more about individuals expressing themselves, whereas now it's more like another form of TV.
Of course that's not to say that I want to go back to the days before page rank. God, search engine results were just horrible in those days.
From an elitist snob point of view, one good thing about page rank is that it doesn't let you just vote in a passive way, as Nielsen ratings do for TV. In order to have a vote, you have to do something active, like making a web page that links to the page you want to vote for.
Find free books.
Have gnu, will travel.
allowed pages to be ranked and categorized according to whether it was "insightful," "interesting," "informative," "funny," "flamebait," or "troll."
The most amazing computer book ever. It has Doug Englebart's first description of “augmenting the human intellect” using computers. It describes what we know now as windows (generic) with pointing devices. It has an early linear document retrieval system using page ranks based on word co-occurrences and it has an early language translation system (Russian to English with examples of translating Soviet missile papers). What a preview of things to come.
It is worth a read just to get into the heads of some of the computing pioneers.
Another required reading book for all aspiring CS students should be John Von Neumann’s the “Computer and the Brain.” Dated, but again this is what they were thinking.
We have a lot to be humble about given the hardware and compilers they had to work with. Not to mention primitive development environments, a.k.a. the card punch.
I'd better get a rush on a patent for "using pagerank on the internet" then. Take that google.
Sent from my PDP-11
The summaries were intriguing but lame. Here's the real thing (preprint):
http://arxiv.org/abs/1002.2858
Author's page is here:
http://users.dimi.uniud.it/~massimo.franceschet/
Interesting stuff.
[|]
Dude; it's just Jacobi iteration applied to a particular, reasonably intuitive model. This isn't to knock it -- just to say that it was probably easier to reinvent it than to find some obscure paper --- especially one which probably isn't in the electronic databases.