Google's PageRank Predicts Nobel Prize Winners
KentuckyFC writes "The pattern of citations between scientific papers forms a network that has remarkable similarities to the network formed by the web. So why not use Google's PageRank, the world's most effective search algorithm to rank these papers in the same way it ranks websites? That's exactly what a couple of US researchers have done for physics papers published by the American Physical Society since 1893 (abstract). The results make interesting reading because almost all of the top ten papers resulted in (or were linked to) Nobel Prizes for their authors. Which means that studying the up-and-coming entries on the list ought to be a good way of predicting future winners. Better get your bets in before the bookies get wind of this."
Preparing for an inundation of people citation bombing each other in 3 ... 2 ... 1 ...
They take bets about this kind of thing?
Mon chien, il n'a pas du nez. Comment scent-il? TrÃs mauvais!
Did the star make the movie a hit, or did the movie make the star?
For 'prediction' to be valuable, it has to work with citations that were linked *before* the paper got the Nobel.
So even in this article Nicola Cabibbo demonstrated to deserve the Nobel Prize:
http://en.wikipedia.org/wiki/Nicola_Cabibbo
Seriously, like this is some kind of weird correlation. No shit Nobel prize winning papers would have excellent page ranks.
Yes, it happens all the time: the Swedish Academy can change their vote any time, if it feels pressed by the media.
Plain old sigh.
They assume that interest in someone's published work is the same whether they are Nobel prize winner or not. That is simply not true, papers written by Nobel prize winners will generate more links and have higher rating, just because they recently won the prize.
The algorithm for Google PageRank is based on the concept of citations from academia. If I remember correctly, the software was originally meant only to index academic papers and eventually grew to index the whole internet. So its not surprising that it predicts winners so well (depending on how much the Nobel committee weights citations in their decisions).
Why doesn't Slashdot ever get slashdotted?
I hope they never award the Nobel Prized based strictly on this. It could be a good way of pointing people in the right direction, but it will also let in a bunch of crap.
The last thing we need is scientists Googlebombing their papers (or creating junk networks to increase page ranks). I bet the Creationists would have a field day with this. "Look, our theories have scientific basis, check out our CiteRank".
Technology is a tool, it should never replace human intelligence.
Can't predict the weather? Weather forecasting is one of the few areas where computers have been an undisputed improvement. Short-term forecasts these days are pretty good ( has some information on ways the accuracy of forecasts can be measured).
Weather is the odd one out because all the other variables are influenced by the prediction made. Expectations of risk (or correlation of currency movements, or default rates on loans) affect the actions of other players in the market. But weather forecasts do not affect the weather.
-- Ed Avis ed@membled.com
Bowel movements would be pretty easy to predict tbh. You just get the Android app to track your bowel movements, it'll upload it to a google appliance gizmo that creates a trend.. maybe some input function to add in the primary sections of your diet (for instance, you ate something with alittle more fat or fiber.. etc..)
----- The internet has given everyone the ability to have their voice heard equally as loud.. even if they shouldn't be
That was my reaction as well. It only works if you base it on publications prior to them winning the Nobel Prize. Of course people are going to reference the papers after the Prize. Citing a Nobel winner gives a certain boost to credibility.
There's nothing wrong with computer models, without them we'd never get any high end engineering done.
However the model can't be better then it's underlying assumptions and here I think that they've confused the relationship.
>That's exactly what a couple of US researchers have done for physics papers published by the American Physical Society since 1893 I wasn't aware that Google's PageRank existed in 1893.
It didn't. But they don't just throw away published papers. Those papers tend to sit around on a dusty shelf, forgotten in a library (unless they're really well-cited). Or in an archive (most likely).
Disclaimer: The opinions and actions of the US Gov't are in no way representative of those held by this author or its ci
Note that they're not looking at webpage referrals, but citations in other scientific papers. Rather than simply counting citations, they're weighting the citations by the number of citations the citing papers received. Thus, if your paper is cited by a paper which is very popular, then your paper will get a boost to it's citation score.
Not having read the actual paper, the following question comes to mind: did they include only the period of time *before* the physicists got their Nobels? Because if they included the citations after that - yeah, I imagine those authors got quite a few citations being Nobel Prize winners and all...
Let's see, so far, computer models have failed to accurately manage loan portfolios to higher risk buyers, failed to manage risk books for hedge funds, could not capture currency trading, can't predict the weather and are probably wrong about climate. Sure, let's have them predict nobel prize winners while we are at it!
Actually, using it to predict Nobel prize winners would be a silly use.
But it would be quite useful to allow scientists to focus their research, find all the tidbits, maybe shed some small extra bit that they may have missed otherwise.
Disclaimer: The opinions and actions of the US Gov't are in no way representative of those held by this author or its ci
That's true, unless this algorithm only searches through papers linked before the cooresponding announcement--which is what my first thought was on seeing the sumamry. I did not RTFA, though.
The meek may inherit the earth, but the strong shall take the stars.
Let's see, so far, computer models *can* predict the weather and are probably *right* about climate.
But unsurprisingly have failed to accurately manage loan portfolios to higher risk buyers, failed to manage risk books for hedge funds, could not capture currency trading, simply because they are not predictable because they are traded by panic driven, idiots who are swayed by rumour, non existent trends, and computer predictions!
The predictable but complex is predictable, the unpredictable ... is unpredictable! no matter what the overpaid consultant says!
Puteulanus fenestra mortis
See, what did I tell ya? Google lets their employees work a bit on odd experiments, and this is the kind of thing it may lead to. (Will Microsoft compete with Microsoft Bowel 2.0 ?)
Table-ized A.I.
Top 40 music singles chart predicts highest-selling singles of the week with astounding precision!
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
The next step is obviously to let PageRank select the Nobel winners and cut out the middleman.
Anyone else really get tired of the friggin tags for a lot of these stories? CorrelationIsNotCausation (this meme here really needs to go, saying it dosn't make you sound smart when it makes no sense or is bleedingly obvious) , and BecauseItWillGetGamed? GTFO. How the hell do you as a scientist game the entire specter of academic publishing to get yourself voted as a nobel prize winner, without you know, maybe actually doing some good science (and having it further recognized by being cited heavily by peers)? The tags are next to useless unless they are good as flamebait (yes am aware of the irony)
I think my joke went over everyones head. :-\
And of course the results of their experiment are submitted in the form of a research paper. Hmm, I wonder...
The original paper doesn't really discuss the connections with Nobel prizes - it mentions as an aside that one paper was cited for a Nobel prize - as it's concerned not with predicted Nobel laureates but evaluating the importance of papers. Therefore any conclusions about predicting Nobel winners are without merit until further analysis is performed.
panic driven, idiots who are swayed by rumour, non existent trends, and computer predictions!
Sounds like we should be using Macs to predict the economy -- that's their main source of operating power anyway :-p
That's not completely true. You can use all citations to create a regression model (or structural equations model or whatever other statistical method you use) that is used to "predict" past and future prize winners. It's really hard to explain in this setting but it's basically using all the aggregate data to create your regression equation, then checking to see if the regression equation was a good fit to the data. From there you should hopefully be able to predict future winners with some degree of accuracy.
I'm not sure if the authors used a method like that or not - I skimmed the original article but don't have time to spend more time on it. In any case, it's not uncommon to use "post" data to help predict "pre" data. That's how you set up a model. Further, it's helpful to be able to use all the "post" data to help you know the size of the error of your prediction. I know I wasn't terribly clear but statistical modeling isn't as straightforward as it might seem.
It would be quite logical for the Nobelists to get considerably more exposure for the mere fact they on the prize. I would think merely referencing a paper from an author who'd made it up there would give your own research more attention than it would otherwise.
This would be quite obvious, but then again what is Google for anyway?
---
Have you read the Terms of Service lately?
It didn't, but it really wasn't all that funny, either.
Disclaimer: The opinions and actions of the US Gov't are in no way representative of those held by this author or its ci
You mean people who write good papers get Nobel prizes? Wow!
Also, I didn't know that people who won Nobel prizes for fundamental discoveries won't post facto get gratuitous citations in the first line of the introduction of every subsequent paper in the field.
Page Rank captures whatever is `sensational', in every domain of human activity. Having RTFA, I conclude that if all that is sensational is good, then what we have here is an empirical demonstration of circular reasoning. If all that is good need not be sensational, we simply have misleading anecdotal evidence.
Let's see, so far, computer models have failed to accurately manage loan portfolios to higher risk buyers, failed to manage risk books for hedge funds, could not capture currency trading, can't predict the weather and are probably wrong about climate. Sure, let's have them predict nobel prize winners while we are at it!
Well, it's certainly easier than trolling Nostradamus' quatrains in search of a prediction, now isn't it? ;)
My blog
The foundation for the work of Messrs. Maslov and Redner was laid by Hari Seldon, who discovered that "while one cannot foresee the actions of a particular individual, the laws of statistics as applied to large groups of people could predict the general flow of future events." The recent paper by Messrs. Maslov and Redner represents the smallest corpus to which Seldon's theory has been successfully applied to date.
Further applications of these techniques to this same corpus will likely fall afoul of Seldon's second axiom: "the population should remain in ignorance of the results of the application of" the analysis.
"We reject as false the choice between our safety and our ideals." --The American President (20.1.2009)
There's a tool that tries to create a network of reviews, rather than just citations. In this case, the reviewer actually specifies the level of endorsement, whereas citations can mean anything. One of the most common reasons to cite a paper is to say "Our idea is way better than this lame idea", or "These guys did something similar, but it comparatively sucked". Sometimes the worst implementations get cited the most because they are so easy to improve upon. Why should that build up a paper?
Sorry, but links do not make a Nobel.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
If I eat at Baja Surf, predict bowel movement within 5 minutes of leaving restaurant?
Let's see, so far, computer models *can* predict the weather and are probably *right* about climate
The reason I made the crack about the Climate was because the reason some of the long since resolved Mann controversy was because he used code that he also used for banking and thus couldn't share it. I don't remember the exact deal or even if it was true, but the thought inspired me to a joke, if it were true.
So, if you can put aside your feelings about gw for a second, given that the left has so much riding on it, there's a pretty funny geek joke... the same buggy program destroyed capitalism first, and then liberalism second, and in a year from now all of our stock will be worth pennies on the dollar, unemployment will be 25%, and the Thames and Delaware will be both frozen solid, all because of a missing semicolon.
This is my sig.
It talks about prediction ... for ... which correlation is enough.
If it rains, I'll stay indoors. Therefore, if I stay indoors, it'll rain!
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
I wonder how different the result are from the normal cumulated Impact factor of the scientists publications....
But i forgot. Google is the only database on the planet....
Such an algorithm may be quite good at indicating popular papers and topics. But there are ideas which are like urban legends. They spread faster than they get falsified. Just think about topics like "cold fusion" or "transmutation of matter". An idea is not good just because it is attractive.
If it's a valid predictor, it would produce those results based only on citations before the author receives a Nobel nomination. An author known to be a Nobel nominee, and especially a Nobel prize winner, will receive more citations and page reads based on their Nobel notoriety. An author who fails to cite a Nobel winning paper would be considered to have incomplete references, and the referees or thesis committee will tell them to add those missing citations.
Actually, citation ranking was first and developed some time in the 1970's. Google's page rank algorithm was an application of citation ranking to the web. The original Page Rank paper even cites the citation ranking papers.
(This also kinds of points out a problem with citation ranking: everybody these days is going to cite page rank, even though the idea originally was developed by other people. So, citation ranking isn't going to tell you who should get the credit, only who popularized an idea.)
Google PageRank = Wisdom of Crowds
And Wisdom of Crowds != Wisdom of Intellectuals
I'd like to buy homeland for our 10 million people. http://twitter.com/mahadiga
Are you seriously proposing that PageRank will be used to predict actual Nobel Prize winners? Wow.
However, give the tech and networks behind such algorithms twenty years or so, and you'll probably find that human beings are no longer the most intelligent species on the planet.
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.