Researchers Forecast the Spread of Diseases Using Wikipedia
An anonymous reader writes Scientists from Los Alamos National Laboratory have used Wikipedia logs as a data source for forecasting disease spread. The team was able to successfully monitor influenza in the United States, Poland, Japan, and Thailand, dengue fever in Brazil and Thailand, and tuberculosis in China and Thailand. The team was also able to forecast all but one of these, tuberculosis in China, at least 28 days in advance.
The wizard will cast a spell on your ass to make it tickle! Why the fuck did you go in the living room!?
This is really an interesting stuff. I guess we have every single thing in WIKI.
wondering when they start to try to predict diseases (or may be pc sales) from /. posts
Sounds familiar, hasn't someone already done that half a year or a year ago using Google search string mapping?
How did they do it? I started reading the linked paper, but my brain started hurting two sentences in. I couldn't extract any useful information on the 'how'.
We suffer more in our imagination than in reality. - Seneca
28 days later...
I predict China will be ground zero for the next big zombie pandemic. Some sort of pandemic, to make Ebola look like a Hawaiian vaction.
1) buy shares in pharmaceutical with unique and unprofitable vaccien for disease X ....
2) make bots that automate Wikipedia searches for disease X, deploy
3)
4) PROFIT
Jack Bauer found out who was there, who they worked for, and where the goddamn bomb was.
...are diseases using Wikipedia? Those little rascals are getting smart.
Wait... what? Diseases now use Wikipedia?
Now that they've spread the word, will the approach start to be 'gamed' by big pharma or gov't trying to sow the seasonal flu panic?
"Consensus" in science is _always_ a political construct.
And teachers always say not to use Wikipedia for research. "Wikipedia is the devil!" When used correctly Wikipedia is a valuable resource.
I thought Wikipedia was spreading just misinformation and biased information. Now they are spreading actual biological diseases using Wikipedia? I'm not surprised. Internet is a lawless frontier and anything goes there.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Why not google trends? It's already categorized.
Any guest worker system is indistinguishable from indentured servitude.
Wikisneezia.
Political correctness is really just herd psychology pushed by insecure people who desperately seek social conformity.
Using linear models, language as a proxy for location
I'm not sure language is such a good indicator for where people are located. I usually use the English pages because the length and quality of the text tend to be better. Also quite a number of pages only exist in English. I'm quite sure this "language statistic corruption" is quite widespread and that English native speakers are unaware of the great quality difference between languages. The data is likely bogus unless this is taken into account.
Having said that, there is something odd about the article. The abstract mentions language as indicator for where people are. However the first figure has both language and country columns. Most match as expected (Polish for Poland etc), but there are exceptions. French and Haiti are in the same row, and Haiti isn't the first country I think of if people use French (that would be France). This mean they are likely using IPs too to detect geolocation. It seems natural, but the abstract doesn't mention anything about using anything other than language for this task.
Google tried (is still trying?) to track the spread of influenza, by watching the trends in searches for information about the disease. It's a very interesting bit of work, but as I recall, failed to be meaningfully predictive. The trouble is, there are lots of prosaic reasons why someone might search out information about the flu (or any other disease) other than actually having it. Separating that noise (general interest in the flu) from the genuine signal (particular interest from people who are infected). Doesn't mean it can't work, just that it hasn't been made to work yet.
Like wow. Diseases can figure out how to use Wikipedia to spread more quickly.
I always wash my hands after using Wikipedia.
Cwm, fjord-bank glyphs vext quiz
Like others I found the headline confusing. I read it as "Researchers are predicting the use of Wikipedia as a vector for the spread of disease". This may mean that:
Bruce Perens.
whose there?
Did Wikipedia provide the data? Does Wikipedia make the data public?
Connecting page load to IP address seems like extremely sensitive information, and not something Wikipedia should record or share.
google has been forecasting flu through search data for a while.
http://www.google.org/flutrends/us/
It doesn't work perfectly though:
http://www.nature.com/news/when-google-got-flu-wrong-1.12413