Slashdot Mirror


Researchers Forecast the Spread of Diseases Using Wikipedia

An anonymous reader writes Scientists from Los Alamos National Laboratory have used Wikipedia logs as a data source for forecasting disease spread. The team was able to successfully monitor influenza in the United States, Poland, Japan, and Thailand, dengue fever in Brazil and Thailand, and tuberculosis in China and Thailand. The team was also able to forecast all but one of these, tuberculosis in China, at least 28 days in advance.

2 of 61 comments (clear)

  1. Re:Sounds familiar by Anonymous Coward · · Score: 2, Informative

    Thought so, it was Google, and they even created a page with real-time stats.
    http://www.google.org/flutrends/us/#US

  2. Re:How? by ctrl-alt-canc · · Score: 4, Informative
    They made the assumption that if a disease is spreading somewhere, there people start looking for information about the disease on wikipedia.
    This implicitly makes some big assumptions, among which the facts that people are aware of the disease and that they have internet access.

    You can easily understand why their approach is of very limited usefulness, and scientifically questionable. I think that it is not by chance that their method fails to work when analyzing data for Uganda (where internet usage probably isn't widespread) and does not score well for China (where censorships both limits information about disease outbreaks and internet access).

    They also state in their paper: "With these constraints in mind, we used our professional judgement to select diseases and countries.", and this raised my eyebrows a lot...

    I would like to put at chance their approach by sifting wikipedia access data looking for Ebola keyword in slovenian language, and then forecast the diffusion of Ebola in Slovenia (equal to nil up to now...), but I try to use my time for testing methods that are better-posed.

    "There are three kinds of lies: lies, damned lies, and statistics."