Slashdot Mirror


Researchers Forecast the Spread of Diseases Using Wikipedia

An anonymous reader writes Scientists from Los Alamos National Laboratory have used Wikipedia logs as a data source for forecasting disease spread. The team was able to successfully monitor influenza in the United States, Poland, Japan, and Thailand, dengue fever in Brazil and Thailand, and tuberculosis in China and Thailand. The team was also able to forecast all but one of these, tuberculosis in China, at least 28 days in advance.

34 of 61 comments (clear)

  1. INteresting by cbhattarai · · Score: 1

    This is really an interesting stuff. I guess we have every single thing in WIKI.

  2. forecast using /. by Selur · · Score: 1

    wondering when they start to try to predict diseases (or may be pc sales) from /. posts

  3. Sounds familiar by Anonymous Coward · · Score: 1

    Sounds familiar, hasn't someone already done that half a year or a year ago using Google search string mapping?

    1. Re:Sounds familiar by Anonymous Coward · · Score: 2, Informative

      Thought so, it was Google, and they even created a page with real-time stats.
      http://www.google.org/flutrends/us/#US

    2. Re:Sounds familiar by umghhh · · Score: 1

      it works like fighting evil regimes by clicking on 'likes' button of fb and alikes does.

    3. Re:Sounds familiar by sumdumass · · Score: 1

      Which is ancient magic compared to the power of hashtags. ... #duh

  4. How? by Qbertino · · Score: 1

    How did they do it? I started reading the linked paper, but my brain started hurting two sentences in. I couldn't extract any useful information on the 'how'.

    --
    We suffer more in our imagination than in reality. - Seneca
    1. Re:How? by ctrl-alt-canc · · Score: 4, Informative
      They made the assumption that if a disease is spreading somewhere, there people start looking for information about the disease on wikipedia.
      This implicitly makes some big assumptions, among which the facts that people are aware of the disease and that they have internet access.

      You can easily understand why their approach is of very limited usefulness, and scientifically questionable. I think that it is not by chance that their method fails to work when analyzing data for Uganda (where internet usage probably isn't widespread) and does not score well for China (where censorships both limits information about disease outbreaks and internet access).

      They also state in their paper: "With these constraints in mind, we used our professional judgement to select diseases and countries.", and this raised my eyebrows a lot...

      I would like to put at chance their approach by sifting wikipedia access data looking for Ebola keyword in slovenian language, and then forecast the diffusion of Ebola in Slovenia (equal to nil up to now...), but I try to use my time for testing methods that are better-posed.

      "There are three kinds of lies: lies, damned lies, and statistics."

    2. Re:How? by NoNonAlphaCharsHere · · Score: 1

      I don't think you're being fair. This research extends their ground-breaking study that searching Google for "Jennifer Lawrence iCloud hack" predicted fapping with 100% accuracy.

    3. Re:How? by wisnoskij · · Score: 2

      That was my thought. The only way I can think of to use Wikipedia log data to predict outbreaks, would also of predicted that American was in the grip of a huge Ebola epidemic a few weeks ago. Perhaps this wiki data is just any easy way to measure media attention to a subject, which often is correlated with an epidemic? It is measuring the public's attention, not actually making a prediction.

      --
      Troll is not a replacement for I disagree.
    4. Re:How? by Rosco+P.+Coltrane · · Score: 3, Funny

      They made the assumption that if a disease is spreading somewhere, there people start looking for information about the disease on wikipedia

      Imagine the potential: if a lot of search logs contain "EBOL-AAAARGH", they'll know a particularly fast-acting variant of the virus has emerged.

      --
      "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
    5. Re:How? by Rosco+P.+Coltrane · · Score: 1

      I think the most important piece of news of this story is that Wikipedia is no better than Google or Facebook, and exploits/sells search data too.

      --
      "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
    6. Re:How? by del_diablo · · Score: 1

      Which raises the question: If you search for the symptom keywords(Rash, Boils, Bleeding, coughing), can Wikipedia actually list diseases with those keywords?

      From experience I do know that a lot of food can be typed in a native language, and it will still go to the correct page on English Wikipedia, roughly.
      But if I start search for terms and keywords, Wikipedia tend to be worse than google.

    7. Re:How? by radarskiy · · Score: 1

      'They made the assumption'
      They made a hypothesis, then tested that hypothesis against the null hypothesis. This is otherwise known as science. Why do you hate science?

  5. Re:Wats poppin my negroes by NoNonAlphaCharsHere · · Score: 1

    Jack Bauer found out who was there, who they worked for, and where the goddamn bomb was.

  6. "... Spread of Diseases Using Wikipedia" by garutnivore · · Score: 1

    Wait... what? Diseases now use Wikipedia?

    1. Re:"... Spread of Diseases Using Wikipedia" by NoNonAlphaCharsHere · · Score: 1

      Why not? Viruses use Outlook.

    2. Re:"... Spread of Diseases Using Wikipedia" by arth1 · · Score: 1

      No, silly - the diseases themselves are not using Wikipedia; people are going to use Wikipedia to spread diseases.

      (I rather enjoy the triple meaning ambiguity in this headline)

      Wouldn't it be nice if headlines used commas and reflexive pronouns?
      Or if there were someone who checked them over before publishing, like a proofreader?

      I too read it as using Wikipedia to spread the diseases. Which is, I guess, doable, if logging gene sequences there, which someone else can splice into harmless but compatible bacteria.
      Would publishing that kind of information be illegal?

    3. Re:"... Spread of Diseases Using Wikipedia" by sumdumass · · Score: 1

      Oh noes.. when will we be able to get wikicondums and how would that work?

  7. Useless now that it's known? by fygment · · Score: 1

    Now that they've spread the word, will the approach start to be 'gamed' by big pharma or gov't trying to sow the seasonal flu panic?

    --
    "Consensus" in science is _always_ a political construct.
  8. Take that Educators! by gunner_von_diamond · · Score: 1

    And teachers always say not to use Wikipedia for research. "Wikipedia is the devil!" When used correctly Wikipedia is a valuable resource.

    1. Re:Take that Educators! by terbo · · Score: 1

      The teachers might not know about 'Talk Pages', 'Revisions', and 'What Links Here':
      things that make wikipedia much more advanced than traditional encyclopedias.

      --
      If you're interested in facts I'll tell you what they are and I'll give you sources - Chomsky on The Big Idea
    2. Re:Take that Educators! by tehcyder · · Score: 1

      The teachers might not know about 'Talk Pages', 'Revisions', and 'What Links Here':
      things that make wikipedia much more advanced than traditional encyclopedias.

      No, teachers know that lazy students will just blindly copy and paste stuff from wikipedia.

      --
      To have a right to do a thing is not at all the same as to be right in doing it
  9. Man! Wikipedia is mean. by 140Mandak262Jamuna · · Score: 1

    I thought Wikipedia was spreading just misinformation and biased information. Now they are spreading actual biological diseases using Wikipedia? I'm not surprised. Internet is a lawless frontier and anything goes there.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
  10. umm by superwiz · · Score: 1

    Why not google trends? It's already categorized.

    --
    Any guest worker system is indistinguishable from indentured servitude.
    1. Re:umm by necro81 · · Score: 1

      Google has been working on that, it's called Flu Trends. But it hasn't really proven itself out yet. See my post below.

    2. Re:umm by superwiz · · Score: 1

      You can cross-correlate multiple medical term searches and conditions and see the trends in search over broken down by regions. It's not limited to flu. You can do it by other (some slowly-spreading) medical conditions.

      --
      Any guest worker system is indistinguishable from indentured servitude.
  11. Re: 28 days in advance of later? by Anonymous Coward · · Score: 2, Funny

    Look, we're onto your game. The suggestion that you've been living under a rock was a dead giveaway that you're a zombie...

  12. It's been done, sort of by necro81 · · Score: 1

    Google tried (is still trying?) to track the spread of influenza, by watching the trends in searches for information about the disease. It's a very interesting bit of work, but as I recall, failed to be meaningfully predictive. The trouble is, there are lots of prosaic reasons why someone might search out information about the flu (or any other disease) other than actually having it. Separating that noise (general interest in the flu) from the genuine signal (particular interest from people who are infected). Doesn't mean it can't work, just that it hasn't been made to work yet.

  13. This is why... by CODiNE · · Score: 1

    I always wash my hands after using Wikipedia.

    --
    Cwm, fjord-bank glyphs vext quiz
  14. Wikipedia the vector by Bruce+Perens · · Score: 1

    Like others I found the headline confusing. I read it as "Researchers are predicting the use of Wikipedia as a vector for the spread of disease". This may mean that:

    • Disinformation and ignorance are diseases.
    • Memes and computer viruses are diseases.
    • Wilipedia contains information that leads to depression.
    • Instructions on Wikipedia lead to substance abuse.
    • This is getting entertaining, fill in your own reason here.
  15. Re: Wats poppin my negroes by electrosoccertux · · Score: 1

    whose there?

  16. google flu trends by Alphons+Clenin · · Score: 1

    google has been forecasting flu through search data for a while.

    http://www.google.org/flutrends/us/

    It doesn't work perfectly though:

    http://www.nature.com/news/when-google-got-flu-wrong-1.12413

    1. Re:google flu trends by Fpdx · · Score: 1

      yes, but google does not share its log files!

      Google published a Nature paper out of it. AFAIK the data (google queries) on which that research is based is kept well secret. Therefore it is not possible to validate what they did. Science cannot be based on secret data, and the journal Nature in this case published an advertising ("how awesome is google"), not a scientific paper ("these are the data, this is our method, check out our conclusions").

      As they athors here say, approaches from closed sources like google limit a lot the efficiency of this kind of approach. So they choose a free software thinking: wikipedia because the data is public + their software is free software. Good work.