Slashdot Mirror


Microsoft Analyzes Web Searches, Finds Clues For Early Cancer Detection (computerworld.com)

An anonymous reader quotes a report from Computerworld: Analyzing online activities can provide clues as to a person's chances of having cancer, Microsoft researchers showed in a paper published this week. Specifically, the researchers demonstrated that by analyzing web query logs they were able to identify internet users who had pancreatic cancer even before they'd been diagnosed. The study suggest that "low-cost, high-coverage surveillance systems" can be created to passively observe search behavior and to provide early warning for pancreatic cancer, and with extension of the methodology, for other challenging cancers," the researchers concluded. "Surveillance systems could also provide for automated capture and summarization of data and landmarks over time so as to provide patients with talking points in their discussion with medical professionals." The researchers used proprietary logs of 9.2 million web queries on Microsoft's own Bing search engine but focused exclusively on English-speaking people in the U.S. from October 2013 to May 2015. First, the team identified searchers in logs of online search activity who made "special queries" that are suggestive of a recent diagnosis of pancreatic cancer. Those queries included phrases such as "Why did I get cancer in pancreas," and "I was told I have pancreatic cancer, what to expect." The team then went back "many months" before the initial queries were made to examine patterns of symptoms as they were expressed by web searches about pancreatic cancer symptoms. "We showed specifically that we can identify 5% to 15% of cases, while preserving extremely low false-positive rates," the researchers said in their paper. The false positives ranged from one in 10,000 to one in 100,000.

73 comments

  1. Newsflash by bengoerz · · Score: 3, Insightful

    Researchers have very low false-positive rate when analyzing past data.

    1. Re:Newsflash by Anonymous Coward · · Score: 0

      And /thread.

      Apparently it's possible to construct a mathematical model that fits a given dataset - who knew?!

    2. Re:Newsflash by NotQuiteReal · · Score: 1

      "analyzing past data" - I've had great success in creating convincing horse race and stock picks using that method!

      Unfortunately, "Past performance is no guarantee of future results" is VERY true.

      --
      This issue is a bit more complicated than you think.
    3. Re:Newsflash by AK+Marc · · Score: 1

      That's simply not true. You take 100% of the data, run the analysis on 50%, then, when done, apply that analysis on the remaining 50% and measure your success. If successful, it should apply to any data source. Even future ones.

    4. Re:Newsflash by Anonymous Coward · · Score: 0

      Although that's the way it's usually done, what you're stating is in general false. It assumes in particular that past behaviour is an indicator of future behaviour. This could just as well be indicative of people watching some medical drama with an unfolding story about cancer.

    5. Re:Newsflash by ceoyoyo · · Score: 1

      Yeah, except that a lot of people don't bother doing that. Particularly physicians.

    6. Re:Newsflash by AK+Marc · · Score: 1

      The population can change, but unless the population changes, the results should remain consistent.

      Past performance is indicative of future performance, but the disclaimer is required because it isn't proof, and conditions can change.

    7. Re:Newsflash by AK+Marc · · Score: 1

      This was big-date people doing medical research, not physicians, so hopefully they did it right.

  2. Probably very easy for microsoft by NotInHere · · Score: 4, Funny

    They just scan the user agent of the browser connecting, and if it contains "Linux" it means there is a cancer infection in the eyes of Microsoft.

  3. Right. Cancer. Of course. by Anonymous Coward · · Score: 1

    THAT'S what they'd use it for.
    To protect us from Cancer.

  4. Wow!! by Anonymous Coward · · Score: 1

    They looked for people searching for "how to cope with stage 4 cancer" and sure enough, those people had it in most cases. How DO they do it?

    1. Re:Wow!! by murdocj · · Score: 4, Informative

      Actually, if you had patience to read the summary, they did something very clever. They did find the people who had been actually diagnosed with cancer. Then they went back months to their previous searches, and found that BEFORE they had any idea they had cancer, they were searching for information on their symptoms... symptoms of the cancer that would be discovered much later. How much would it be worth to you to find out you have cancer when it can be treated, rather than too late?

    2. Re:Wow!! by Dunbal · · Score: 1

      And these non medically trained researchers are sure that these symptoms are specific to pancreatic cancer because after all abdominal pain, weight loss and vomiting could hardly be anything else right?

      --
      Seven puppies were harmed during the making of this post.
    3. Re:Wow!! by Anonymous Coward · · Score: 1

      Yes, if you read the fucking summary, you will see that they state, "The false positives ranged from one in 10,000 to one in 100,000."

      So yeah, the researchers are pretty certain the symptoms and terms they're analyzing for are pretty specific to pancreatic cancer.

      What would you rather have: No warning that you may have cancer until you're in the hospital shitting your guts out a straw? Or maybe, a warning 6 months earlier to get checked out by your doctor - when the cancer may be treatable, and you can avoid a lot of that pain and suffering?

    4. Re:Wow!! by Ol+Olsoc · · Score: 1

      Actually, if you had patience to read the summary, they did something very clever. They did find the people who had been actually diagnosed with cancer. Then they went back months to their previous searches, and found that BEFORE they had any idea they had cancer, they were searching for information on their symptoms... symptoms of the cancer that would be discovered much later. How much would it be worth to you to find out you have cancer when it can be treated, rather than too late?

      If you have any symptoms from Pancreatic cancer, its already too late.

      --
      The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
    5. Re:Wow!! by ShanghaiBill · · Score: 1

      If you have any symptoms from Pancreatic cancer, it's already too late.

      Yes, a diagnosis of pancreatic cancer is usually a death sentence, but many people would still appreciate a few extra months to wrap up life issues, reach out to old friends, maybe arrange one last family reunion, and knock a few items off their list of life goals. Earlier diagnoses are good, even if they don't provide a cure.

    6. Re:Wow!! by Anonymous Coward · · Score: 0

      Even more critical is how vague the early symptoms are and how quickly the disease progresses. Stage 3 pancreatic cancer has a 5 year survival rate of approximately 8% if I recall correctly. Stage 1 is around 40%? If we can increase chances of surviving this terrible disease 5x by correlating a specific set of typically vague symptoms - that's huge. Not as huge as the cluster of symptoms we know to be specific to a heart attack, but we're getting there.

    7. Re:Wow!! by ShanghaiBill · · Score: 1

      "The false positives ranged from one in 10,000 to one in 100,000."

      If they had a false positive rate of 1% or even 10%, that would be amazingly good. Their claimed rate of 0.01% to 0.001% is completely implausible. It may just be a case of incompetent journalism, but if the researchers actually claimed those rates, I don't believe them, and I question their integrity.

    8. Re:Wow!! by Ol+Olsoc · · Score: 1

      If you have any symptoms from Pancreatic cancer, it's already too late.

      Yes, a diagnosis of pancreatic cancer is usually a death sentence, but many people would still appreciate a few extra months to wrap up life issues, reach out to old friends, maybe arrange one last family reunion, and knock a few items off their list of life goals. Earlier diagnoses are good, even if they don't provide a cure.

      My Mother in law died from Pancreatic cancer. Funny, but she didn't feel much like doing a bucket list. Kinda laid around, then died. Here's an even better approach. We all die. Seems like a bitched up idea that you wait until you have a couple months left to do the things you suggest. Do them now, when it isn't so depressing.

      --
      The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
    9. Re:Wow!! by Anonymous Coward · · Score: 0

      It's misdiagnosed as food poisoning/gall bladder issues/acute pancreatitis often enough that the error rate can safely be ignored. An ERCP to confirm pancreatic cancer is far more likely to be performed as opposed to sending the patient home with a bottle of painkillers.

    10. Re:Wow!! by Dunbal · · Score: 1

      This only means that one in 10,000 turned out to have pancreatic cancer without querying for the symptoms. That is what false positive means. I am pretty sure they didn't follow up on every single patient and rule out say acute diarrhea.

      --
      Seven puppies were harmed during the making of this post.
    11. Re:Wow!! by Anonymous Coward · · Score: 2, Informative

      > If they had a false positive rate of 1% or even 10%, that would be amazingly good.

      If it was a sensitive test, that'd be true.

      But it's not a sensitive test. It spots a relatively small percentage of pancreatic cancer cases early, while getting a relatively small number of false positives.

      Diagnostic and screening tests trade off sensitivity (detecting cases) and specificity (positive tests indicating the specific condition). This test has a very low sensitivity in order to attain its high specificity.

    12. Re:Wow!! by AK+Marc · · Score: 1

      Nope. The researchers don't know or care about "symptoms". If everyone with lung cancer googles for "left handed mouse" and "pineapple and red bean ice cream" 7 months before being diagnosed with cancer. It's a correlation, not a diagnosis, and not "symptoms". If a correlation is strong, it's useful. Though, as discussed, the better correlations happen among more logical things, like possible cancer symptoms.

    13. Re:Wow!! by AK+Marc · · Score: 1

      So a false negative rate of 85% to 95% is so high that you find they must be lying. You have unreasonably high standards. They tuned to minimize false positives, at the expense of false negatives. They could get the false positive rate you demand, and greatly improve the false negative rate. And, then with the same data and analysis, meet your false positive rate requirements.

    14. Re:Wow!! by AK+Marc · · Score: 1

      When you ignore the symptoms until you are too sick to do anything, you don't do a bucket list. Because the health care system in the US is so bad, many people in the US wait until they are too sick to have any options. I know more than one person who was having deadly symptoms before the first time they sought care.

    15. Re:Wow!! by Anonymous Coward · · Score: 0

      Interestingly enough what you are describing is the false negative.

      With any test for illness there are 4 possible combinations for the patient actually having the illness tested for and the actual result of the test.

      1) Patient has illness, the test result is positive: Regular result, ok
      2) Patient has illness, the test result is negative: False Negative (Since the test turns out negative and that is false, the patient is actually sick)
      3) Patient does not have illness, the test result is positive: False Positive (Since the test turns out positive and that is false, the patient is not sick)
      4) Patient does not have illness, the test result is negative: Regular result, ok

      Having a very low False Positive rate is not really impressive, you can obtain that by doing a test that never yields positive results. In the actual situation of the false positive rate being 1 in 10.000 you can still only make sense of that if you know what the chance of actually having the illness is. It turns out that people have an actual chance of around 1 in 10.000 (in American statistics) of developing Pancreatic cancer. So if the actual test would be perfect (False Negative rate of 0%) half of the results that ended up positive would actually be false positives. If you do the test on 10.000 people one would actually be a real cancer patient and one would be without the illness but be tested positive. Basically a positive test would mean that you have a 50-50 chance of actually having the illness tested for. This result gets worse when the False Negative rate actually increases (not all people with the illness are actually diagnosed correctly by the test)

    16. Re:Wow!! by Dunbal · · Score: 1

      3) Patient does not have illness, the test result is positive: False Positive (Since the test turns out positive and that is false, the patient is not sick)

      Patient does not have pancreatic cancer. Algorithm thought he had it. False positive Seriously what the heck are you talking about? I know what a false positive is. You, apparently, don't - even if you can quote the conditions back to me you can't apply them.

      --
      Seven puppies were harmed during the making of this post.
    17. Re:Wow!! by jonbryce · · Score: 1

      But if very few people get pancreatic cancer, then with a false positive rate of that magnitude, it could still be that the majority of detections are false positives.

    18. Re:Wow!! by jonbryce · · Score: 1

      1/10000 of people who searched for those symptoms didn't have pancreatic cancer. Around 90% of people who did have pancreatic cancer didn't search for those symptoms.

    19. Re: Wow!! by Anonymous Coward · · Score: 0

      Yes, they have wrong priorities.
      No excuse to not seek medical help in a timely manner.

    20. Re: Wow!! by AK+Marc · · Score: 1

      Unless the cost of medical treatment would cause bankruptcy. And don't trust doctors, American doctors kill many people through malpractice.

    21. Re: Wow!! by Anonymous Coward · · Score: 0

      Extraordinary claims require extraordinary evidence.

      M.D., Pharm.D, Ph.D. -- because I enjoy research and saving lives.

    22. Re:Wow!! by Ol+Olsoc · · Score: 1

      When you ignore the symptoms until you are too sick to do anything, you don't do a bucket list. Because the health care system in the US is so bad, many people in the US wait until they are too sick to have any options. I know more than one person who was having deadly symptoms before the first time they sought care.

      Most times, Pancreatic cancer does not have symptoms until it is too late. That is why I already wrote that if you have symptoms, and it is pancreatic cancer, your outlook is nil.

      How many family members you have die from it? My Stepmom - in law first showed some diabetes like symptoms and went to the doctor in a couple days. She was a RN.

      Pancreatic Cancer does not do it's work because of the US health care system or the expense of treatment. Unless caught pretty much by luck, you are out of luck. I suppose if you were really paranoid, you could go in for weekly blood tests, maybe monthly biopsies.

      From Wikipedia:

      "There are usually no symptoms in the disease's early stages, and symptoms that are specific enough to suggest pancreatic cancer typically do not develop until the disease has reached an advanced stage. By the time of diagnosis, pancreatic cancer has often spread to other parts of the body."

      Here is the web page - better get over there and correct the misinformation they are spreading. https://en.wikipedia.org/wiki/...

      --
      The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
    23. Re:Wow!! by Anonymous Coward · · Score: 0

      People with pancreatic cancer who do not query for symptoms are not False positives since the test is 'Query for the symptoms', These are just people with undiagnosed pancreatic cancer who are searching for porn.

  5. not creepy! by Anonymous Coward · · Score: 1

    And your search engine mining your search history to figure out which diseases you might have is not creepy at all! No sir.

    And surely there is nothing else that could be done with this technique.

    I'd suggest this one instead.

  6. Microsoft (and google, apple and the rest) by ameline · · Score: 1

    STOP watching/tracking what I'm doing with my computer -- it's creepy as fuck!

    (Switched to Mac years ago, not going back to windows and it's "telemetry". Not so sure of Apple either, but at least they claim to not track users and collect data on them -- their business model (currently) is selling shiny toys -- not selling data. And they have been pushing back on surveillance in the courts and their encryption is good. Filevault should be on by default.)

    --
    Ian Ameline
    1. Re:Microsoft (and google, apple and the rest) by Anonymous Coward · · Score: 0

      Stop being retarded.

    2. Re:Microsoft (and google, apple and the rest) by Anonymous Coward · · Score: 0

      Useful idiot - you've absorbed as much marketing as humanly possible, and it has shaped every facet of how you view the world. You failed.

    3. Re:Microsoft (and google, apple and the rest) by antdude · · Score: 1

      Searches like in Spotlight, Safari, iCloud, iTunes, etc.

      --
      Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
    4. Re:Microsoft (and google, apple and the rest) by Anonymous Coward · · Score: 0

      STOP watching/tracking what I'm doing with my computer -- it's creepy as fuck!

      They're not -- at least not in anything related to this story! Learn how a computer works, would ya?

      They're tracking what you're doing to THEIR computers. Small hint -- if you need to use the internet to do whatever thing you're talking about, you're going to be accessing other people's computers.

    5. Re:Microsoft (and google, apple and the rest) by Anonymous Coward · · Score: 0

      Would you also like your credit card company to stop keeping track of where you spend your money? Perhaps you want your telephone service provider to stop tracking whom you call?

      dom

  7. That they have this data at all is frightening by Anonymous Coward · · Score: 0

    And it isn't anonymous if they can link it to the same user. What ethics board approved this?

    1. Re:That they have this data at all is frightening by AK+Marc · · Score: 1

      It's anonymous if they can't identify that "same user", they don't know it's Bob, and they can't figure it out. They have to identify that user to be able to perform the statistical analysis.

  8. Telemetry causes pancreatic cancer by Anonymous Coward · · Score: 0

    Most of the users with pancreatic cancer had recently been tricked into upgrading to Windows 10... why is Microsoft covering up this disturbing fact?

  9. Detecting early warning signs of... by axewolf · · Score: 1

    cancer

    or

    dissident ideology

    which one do you think is likely to be acted on to promote stability for the economy from which microsoft profits?

    Seriously us people who aren't herd animals need to get it together. There is clearly a massive conspiracy against us all. Are you going to wait around to see what the top of the hierarchy does with all of this information they have on us once the economy becomes majorly automated?

    1. Re:Detecting early warning signs of... by Anonymous Coward · · Score: 0

      You're more of a "herd animal" than anyone you've ever called one. You are not a brave iconoclast, and there is no possibility that you ever will be.

  10. Never thought it possible by Anonymous Coward · · Score: 0

    I never thought anyone would be able to make cancer diagnoses creepy and unnerving.. kudos Microsoft

  11. Colonoscortana by Anonymous Coward · · Score: 0

    Where would you like me to go today?

    Either way I'm going there.

  12. This is nonsense by Anonymous Coward · · Score: 0

    They can identify you are likely to have pancreatic cancer when you enter searches for "Do I have pancreatic cancer?" and "Pancreatic cancer symptoms." It's not like they're diagnosing cancer in seemingly healthy people. They're just going to be telling people what they've already figured out. At the very least, these people would already know about their symptoms.

  13. Symptoms? What Symptoms? by NotQuiteReal · · Score: 1

    Unless you have symptoms that are just in common with pancreatic cancer symptoms, in which case they are just symptoms of something else. The study has a biased sample, since they where studying the history of "known cancer" patients (or someone who lies to search engines... for example, I just now searched "I think I have pancreatic cancer"... Shit! - oh, wait, I just searched for "medical student syndrome", too! Whew! That's what I have.)

    --
    This issue is a bit more complicated than you think.
  14. Obligatory... by CheeseTroll · · Score: 1

    Clippy: It looks like you may have cancer! Would you like me to schedule an appointment with an oncologist?

    --
    A post a day keeps productivity at bay.
    1. Re:Obligatory... by Anonymous Coward · · Score: 0

      More like:

      Clippy: It looks like you may have cancer! Your health insurance provider has been notified, and, regrettably, has canceled your policy. Would you like to make out a will at this time?

    2. Re: Obligatory... by kencurry · · Score: 2

      ... It looks like you may have cancer; your insurance company & employer have been notified. Have a nice day !!!

      --
      sigs are for losers (except to point out that sigs are for losers)
    3. Re: Obligatory... by CrimsonAvenger · · Score: 1

      ... It looks like you may have cancer; your insurance company & employer have been notified.

      The ACA has some issues, but one of the things it does right is make it illegal for your insurance to dump you for things like cancer, and make it illegal for an insurance company to refuse coverage if you have a pre-existing condition.

      Okay, TWO of the things it does right.

      --

      "I do not agree with what you say, but I will defend to the death your right to say it"
  15. no, no and NO. by Anonymous Coward · · Score: 0

    nest thermostats could have sensors in them to detect 'aromas' of drug use and manufacturing....

    cameras in phones and computers to capture faces and recognize stacks of money or credit cards

    microphones in same, to record based on trigger keywords...

    sewer systems could have sensors in them to detect illicit chemicals and drugs (this already happens, btw)...

    papers being recycled could be scanned and archived for later analysis....

    where do you draw the line at "good" mass surveillance and "bad".. you don't.. it's ALL BAD.

  16. yah right. ms curing cancer by Anonymous Coward · · Score: 0

    meanwhile you pay them to spy on you.

    they sold you out.

  17. Not particularly useful for pancreatic cancer by Anonymous Coward · · Score: 0

    The reason pancreatic cancer is so deadly is because it is seldom detected in its early stages and spreads rapidly. Symptoms may not appear until the cancer is already spread through the body, where it becomes too late to do anything surgically to remove it. So there's a good chance that the people who are going online and searching about their symptoms already have the cancer in an advanced stage. Not sure what the predictive value is in that. Maybe the method would be useful for other diseases, though.

  18. aka SPYING ON YOU IS GOOD by Anonymous Coward · · Score: 0

    Spying on everything you do is really in your best interest.

    Sign below this pentagram...

  19. Great by backslashdot · · Score: 1

    Just what I need, a pop up window that says "CONGRATULATIONS YOU'RE THE 1 MILLIONTH VISITOR WITH PANCREATIC CANCER!"

  20. And by Anonymous Coward · · Score: 0

    And drug use, and sexual interests, and political affiliations...

  21. No by Anonymous Coward · · Score: 0

    Sounds just as another try to justify snooping. Big companies often make false research requests. I see no reason for this to be something else.

  22. Doctor, Doctor by Mike+Frett · · Score: 1

    This is where a good doctor comes in. If you are vomiting and have abdominal pains and DO NOT go to the Doctor then you should have no expectations of survival once this fast-moving Cancer is revealed. A good Doc will run tests when presented with these symptoms and hopefully catch it early. A good Doc with EXP will know what's up when presented with your symptoms.

    Do yourself a favor, stop searching on WebMD, Google and Bing. Visit your Doctor instead.

    1. Re:Doctor, Doctor by Anonymous Coward · · Score: 0

      A web search is less expensive than a medical consultation. It may be relevant that the analysis is restricted to searchers in the United States.

  23. Future conversation goes like this... by moneybabylon · · Score: 0

    "I was told I have pancreatic cancer today." "OMG! The doctor told you?" "No, NSA sent me a letter."

  24. Wishful thinking by Anonymous Coward · · Score: 0

    Perhaps this is the sort of thing that they intend to do with data collected via Win10/Visual Studio 2015 compiled object's telemetry.

  25. So much for any claims MS makes for privacy. by Anonymous Coward · · Score: 0

    You have none.

  26. Rose-tinted glasses by Donwulff · · Score: 1

    And how many of their research subjects had been diagnosed with hypochondria? Searching for symptoms and eventual disease isn't unlikely pattern, whereas someone actually suffering from it would be more likely to only ask a doctor. Didn't bother to read the article, of course, but hopefully they did also check whether they did search indicating diagnosis also before, and possibly for other diseases.

    I also have to join those questioning the "false positive" rate there. People are perhaps even more liable to search for other people's conditions than their own, and while showing them a banner like "Your searches indicate X" would work just as well, in the context of the study that should count as a false positive. One question on this is exactly how they're counting or reporting false positives. Approximately 5 in 100.000 will get pancreating cancer *in their lifetime*, which comes to neighborhood 1 to 1.000.000 million per year. If their algorithm actually tagges 1 in 10.000 users as having pancreatic cancer then it is next to useless. If 1 in 10.000 tagged didn't turn out to have pancreatic cancer, then it's unbelievable.

    And indeed, assuming they were searching for identifiable symptoms, wouldn't they have discovered their cancer earlier? Is this a case of too slow medical system, or just a case of people who already know they have pancreatic cancer sometimes making searches looking like recent diagnosis... the example of "Why did I get pancreatic cancer?" in the summary for example is pretty telling, as that would seem quite likely search for a late-stage patient.

  27. As someone who HAS pancreatic cancer, by Anonymous Coward · · Score: 1

    I wonder if my search history would have shown I was likely to have it. My symptoms were so vague that I waited about 6 weeks before going to the doctor - general discomfort in the abdominal area (2 different spots that were separated), a little more gas and diarrhea than normal, possibly slight weight loss, maybe a bit more tired than normal. The doctor ordered tests that showed gall stones and positive for H. Pylori (main cause of ulcers). She put me on antibiotics for the H. Pylori but my impression was she didn't think that was it. It wasn't till I started throwing up, went to the E.R., turned yellow while IN E.R., that they ordered the CAT scan that revealed the tumor. It also showed a lesion in my liver and the E.R. doctors told me it had already spread (the liver lesion turned out to be benign).

    I may be one of the 'lucky' ones. Mine was operable and I got in a trial for an immuno-therapy drug. I'm now almost 3 months post-op (the Wipple procedure is NOT for wimps), there was no cancer in the margins or in the lymph nodes near it, and my tumor marker level is now at 20. I just started my 2nd round of chemo and may actually survive the cancer.

  28. Now they are trying to sell surveillance as health by Anonymous Coward · · Score: 0

    What a crock of doublespeak claptrap.

  29. Exactly this by lamer01 · · Score: 1

    Some symptoms that doctors may dismiss as other illnesses, this search capability may catch.

  30. How would this data be applied? by Anonymous Coward · · Score: 0

    I check my symptoms online, and the next day I get a note from a (for profit) medical service: Come in for a checkup.
    I research my leaky faucet, and the next day I get a visit from a friendly neighbourhood plumber.
    I research privacy, and the next day a government official calls to say how my privacy is assured, they want to take care of me.
    I can sit back and know that everyone is concerned about my welfare....

  31. Lesson by Alopex · · Score: 2

    What I took away from this is that people who use Bing get cancer.

  32. Screening tests need very low false positive rates by ceoyoyo · · Score: 1

    One in ten thousand is a lot of erroneous diagnoses when you're doing web search scraping.