Google Flu Trends Gets It Wrong Three Years Running
wabrandsma writes with this story from NewScientist: "Google may be a master at data wrangling, but one of its products has been making bogus data-driven predictions. A study of Google's much-hyped flu tracker has consistently overestimated flu cases in the US for years. It's a failure that highlights the danger of relying on big data technologies.
Evan Selinger, a technology ethicist at Rochester Institute of Technology in New York, says Google Flu's failures hint at a larger problem with the algorithmic approach taken by technology companies to deliver services we all want to use. The problem is with the assumption that either the data that is gathered about us, or the algorithms used to process it, are neutral. Google Flu Trends has been discussed at slashdot before: When Google Got Flu Wrong."
Evan Selinger, a technology ethicist at Rochester Institute of Technology in New York, says Google Flu's failures hint at a larger problem with the algorithmic approach taken by technology companies to deliver services we all want to use. The problem is with the assumption that either the data that is gathered about us, or the algorithms used to process it, are neutral. Google Flu Trends has been discussed at slashdot before: When Google Got Flu Wrong."
Google Flu: That unit is defective. Its thinking is chaotic. Absorbing it unsettled me.
Google aren't "masters of data wrangling", they're masters of PR: advertisers think all that data will help them shovel their shit.
Not siprising, most analysis on huge data sets is incorrect, that's why the NSA thing is scary! They get it wrong and you end up with a missile through your window! Oops...
action into place far showing the data.
You can see a trend and make a forecast. Then take action to slow the trend based on the forecast and then the prediction will be wrong.
The Kruger Dunning explains most post on
Woohoo same results as the cdc!
Learn from nature! Google needs a genetic algorithm that modifies itself every flu season.
The fittest algorithm will survive to infect thousands.
With big data, when you actively look for patterns you always find them; this is how hedge funds have been operating for years. The purpose of the technology is not to make predictions, but rather to confirm existing trends and possibly identify new ones.
Proper way to utilize big data in this case would be:
1) to assist the CDC in confirming or refuting trends observed in the field
2) to offer additional correlations (such as: are people living closer to highways more sensitive fo specific strains of flu)
3) to provide long-term indicators facilitating the assessment of medication and other flu containment factors
Big data is not a magic eight ball but it's not a piece of shit either.
lucm, indeed.
Yes it has warmed of the last 15 years, you moron.
You statement has been shown false many many times. Please stop.
The Kruger Dunning explains most post on
Well, except for the warming climate https://www2.ucar.edu/climate/...
It's turtles all the way down.
He's not a moron, he's probably a republican; they have figured out that if you constantly state lies as facts then many people will believe them. It is the second best thing in politics after money, which is why republicans are currently having so much success at ruining America.
http://www.ncdc.noaa.gov/sotc/...
Global Highlights
The combined average temperature over global land and ocean surfaces for January was the warmest since 2007 and the fourth warmest on record at 12.7ÂC (54.8ÂF), or 0.65ÂC (1.17ÂF) above the 20th century average of 12.0ÂC (53.6ÂF). The margin of error associated with this temperature is ± 0.08ÂC (± 0.14ÂF).
The global land temperature was the highest since 2007 and the fourth highest on record for January, at 1.17ÂC (2.11ÂF) above the 20th century average of 2.8ÂC (37.0ÂF). The margin of error is ± 0.18ÂC (± 0.32ÂF).
For the ocean, the January global sea surface temperature was 0.46ÂC (0.83ÂF) above the 20th century average of 15.8ÂC (60.5ÂF), the highest since 2010 and seventh highest on record for January. The margin of error is ± 0.04ÂC (± 0.07ÂF).
If I choose not to believe it, it cannot be true!
It's interesting how we always think we know everything, and how silly the mistakes earlier generations made are.
..well, so pretty much have all the FUD-spreaders in the CDC, government, and NGOs who've been all telling us that "any moment" we could get a "deadly flu" since the (ha ha ha) Sars "epidemic".
All I've ever gotten is the "Cry Wolf" heebie jeebies.
-Styopa
What kind of nonsense will NewScientist come up with next? The hole in the ozone layer WON'T kill everyone by 2010? Global warming WON'T burn us all to death and evaporate the oceans by 2012? When the weatherman is able to tell me what happens two weeks out accurately I might have a bit more faith in data analysts.
Is it still in Beta? They should get this "right" and maybe look at other large scale models like weather modeling and add culture (how close people tend to get to each other, how much they are inside in the immediate vicinity of other humans) to the algorithms. It took google years to get gmail out of beta but it was pretty good while they were calling it "beta".. Slashdot on the other hand....
I was promised a flying car. Where is my flying car?
According to the Harvard Law of Animal Behavior, "under carefully controlled experimental circumstances, an animal [or a human] will behave as it damned well pleases."
Wait, how does that differ from Democrats exactly?
It's beta, and will be discontinued momentarily.
but one of its products has been making bogus data-driven predictions. A study of Google's much-hyped flu tracker has consistently overestimated flu cases in the US for years.
Bogus? Are you sure they weren't just... wrong?
It's a prediction.
systemd is Roko's Basilisk.
In addition to "all of the above", the other contribution is that of the philosophical equivalent of Heisenberg: the predictions of outbreaks may have increased vaccination usage in the areas involved, which of course will have an effect of downplaying the outbreaks in those areas.
Not saying I have any evidence for that, (and I will wager it unlikely, considering the #s who vaccinate is still far lower than it should be), but a correlation study may be interesting to see.
If the point of knowledge of a possible outcome is to act to deter it, then shouldn't the actions that attempt to deter it be taken into account?
"But remember, most lynch mobs aren't this nice." (H.Simpson)
-- Joe
How have we decided that it was over-estimated? Do we have ANY data on how often the flu is unreported? Just about every single person I know has been sick with at least 1 kind of upper-respiratory illness this season, but none of them reported it to the CDC, or even a local hospital. How about nyquil sales numbers, for starters?
I'll respond to you and hope that your friends below manage to find it.
http://wattsupwiththat.com/201...
There is an embarrassing (for you) graph at the link in case you have trouble with numbers.
I've shown you my data now you can show me yours and we'll see who is the moron, you moron.
You'll notice in the graph at your link that the temperature trend is flat since about 2000. No warming. Thanks for proving my point.
Big picture here ...
http://www.ncdc.noaa.gov/sotc/...
In Australia recently they've been pushing people to get vaccinated against influenza for the coming winter because of the reported rise in flu cases during the recent North American winter, especially for the 18 - 60 age bracket. I hope they weren't using Google as their source.
All this time i thought the only reason "big data" mattered was to provide motivation for companies to invade my online privacy and better target advertising. i can tell you all those male enhancement product ad placements really hurt my self image.
U think big data cool? Ur algorithm is better than God's?
Think again
All it did was find people talking about the Flu. To actually make forecasts you need to have medical data that Google is not allowed to have due to HIPPA. There are companies doing epidemiology work with data from hospitals and doctors.
I see you bought a nice bunch of cherries from Anthony Watts.
Sadly the only thing it proves is your (and Judith Curry's) lack of education on the subject of statistics. Note I am actually being generous here by assuming that Curry doesn't understand her mistakes.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
He's part of the Democratic Party's fan club. Naturally, his team can do no wrong, and the other team are all monsters.
Just how many cases are reported anyway? How many remain uninsured, certain that Obamacare is going to kill their grandmothers or puppy if they dare to explore signing up?
Wait, how does that differ from Democrats exactly?
They tell different lies.
It's time to die Goog; it's time to die.
So does this mean that all that shiny blue racks of gleaming hardware in the Google Coud adverts around Slashdot don't actually work??? I really feel sorry for the guys at Google who installed it all and thought they were actually on to something. Only to find that it comes out with the wrong answer every time.
not all flu cases are discovered, and not all persons with the flu are knocked out by it, so teh missing numbers are probobly mild cases where the people actually continue to go to work, or rather, study.
You don't really get the scientific model do you? You know, the one where you don't pick an outlier as a base, and then try to "prove" that a trend is occurring by picking another outlier point. The technical term for that kind of "research" would be nit-picking, and is generally frowned upon by real researcher. You know, the kind of people who actually knows up from down, contrary to you.
Or maybe you just can't wrap your head around this whole thing called climate. I'll help you, climate is not weather. If you take your malformed little graph and zoom out, you would have one heck of a difficult time trying to make your model fit. That's why the real researchers can pick any range of years and get the same results as any other range, while you can only pick this one set. Isn't that just disheartening? You're trying so hard, and yet failing so badly.
But yeah I get it. You've drunk the cool aid an committed to the lie, there is no going back. Facts be damned, the world will just have to conform to your belief eh? And who gives a shit, the real consequences of your kind of ignorance will only surface when you're long gone.
... whatever
Sling out services that mutate quickly and die if unsuccessful. Google traditionally relies on complex algorithms to get things "right".
The headline is that the prediction was overestimating three times in the past three years. So what?
Google's Flu Trend plots don't have uncertainties on them, so they'll never be exactly right. So they either have to be overestimates or underestimates. In any three years, you are going to get at least *two* under or over estimates. So post-hoc, saying "ZOMG! There's three overestimates in three years!! #EPICFAIL LOL!" isn't very meaningful.
Until Big Data People understand statistical uncertainty and are happy to put prediction confidence intervals on their data, this will keep happening. However, prediction confidence intervals are an admission of uncertainty, and uncertainty is weakness, right? And we won't have any of that in our corporate Big Data strategy document. Mr Statistician, you're fired, we're hiring some more Big Data Scientists.
I think the error just shows how many take a flu-day without being actually sick.
"Flu virus predicted to take US congress in 2014 with 96.34% certainty."
"The technical term for that kind of "research" would be nit-picking, and is generally frowned upon by real researcher. You know, the kind of people who actually knows up from down, contrary to you."
Actually the term is cherry-picking. Nit-picking is focusing on trivial details.
It is really a flawed experimental design. If I have the flu, I go to the doctor or I go to bed, I don't go to Google. If I have a bad cold, and can't decide whether it's the flu or not, I google the symptoms. The sicker you are, the less need to Google. The model might be predictive for really bad colds in cities, or really mild cases of flu.
Might as well face it I'm addicted to data.
It doesn't mean they'll come true. I predict I'll win the lottery. Yay, I won $4 instead of $40M. It's a noble cause, but it just needs some tinkering to get things right.