Science and the Shortcomings of Statistics

← Back to Stories (view on slashdot.org)

Science and the Shortcomings of Statistics

Posted by samzenpus on Wednesday March 17, 2010 @02:30PM from the 14%-of-people-know-that-statistics-can-prove-anything dept.

Kilrah_il writes "The linked article provides a short summary of the problems scientists have with statistics. As an intern, I see it many times: Doctors do lots of research but don't have a clue when it comes to statistics — and in the social science area, it's even worse. From the article: 'Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.'"

12 of 429 comments (clear)

Lies, Damned Lies, and Statistics. by Shadow+of+Eternity · 2010-03-17 14:31 · Score: 5, Informative

In other news math may not lie but people still can, all the honesty and good statistics in the world doesnt help end-user stupidity, and there are statistically two popes per square kilometer in the vatican.

--
A bullet may have your name on it but splash damage is addressed "To whom it may concern."
1. Re:Lies, Damned Lies, and Statistics. by Cryacin · 2010-03-17 15:08 · Score: 5, Funny
  
  Exactly. I would never believe a statistic that I did not make up myself!
  
  --
  Science advances one funeral at a time- Max Planck
2. Re:Lies, Damned Lies, and Statistics. by Chrisq · 2010-03-17 22:33 · Score: 5, Funny
  
  That since a dead clock is right twice a day, those two times cause the clock to work again?
  No, the clock is right all of the time, it just shows local sidereal time and is often in the wrong place
Personal experience by nanoakron · 2010-03-17 14:53 · Score: 5, Interesting

As a doctor myself, I feel I should add my $0.02...
Throughout med school we had the odd scattered lecture on statistics, and later when reading papers I used to skim over most of the maths just to look for the P value at the end (one representation of how statistically significant a result is).
However, I then took a formal stats course and was amazed at how little I understood - Monte Carlo techniques, Markov models, and even something as trivial yet important as the difference between a parametric versus a non-parametric test.
And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.
So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.
-Nano.
1. Re:Personal experience by Frequency+Domain · 2010-03-17 15:24 · Score: 5, Insightful
  
  ...And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.
  So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.
  That's a good insight. I'm a statistics professor, and some of the problems I see are a) people generally get exposed to a single course in statistics; b) they're usually mathematically unprepared for it; c) so much gets squeezed into that one opportunity that heads are exploding; d) because of (a) - (c), everybody wants you to "just give 'em the formula"; e) since statistics is so widely used, there's a plethora of courses that are being taught by people who themselves are victims/products of (a) - (d), and are very happy to "just give 'em the formula"; and so e) most people plug and chug data through a stats package with no idea of the applicability, limitations, and interpretation of the results. The sheer volume of bad analyses is enough to make you weep, and contributes to the widely held perception about "lies, damned lies, and statistics". And that completely ignores the intentional falsehoods propagated by people who are trying to support various advocacy viewpoints, and will happily mislead the public with biased samples, Simpson's paradox, invalid assumptions, etc.
Re:No surprise here by Homburg · 2010-03-17 15:00 · Score: 5, Funny

I think your example would be more persuasive if it involved algebra, though.
The problem is statisticians by BrokenHalo · 2010-03-17 15:01 · Score: 5, Insightful

In other news math may not lie but people still can...

Usually (in science at least) it's not even a matter of lying. Part of the problem is that the multi-headed monster that statistics has become has a tendency to lead people to over-use numerical "answers" vomited up by stats packages, without really understanding what they are for, or how to interpret them.

Statistics are very useful for predicting certain things, but all too often they are submitted as "proof" of a given condition, which is dangerous. Sometimes we need to throw away statistics and start applying common sense.
Re:No surprise here by coolsnowmen · 2010-03-17 15:05 · Score: 5, Insightful

You are a jerk.
You are insulting your sister because she is bad at mental math? It is a skill; one not required for extensive knowledge of the social sciences. Additionally, maybe if sales tax is simple in your state like 10%, but where I live it is 4.5% which is not always easy to get exactly right in your head.
I had a roommate who was brilliant,funny, a singer and an artist, and yet, he couldn't calculate tip to save his life, but I don't certainly hold that against him.
bad title by obliv!on · 2010-03-17 15:10 · Score: 5, Interesting

It is not a shortcoming of statistics that other people, like various scientists who aren't statisticians, don't know how to use or properly interpret statistics. It is a shortcoming of their knowledge.

It is not a shortcoming of the Copenhagen interpretation of quantum mechanics or the Chicago school of economics if I don't understand or know how to correctly interpret their results. It is my shortcoming and fault for not knowing enough to connect the dots.

I do statistical research some of that is through interacting with researchers in the biosciences. Often when I go to talk to a researcher and ask them if they could use some statistical or mathematical or computational assistance with their research it has almost always been a fruitful starting point to long conversations and getting into the research. Now sometimes it was simply a matter of looking at their F-test results or ANOVA scores and telling them what it meant (like with a regression model relating proportions of certain characteristics between taxa), more useful interactions for me often mean working on new algorithms or estimators or working with fitting a model from their empirical data because there isn't a reliable standard model to work off of (like intergenic distance between genes in an operon) that kind of challenge makes less engaging work worth the hassle. Maybe I'm odd because I've worked hard to have a good background in both statistics and biology, but I shouldn't be.

Although here is an observation that perhaps supports some of the intent of the article from my own experience. I was speaking with a biology graduate student and it came up that they had a biostatistics course in the department. Of course as a statistician my mind goes towards survival function, failure rate, life tables, censored data, bioassy, epidemiology, microarrays, clincal trials, topics along those lines. It turned out their course focused z tests, t tests, f tests, confidence intervals, point predictions, least squares regression, multiple regression, ANOVA, and things along these lines just with simulated problems in a lab setting. That is not necessarily a bad thing, but much of the core math was under played or missing like model assumptions and alternate formulations or things like dummy variables. The worst part was that even though they were doing well with the class they had no confidence in actually using the statistics and didn't understand how to interpret the meaning of something like a confidence interval, they knew how to calculate one, but it wasn't clear what it actually meant to them.

The corollary to the notion in the summary I'd rant and claim is that scientists overall have less than desirable skills in mathematics, statistics, and computation than those who studied those disciplines principally and that's hurting science. However many in those three disciplines really know little beyond basic results in any of the sciences which hurts the applicability of these mathematical fields to the sciences and likely hurt our ability to develop certain types of discipline specific results that can be generalized from work in application problems.

In either case whether you're a typical scientist or a typical math/stat/comp person in order to become proficient enough in the other areas it requires going an awfully long out of the way compared to any counterpart who simply does not care and goes straight through as many before have. While in some areas of research on either side it is no problem to do as has been done and not further knowledge into those other areas. Increasingly results that have the highest levels of impact are coming more and more from truly interdisciplinary research. In order to further encourage that for those who are interested in such fields (aside from making more clear what areas in any of the fields fringe to such interdisciplinary work) we need more incentive to study more than one field and/or better ways of enabling fruitful cooperation between the camps.
only in medicine by rook166 · 2010-03-17 15:20 · Score: 5, Interesting

In reading a couple of these types of articles recently I've noticed that the articles always talk about this being a problem across all journals, but only seem to mention a couple of different disciplines - medicine usually chief among them. Has anyone heard/read anything naming a hard science (e.g. chemistry or physics) as full of bad stats? My hunch is that this happens most often in medicine because you have the combination of controlling for a lot of variables as well as inadequate mathematics training.
What it actually said by williamhb · 2010-03-17 15:38 · Score: 5, Informative
Contrary to the parent poster's claim, the article does not focus on correlation vs causation. It focuses on people getting the correlation wrong in the first place. It lists several common mistakes scientists make when writing up research studies. (Not all scientists are very good at stats). These include:
- If you run enough studies you are almost certain to find a difference that appears statistically significant at the p<0.05 level through chance alone. (It is incredibly unlikely that you will win the lottery; but across the whole pool of tickets someone wins it most weeks.) That makes studies that bulk analyze large amounts of data against many different factors, actively hunting for something that is significantly different, erroneous.
- "p < 0.05" does not mean there is a 95% chance of your result being "true"; it just means that someone else rolling dice has a 5% chance of achieving the same result through chance alone.
- Tests are often combined in ways that are mathematically inconsistent
- Finding a statistical effect does not mean it is a strong effect
- You cannot simply compare effect sizes between two studies because the results of their control groups may differ ("effect size analysis" is usually wrong)
- Failing to find a significant effect does not mean there is no effect ("we found there was no significant effect on..." is misleading because "no satistical significance" is "no information" [your study didn't tell anybody anything] not "no effect" -- to prove "no effect" you need a different statistical test)
And lots of others. It then suggests Bayesian reasoning as an alternative to traditional statistical tests.
Most post-PhD scientists are aware of the common mistakes, but being aware that we make mistakes doesn't necessarily stop us from making them. If you chose a random set of conference proceedings, it is almost certain you will find at least one paper (and I suspect usually a dozen or more) that have statistical mistakes in them.
Re:Summery? by Saroful · 2010-03-17 19:37 · Score: 5, Informative

And what's the law about spelling/grammar corrections that incorrectly correct the supposed spelling error? (Redundancy is purposefully deliberate.) "Its" is possessive. "It's" is a contraction of "it" and "is". -- This has been a message from your friendly neighborhood Spelling Nazi.