Science and the Shortcomings of Statistics
Kilrah_il writes "The linked article provides a short summary of the problems scientists have with statistics. As an intern, I see it many times: Doctors do lots of research but don't have a clue when it comes to statistics — and in the social science area, it's even worse. From the article: 'Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.'"
My doctor was explaining to me that my blood sugar readings should not have a standard deviation of more than 1/3rd of the average blood sugar reading. Just to test if he knew what it meant, I asked him what a standard deviation was. Oh the fun when he tried to bullshit his way out of that one! He eventually told me that when I plot my data in Excel I can ask it to give me statistics on the column and it would mention what the standard deviation value was. But when I pressed on and asked him what a standard deviation is, he shooed me off and told me to go look it up. Never did he confess that he had no clue.
As a doctor myself, I feel I should add my $0.02...
Throughout med school we had the odd scattered lecture on statistics, and later when reading papers I used to skim over most of the maths just to look for the P value at the end (one representation of how statistically significant a result is).
However, I then took a formal stats course and was amazed at how little I understood - Monte Carlo techniques, Markov models, and even something as trivial yet important as the difference between a parametric versus a non-parametric test.
And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.
So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.
-Nano.
It is not a shortcoming of statistics that other people, like various scientists who aren't statisticians, don't know how to use or properly interpret statistics. It is a shortcoming of their knowledge.
It is not a shortcoming of the Copenhagen interpretation of quantum mechanics or the Chicago school of economics if I don't understand or know how to correctly interpret their results. It is my shortcoming and fault for not knowing enough to connect the dots.
I do statistical research some of that is through interacting with researchers in the biosciences. Often when I go to talk to a researcher and ask them if they could use some statistical or mathematical or computational assistance with their research it has almost always been a fruitful starting point to long conversations and getting into the research. Now sometimes it was simply a matter of looking at their F-test results or ANOVA scores and telling them what it meant (like with a regression model relating proportions of certain characteristics between taxa), more useful interactions for me often mean working on new algorithms or estimators or working with fitting a model from their empirical data because there isn't a reliable standard model to work off of (like intergenic distance between genes in an operon) that kind of challenge makes less engaging work worth the hassle. Maybe I'm odd because I've worked hard to have a good background in both statistics and biology, but I shouldn't be.
Although here is an observation that perhaps supports some of the intent of the article from my own experience. I was speaking with a biology graduate student and it came up that they had a biostatistics course in the department. Of course as a statistician my mind goes towards survival function, failure rate, life tables, censored data, bioassy, epidemiology, microarrays, clincal trials, topics along those lines. It turned out their course focused z tests, t tests, f tests, confidence intervals, point predictions, least squares regression, multiple regression, ANOVA, and things along these lines just with simulated problems in a lab setting. That is not necessarily a bad thing, but much of the core math was under played or missing like model assumptions and alternate formulations or things like dummy variables. The worst part was that even though they were doing well with the class they had no confidence in actually using the statistics and didn't understand how to interpret the meaning of something like a confidence interval, they knew how to calculate one, but it wasn't clear what it actually meant to them.
The corollary to the notion in the summary I'd rant and claim is that scientists overall have less than desirable skills in mathematics, statistics, and computation than those who studied those disciplines principally and that's hurting science. However many in those three disciplines really know little beyond basic results in any of the sciences which hurts the applicability of these mathematical fields to the sciences and likely hurt our ability to develop certain types of discipline specific results that can be generalized from work in application problems.
In either case whether you're a typical scientist or a typical math/stat/comp person in order to become proficient enough in the other areas it requires going an awfully long out of the way compared to any counterpart who simply does not care and goes straight through as many before have. While in some areas of research on either side it is no problem to do as has been done and not further knowledge into those other areas. Increasingly results that have the highest levels of impact are coming more and more from truly interdisciplinary research. In order to further encourage that for those who are interested in such fields (aside from making more clear what areas in any of the fields fringe to such interdisciplinary work) we need more incentive to study more than one field and/or better ways of enabling fruitful cooperation between the camps.
In reading a couple of these types of articles recently I've noticed that the articles always talk about this being a problem across all journals, but only seem to mention a couple of different disciplines - medicine usually chief among them. Has anyone heard/read anything naming a hard science (e.g. chemistry or physics) as full of bad stats? My hunch is that this happens most often in medicine because you have the combination of controlling for a lot of variables as well as inadequate mathematics training.
Actually, one of the most dangerous uses of statistics is exactly predicting with them inappropriately. Curve fitting is especially prone to this error- attempting to make any predictions outside of the central mass of the points used to *produce* the curve is completely bogus, and yet people do it all the time.
The ringing of the division bell has begun... -PF