Science and the Shortcomings of Statistics
Kilrah_il writes "The linked article provides a short summary of the problems scientists have with statistics. As an intern, I see it many times: Doctors do lots of research but don't have a clue when it comes to statistics — and in the social science area, it's even worse. From the article: 'Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.'"
In other news math may not lie but people still can, all the honesty and good statistics in the world doesnt help end-user stupidity, and there are statistically two popes per square kilometer in the vatican.
A bullet may have your name on it but splash damage is addressed "To whom it may concern."
How do you figure that? My latest calculations placed it at 70% [Note: Error +/- 10%].
It's not just statistics that people have a problem with...
-- Braden's law of data: All data spends some of its lifetime in an excel spreadsheet.
My doctor was explaining to me that my blood sugar readings should not have a standard deviation of more than 1/3rd of the average blood sugar reading. Just to test if he knew what it meant, I asked him what a standard deviation was. Oh the fun when he tried to bullshit his way out of that one! He eventually told me that when I plot my data in Excel I can ask it to give me statistics on the column and it would mention what the standard deviation value was. But when I pressed on and asked him what a standard deviation is, he shooed me off and told me to go look it up. Never did he confess that he had no clue.
How to Lie with Statistics by Darrell Huff. Recommended reading.
Do not mock my vision of impractical footwear
Our company six sigma training included two weeks of collecting and analyzing data with a stats package. I got enough experience to even train me how to use the program. I can still do a few things that come up regularly. Probably the best thing to come out of six sigma (for me at least).
As a doctor myself, I feel I should add my $0.02...
Throughout med school we had the odd scattered lecture on statistics, and later when reading papers I used to skim over most of the maths just to look for the P value at the end (one representation of how statistically significant a result is).
However, I then took a formal stats course and was amazed at how little I understood - Monte Carlo techniques, Markov models, and even something as trivial yet important as the difference between a parametric versus a non-parametric test.
And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.
So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.
-Nano.
that there are only 3 kinds of scientists: those that are good at math and those that aren't.
Game: Player 'Donald J Trump' now has AI skill level 'experimental'.
I think your example would be more persuasive if it involved algebra, though.
In other news math may not lie but people still can...
Usually (in science at least) it's not even a matter of lying. Part of the problem is that the multi-headed monster that statistics has become has a tendency to lead people to over-use numerical "answers" vomited up by stats packages, without really understanding what they are for, or how to interpret them.
Statistics are very useful for predicting certain things, but all too often they are submitted as "proof" of a given condition, which is dangerous. Sometimes we need to throw away statistics and start applying common sense.
It's perfectly reasonable that someone use a calculator for sales tax (if an exact answer is desired).
Also, sales tax is multiplication - not algebra.
You are a jerk.
You are insulting your sister because she is bad at mental math? It is a skill; one not required for extensive knowledge of the social sciences. Additionally, maybe if sales tax is simple in your state like 10%, but where I live it is 4.5% which is not always easy to get exactly right in your head.
I had a roommate who was brilliant,funny, a singer and an artist, and yet, he couldn't calculate tip to save his life, but I don't certainly hold that against him.
One of the best articles I've seen on stats (and their misuse). I'm taking a data analysis course at the moment and I've spent at least a dozen hours simply computing confidence intervals, testing the null hypothesis, and determining significance. It really has changed how I view statistics because it keeps pounding in these very key but oft-ignored principles.
"Everything is linear if plotted log-log with a fat magic marker."
It is not a shortcoming of statistics that other people, like various scientists who aren't statisticians, don't know how to use or properly interpret statistics. It is a shortcoming of their knowledge.
It is not a shortcoming of the Copenhagen interpretation of quantum mechanics or the Chicago school of economics if I don't understand or know how to correctly interpret their results. It is my shortcoming and fault for not knowing enough to connect the dots.
I do statistical research some of that is through interacting with researchers in the biosciences. Often when I go to talk to a researcher and ask them if they could use some statistical or mathematical or computational assistance with their research it has almost always been a fruitful starting point to long conversations and getting into the research. Now sometimes it was simply a matter of looking at their F-test results or ANOVA scores and telling them what it meant (like with a regression model relating proportions of certain characteristics between taxa), more useful interactions for me often mean working on new algorithms or estimators or working with fitting a model from their empirical data because there isn't a reliable standard model to work off of (like intergenic distance between genes in an operon) that kind of challenge makes less engaging work worth the hassle. Maybe I'm odd because I've worked hard to have a good background in both statistics and biology, but I shouldn't be.
Although here is an observation that perhaps supports some of the intent of the article from my own experience. I was speaking with a biology graduate student and it came up that they had a biostatistics course in the department. Of course as a statistician my mind goes towards survival function, failure rate, life tables, censored data, bioassy, epidemiology, microarrays, clincal trials, topics along those lines. It turned out their course focused z tests, t tests, f tests, confidence intervals, point predictions, least squares regression, multiple regression, ANOVA, and things along these lines just with simulated problems in a lab setting. That is not necessarily a bad thing, but much of the core math was under played or missing like model assumptions and alternate formulations or things like dummy variables. The worst part was that even though they were doing well with the class they had no confidence in actually using the statistics and didn't understand how to interpret the meaning of something like a confidence interval, they knew how to calculate one, but it wasn't clear what it actually meant to them.
The corollary to the notion in the summary I'd rant and claim is that scientists overall have less than desirable skills in mathematics, statistics, and computation than those who studied those disciplines principally and that's hurting science. However many in those three disciplines really know little beyond basic results in any of the sciences which hurts the applicability of these mathematical fields to the sciences and likely hurt our ability to develop certain types of discipline specific results that can be generalized from work in application problems.
In either case whether you're a typical scientist or a typical math/stat/comp person in order to become proficient enough in the other areas it requires going an awfully long out of the way compared to any counterpart who simply does not care and goes straight through as many before have. While in some areas of research on either side it is no problem to do as has been done and not further knowledge into those other areas. Increasingly results that have the highest levels of impact are coming more and more from truly interdisciplinary research. In order to further encourage that for those who are interested in such fields (aside from making more clear what areas in any of the fields fringe to such interdisciplinary work) we need more incentive to study more than one field and/or better ways of enabling fruitful cooperation between the camps.
I don't have to be a statistician to know that the above post is 97% bullshit.
In reading a couple of these types of articles recently I've noticed that the articles always talk about this being a problem across all journals, but only seem to mention a couple of different disciplines - medicine usually chief among them. Has anyone heard/read anything naming a hard science (e.g. chemistry or physics) as full of bad stats? My hunch is that this happens most often in medicine because you have the combination of controlling for a lot of variables as well as inadequate mathematics training.
It's a troll because it implies scientists don't know about those things.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
And 77.335% of all statistics claim more accuracy than their expected deviation warrants.
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
And lots of others. It then suggests Bayesian reasoning as an alternative to traditional statistical tests.
Most post-PhD scientists are aware of the common mistakes, but being aware that we make mistakes doesn't necessarily stop us from making them. If you chose a random set of conference proceedings, it is almost certain you will find at least one paper (and I suspect usually a dozen or more) that have statistical mistakes in them.
Unfortunately, it is hard to break a viscous cycle. The high viscosity makes it easy to get stuck.
"FDA staff reviewers expressed concern about the number of patients who were left out of the study because they died."
Yes, it's rarely mentioned that causation implies correlation.
Interestingly, I have observed a correlation between people who cite that "correlation != causation" and those who ignore "causation implies correlation" in their arguments.
Lars T.
To the guy who modded me down from perfect to terrible Karma - Apple haters still suck
and IAAB (biologist) and I can tell you that most scientists don't have access to statisticians or don't have the grant money to pay for them. I also don't have time to learn SAS and code my own tests, therefore I use stuff like SPSS or Genstat (both of which do allow you to code your own tests as well). Just because they are easy to use doesn't mean I do or do not understand the tests, the assumptions or their results. I would say my grasp of stats is above average for my peer group, below where I would like it to be and obviously limited.
One thing that is interesting to me is that throughout my education and career I have been warned off using multiple means comparisons and LSD in particular (I understand why and have avoided where I can and the latter always). Yet the only actual statisticians I have dealt with in recent years have recommended me to use LSD on means comparisons with 10s of means. I would be hard pressed to publish those results.
In summary, whilst statisticians like to blame easy to use stats programs for bad stats the reality is they are just a tool and if statisticians can't agree on the acceptable use of the simplest procedures I'm not sure what chance the rest of us have of getting it right.
Si hoc legere scis nimium eruditionis habes.
Statistics is changing slowly (mostly because computers and R make non-classical statistics more practical) but the way it's taught still leads to problems.
Actually the subtler issue here has nothing to do with statistics, they are implying peer-review does not work.
"Peer review" is another of the things that has been over-sold to the public. A science research group spends six months and a hundred thousand dollars conducting a research study using highly specialised equipement. They submit a paper to an academic conference or a small journal. It gets put out to review by three people who each spend about four hours reading it and reviewing it, and who usually do not have access to the equipment or the original data that was used in the study. Do you really think we're likely to catch every mistake at review? We certainly can't check the stats (except for the most egregious errors) because we don't have the full data tables they analyzed.
Scientists actually accept that inevitably some incorrect results will be published. More often in the smaller conferences than in the most prestigious journals, but even the journals have to publish a retraction every now and then. We also accept that most studies are never repeated, and so the "objective repeatable experiment" is rarely really tested for being either objective or repeatable. However, science has long had the "many eyes" effect at work. There are hundreds of thousands of scientists reading papers and using them in our own experiments. If some theorised effect out there is wrong, usually we'll find out eventually.
Peer review is not about catching mistakes, although it can on occation. Peer review is about clear communication, such that the experiment can be repeated as identically as possible and that the readers can understand the authors justification for their conclusions. At least that's what every journal article I've read on the topic indicateded was the reason for the peer review processes creation. One of my advisors asked me about it on my written preliminary exam and I needed to do a lot of reading to be prepared for the oral exam. There were several different societies that claimed to have originated the idea, but no one claimed that the purpose was to catch mistakes, fabrications, or data manipulations.
Bureaucracy expands to meet the needs of the expanding bureaucracy.-Oscar Wilde
Interestingly, I have observed a correlation between people who cite that "correlation != causation" and those who ignore "causation implies correlation" in their arguments.
Ah yes, but can you suggest any causal relationship between those two observations?
Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke
I'm interested in learning the essentials of statistics. What would be a good book to start me out?
I got The Manga Guide to Statistics and it did introduce me to the very basics. However, there are many places where it just gives you an equation, without deriving it or even explaining it. After reading this book, I now know how to calculate standard deviation, but I'm still a bit vague on how people actually use it. I would like to see some examples of how people use statistics in (for example) science experiments.
My ideal book would explain the basics, with examples, and show how the math works. Ideally it wouldn't be a thousand pages long, either, but that's a secondary consideration.
Recommendations, please?
P.S. Those of you who know about statistics: how good are the Wikipedia pages on statistics?
steveha
lf(1): it's like ls(1) but sorts filenames by extension, tersely
.. or at least not the probability of the hypothesis. This is one of the errors that people make. Having 0.95 significance do NOT imply having 95% chance for the hypothesis being true! The significance is the probability of the test outcome assuming the hypothesis is true (in other words it is a likelihood value). You have to multiply it by a prior to obtain real probabilities.
Significance values will not even add up to 1 over the two hypothesises!
The root of the problem is that frequentists can not use probabilities for statements -- only for events. In frequentist terms you have to have a sigma algebra over some Omega state space which is measurable. Bayesians on the other hand can talk about the probabilities of any statements using probability theory as an extension of formal logic. I really recommend reading the books of E. T Jeynes and David McKay.
Other false assumptions people make with statistics:
- Everything is normally distributed
- Everything has a variance
- Everything has an expected value
- Hypothesis testing is without bias (in fact it is equivalent to give 50% prior probability to both hypothesises)
- Variance means average distance from mean
- Empirical variance does not have a variance
I'm actually at a scientific meeting and saw 7 presentations in which they "double dipped" on their statisitics before we broke for lunch.
Double-dipping is bad enough, but the medical field is rife with multiple-dipping. Each dataset is plumbed to test dozens of hypotheses, without appropriately adjusting the acceptance criteria. Even with separate datasets, if you test 20 hypotheses and discover that each one is just valid at the 95% confidence level, then there is a very good chance that there are some false positives. In the medical alleged-sciences, however, all 20 would be blindly proclaimed as truth.
And then there are the social nonsenses^W sciences... If practitioners of some discipline do not understand how to use quantitative methods, they should limit themselves to qualitative argument only. Unfortunately, in statistics as in other fields, those who are ignorant or incompetent are generally unaware of the extent of their ignorance and incompetence.
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
I'm sick of this bullshit. There is statistics, and there is lies. Statistical operations are mathematical procedures, which may or may not be appropriate. They are not, however, lies. They may be errors, deliberate or accidental. Lies, on the other hand, are what you introduce when the data does not fit the hypothesis you want to put forward. Blame the liar, not his smoke and mirrors.
[FUCK BETA]
I have the same problem. In school they were considering putting me in remedial classes because I had trouble doing basic arithmetic with even single digit numbers(I still have trouble with anything above 6). I can and could do a reasonably accurate estimate, but not the real result (possibly has something to do with me also having a bad short term memory). As soon as we got to the abstract bit (i.e. real math) I had no trouble. I can do integration with coordinatesystem shifts(e.g. cartesian->polar) in my head, but I will have to check my constants with a calculator.
The largest demographic in american prisons are black americans. Real statistic but is it true?
Given a particular sample that indicates blacks are 60% of the prison population this would appear to be true.
But what if I said: "The largest demographic in prison is minority, non-whites." Suddenly the % jumps from 60% (black) to 80% (minority). Which is more right? This is the problem with statistics. Context.
Now I can say readily that the largest demographic in prison is actually right-handed people. The % now jumps to 90%.
But wait! There is more! The largest demographic is prison is actually people who prior to arrest were below the poverty line which jumps to 99% of the population. Again, all of the above are accurate based on a sample but which is MORE correct? Linear Algebra is coming into play here quickly....
When that kind of issue comes into play, it is the classic "Correlation != Causation" confusion. The majority of people in prison are in there because of "Being black? Being a minority? being right handed? or being poor?" None of the above. The majority of them are in there because they were convicted of a crime and sentenced. That is the causation of their imprisonment, the rest is correlation which may have a direct causation on the conviction or sentencing, but no direct causation on being in prison. (e.g. You cannot be thrown into prison for being poor, black, minority, right handed)
Same with medical research, politics, economics, etc. The price of oil rising 10% and a subsequent 5% drop in shipping orders. Measuring the significance of regessors is important but oddly never reported most of the time. Many factors get masked or shadowed by higher level regressors (e.g. being a minority masks a variety of other social and economic factors. In addition it can distort statistical work by being too broad. Asians have a variety of different economic and social factors as north american blacks versus even african immigrants.)
Back to the orignal subject:
We can take 100 prisoners and 100 non-prisoners and figure out rather quickly if being black is statistically significant in prison population. Non-prison population blacks would account for 25%-45% of the population (Depending on location). We can see that 60% of prisoners are black. There is a 20+% deviation from the norm. We can test to see the significance of that. Same with minorities. Now we find something quickly that right handed is insignificant because it doesn't deviate from the norm. We can test left-handed and right-handed populations and rule out the handed-ness of a convict being significant.
We can find the economic status is considerable MORE significant then minority or black as a status. We can determine that the reason minorities or blacks are disporotinally more prevelant in prison is that blacks and minorities have higher rates of poverty. We can extract and determine the statistical weight of POVERTY in regards to imprisonment (Since we find a high % of white in prison that are poor compared to the normal population.) Once we figure that out we can remove that and continue an investigation and figure out what weight minority and black has once we have removed POVERTY from the model (Residual analysis).
The problem in reporting is without providing the whole, comprehensive analysis you can miss important things. For instance to correct the injustice in sentencing, without reporting the weight POVERTY has in contrast to BLACK or MINORITY you may lose sight that you may have better success addressing POVERTY to normalize sentencing rather then MINORITY or BLACK (or not).
The same happens in medical reasearch. Given a cocktail of drugs wirthout having the whole analysis you may end up providing more of Medicine A versus B but lose sight that A & B are limited by the dosage of Medicine C.
Satistics are not bullshit, rather mearly observations with no intrinsic agenda or even implication of truth. Purely amoral, like a hand gun.. useful to both the good and evil.
Statistics don't lie, nor do they tell the truth. They simple show the relationship of the data as it stands. The Truth or Thruthiness of it is subjective and vulnerable to context.
-=[ Who Is John Galt? ]=-
That's exactly the point. If obtaining a degree of certainty in one measurement takes a bookload of theory to do 'properly', and is 'hard', obtaining a the same degree of certainty in a space with N channels should be 'hard'^N. The OP's point was that people assume that it should be just as easy, and don't go to the trouble of learning what it takes to do it right.
That people are trying to use peer-review as a method to detect fraud, does not make it a good method for doing so. I've mentioned this before on /., although not in this thread, but I have no way of telling if the numbers in a table were generated by the experiment described, some other experiment, a random number generator, or the PR department at the company who's product is being evaluated. As long as the numbers are internally consistent, I have to "trust" that what they describe, happened. I can catch obvious errors, such as the SEM not supporting claims of statistical significance made by the authors. However, if during the review process, they claim that the SEM was a typo (numbers were actually SD and not SEM for example) and change it, I have no way of verifying that their explanation was true.
Also, in your quote you highlighted 2 different lines. The first has to do with the soundness of the conclusions. This is most definitely a role of peer review, but not related to accuracy. It doesn't mean that they verify that your conclusions are correct. Conclusions are not objective. The data gives you objective facts from which to draw subjective conclusions. This line indicates that your discussion will be evaluated for how well the data (yours and previous literature) supports your conclusions. If you extrapolate, or ignore important results then your paper will be rejected.
The second bolded section just indicates that if serious errors are found (using insufficiently large sample size, extrapolating results, etc.) then the paper will be rejected. That's totally understandable to reject, but obviously serious errors of this sort are uncommon. Most errors are much harder to detect, and are not picked up by the peer review process in my experience.
Bureaucracy expands to meet the needs of the expanding bureaucracy.-Oscar Wilde