Why Published Research Findings Are Often False
Hugh Pickens writes "Jonah Lehrer has an interesting article in the New Yorker reporting that all sorts of well-established, multiply confirmed findings in science have started to look increasingly uncertain as they cannot be replicated. This phenomenon doesn't yet have an official name, but it's occurring across a wide range of fields, from psychology to ecology and in the field of medicine, the phenomenon seems extremely widespread, affecting not only anti-psychotics but also therapies ranging from cardiac stents to Vitamin E and antidepressants. 'One of my mentors told me that my real mistake was trying to replicate my work,' says researcher Jonathon Schooler. 'He told me doing that was just setting myself up for disappointment.' For many scientists, the effect is especially troubling because of what it exposes about the scientific process. 'If replication is what separates the rigor of science from the squishiness of pseudoscience, where do we put all these rigorously validated findings that can no longer be proved?' writes Lehrer. 'Which results should we believe?' Francis Bacon, the early-modern philosopher and pioneer of the scientific method, once declared that experiments were essential, because they allowed us to 'put nature to the question' but it now appears that nature often gives us different answers. According to John Ioannidis, author of Why Most Published Research Findings Are False, the main problem is that too many researchers engage in what he calls 'significance chasing,' or finding ways to interpret the data so that it passes the statistical test of significance—the ninety-five-per-cent boundary invented by Ronald Fisher. 'The scientists are so eager to pass this magical test that they start playing around with the numbers, trying to find anything that seems worthy,'"
Even in academia, there's an establishment and people who are powerful within that establishment are rarely challenged. A new upstart in the field will be summarily ignored and dismissed for having the arrogance to challenge someone who's widely respected. Even if that respected figure is incorrect, many people will just go along to keep their careers moving forward.
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
I'm a scientist myself. It's quite clear from where I'm standing that to get good jobs, research grants, etc one needs plenty of published articles. Whether the conclusions of those are true or false is not something that hiring committees will delve into too much. If you are young and have a family to support, it can be tempting to take shortcuts.
That article is as flawed as the supposed errors it reports on. The author just "discovered" that biases exist in human cognition? The "effect" he describes is quite well understood, and is the very reason behind the controls in place in science. This is why we don't, in science, just accept the first study published, why scientific consensus is slow to emerge. Scientists understand that. It's journalists who jump on the first study describing a certain effect, and who lack the honesty to review it in the light of further evidence, not scientists.
After years of speculation, the a study has revealed that scientists are, in fact, human. The poor wages, long hours, and relative obscurity that most scientists dwell in has apparently caused widespread errors, making them almost pathetically human and just like every other working schmuck out there...
I'll add another cause to the list. The "publish or perish" mentality encourages researchers to rush work to print often before they are sure of it themselves. The annual review and tenure process at most mid-level research universities rewards a long list of marginal publications much more than a single good publication.
Personally, I feel that many researchers publish far too many papers with each one being an epsilon improvement on the previous. I would rather they wait and produce one good well-written paper rather than a string of ten sequential papers. In fact, I find that the sequential approach yields nearly unreadable papers after the second or third one because they assume everything that is in the previous papers. Of course, I was guilty of that myself because if you wait to produce a single good paper, then you'll lose your job or get denied tenure or promotion. So, I'm just complaining without being able to offer a good solution.
Is it possible that there has always been error, but it is just more noticeable now given that reporting is more accurate?
Precisely. As mentioned in a Scientific American blog:
"The difficulties Lehrer describes do not signal a failing of the scientific method, but a triumph: our knowledge is so good that new discoveries are increasingly hard to make, indicating that scientists really are converging on some objective truth."
That's not a given. Particularly in the soft sciences - psychology, for instance - it is extremely difficult to control for all factors (I'm more inclined to say nearly impossible) and so replication of results can be subsumed by other effects, or even simply not work at all. You know that whole generation gap thing? That's a good example of groups of people who are different enough that the reactions they will have to certain subject matter can be polar opposites. So something that was "definitively determined" in 1960 may be statistically irrelevant among the current generation.
That's just one example of how squishy this all is. Without having to bring lying into it at all. And then, there will be liars; and there will be people who draw conclusions without scientific rigor at all, simply because it's just too difficult, expensive or time-consuming to attempt to confirm the ideas at hand. And there is the outlier personality; the one who accounts for those other few percent -- all the declarations of "this is how it is" are false for them right out of the gate.
Hard sciences simply lend themselves a lot better to repeatability. Where I think we go wrong is assigning the same certainties to the claims of the soft scientists. I have personally seen psychiatrists, best intent not in doubt, completely err in characterizing a situation to the great detriment of the people involved, because the court took the psychiatrist's word as gospel truth.
All science is an exercise in metaphor, but soft science is an exercise of metaphor that is almost always far too flexible. One place you can see this happening is the trendy / cyclic adherence to Froyd, Jung, Maslow, Rogers and so forth... the "correct" way to raise babies... Ferberizing, etc. This stuff isn't generally lies at all, but it also generally isn't "right." Good intentions do not automatically make good science.
Serious medicine is another good example. Something that might work very well for you might not work at all for me; get the wrong group of test subjects, and your results will skew or worse. This is an area that I think is fair to call a hard science, but where we just don' t know enough about the systems involved. Generally speaking, I don't think our oncologist lies to us; further, I think he's pretty well aware of the limitations of his practice and the state of knowledge that informs it; but they just don't know enough. To which I hopefully add, "yet."
On a personal level - since that's all I can really affect - I treat soft science about the same way I do astrology. If you believe it, you'll probably attempt to modify your behavior because of the predictions, which in turn may, or may not, affect your actual outcome. If you don't, it's either irrelevant or too uncertain to trust anyway. So it's low confidence all the way.
I do, however, still place very high confidence in Boyle's law for gasses. Hard science works very well. :)
I've fallen off your lawn, and I can't get up.
It's only lying if you do it intentionally. If ten labs independently and without knowing of each other perform essentially the same experiment, and one of them has a statistically significant result, is that lying? The other nine won't get published because, unfortunately, people only rarely (and for large or controversial experiments) publish negative results, but the one anomalous study will.
The vast majority of science is performed with all the good will in the world, but it's simply impossible for scientists to not be human. That's why we do replicate experiments - hell, my wife just published a paper where she tried to replicate someone else's results and got entirely different ones, and analyzed why the first guy got it wrong.
Wow. I didn't pick up any of that at all, and I RTFA. It looked to me much more like acknowledgement of widespread difficulties with randomness, scale, and human fallibility. Exactly the kinds of things that would make someone who's a staunch defender of "science as a means to truth" to disregard valuable critical information about it.
Did you even read the article?
This is basically about poorly designed clinical drug trials without sufficient controls. Sloppy work, even if it seemed rigorous enough at the time.
The sensationalistic "scientific method in question" stuff is pure BS, but after all this is New Yorker magazine we're talking about, so one wouldn't expect too much scientific literacy. It was the scientific method of "predict and test" that caught these erroneous results, so the method itself is fine. The "scientist" who designed a sloppy experiment is too blame, not the method.
However, I'm not sure that psychiatric drug trials even deserve to be called science in the first place. The principle of GIGO (Garbage In - Garbage Out) applies. This is touchy-feely soft science at best. How do you feel today on a scale of 1-10? Do the green pills make you happy?
nahh, the problem is a misunderstanding of statistics (thinking that post-hoc analysis with this fishing for statistical significance) is as valid as proper hypothesis testing. The proper way is where the hypothesis is fully pre-formed and then tested. The numbers and statistics apply ONLY TO THE HYPOTHESIS being tested, so you cannot hunt for a statistical significance just somewhere in the data and then re-formulate your hypothesis.
The need to publish (a scientist's income relies on what he publishes in most cases) as well as funding issues force scientists to try to find some usable results from their science, and by trawling through their data they can often salvage what would otherwise have been a failed bit of research. Except this salvaging operation may actually be absolutely worthless. This is most often not done on purpose but rather due to only partly understanding what statistics and significance testing tell us.
So, a capitalistic, fully performance based (with results being the performance metric) environment does not seem to work well for science.
Surprised?
Me neither.
Very well expressed. To put this in a context which will seem bizarre to many readers of slashdot, there is a whole range of products on the market to help "scientific astrologers" search out correlations between planetary positions and life circumstances. And a legion of astrologers making use of them -- at several hundred dollars a copy -- to pore over birth charts with dozens and dozens of factors. Unless things have changed in the years since I looked into this, what's usually conveniently sidestepped is that some of those factors will indeed show up significant by chance. After all, that is the very definition of probability expressions such as "p less than .05". On replication, these findings will normally disappear, resulting in a crestfallen astrologer. (Then again, why not just expand the original dataset and check again to see if different factors come up this time :-)
But the motivation to get something out of the data is high, as the parent post points out, and researchers may be able to deceive themselves just as well as astrologers can, especially when academic careers are on the line.
The problem is actually the opposite. Note the E.S.P. experiment cited in the article. Rhine's initial experiment suggested to him that E.S.P. was real. Before publishing his results he did the right thing and reran the tests and the results proving E.S.P. were not repeatable.
The next part is his absolute failure to understand the scientific method and statistics. He concluded that "extra-sensory perception ability has gone through a marked decline.” In fact what he experienced was Regression toward the mean.
Taking a well understood principle, renaming it with a term that suggests an action is taking place, then arguing that you have found some new phenomenon that proves science doesn't work is not critical information about anything.
It is ignorance that will be dismissed for obvious reasons. Too much time and energy is wasted repeatedly addressing these attacks on science by people who want so badly for their pseudo-science or supernatural beliefs to be true. In a perfect world when somebody stumbles upon regression to the mean without knowing it they would do additional research to understand what it is they are observing rather than conclude that their initial experiment was correct and the supernatural ability they detected was "declining" rather than accept the alternate, it was never there in the first place.
Start with a ridiculous premise to get people reading, then break out what's really happening
Welcome to corporate journalism. And corporate science.
If there's one useful thing that 30 years of recreational gaming has taught me, it's this: Players will find loopholes in any set of rules, and exploit them relentlessly for an advantage. Corrolaries include the tendency for games to degenerate into contests between different rulebreaking strategies and the observation that if you raise the stakes to include rewards of real value (like money) then the games with loopholes attract players who are not interested in the contest, but only in winning.
This lesson applies to all aspects of life from gaming, to sports, business, and even dating.
And so it's no surprise that when the publishers set up a set of rules to validate scientific results, that those engaged in the business of science will game those rules to publish their results. They're being paid to publish; if they don't publish, they've "lost" or "failed" because they will receive no further funding. So the stakes are real. And while the business of science still attracts a lot of true scientists -those interested in the process of inquiry- it now also attracts a lot of players who are only interested in the stakes. Not to mention the corporate and political interests who have predetermined results that they wish to promulgate.
What was really the point of implying that truth can change?
To game the system, of course. The aforementioned corporate and political interests will use this line of argument now, in order to discredit established scientific premises.
I can see the fnords!
No, scientists in many fields (and some of which you would expect the opposite) do not understand statistics well.
If you dig through your well gathered data you will find correlations that are purely chance. Which is why you are supposed to be looking for the predetermined correlation not just any correlation. But when you've spend a lot of time and effort gathering a set of data, digging into it to find other things seems like a reasonable plan - and as long as you do another completely separate data gathering study to check what you find it is (but there's a great pressure to publish something now since you just spent a huge wad of cash and your performance is measured by what you publish not by actual scientific progress).
Scientists do this. Traders at investment banks (and elsewhere) do this. People just do this.
"Fooled by Randomness" by Taleb is a good look into this from the trading perspective. Assuming you don't mind his writing style, "ego-centric and pompous" is a common description (though I don't find it so).
I'm pretty sure investment banking is dominated by "rightoids" which nullifies your ridiculous injection of politics into the universal human bias to see patterns in randomness.
Are you serious? Many thousands of people are dead simply because a few people were trying to stay gainfully employed to support their families?
I am truly sorry if this comes off as offensive as I think it does but if you believe there would be mass suffering from unemployment if we did not bomb the shit out of Iraq and that was the basis for the lies that resulted in many thousands losing their lives then you are seriously deluded.
As a U.S. citizen I found Clinton's actions and lies embarrassing, but the lies from Bush transferred billions, if not trillions, of public funds into the hands of a few and resulted in the deaths of many thousands of people.
Comparing lies about a blow job to lies resulting in debt and death is absurdity on a grand scale.
nahh, the problem is a misunderstanding of statistics (thinking that post-hoc analysis with this fishing for statistical significance) is as valid as proper hypothesis testing. The proper way is where the hypothesis is fully pre-formed and then tested. The numbers and statistics apply ONLY TO THE HYPOTHESIS being tested, so you cannot hunt for a statistical significance just somewhere in the data and then re-formulate your hypothesis.
This significance of this fundamental mistake cannot be overstated. It seems to be prevalent in medical literature and there was a doctor doing the rounds lecturing about this a couple of years back. I wish I could recall exactly which podcast but he covered all sorts of common fundamental errors in medical research statistics and did it in a very accessible way. The key thing to remember is that if you have enough variables there WILL by complete coincidence be correlation between some of them in any given sample. So to test a hypothesis properly, not only must you formulate it in advance without looking for any correlation within the data, but you must look at more than one data set to verify your findings.
These posts express my own personal views, not those of my employer
> So, a capitalistic, fully performance based (with results being the performance metric)
> environment does not seem to work well for science. / Surprised? / Me neither.
This is a gratuitous, cheap shot. These problems appear only in scientific research that is funded, managed, or supervised by government agencies or academic review committees so that bureaucrats will grant money, or full professorships, or licenses to sell drugs. Hence the crack that if you want to study squirrels in the park, you title your grant proposal, "Global Warming and Squirrels in the Park."
There are "capitalistic... performance-based environments" in science - but they're the corporate R&D departments that are seeking marketable innovations. There isn't much intellectual corruption or fudging of study results in, say, pushing the limits of video card performance.
No, that doesn't solve the problem, it increases it.
The consistent lack of results is a result, and a very useful one too.
The logical next step is to ban marketing of humbug until and unless the snake oil sellers can show valid scientific theories and peer reviews for their remedies.
Likewise, capitalist-funded research needs to stop rewarding findings, but start treating all results as equally valid science, and stop punishing scientists who produce negative and inconclusive results. That's good science, which is what they should pay for.
Consistently publishing more results than randomness would dictate is a clear indication of bad science, and should be punished, not rewarded.
>>so you cannot hunt for a statistical significance just somewhere in the data and then re-formulate your hypothesis
Cannot? Or should not?
I work as an external evaluator on federal projects, and have been told by one group I worked with, after I delivered a negative result on their data, that "we know that the stats can say anything - why don't you take another look at the stats and find something that makes us look better?" I refused, saying it would be dishonest to change the analysis. They fired me, saying "most evaluators make us look better than the data, but you're making us look worse."
The entire point of an external evaluator is to have a third party looking at your data, so as to prevent this kind of analysis fudging, but when I reported it to the federal case officer overseeing the grant, they just shrugged and didn't care. They don't want any drama to crop up in the grants they oversee. Makes them look bad to *their* bosses.
I'm a biochemist. After earning my PhD five years ago I've been working in academia, but my funding's about to run out and I'm applying for jobs at biotech and pharmaceutical companies. Do you think I had the empathy and morality centers of my brain removed or something? Do you think that every single person working in those sectors underwent the same procedure or were blessed from birth with complete amorality? The reality is that science is hard. The reality is that science is expensive. The reality is that our knowledge is incomplete and we do the best we can with the limited resources at our disposal. If we're lucky, that means we can turn a life-destroying illness into something treatable. Take cancer, for example. There's no magic pill to take it away and probably never will be, but it's because cancer is a large family of disease caused by different breakdowns of cellular mechanisms, many mechanisms that we don't understand very well and that are very hard to tease apart. That's why cancer, and diseases in general, tend to end up with treatments and not one-pill cures, not because big pharma's hiding it.
My brother went through 10 months of chemotherapy. 10 months of being nauseous, 10 months of not wanting to eat, 10 months without a sense of smell, 10 months with no sense of taste, 10 months of physical weakness, 10 months of diminished mental capacity, 10 months of needles, 10 months of IVs full of chemicals that burned when they went in, 10 months of doctors prodding and poking. He's now cancer-free and has been for 12 years. Back when he went through that his odds of surviving Hodgkin's lymphoma were about 80%. Current treatment has reached 90%, and a recent experimental treatment is at 98%. They're all still unpleasant and take months. Do you honestly think that if I had the ability to jump in with a magic pill and spare my brother those 10 months I wouldn't do it because it might hurt the corporate bottom line? Fuck the bottom line. Fuck having a job if it came to it. That's the prevailing attitude in biotech and pharmaceutical companies because they're made up of people like me, people who have seen loved ones go through horrible illness, and not the monsters your fantasy requires.