Why Published Research Findings Are Often False

← Back to Stories (view on slashdot.org)

Why Published Research Findings Are Often False

Posted by samzenpus on Sunday January 2, 2011 @04:27AM from the race-to-publish dept.

Hugh Pickens writes "Jonah Lehrer has an interesting article in the New Yorker reporting that all sorts of well-established, multiply confirmed findings in science have started to look increasingly uncertain as they cannot be replicated. This phenomenon doesn't yet have an official name, but it's occurring across a wide range of fields, from psychology to ecology and in the field of medicine, the phenomenon seems extremely widespread, affecting not only anti-psychotics but also therapies ranging from cardiac stents to Vitamin E and antidepressants. 'One of my mentors told me that my real mistake was trying to replicate my work,' says researcher Jonathon Schooler. 'He told me doing that was just setting myself up for disappointment.' For many scientists, the effect is especially troubling because of what it exposes about the scientific process. 'If replication is what separates the rigor of science from the squishiness of pseudoscience, where do we put all these rigorously validated findings that can no longer be proved?' writes Lehrer. 'Which results should we believe?' Francis Bacon, the early-modern philosopher and pioneer of the scientific method, once declared that experiments were essential, because they allowed us to 'put nature to the question' but it now appears that nature often gives us different answers. According to John Ioannidis, author of Why Most Published Research Findings Are False, the main problem is that too many researchers engage in what he calls 'significance chasing,' or finding ways to interpret the data so that it passes the statistical test of significance—the ninety-five-per-cent boundary invented by Ronald Fisher. 'The scientists are so eager to pass this magical test that they start playing around with the numbers, trying to find anything that seems worthy,'"

14 of 453 comments (clear)

Min score:

Reason:

Sort:

Hmmmmm by Deekin_Scalesinger · 2011-01-02 04:30 · Score: 4, Interesting

Is it possible that there has always been error, but it is just more noticeable now given that reporting is more accurate?

--
"As the intrepid kobold companion continues his journey, he begins to wonder... if priests raises dead, why anybody die?
1. Re:Hmmmmm by digsbo · 2011-01-02 05:29 · Score: 5, Interesting
  
  Wow. I didn't pick up any of that at all, and I RTFA. It looked to me much more like acknowledgement of widespread difficulties with randomness, scale, and human fallibility. Exactly the kinds of things that would make someone who's a staunch defender of "science as a means to truth" to disregard valuable critical information about it.
2. Re:Hmmmmm by Anonymous Coward · 2011-01-02 06:06 · Score: 5, Interesting
  
  Very well expressed. To put this in a context which will seem bizarre to many readers of slashdot, there is a whole range of products on the market to help "scientific astrologers" search out correlations between planetary positions and life circumstances. And a legion of astrologers making use of them -- at several hundred dollars a copy -- to pore over birth charts with dozens and dozens of factors. Unless things have changed in the years since I looked into this, what's usually conveniently sidestepped is that some of those factors will indeed show up significant by chance. After all, that is the very definition of probability expressions such as "p less than .05". On replication, these findings will normally disappear, resulting in a crestfallen astrologer. (Then again, why not just expand the original dataset and check again to see if different factors come up this time :-)
  But the motivation to get something out of the data is high, as the parent post points out, and researchers may be able to deceive themselves just as well as astrologers can, especially when academic careers are on the line.
3. Re:Hmmmmm by bughunter · 2011-01-02 06:50 · Score: 5, Interesting
  
  Start with a ridiculous premise to get people reading, then break out what's really happening
  
  Welcome to corporate journalism. And corporate science.
  If there's one useful thing that 30 years of recreational gaming has taught me, it's this: Players will find loopholes in any set of rules, and exploit them relentlessly for an advantage. Corrolaries include the tendency for games to degenerate into contests between different rulebreaking strategies and the observation that if you raise the stakes to include rewards of real value (like money) then the games with loopholes attract players who are not interested in the contest, but only in winning.
  This lesson applies to all aspects of life from gaming, to sports, business, and even dating.
  And so it's no surprise that when the publishers set up a set of rules to validate scientific results, that those engaged in the business of science will game those rules to publish their results. They're being paid to publish; if they don't publish, they've "lost" or "failed" because they will receive no further funding. So the stakes are real. And while the business of science still attracts a lot of true scientists -those interested in the process of inquiry- it now also attracts a lot of players who are only interested in the stakes. Not to mention the corporate and political interests who have predetermined results that they wish to promulgate.
  
  What was really the point of implying that truth can change?
  To game the system, of course. The aforementioned corporate and political interests will use this line of argument now, in order to discredit established scientific premises.
  
  --
  I can see the fnords!
4. Re:Hmmmmm by JBMcB · 2011-01-02 07:18 · Score: 4, Interesting
  
  The National Center for Complimentary and Alternative Medicine has received billions of dollars of public NIH funding. They study "alternative" medicine, such as chiropractic and homeopathic remedies. So far, their strongest conclusion has been that ginger has a slight positive effect on upset stomachs.
  Billions of dollars. Ginger for upset stomachs. When asked why they haven't produced many solid results, the director of NCCAM usually says that they need more funding. I'd say we need a bit more results-based funding in some areas.
  
  --
  My Other Computer Is A Data General Nova III.
5. Re:Hmmmmm by JDS13 · 2011-01-02 07:41 · Score: 5, Interesting
  
  > So, a capitalistic, fully performance based (with results being the performance metric)
  > environment does not seem to work well for science. / Surprised? / Me neither.
  This is a gratuitous, cheap shot. These problems appear only in scientific research that is funded, managed, or supervised by government agencies or academic review committees so that bureaucrats will grant money, or full professorships, or licenses to sell drugs. Hence the crack that if you want to study squirrels in the park, you title your grant proposal, "Global Warming and Squirrels in the Park."
  There are "capitalistic... performance-based environments" in science - but they're the corporate R&D departments that are seeking marketable innovations. There isn't much intellectual corruption or fudging of study results in, say, pushing the limits of video card performance.
It's simple. by Lord+Kano · 2011-01-02 04:31 · Score: 5, Interesting

Even in academia, there's an establishment and people who are powerful within that establishment are rarely challenged. A new upstart in the field will be summarily ignored and dismissed for having the arrogance to challenge someone who's widely respected. Even if that respected figure is incorrect, many people will just go along to keep their careers moving forward.
LK

--
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
race to the bottom by toomanyhandles · 2011-01-02 04:38 · Score: 4, Interesting

I see this as one more planted article in mainstream press: "Science is there to mislead you, listen to fake news instead". The rising tide against education and critical thinking in the USA is reminiscent of the Cultural Revolution in China. It is even more ironic that the argument "against" metrics that usefully determine validity is couched in a pseudo-analytical format itself. At this point in the USA, most folks reading (even) the New yorker have no idea what a p-value is, why these things matter, and they will just recall the headline "science is wrong". And then they wonder in Detroit why they can't make $100k a year anymore pushing the button on robot that was designed overseas by someone else- you know, overseas where engineering, science, etc are still held in high regard.
Bogus article by Anonymous Coward · 2011-01-02 04:48 · Score: 5, Interesting

That article is as flawed as the supposed errors it reports on. The author just "discovered" that biases exist in human cognition? The "effect" he describes is quite well understood, and is the very reason behind the controls in place in science. This is why we don't, in science, just accept the first study published, why scientific consensus is slow to emerge. Scientists understand that. It's journalists who jump on the first study describing a certain effect, and who lack the honesty to review it in the light of further evidence, not scientists.
Re:News Flash: Scientists Human Too, Study Finds by onionman · 2011-01-02 04:50 · Score: 5, Interesting

After years of speculation, the a study has revealed that scientists are, in fact, human. The poor wages, long hours, and relative obscurity that most scientists dwell in has apparently caused widespread errors, making them almost pathetically human and just like every other working schmuck out there...
I'll add another cause to the list. The "publish or perish" mentality encourages researchers to rush work to print often before they are sure of it themselves. The annual review and tenure process at most mid-level research universities rewards a long list of marginal publications much more than a single good publication.
Personally, I feel that many researchers publish far too many papers with each one being an epsilon improvement on the previous. I would rather they wait and produce one good well-written paper rather than a string of ten sequential papers. In fact, I find that the sequential approach yields nearly unreadable papers after the second or third one because they assume everything that is in the previous papers. Of course, I was guilty of that myself because if you wait to produce a single good paper, then you'll lose your job or get denied tenure or promotion. So, I'm just complaining without being able to offer a good solution.
logical contortions in the article by bcrowell · 2011-01-02 05:10 · Score: 4, Interesting

The article can be viewed on a single page here: http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer?currentPage=all
Not surprisingly, most of the posts so far show no signs of having actually RTFA.
Lehrer goes through all kinds of logical contortions to try to explain something that is fundamentally pretty simple: it's publication bias plus regression to themean. He dismisses publication bias and regression to the mean as being unable to explain cases where the level of statistical significance was extremely high. Let's take the example of a published experiment where the level of statistical significance is so high that the result only had one chance in a million of occurring due to chance. One in a million is 4.9 sigma. There are two problems that you will see in virtually all experiments: (1) people always underestimate their random errors, and (2) people always miss sources of systematic error.
It's *extremely* common for people to underestimate their random errors by a factor of 2. That means the the 4.9-sigma result is only a 2.45-sigma result. But 2.45-sigma results happen about 1.4% of the time. That means that if 71 people do experiments, typically one of them will result in a 2.45-sigma confidence level. That person then underestimates his random errors by a factor of 2, and publishes it as a result that could only have happened one time in a million by pure chance.
Missing a systematic error does pretty much the same thing.
Lehrer cites an example of an ESP experiment by Rhine in which a certain subject did far better than chance at first, and later didn't do as well. Possibly this is just underestimation of errors, publication bias, and regression to the mean. There is also good evidence that a lot of Rhine's published work on ESP was tainted by his assistants' cheating: http://en.wikipedia.org/wiki/Joseph_Banks_Rhine#Criticism

--
Find free books.
Maybe it is not science by fermion · 2011-01-02 05:11 · Score: 4, Interesting

The scientific method derives from Galileo. He constructed apparatus and made observations that any trained academician and craftsperson of his day could have made, but they did not because it was not the custom. He built inclined planes, lenses, and recorded what he say. From this he made models that included predictions. Over time those predictions were verified by other such as Newton, and the models became more mathematically complex. The math used is rigorous.
Now science uses different math, and the results are expressed differently, even probabilistically. But in real science those probabilities are not what most think as probability. In a scanning tunneling microscope, for instance, works by the probability that a particle can jump an air gap. Though this is probabilistic, It is well understood so allows us to map atoms. There is minimal uncertainty in the outcome of the experiment.
The research talked about in the article may or may not be science. First, anything having to do with human systems is going to be based on statistics. We cannot isolate human systems in a lab. The statistics used is very hard. From discussions with people in the field, I believe it is every bit as hard as the math used for quantum mechanics. The difference is that much of the math is codified in computer applications and researchers do not necessarily understand everything the computer is doing. In effect, everyone is used the same model to build results, but may not know if the model is valid. It is like using a constant acceleration model for which a case where there is a jerk. The results will be not quite right. However, if everyone uses the faulty model, the results will be reproducible.
Second, the article talks about the drug dealers. The drug dealers are like the catholic church of Galileo's time. The purpose is not to do science, but to keep power and sell product. Science serves a process to develop product and minimize legal liability, not explore the nature of the universe. As such, calling what any pharmaceutical does as the 'scientific method' is at best misguided.
The scientific method works. The scientific method may not be comopletey applicable to fields of studies that try to find things that often, but not, always, work in a particular. The scientific method is also not resistant to group illusion. This was the basis of 'The Structure of Scientific Revolution'. The issue here, if there is one, is the lack of education about the scientific method that tends to make people give individual results more credence than is rational, or that is some sort of magic.

--
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
Re:Quantity, not quality, is often prioritised. by Moof123 · 2011-01-02 05:32 · Score: 5, Interesting

Agreed. Way too many papers from academia are ZERO value added. Most are a response to "publish or perish" realities.
Cases in point: One of my less favorite profs published approximately 20 papers on a single project, mostly written by his grad students. Most are redundant papers taking the most recent few months data and producing fresh statistical numbers. He became department head, then dean of engineering.
As a design engineer I find it maddening that 95% of the journals in the areas I specialize in are:
1. Impossible to read (academia style writing and non-standard vocabulary).
2. Redundant. Substrate integrated waveguide papers for example are all rehashes of original waveguide work done in the 50's and 60's, but of generally lower value. Sadly the academics have botched a lot of it, and for example have "invented" "novel" waveguide to microstrip transitions that stink compared to well known techniques from 60's papers.
3. Useless. Most, once I decipher them, end up describing a widget that sucks at the intended purpose. New and "novel" filters should actually filter, and be in some way as good or better than the current state of the art, or should not be bothered to be published.
4. Incomplete. Many interesting papers report on results, but don't describe the techniques and methods used. So while I can see that University of Dillweed has something of interest, I can't actually utilize it.
So as a result when I try to use the vast number of published papers and journals in my field, and in niches of my field to which I am darn near an expert, I cannot find the wheat from the chaff. Searches yield time wasting useless results, many of which require laborious decyphering before I can figure that they are stupid or incomplete. Maybe only 10% of the time does a day long literature search yield something of utility. Ugh.
Re:Not that simple. by Simetrical · 2011-01-02 11:11 · Score: 4, Interesting
Hard sciences simply lend themselves a lot better to repeatability. Where I think we go wrong is assigning the same certainties to the claims of the soft scientists.
Granted that hard sciences are probably more reliable, but unfortunately, a lot of the research even there is shaky. I overheard roughly the following conversation between a graduate student in mathematics and his thesis adviser one summer, while I was doing undergraduate summer math research at the CUNY Graduate Center on an NSF grant (RTG):
- Student: So I looked into the paper by Smith, and when I did the same computations, I got a different answer. I haven't been able to figure out what I'm doing differently. Do you think I should e-mail him?
- Adviser: No. If the results are inconsistent, pretend they don't exist. Don't use them, but don't tell anyone you got different results either. If you do, then they'll just suspect that your results are wrong.
- Student: Yeah, I suspect that too.
- Adviser: But don't contact him, because people don't like being proven wrong. You can point out errors in people's papers once you've got tenure – it's not something you want to do as a grad student. You don't want to make this guy your enemy.
- Student: Oh, okay . . .
Even if high-profile results are more reliable in the hard sciences, your average paper is still unreproducible garbage. The problem is the system, which forces everyone to publish as much as possible without heed to quality; and the journals, which publish only positive results. Researchers need to publish all their results publicly, including registering their hypotheses before they even begin the study. Universities need to take a stand by not focusing on quantity of publications. More emphasis must be placed on repeatability.
The people who treat this kind of finding as an attack on science are perpetuating the problem. We should be looking to make the scientific process ever better and more accurate as we come to understand its pitfalls better, not shrug off its inadequacies as inevitable.
--
MediaWiki developer, Total War Center sysadmin