How To Better Verify Scientific Research
Hugh Pickens DOT Com writes "Michael Hiltzik writes in the LA Times that you'd think the one place you can depend on for verifiable facts is science but a few years ago, scientists at Amgen set out to double-check the results of 53 landmark papers in their fields of cancer research and blood biology and found only six could be proved valid. 'The thing that should scare people is that so many of these important published studies turn out to be wrong when they're investigated further,' says Michael Eisen who adds that the drive to land a paper in a top journal encourages researchers to hype their results, especially in the life sciences. Peer review, in which a paper is checked out by eminent scientists before publication, isn't a safeguard because the unpaid reviewers seldom have the time or inclination to examine a study enough to unearth errors or flaws. 'The journals want the papers that make the sexiest claims,' Eisen says. 'And scientists believe that the way you succeed is having splashy papers in Science or Nature — it's not bad for them if a paper turns out to be wrong, if it's gotten a lot of attention.' That's why the National Institutes of Health has launched a project to remake its researchers' approach to publication. Its new PubMed Commons system allows qualified scientists to post ongoing comments about published papers. The goal is to wean scientists from the idea that a cursory, one-time peer review is enough to validate a research study, and substitute a process of continuing scrutiny, so that poor research can be identified quickly and good research can be picked out of the crowd and find a wider audience. 'The demand for sexy results, combined with indifferent follow-up, means that billions of dollars in worldwide resources devoted to finding and developing remedies for the diseases that afflict us all is being thrown down a rathole,' says Hiltzik. 'NIH and the rest of the scientific community are just now waking up to the realization that science has lost its way, and it may take years to get back on the right path.'"
Science is infallible.
More like science is always wrong. Scientists always set out to be less wrong than the last guy, though.
So basically they want to introduce a Slashdot for scientists..
Prepare for a brand new style of flame-wars!
I think the real problem is that scientists aren't lending any prestige to reproducing experiments so nobody bothers. Journals want to publish new results, not confirmation. Advisors discourage students from reproducing experiments, which makes sense since they won't be published.
Follow-up studies are where the validation/replication/testing happens. This is not new. Any decent scientist knows this. Peer review is a filter, but it's a pretty basic sanity check, not a comprehensive evaluation of the work. Once published, that opens a paper and the ideas within it to critique by ALL readers, not only the reviewers. Thus, post-publication is when the real scientific review happens. Peer review merely removes the stuff that isn't formulated, measured, and organized well enough to bother reading it in the first place (i.e. it gets rejected). It's an imperfect process, so sometimes stuff slips through anyway. That's what the follow-up papers are for.
It's important to remember that in vivo biology is not all of science. It's a lot harder to know what you're doing in biology. If you want excellent reproducible science, let's just roll balls down inclines, measure that and hope we don't get sick.
Well, it's good to see a major scientific institution waking up to a phenomenon Richard Feynman warned about in the 1970s. Yet it seems to me the proposed solution is a little ad hoc. If scientists want to restore integrity to their field(s) -- and I applaud their efforts to do so -- why aren't they using an experimental approach to do so? I think they should try several things and collect data to find out what actually works.
[Sir Garlon] is the marvellest knight that is now living, for he destroyeth many good knights, for he goeth invisible.
Yes, there's no money in replication. More importantly, there is no money for replication. Who's going to fund replication studies? And it wouldn't be small amount of money.
Maybe what is needed is a tax on research to fund replication studies. That opens up another can of worms, is one replication study enough? Who decides? Whether the Tea Baggers like it or not, this seems like an area that will require government intervention. Taxing research is unlikely to bring in enough money. Taxes will have to be raised somewhere to pay for it. (Several Tea Bagger angels were sacrificed in the writing of the previous sentence.)
Whenever one of these stories is posted about inaccurate and falsified research papers, it's always a field related to biology. This doesn't seem to be nearly as much of a problem with the hard sciences (physics, chemistry). We should avoid rhetoric like "science has lost its way" since the problem is mostly isolated to one branch of science and such statements only serve as ammo for the anti-science crowd. Disclaimer: I'm a physicist.
The *best* way would be to do a different experiment with the expectation of getting the same results if the original research was valid and understanding of the studied phenomena good. Then, regardless of whether the second study validates the first one or not, we would actually have more data and better understanding of the issue and problems regarding it's study.
Invalidating shoddy research would be a bonus.
Interestingly, the Economist's article on the same points this weeks notes that there is a group specifically devoted to doing replication- the Reproducibility Initiative from PLOS One. They've got a $1.3 million grant from the Arnold Foundation to look at 50 high profile papers in cancer research.
"Seven Deadly Sins? I thought it was a to-do list!"
Took 29 minutes to get from the story being posted to "CLIMATE SCIENTIST ARE LIREZ!!11!!1". You know there are a lot of other branches of science, many of which are far more subjective than climate science.
There's also plenty of data and models out there if you wanted to run your own experiments to confirm or disprove a particular paper or claim. I'd be very interested in reading your counter paper.
The problem with the present method is that each paper is scrutinized before publication only by a very small select cohort of experts. And once this decision is taken, its 'published and stays published for ever' (in most cases, discounting the outright fraudulent ones that are retracted)'.
I am a professor of pharmacology and we do critical appraisal of scientific papers in our department all the time for symposiums. You won't know what kind of mistakes my undergrads pick up in journal clubs, of papers published in prestigious journals.
By enough eyeballs, I do mean qualified eyeballs. Not just eyeballs.
In addition to questioning "Climate Change", one could also look at the "Science" of may other fields as suggested. Especially crucial for examination are the many psychological and sociological studes that are used to "guide" public policy. I venture that many are complete loads of crap designed specifically to influence public policy.
Physics is not immune to parasitic and mercenary research phenomena either, especially in more exotic areas with great funding potential, such as quantum computing & crypto where exaggerations and self-puffery are common. One might say the whole field is of that kind, since their whole theorizing (which is all they got) rests on the speculative aspects of quantum measurement theory, the foundations of which are still awaiting unambiguous experimental demonstration (such as the "loophoole free" violations of Bell inequalities), for over half century already. Should the experimental failure to confirm the fundamental conjectures persist, the whole field will be recognized as fancily relabeled analog computing (such as D-Wave system).
What is not discussed is that in science as in life it's all about incentives. All you have to is look at who is paying for these studies, directly (through research grants) or indirectly (speaking or consulting fees), and things will become much clearer. The biomedical and life sciences are most vulnerable to corruption because the incentives are very high, successful drug/treatments are worth a lot of money. Even unsuccessful ones, given the proper appearance of effectiveness are worth money.
Other sciences are less susceptible because there is no incentive to hype the results, not because those scientists are more ethical. There is two solutions for the problem. One is to remove incentives, which would mean overhauling the whole system of scientific funding. The other is to mandate raw data sharing. This would make it easier for people to reanalyze the data without actually redoing the experimental parts.
A good example of this is Reinhart-Rogoff controversy in economics, where they claimed one thing in their widely publicized 2010 paper (high debt levels impede growth), but their statistical analysis was shown to be riddled with errors, skewing the data to the desired conclusion. This was discovered the when they shared their raw data with a University of Massachusetts grad student. While data sharing would not eliminate these issues it would make is harder to perform "statistical" analysis that introduces biases.
I think the point is that *some* of those eyes will have the requisite expertise to catch subtle flaws. And perhaps just as valuable *lots* of those eyes will have enough expertise to catch the simplistic flaws - the sorts of things so obvious that the real experts aren't even looking at that area because they assume no expert would make such an obvious error. But since we're all human, occasionally we do.
--- Most topics have many sides worth arguing, allow me to take one opposite you.
...except about something like catastrophic anthropogenic global warming :)
Oh man, you are totally right! How could I have been so blind! We should be more skeptical about shit like evolution and gravity too! Down with close minded dogma!
The correct take-away from this kind of study is not that a specific field of science is "broken" (also, cancer research is not all of science), but rather that there is room for improvement.
There is no question whatsoever that cancer research has made leaps and bounds over the last few decades in terms of improving the lives of many people with cancer, both by helping them to live longer, and by helping them to live better. What this kind of study shows is that we can do even better still, if we can find ways to fix the flaws that remain in cancer research.
Completely agree, I have a friend who's getting a doctor in Sociology with a concentration in women studies. Some of the crap she's made me read is ridiculous. She stopped talking to me for awhile after I said what she does isn't science. She can't even replicate an "experiment" from one group to another let alone across a generational, cultural, or geographic gap, and yet some of these "studies" are used to set employment policies that discriminate against majorities and created the "we don't care if you're qualified to do this if you don't help us meet our quota" environment.
Yes, in fact, Popper was right - falsification is the bedrock of science.
Without falsification, you simply have religion, no matter how fancy the lab coat you dress up in looks like :)
Just because you use maths doesn't mean it's not religion.
He neither said that nor implied it. What he said was that any criticism of AGW is met with a defense akin to a religious fervor. This is a true statement.
As demonstrated.
It has been said in other papers too that a lot of the literature is wrong (http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124) and that this is more likely in higher impact journals and for papers with lower sample sizes (http://www.nature.com/nrn/journal/vaop/ncurrent/full/nrn3475-c6.html). The idea is that a smaller sample size is more likely to lead to a Type I error (incorrectly finding a statistically significant result) or over-estimating the size of an effect. Consequently, these smaller sample size studies find what looks like a stunning effect but what they're really seeing is an outlier. The paper looks awesome so it gets published somewhere high impact, where it is sensationalised. This effect is exacerbated by the "publish or perish" mentality, where researchers are pressured to produce many high impact papers in order to get grants. It's also a function of the fact that a lot of research is being done, so the high volume increases the odds of this shit happening. Cancer biology is particularly prone to this sort of effect because it's very competitive, there's a lot of interest in it and so it generates high impact papers, and there are a lot of big screening studies that depend heavily on statistics to confirm effects. In some branches of biology you hardly need a stats test because variables are few in significance is obvious. However, when you're screening vast numbers of drug targets then you have all sorts of problems with multiple comparisons and the like. You need elaborate stats tests and they have to be done right. Overall, however, whether the community as a whole believes something is determined by state of the literature in general and not just a single study. What we consider true or false is influenced by the politics of science as well as the data. This is nicely reviewed in the controversial book, "The Golem", by Collins and Pinch (http://www.amazon.com/The-Golem-Should-Science-Classics/dp/1107604656).
soylentnews.org
The problems that plague science today are much deeper than the simple, solvable problem of peer review. If you actually listen to the critics who have been speaking out on the issue for decades now, the problems start in grad school. See Jeff Schmidt's book, Disciplined Minds, which exposes the details of how consensus actually forms in science today. The public likes to imagine that consensus is decided by individuals who are aware of alternative options for belief. The truth is that the consensus is simply manufactured in the grad schools, through an over-reliance upon memorization (as opposed to checking for actual conceptual comprehension, like with force concept inventory tests) and the weeding out of students who stray from the technical details of the problems they are assigned to. The truth is that the features we desire in professionals -- obedient thinkers who can fit into large organizations without "getting political" -- is really quite different than the values we associate with thinking like a scientist (which necessarily includes open-mindedness and skepticism). The notion of "professional scientist" is actually an idea with internal conflicts. It's a contradiction out in the open which apparently few have put any thought into. But, once you look at the way we train professionals today, it becomes apparent that we are not training them to actually think like scientists.
We actually had an incredible chance to have this debate back in May of 2000 when Noam Chomsky stood up with around 700 researchers in support of Jeff Schmidt. Schmidt even won his case against the American Institute of Physics, but the AIP's purpose has always been to obscure this debate from national discourse.
The AIP realizes that the credibility of much of science is basically on the line. If consensus is largely manufactured, then the public cannot rely upon it as a guide in the more empirically challenged domains.
It's true that the system can be gamed in the short run. And sometimes someone can be game it enough to get tenure. But without follow up and citations, they'll just end up in academic limbo of being an associate professor with no funding.
Scientists always set out to be less wrong than the last guy, though.
No they don't. TFA lists many examples of scientists choosing to advance their careers rather than trying to be "less wrong".
The Economist Magazine had a cover story on this issue just last week, that in my opinion covers the issue better than TFA.
Basically, the current system of peer review and replication is failing. Peer reviewers actually miss many errors, rarely check statistics, and almost never re-run any software. The current publishing system has little interest in printing replication, and spending time replicating experiments is a dead end career path. The existing system doesn't work well in the era of "big science" and "big data".
We need to move to a system where all publicly funded science is required to be disclosed when it is initially funded, so negative results cannot later be buried. We should also move to online publishing, with a permanently active area for comments, so if the research is later refuted, or even questioned, that is immediately visible. A portion of public science spending should be set aside for replication. There also should be negative consequences for researchers that publish papers that cannot be replicated, whether because their results are wrong, or because they failed to disclose enough information about how the experiment was conducted. Scientists accepting public funds should be required to make their data and software available.
But the biggest obstacle to reform is researchers and publishers that have prospered under the existing system. Many of them treat the current system of peer review as some sort of holy ritual, and refuse to even admit that the system is broken.
He neither said that nor implied it. What he said was that any criticism of AGW is met with a defense akin to a religious fervor. This is a true statement.
As demonstrated.
No. Scientific criticism of AGW is fine. But coming up with inane conspiracies, casting aspersions, or character assassinations are NOT valid forms of scientific criticism. Worse, the people often spouting such nonsense have little if any knowledge of the actual science and DON'T WANT TO KNOW IT.
Don't equate denialisim with legitimate skepticism. There are legitimate skeptics, but they aren't the ones claiming that the entire world's population of climate scientists is on a mission to murder Jesus and create a socialist utopia. Deniers make the real skeptics look bad, and actually serve to drown out real scientific skepticism with their idiocy.
~X~
I think the point of the PubMed Commons pilot is to experiment with providing a forum where "the kinds of mistakes my undergrads pickup in journal clubs" *do* get shared.
Basically, the current system of peer review and replication is failing. Peer reviewers actually miss many errors, rarely check statistics, and almost never re-run any software. The current publishing system has little interest in printing replication, and spending time replicating experiments is a dead end career path. The existing system doesn't work well in the era of "big science" and "big data".
No, it's not. Peer review isn't meant to catch all errors, just errors in logic. You're assuming that the review process that goes on in mathematics is the same as the review process in all other fields of science, which just isn't true.
Replication is given to grad students -- if they succeed, you know they learned the method, and if they fail, it goes into a chapter of their thesis -- if they're a crappy grad student and they can't get anything to work, they wash out, and if they're good grad students that have other good results, then you go around for the next twenty years saying "so-and-so tried to redo that guys work and couldn't figure it out, it's probably crap, so don't work for that guy". And that's how replication is done.
Sure, I don't want to spend my time replicating experiments, but why should I? And what is the point to publishing it? If I'm any good you don't want me wasting my time with that. I'll have students and then they can do it, and in the process they'll learn, and so life goes on. Science is a community process... you can't break it down into something so cut and dried as "it must all be on paper, published and formatted nicely" -- nobody has time for that crap.