Can Bad Scientific Practice Be Fixed?
HughPickens.com writes: Richard Horton writes that a recent symposium on the reproducibility and
reliability of biomedical research discussed one of the most sensitive issues in science today: the idea that something has gone fundamentally wrong with science (PDF), one of our greatest human creations. The case against science is straightforward: much of the scientific literature, perhaps half, may simply be untrue. Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, science has taken a turn towards darkness. According to Horton, editor-in-chief of The Lancet, a United Kingdom-based medical journal, the apparent endemicity of bad research behavior is alarming. In their
quest for telling a compelling story, scientists too often sculpt data to fit their preferred theory of the world or retrofit hypotheses to fit their data.
Can bad scientific practices be fixed? Part of the problem is that no-one is incentivized to be right. Instead, scientists are incentivized to be productive and innovative. Tony Weidberg says that the particle physics community now invests great effort into intensive checking and rechecking of data prior to publication following several high-profile errors. By filtering results through independent working groups, physicists are encouraged to criticize. Good criticism is rewarded. The goal is a reliable result, and the incentives for scientists are aligned around this goal. "The good news is that science is beginning to take some of its worst failings very seriously," says Horton. "The bad news is that nobody is ready to take the first step to clean up the system."
Can bad scientific practices be fixed? Part of the problem is that no-one is incentivized to be right. Instead, scientists are incentivized to be productive and innovative. Tony Weidberg says that the particle physics community now invests great effort into intensive checking and rechecking of data prior to publication following several high-profile errors. By filtering results through independent working groups, physicists are encouraged to criticize. Good criticism is rewarded. The goal is a reliable result, and the incentives for scientists are aligned around this goal. "The good news is that science is beginning to take some of its worst failings very seriously," says Horton. "The bad news is that nobody is ready to take the first step to clean up the system."
Feynman's take:
We have learned a lot from experience about how to handle some of the ways we fool ourselves. One example: Millikan measured the charge on an electron by an experiment with falling oil drops, and got an answer which we now know not to be quite right. It's a little bit off because he had the incorrect value for the viscosity of air. It's interesting to look at the history of measurements of the charge of an electron, after Millikan. If you plot them as a function of time, you find that one is a little bit bigger than Millikan's, and the next one's a little bit bigger than that, and the next one's a little bit bigger than that, until finally they settle down to a number which is higher.
Two more examples from Ignition! by John Clark.
James Dewar (later Sir James, and the inventor of the Dewar flask and hence of the thermos botde), of the Royal Institute in London, in 1897 liquefied fluorine, which had been isolated by Moisson only eleven years before, and reported that the density of the liquid was 1.108. This wildly (and inexplicably) erroneous value (the actual density is 1.50) was duly embalmed in the literature, and remained there, unquestioned, for almost sixty years, to the confusion of practically everybody.
Bill Doyle, at North American, had also fired a small fluorine motor in 1947, but in spite of these successes, the work wasn't immediately followed up. The performance was good, but the density of liquid fluorine (believed to be 1.108 at the boiling point) was well below that of oxygen, and the military (JPL was working for the Army at that time) didn't want any part of it.
This situation was soon to change. Some of the people at Aerojet simply didn't believe Dewar's 54-year-old figure on the density of liquid fluorine, and Scott Kilner of that organization set out to measure it himself. (The Office of Naval Research put up the money.) The experimental difficulties were formidable, but he kept at it, and in July, 1951, established that the density of liquid fluorine at the boiling point was not 1.108, but rather a little more than 1.54. There was something of a sensation in the propellant community, and several agencies set out to confirm his results. Kilner was right, and the position of fluorine had to be re-examined. (ONR, a paragon among sponsors, and the most sophisticated —by a margin of several parsecs — funding agency in the business, let Kilner publish his results in the open literature in 1952, but a lot of texts and references still list the old figure. And many engineers, unfortunately, tend to believe anything that is in print.)
For years people had noted that a standing drum of acid slowly built up pressure, and had to be vented periodically. But they assumed that this pressure was a by-product of drum corrosion, and didn't think much about it. But then, around the beginning of 1950, they began to get suspicious. They put WFNA in glass containers and in the dark (to prevent any photochemical reaction from complicating the results) and found, to their dismay, that the pressure buildup was even faster than in an aluminum drum. Nitric acid, or WFNA at least, was inherently unstable, and would decompose spontaneously, all by itself. This was a revolting situation.
All of this goes to show that even well-respected scientists and engineers are not immune to bad science.
That's part of it... But I believe the biggest problem is science fairs. Once heralded as a great way to get kids involved in science and the scientific method has been ruined by a culture of excessive safety, pandering to kids, and incompetent science teachers. First, every kids science toy has been neutered by safety culture. I'm not saying we should have kits with mercury and radioactive materials like we did in the 50s, but "science" kits where you make kitchen goo instead of actual chemical reactions is lame and boring. Kids are not fooled.
Second, the increasing pressure to pass all kids or give them participation ribbons is very present at the science fair. Many kids are forced to participate, and in many fairs judges have to assign a minimum score of "good" or some such term. I have judged at the STATE LEVEL (as in, they had to do very well at the school and county levels) and have had to assign this minimum score which was still a gift. Kids come up with buzzword laden projects and make elaborate art projects that get ooohs and ahhs from non-technical people while doing no research and offering conclusions that are demonstrably wrong. Don't believe me? Go to a science fair some time and count the number of "experiments" showing ethanol has more energy content than gasoline. There are usually a dozen at the state science fair I judge. I also wonder how many projects are done primarily by the parents who don't want their kids to do poorly.
Finally, the incompetency of science teachers... This is not applicable to all teachers, but especially in poorer areas and in under performing schools, science teachers have no science background and don't understand the scientific method. They don't understand research, citations, hypotheses, or conclusions. They don't even take the time to verify experimental results with a quick Google search. The comforting thing I've noticed from judging student science projects is that most of the kids KNOW their teachers are incompetent and bullshitted their way to a good score at the science fair. At the state level, they are completely unprepared for actual questions on subject matter by professionals in the various fields. I'm a civil engineer, and I've had to shake my head in disbelief that projects are off by an order of magnitude from what they should be and it is a shock for the student to hear that as no one has reviewed or questioned their work before the state level.
What we need is a new science fair system where teachers can mentor students on projects, but teachers don't judge projects. Projects should only be judged by people familiar with the subject matter and the scientific method. If they can't scrape together the judges, maybe the science fair needs to go away or there needs to be an active campaign to recruit and support professionals to judge school science fairs. It should be no surprise that the science fair kids have grown up to do research that panders to public opinion, are lazy, have poor citations, and are filled with self-confirming results.
Yes it's an anecdote! Were you expecting original research in a Slashdot comment?
You are mistaking kicking back against PR agencies, people in politics defining a difference to other people in politics and medicine show "religion" who see science as a threat to their business model for "scientific consensus".
Banding together against the barbarians at the gate who wouldn't know the scientific method if it bit them on the arse is not "scientific consensus" - it is a defence of expertise versus wilful ignorance and deliberate lies.
I have witnessed way too many brilliant, and I mean off the scale brilliant graduate students who are forced to pretty much credit their work to some 60+ year old very tenured professor because he is the only one who can get access to the money. But worse than that I see the same off the scale brilliant students being told that they are wrong wrong wrong. Not because they are wrong but because when they are shown to be correct it will upend the research and conclusions that entire careers were built upon.
I find that many senior professors/scientists never really accomplished anything and simply became experts in an established field further establishing that field. They are threatened by anyone who comes along and shakes the tree which might cause a few of their most rotten fruit to fall. But they are also threatened that if recognized that a truly great young scientist will come along and "steal" all the grant money that is rightfully theirs because of their seniority.
There are the rare senior scientists who encourage new and radical thinking along with making sure that credit is properly assigned (first name) but pretty much without exception these are scientists who accomplished something in their day.
I find a very common song sung by these terrible scientists is that all science is now to be done by groups. Yes groups are often required to conclusively put something new to bed but almost without exception great science had some key crack opened by some one person(or two) thinking way outside the box; not merely going through a checklist.
I have long thought that one of the reasons that so many great scientists are a bit autistic is that only this way can they ignore the continuous social pressure to conform to the groupthink that the lesser scientist would prefer they would. Whereas the more social but less capable scientists are the ones who can rise to the top on little or no accomplishments and cajole and structure the system so as to provide them with a huge cut of the grant money.
I think part of the problem is that nobody wants to publish a paper where the experiment failed--but they should.
Failures are useful; they're not wasted time. You've almost certainly learned something from a failed experiment. Maybe you learned that the setup wasn't rigorous enough, or maybe you just learned that a certain avenue of research wasn't viable for one reason or another. I get that journals are looking for breakthroughs, but it would be so useful to read a paper in your field and find out that someone already tried the thing you're attempting, and now you don't have to fail in exactly the same way.
But that requires a much more collaborative system, and one where the community is interested in finding answers, not glory.
"Publish or Perish", Degrees that require new original ideas, Strict hierarchy structure...
Academic institutions are culturally stuck in victorian times. So if you want to work up, get the choice projects and research, you need to publish. The more your publish, the higher the chances you will move up. Because there is so much published material, people don't read it much, so they found that they can get credit for half ass work.
Your name becomes your brand, so when you try to get a grant your name+institution you will work for will get you the grant money.
There isn't any reason why Say State University of New York Buffalo can't get a grant to study seismology, but chances are it will go to University of California Berkeley not because they will do a better job, but because of the name.
Finally institutions haven't learned how to deal with today's political climate with the attempt for breaking news. Every Hypothesis is sold to the public as a new Theory... Then if that Hypothesis is shown false (as it is common in science) then the media who may have a political slant will go and say see Science is Wrong again, just like our political stance has predicted!
Science for the most part is quite work, collaborating with like minded people, with checks and balances to try to filter out strong egos. But it has gone commercial so these checks and balances are weaken as strong egos will win out.
This reminded me of two things:
1- One of my favorite Roy Scheider lines from 2010: "Look, just because our governments are behaving like asses doesn't mean we have to! We're supposed to be scientists, not politicians!"
and
2- Dr. Jeff Hawkins, the inventor of the palm pilot and handspring lines of devices, who is an avid researcher in the field of artificial intelligence, pointed out in his book, On Intelligence, the following about his approach to his interests and career path:
"Frequently hypotheses in the academic environments don't pan out into ground breaking research and as a result can be career enders." This is why he approached his study of neuro-biology to the end of designing and building intelligent machines, to the corporate research and development environment which tends to take more of a "Back to the drawing board" approach to engineering and science programs that don't pan out into discoveries or innovation. This is a much better approach for many obvious reasons, but part of the problem is that academic research is too quick to blame the researcher and not the questions or the actual research or research approach and black list the people involved, which is very much like throwing the baby out with the bath water. It is no surprise that academia has serious problems with the integrity of it's publications (which is the root of the actual problem pointed out here) because they have created an environment where it is profitable or expedient to be less than honest, at least in the short term, if there is one constant in life, it is that nothing remains a secret forever. Academia would do well to reward the actual merits of research that does not pan out into something groundbreaking, because like Edison, it adds to the body of research that can hep to define later research that does pan out into something novel. (like the 1000 tries at finding the appropriate material to use as a filament in the first light bulb and the famous quote "I just found 999 ways not to make a light bulb" before he settled on tungsten.)
There are so many talented scientists and engineers that are unable to find places to apply their talents due to the system that is in place in academia making the process work against itself in this manner. I would say this is why (coming full circle here) we did not actually end up exploring the outer solar system in the last decade. (2000 - 2010)
I agree with your general tone and statement. However it is important to note the inherent limitations of biomedical research. Generally one CANNOT do large scale studies needed to get a statistically robust result. All of physics and astrophysics generally use the 5 sigma discover requirement which means you have to measure the effect to 3e-7. You cannot do this with people as subjects. It is hard to do this with ANY biological subject. Many of the issues brought up stem from this.
I think much of the problem is exacerbated by the public-or-perish mentality but is even more affected by the total lack of reporting null results (when you DO NOT see anything). This skews your overall distribution. It is like not accounting for trials (because you aren't). In biomedical research they need to spend more time quantifying their trials and placing their results in the proper statistical context. Just staying that you are less likely to get parkinson's disease if you drink coffee because we asked a bunch of people isn't the whole story. How many questions did you ask? Was it 100? Did you treat all those as essentially trials?
I think there is are several fundamental issue with applying the scientific method to the question of anthropogenic global warming. First, the gold standard of scientific proof is experimentation. Experiments must be independently reproducible. Given that we only have one earth, this is a problem. Another challenge is multiple variables which interact in unknown ways. The best experiments are when one variable is changed, and everything else held constant. This is simply unattainable on a global scale. The third challenge I see is time. When the delay between cause and effect spans multiple generations of human observers, the probability of getting useful information falls dramatically. In short, the scientific method has its limitations.
First, the gold standard of scientific proof is experimentation.
Uh... there's a lot more to science than that. But even if we take your word for it, the climatologists create statistical models based on observable variables and fit those models to collected data. The better the fit, the more accurate the predictions.
Models are certainly useful for understanding small systems with few variables, interacting linearly, over relatively short time periods. The math involved in modeling truly complex systems, non-linear partial differential equations, is extremely difficult. In fact, the ones we study in school are a small subset that have closed solutions. The majority don't. I never hear about Nobel prizes in physics going to climate modelers.
Right, I'm in the humanities and there is this running joke that you only need to publish one really bad and obviously flawed paper on a really popular topic, and your career is certain. It's true, one bad paper, a followup book that is even worse published at 'prestigious' publisher like Oxford UP*, and you will get cited everywhere and get full tenure within about 3 years after the book has been published. I swear I'm not kidding, I've seen this more than once.
So much for impact scores and citation indices ...
-----
* I mention this publisher because he's well respected and nevertheless publishes many bad or at least dubious books without a proper peer review. I should know, because they once contacted me, a lowly postdoc from an unknown university, to review the latest book project by one of the most famous researchers in my area. It's obvious that they just googled me, as I'm easier to find on the net than some of my more established colleagues.
What surrogate could possibly serve as an analog to the earth? That would be like testing a drug on amino acids instead of living cells.