Results Are In From Psychology's Largest Reproducibility Test: 39/100 Reproduced
An anonymous reader writes: A crowd-sourced effort to replicate 100 psychology studies has successfully reproduced findings from 39 of them. Some psychologists say this shows the field has a replicability problem. Others say the results are "not bad at all". The results are nuanced: 24 non-replications had findings at least "moderately similar" to the original paper but which didn't quite reach statistical significance. From the article: "The results should convince everyone that psychology has a replicability problem, says Hal Pashler, a cognitive psychologist at the University of California, San Diego, and an author of one of the papers whose findings were successfully repeated. 'A lot of working scientists assume that if it’s published, it’s right,' he says. 'This makes it hard to dismiss that there are still a lot of false positives in the literature.'”
Is there a valid reason we accept studies that have not been reproduced at least one more time to truly vet them before the community?
Logistics, resources, patents, or a need to just plain bullshit people. I'm sure there's plenty of excuses as to why we don't, but doesn't sound like we have a whole lot of good reasons why not.
And those that are labeling a score of 39/100 "not bad at all" should have their head checked. Enjoy your legal fun from that ball of lies.
'This makes it hard to dismiss that there are still a lot of false positives in the literature.'
An even more widespread problem is that there are a lot of true negatives that aren't in the literature.
Of course, this is a problem in most scientific fields, not just the "soft sciences" like psychology. I'm occasionally impressed by a researcher who publishes descriptions of things studied and found to be not significant, but this doesn't happen very often.
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
I think it could have something to do with this XKCD:
https://xkcd.com/882/
Well, this is interesting news, to be sure. Gives us plenty to think about. I can't help but wonder if anyone has been able to reproduce their results.
You need to put this in perspective. Sure, psychology is wishywashy field filled with pseudo science. But apparently their studies are about as reproducible as a bunch of the hard sciences fields. If there is anything that reproduciblility studies have taught us is that if there is around a 50% chance your result is correct than you are around the norm, in a great many fields. This 39% would make them about on par with what I remember from medical/cancer reproduciblility studies.
Troll is not a replacement for I disagree.
Just taking this quick opportunity to post a link to my favorite journal, the Journal of Articles in Support of the Null Hypothesis: http://www.jasnh.com/ .
JASNH is one of the few places where you can submit a paper that says "we tested for X effect on Y and found no evidence that X affects Y". Generally this research is unpublishable and people will tweak parameters to get something career-advancing out of their research; I like JASNH because of the reminder that "falsifiability" can really happen.
We recently had heard in the office over one of the Yellow Machine that's made by Anthology Solutions.
Great! With p=2E-65, studies in psychology aren't totally random.
Think of it this way... psychology is the 1960-70's equivalent of today's MBA, and have many similarities:
* neither has an objective means of measuring success or failure, in spite of claiming to have a wide array of methods by which to do so.
* neither the psychologist or the MBA is held accountable for incompetence or non-criminal malice.
* sometimes either one can take on the semblance of religion, minus a deity.
* the big 'do-nothing-but-are-promised-great-riches' degree of the 60's-80's was psychology, as hordes of students took that class thinking just that. In the 90's through today it's the MBA program.
* both can stretch logic and credulity in their work to attempt things that would get an engineer either incarcerated or killed.
(...add your own here...)
(Trigger warning for the MBAs and Psych majors: this is what is known as a joke.)
Quo usque tandem abutere, Nimbus, patientia nostra?
Unlike the hard sciences, awareness of classic social science findings can loop back to impact the phenomena in question or they can change in response to society's evolution. Take the bystander effect for example. How many thousands or millions of college students have learned about the bystander effect in Psych 101? Hypothetically, now that they're aware of it, the effect should diminish and not be quite as reproducible as it once was. Then you layer on societal changes (oblivious smartphone/iTunes users increase the effect, but ubiquitous phones may decrease barriers to reporting and responding to violent crime, etc) and the ability to reproduce an earlier effect becomes muddled.
When a physicist announces a new particle, nothing changes. All the particles keep behaving how they were behaving before the announcement, and they don't care how society changes. The findings should be reproducible 100 years from now.
Many other comments have correctly pointed out that studies in general often focus on the new and shiny and statistically significant rather than reproducing prior results or reporting null findings, but the issue of settling on "truth" is made that much more difficult in the social sciences due to the existence of moving targets.
You can measure how many parts per million of some matter is in teh air.
You can measure how many bacteria of a certain type is in your blood stream.
How do you measure if someone is in a good or bad mood?
The tester's bedside (or couch-side) manners can be enough to tilt the result one way or the other.
And if the researcher has an idea of what he is looking to find, he can (even subcounsciously) manipulate the patient into reacting one way or the other, tainting the measurement.
What do we measure, how do we measure it? The subject could be lying. They subject could be be imagining something. The tester has no way to verify.
Reproducibility is NOT the problem.
Even research that was reproduced can be wrong, for same reasons as above.
The NATURE of the field is the problem, not the lack of reproduciblility.
Lack of reproducibility is merely the proof that there are fundamental problems with measurements and conclusions.
But I agree that the conclusion we can draw, is that there are a lot of false positives.
-- Another senseless waste of fine bytes.
As dismal as Psychology's record is as a science, it's still way more rigorous and evidence-based than Economics.
You are welcome on my lawn.