MRI Software Bugs Could Upend Years Of Research (theregister.co.uk)
An anonymous reader shares a report on The Register: A whole pile of "this is how your brain looks like" MRI-based science has been invalidated because someone finally got around to checking the data. The problem is simple: to get from a high-resolution magnetic resonance imaging scan of the brain to a scientific conclusion, the brain is divided into tiny "voxels". Software, rather than humans, then scans the voxels looking for clusters. When you see a claim that "scientists know when you're about to move an arm: these images prove it", they're interpreting what they're told by the statistical software. Now, boffins from Sweden and the UK have cast doubt on the quality of the science, because of problems with the statistical software: it produces way too many false positives. In this paper at PNAS, they write: "the most common software packages for fMRI analysis (SPM, FSL, AFNI) can result in false-positive rates of up to 70%. These results question the validity of some 40,000 fMRI studies and may have a large impact on the interpretation of neuroimaging results."
The research is on fMRI - the F stands for Functional. As it mentions later in the summary this is used to try to associate regions of the brain with specific functions. This is not the same as the structure of the brain itself. What we see in terms of actual brain structures - folds, regions, etc, is still very much valid. We're just not so sure about the functional assignments that we've held on to for a while now.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
A friend of mine as worked in the social sciences (cue /. laughter, but bear with me) and they were forced by the university to use a closed source statistical package for all their data processing. So anyway, she got some really dubious results and she preferred to do her own maths, so she did, and lo! completely different results. That was the start of a research project which concluded that the closed source package contained a rounding error that basically filtered all minorities out of the data set, which is kind of sad if you're doing research on minorities.
People trust their software too much, are too lazy to do their own maths, don't really want to have got anything to do with data processing even though that's their job, and universities force bad software on their employees. This is an institutional problem that goes way beyond MRI research.
The paper has been available as a preprint for awhile now, and my lab has discussed it internally and I've also paid attention to outside coverage. The key issue that the paper reports is that false positive rates are two high for most existing software WHEN using a specific type of test under a specific set of conditions. They show that voxelwise familywise error (FWE) correction actually seems to work reasonably or even conservatively. Cluster level FWE correction (looking for groups of voxels that are active) fails when using a very liberal cluster-defining threshold, but works reasonably well when using a more stringent cluster defining threshold. It also says nothing about the performance of another very common correction method that is frequently used in fMRI studies (false discovery rate or FDR).
I'm not really sure how extensive the group of findings that these issues actually affect is, but it's certainly not 40,000 as is claimed in the paper's significance section. Many of the earlier papers (and even more recent) likely used uncorrected statistical tests, so are suspect for entirely different reasons from this issue. Of the ones that use correction, the findings in this paper only call into question the results for those that are using FWE cluster correction with a cluster defining threshold that is too liberal (likely > 0.001, the paper's findings suggest that at 0.001 the familywise error rate is in the ballpark of the desired 5%). Those using a cluster defining threshold of p=0.001 or lower are likely fine, and those using a different correction method like FDR are unknown as to my knowledge there isn't currently any similar paper on that correction method.
You can also check out this technical report by some other big names in imaging that basically says that this result is known and expected for overly liberal cluster defining thresholds:
http://www.fil.ion.ucl.ac.uk/s...
The researchers used published fMRI results, and along the way they swipe the fMRI community for their “lamentable archiving and data-sharing practices” that prevent most of the discipline's body of work being re-analysed.
So the raw data isn't being saved so that someone else can independently verify the results. No checking the computers math, no checking the researchers settings on the machine. Just blanket trust for the people and the machine, and purging of any way of poking holes in someones findings. Even if this wasn't caused by a software bug the lack of archiving the raw dataset so that it can be rerun when software improvements are made is just infuriating.