Reanalysis of Clinical Trials Finds Misleading Results
sciencehabit writes: Clinical trials rarely get a second look — and when they do, their findings are not always what the authors originally reported. That's the conclusion of a new study (abstract), which compared how 37 studies that had been reanalyzed measured up to the original. In 13 cases, the reanalysis came to a different outcome — a finding that suggests many clinical trials may not be accurately reporting the effect of a new drug or intervention. Moreover, only five of the reanalyses were by an entirely different set of authors, which means they did not get a neutral relook.
In one of the trials, which examined the efficacy of the drug methotrexate in treating systemic sclerosis—an autoimmune disease that causes scarring of the skin and internal organs—the original researchers found the drug to be not much more effective than the placebo, as they reported in a 2001 paper. However, in a 2009 reanalysis of the same trial, another group of researchers including one of the original authors used Bayesian analysis, a statistical technique to overcome the shortcomings of small data sets that plague clinical trials of rare diseases such as sclerosis. The reanalysis found that the drug was, as it turned out, more effective than the placebo and had a good chance of benefiting sclerosis patients.
In one of the trials, which examined the efficacy of the drug methotrexate in treating systemic sclerosis—an autoimmune disease that causes scarring of the skin and internal organs—the original researchers found the drug to be not much more effective than the placebo, as they reported in a 2001 paper. However, in a 2009 reanalysis of the same trial, another group of researchers including one of the original authors used Bayesian analysis, a statistical technique to overcome the shortcomings of small data sets that plague clinical trials of rare diseases such as sclerosis. The reanalysis found that the drug was, as it turned out, more effective than the placebo and had a good chance of benefiting sclerosis patients.
Now that is an interesting observation! Mostly, in science, when someone does an experiment that supposedly proves a theory, the next step is to document and publish every detailed step. Only when a number of peers have replicated the results can they be accepted with any confidence.
Yet in clinical trials of new drugs, it seems, only a single trial is ever done. How did that ever get accepted as proper scientific evidence?
I am sure that there are many other solipsists out there.
The problem is the Bayesian analysis is far from conclusive. What it does point to is that the clinical trial needs a larger sample size. Sample sizes that are too small are useless.
They looked at reanalyses that had already been done for other reasons, rather than doing their own reanalyses on randomly selected trials. It occurs to me that these trials may have been subjected to reanalysis precisely *because* there were doubts about the initial analysis.
No, the GP is right. While BA gives you a probability distribution for the effectiveness, unless the effect is really strong (or you bad a really bad choice of priors), that distribution is going to be quite wide for a small data set. Such results are not proving that what you were testing was effective, but that there is a decent probability it might be effective given the knowledge you gain from the test, and that you should pursue a larger test. I've found it to be quite rare to have a BA result that strongly excludes a null hypothesis in a small scale test without having already been flagged as effective by simpler tests (i.e. the effects were so obvious, didn't require trying that hard to see).
Anytime you re-analyze data you run into this.
Think about it. There are a million ways you can analyze any dataset. There are millions of datasets out there to analyze. There are millions of people who can independently decide to go back and do a re-analysis.
So, the issue is that if somebody goes back and does a re-analysis and the results are boring, nobody publishes. However, if the results are controversial, it gets published. Since there are so many permutations, you're guaranteed to find something exciting.
This is why you're supposed to establish your methods BEFORE you collect the data, and then stick to the methods you established to analyze the data. Otherwise your 95% confidence turns into a more realistic 1% confidence.
In practice, though, I'm sure the initial analyses are just as prone to this kind of problem. It just gets REALLY bad when you look backwards.
There are many things wrong with clinical trials, but this isn't one of them. Both the original article and the reanalysis use valid statistical procedures and do not contradict each other. The original analysis didn't prove absence of an effect, it merely failed to show the existence of an effect. The new analysis shows that the drug is, in fact, more effective under some (weak, reasonable) a priori assumptions.
Whether to use statistical hypothesis testing (frequentist methods) or Bayesian analysis is a long-running debate in statistics and medicine. Both techniques are mathematically valid. Statistical hypothesis testing makes fewer a priori assumptions, which is why people have traditionally trusted it more and why it is widely taught and used in science. But over the years that people have come to realize that pessimistic assumptions can be harmful, such as when you continue clinical trials too long or reject the use of life saving drugs. Although I personally think Bayesian methods are a better way of analyzing the data, I think the debate over which methods to use is the way scientific debate and change should happen: slowly and with careful re-analysis and re-examination of data and experimental results.