Why Standard Deviation Should Be Retired From Scientific Use
An anonymous reader writes "Statistician and author Nassim Taleb has a suggestion for scientific researchers: stop trying to use standard deviations in your work. He says it's misunderstood more often than not, and also not the best tool for its purpose. Taleb thinks researchers should use mean deviation instead. 'It is all due to a historical accident: in 1893, the great Karl Pearson introduced the term "standard deviation" for what had been known as "root mean square error." The confusion started then: people thought it meant mean deviation. The idea stuck: every time a newspaper has attempted to clarify the concept of market "volatility", it defined it verbally as mean deviation yet produced the numerical measure of the (higher) standard deviation. But it is not just journalists who fall for the mistake: I recall seeing official documents from the department of commerce and the Federal Reserve partaking of the conflation, even regulators in statements on market volatility. What is worse, Goldstein and I found that a high number of data scientists (many with PhDs) also get confused in real life.'"
...because people use it incorrectly in economics? Get bent. The standard deviation is a useful tool for statistical analysis of large populations.
On the other hand, you also need to use 2-pass algorithms to compute Mean Absolute Deviation, whereas STD can be easily calculated in one pass. And you still need standard deviation as it relates directly to the second moment about the mean.
Also, annoyingly, Median Absolute Deviation competes for the MAD name and is more robust against outliers.
Sanity is a sandbox. I prefer the swings.
Standard Deviation is the square root of the second moment about the mean, an important fundamental concept to probability distributions. Looking at moments of probability distributions gives us lots of tools that have been developed over the years and in many cases we can apply closed form solutions with reasonably lenient assumptions. Then we apply the square root in order to put it in the same units as the original list of observations and get some of the heuristic advantages that he attributes to the mean absolute deviation.
But it is a balance, and any data set should be looked at from multiple angles, with multiple summary statistics. To say MAD is better that standard deviation is a reasonable point (with which I would disagree), but to say we should stop using standard deviation (the point made in TFA) is totally incorrect.
We should not ask statisticians to change their terms because people are too stupid to understand them.
But doesn't that give an unfair advantage to statisticians? You have to give everyone a chance!
The problem is that peoples' attention spans are rapidly approaching that of a water-flea.
Up until the past 50 or so years, people who learned about Standard Deviation would do so in environments with far less stimulation and distraction. Their lives weren't so filled with extra-curricular activities and entertainments that they never sat for a moment from waking until sleep without some form of stimulus based pastime. When they "understood" the concept, there was time for it to ruminate and gel into a meaningful set of connections with how it is calculated and commonly applied. Today, if you can guess the right answer from a set of 4 choices often enough, you are certified expert and given a high level degree in the subject.
Not bashing modern life, it's great, but it isn't making many "great thinkers" in the mold of the 19th century mathematicians. We do more, with less understanding of how, or why.
First!
... to within 0.5 standard deviations.
Actually, the more posts this story attracts, the more accurate your statement is, and the fewer standard deviations you are away from true first. Response times not being distributed in a Gaussian curve perhaps complicates things.
Then it would be the same as pi, and that would just be silly.
I also think averages should go away. Most people think they are being reported the median (the number in the middle) when people tell them the average. It's great for real estate agents, and people trying to advocate for tax reform, but the numbers are not what people think they are.
The phrase "orbital process" means entirely different things to brain surgeons and rocket scientists.
When our name is on the back of your car, we're behind you all the way!