Why Standard Deviation Should Be Retired From Scientific Use
An anonymous reader writes "Statistician and author Nassim Taleb has a suggestion for scientific researchers: stop trying to use standard deviations in your work. He says it's misunderstood more often than not, and also not the best tool for its purpose. Taleb thinks researchers should use mean deviation instead. 'It is all due to a historical accident: in 1893, the great Karl Pearson introduced the term "standard deviation" for what had been known as "root mean square error." The confusion started then: people thought it meant mean deviation. The idea stuck: every time a newspaper has attempted to clarify the concept of market "volatility", it defined it verbally as mean deviation yet produced the numerical measure of the (higher) standard deviation. But it is not just journalists who fall for the mistake: I recall seeing official documents from the department of commerce and the Federal Reserve partaking of the conflation, even regulators in statements on market volatility. What is worse, Goldstein and I found that a high number of data scientists (many with PhDs) also get confused in real life.'"
And, in a word full of highly numerate simpletons, one must never forget this.
...because people use it incorrectly in economics? Get bent. The standard deviation is a useful tool for statistical analysis of large populations.
The meaning of standard deviation is something you learn on a basic statistics course.
We don't ask biochemists to change their terms because the electron transport chain is complicated.
We don't ask cryptographers to change their terms because the difference between extra entropy and multiplicative prediction resistance is not obvious.
We should not ask statisticians to change their terms because people are too stupid to understand them.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
On the other hand, you also need to use 2-pass algorithms to compute Mean Absolute Deviation, whereas STD can be easily calculated in one pass. And you still need standard deviation as it relates directly to the second moment about the mean.
Also, annoyingly, Median Absolute Deviation competes for the MAD name and is more robust against outliers.
Sanity is a sandbox. I prefer the swings.
The problem is that people think they understand statistics when all they know is how to enter numbers into a program to generate "statistics".
They mistake the tools-used-to-make-the-model for reality. Whether intentionally or not.
Standard Deviation is the square root of the second moment about the mean, an important fundamental concept to probability distributions. Looking at moments of probability distributions gives us lots of tools that have been developed over the years and in many cases we can apply closed form solutions with reasonably lenient assumptions. Then we apply the square root in order to put it in the same units as the original list of observations and get some of the heuristic advantages that he attributes to the mean absolute deviation.
But it is a balance, and any data set should be looked at from multiple angles, with multiple summary statistics. To say MAD is better that standard deviation is a reasonable point (with which I would disagree), but to say we should stop using standard deviation (the point made in TFA) is totally incorrect.
There is a great difference between a mean value and an RMS value. Scientific people can work with the appropriate version so I don't see a problem with using the correct one for the correct occasion. And certainly science should stay with the correct term as appropriate.
What I believe the person is calling for here is the most appropriate use when communicating to the non-scientific person. This is an education issue in that the communication really should not use either term as a shorthand but should explain in full the effect of the distribution. Science uses mean and standard deviation (often also requiring a named distribution) because they are shorthands that describe the random behavior and have full meaning without any other explanation needed. So I say use neither term when communicating to the non-scientific as they do not fulfill the communication role to which they are intended.
What I believe should actually be done is proper education of all so that they understand the differences between various random distributions and move totally away from a "it is cold today, so global climate change based on heating must be a lie".
If there are "data scientists" who don't understand what the standard deviation is, then they certainly shouldn't be calling themselves "data scientists," and quite possibly not scientists at all. What subjects are their PhDs in, I wonder? This doesn't do anything to reduce my skepticism that such a thing as "data science" really needs to exist.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Yes, use the interquartile range instead https://en.wikipedia.org/wiki/Interquartile_range
It is like the median a very robust method, not readily influenced by outliers. https://en.wikipedia.org/wiki/Median
The median is wickedly robust, with a breakdown point at 50%, meaning that you can throw a huge a mount of junk data at it and it still doesn't care.
The arithmetic mean and the standatd deviation are both junk, often worse than the too-often-assumed-normal data thrown at it.
We should not ask statisticians to change their terms because people are too stupid to understand them.
I've always wondered about this attitude.
For me, any change requires an analysis of risk/reward versus value. For example, if code contains confusing names, it might be worthwhile to refactor it.
The tradeoff is in the time spent refactoring versus the perceived value - if it's a mature product that largely works with few planned updates and few people will have to deal with the confusion, then the effort outweighs the returned value. If the code is open source, being actively developed and with many eyes looking at it, there may be a great deal of value in making it easier to understand.
The same could be said of English versus Metric measurements. Why should the US change to use the new system when everyone understands the one we have?
If the Federal Reserve sometimes gets it wrong, there may be great value in changing terms. The effort to fix the mistakes people make might be a good deal less effort than changing the terms used by a subset of mathematicians.
You can look at the big picture and see changes that would return a large overall/distributed value, or you can look at small groups and see that making those changes would cost them time and effort.
Is it too much to ask statisticians to look at the big picture?
That's a good enough replacement term.
---- The above post was generated by the Turing Institute. Maybe.
My college math teacher made a special point of warning us that journalists almost always mix up pct and pp. Sometimes they even do that on purpose!
If you don't like the term "standard devation", use "margin of error" instead.
-Bob-
I don't know which is more foolish, thinking that saying nothing, but saying it first, is a worthwhile goal, or claiming to be first when you're not. No need for you to choose, however: you did both.
If there are "data scientists" who don't understand what the standard deviation is, then they certainly shouldn't be calling themselves "data scientists," and quite possibly not scientists at all. What subjects are their PhDs in, I wonder?
The problem isn't with highly-educated people, it people who are not highly educated, or who are highly educated but in a different field.
If a particular intersection attracts a lot of accidents, we consider the accidents to be the fault of the drivers involved. But at the same time, we recognize that aspects of the intersection might be a contributing factor as well.
Expert drivers would never have such accidents, but if we spend some effort reblocking the intersection we could get improved safety, and sometimes there is value in doing this.
Like the roadway intersection, if a term is so confusing that average people make mistakes because of it, there may well be value in changing to easier-to-understand terms.
First!
... to within 0.5 standard deviations.
Actually, the more posts this story attracts, the more accurate your statement is, and the fewer standard deviations you are away from true first. Response times not being distributed in a Gaussian curve perhaps complicates things.
Perhaps non-mathematicians don't have a problem with this, but it rubs me the wrong way.
What makes the mean an interesting quantity is that it is the constant that best approximates the data, where the measure of goodness of the approximation is precisely the way I like it: As the sum of the squares of the differences.
I understand that not everybody is an "L2" kind of guy, like I am. "L1" people prefer to measure the distance between things as the sum of the absolute values of the differences. But in that case, what makes the mean important? The constant that minimizes the sum of absolute values of the differences is the median, not the mean.
So you either use mean and standard deviation, or you use median and mean absolute deviation. But this notion of measuring mean absolute deviation from the mean is strange.
Anyway, his proposal is preposterous: I use the standard deviation daily and I don't care if others lack the sophistication to understand what it means.
I also think averages should go away. Most people think they are being reported the median (the number in the middle) when people tell them the average. It's great for real estate agents, and people trying to advocate for tax reform, but the numbers are not what people think they are.
They are vital to getting meaningful information out of a sea of data. Cancer research and particle physics use data scientists. Unfortunately so does amazon.com.
Well, given that they think it's a great idea to take two different data sets measured in the same units, but measured in completely different ways, and put them together as a comparison over time then I'd say the definition of deviation is the least of their worries.
Food for thought: "Revisiting a 90-year-old debate: the advantages of the mean deviation"
http://www.leeds.ac.uk/educol/documents/00003759.htm
Didn't Taleb warn us about the perils of modeling things with normal distributions that fail to capture outliers ("Black Swans") and yet now he advocates the use of a stastical measure that conceals^H^H^H^H^H is robust with respect to outliers?
Oh well, next year he'll probably come up with something along the lines of "Monte Carlo methods major cause of global warming, return to analytic methods and moments unavoidable truth"...
I find this article quite confusing. Is the actual suggestion that we should be going around using the mean deviation as a way of capturing the general variance of our data sets? Or to put it another way, does he want "deviation" measures not to give us a real sense of the larger deviations that might occur with some real probability. For example, with temperatures, standard deviation is more likely to suggest that we can have periods of significantly higher and lower temperatures than a simple "mean deviation".
Adding to my confusion is that there is no reference to articles, books, or other subject material that supports the general thesis. If the "mean deviation" is better than the "std deviation", give some real concrete examples and supporting mathematics.
Also, there seems to be no reference to "bell curve" distributions and "non bell curve" distributions. Standard deviation computations are built around bell curve distributions for their mathematical soundness. For example, if I were to take every number and raise it the fourth power, standard deviation would not work so well on this new set of numbers. Is the author suggesting that typical sampling distributions of sampled events tend not to be "bell curve" like?
Standard deviation is taught in 7th grade in my local school. It shows up constantly in any standard K-12 curriculum. To challenge this, you really should bring a lot more substance to any argument that we should do things differently.
For example, I could argue that we should use 1:2 to represent 1/2 because the slash (/) should be used for logical dependency arguments instead. I could create lots of examples and go into a diatribe about how people constantly misuse fractions and ratios because they use a slash in their construction. But I would still be spouting nonsense.
Well... first of all, summary has it wrong. It's not "mean deviation", it's "mean absolute deviation", or just "absolute deviation" from the literature I've seen. (Mean deviation is actually always zero, the most useless thing you could possibly consider.)
Keep in mind that standard deviation is the provably best basis if your goal is to estimate a population *mean*, the most commonly used measure of center. Absolute deviation, on the other hand, is the best basis to use for an estimate of a population *median*, which is maybe fine for finances, which is what the linked paper seems mostly focused on. (Bayesian best estimators, if I recall correctly.)
If the main critique is that economists and social scientists don't know what the F they're doing, then I won't disagree with that. But no need to metastasize the infection to math and statistics in general.
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
The example in the article isn't even an example of a standard deviation. He may have plugged his five values into the RMS formula, but what it produced isn't an actual standard deviation because five values is too small of a sample size.
This article is really a demonstration of why people should stop misusing the term "standard deviation" than it is an argument of why people should stop using standard deviation.
-Glires
I studied geodesy in germany as diploma on a technical university. Standard deviation has its right to exist and to be there and to be used. If this man really means what he says he should not say to abandon standard deviation but to write BOOKS that teach people correctly what it is and how it is calculated on the data which you have. Yes I also meet people (talking of themselves as scientists and researchers) who have no fucking clue how to work with data and standard deviation, but on the other hand I also meet alot who know and also derive the right conclusions, formulars or algorithms out of these. For me this guy sounds like a mad panda who just didn't get it right...
When I was in school, they still taught the central limit theorem which explains why so many error distributions are "normal". Our world provides us with millions of examples in everyday life where the standard deviation of our experiences is the best statistic to estimate the probability of future events.
What you do with a statistic is what counts. It's easy to look at the standard deviation and estimate the probability that the conclusion was reached by chances of the draw, though it takes some practice to develop your intuition. It is imbedded in our language when we talk of "6 sigma" reliability or " 4 sigma" thinkers. Anyone who thinks he is a scientist should understand this!
Mr. Taleb may be working in a field where normal distributions are rare, but the probability is he is either lying or poorly educated.
I agree that mathematicians may become imprinted on standard deviation and forget that it is only used because it is easier to work with than average absolute deviation (ex: the derivative of x^2 is continuous, unlike abs(x)), and that less technically inclined readers might not realize there is a difference. However, they ARE usually pretty close (I don't have a reference, but I once ran a simulation comparing the 2 using random data with a Gaussian distribution and the curves matched exactly), and its harder to find exact solutions with average absolute deviation. On the other hand, it wouldn't hurt to use "MAD" occasionally on a data set to make sure that the standard deviation gives results that are meaningful as a measure of "deviation".
That's what he concludes at the bottom of the article. He starts the article by saying that standard deviation should only be used by physicists, mathematicians, and mathematical statisticians. If I'm not mistaken, "physics" and "math" covers a whole lot of different fields, including most of the STEM fields that (largely) define the users of this site.
I know in my particular field (physics based), standard deviation is a hell of a lot more useful than mean average deviation. And easier to use.
Bah. I call poor summary.
It may look like I'm doing nothing, but I'm actively waiting for my problems to go away.
--Scott Adams
Cancer research and particle physics use data scientists. Unfortunately so does amazon.com.
Okay, since cancer research is a very large field, I can't say for sure one way or the other ... but I do know that working in bioinformatics at a major academic research center, I've never known a single person in medical research of any kind who called themselves a "data scientist." We have lots of computer scientists and statisticians, most of whom, fortunately, get along well enough to make use of each other's strengths. Regarding particle physics I have no idea, but yeah, I'm willing to bet Amazon or any other large corporation hires more "data scientists" than all the scientific institutions in the world put together--and gets exactly the kind of buzzword bingo they're paying for in return.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
If I read a non-scientific article that spewed out standard deviations I would automatically disregard the numbers anyway. It is a safe assumption that a journalism major doesn't understand what they're writing about and just adding filler to boost word count.
I am becoming gerund, destroyer of verbs.
For normal densities, standard deviations and MAD are just proportional, with a factor of about 1.25, so it doesn't matter which you use.
For non-normal densities, neither of them really is universally "right" for characterizing the deviation, but it's mathematically a whole lot easier to understand how standard deviation behaves in those cases than MAD. So even there, standard deviations are usually the better choice.
The celebrated Heisenberg uncertainty principle in quantum theory is based on statistical statements about the coupled standard deviations of position and momentum measurements (for example), not the mean deviation. The mean deviations are assumed to be zero since the means of the position and momentum distributions are exactly known for theoretical work. What matters are the fluctuations about the mean. In fairness, Taleb does allow physicists to keep using STD. But, quantum mechanics aside, it seems characterizing fluctuations about the mean, rather than fluctuations of the mean, is often an important measure depending on the nature of the investigation. Retiring the standard deviation seems a bit hasty.
i\hbar\dot{\psi}=\hat{H}\psi
It weights by the difference between the observation and the mean, by the variation. So large observations are not weighed any more than small ones. Two observations equally far from the mean get equal weight.
Widely varying observations do get higher weight and that is intentional. Standard deviation is that way because it is so useful in analysis of variance and measuring likelihood of statistical significance.
http://lkml.org/lkml/2005/8/20/95
...and besides... JUST THINK of all the rigorous Lean Management courses that will have to re-certify all of their "Six-Sigma Black Belts" to some kind of "Half-Dozen of the Other" degrees!
PANDEMONIUM!!!
Where does it say that?
Agreed that this is a ridiculous proposal. He probably just wants more publicity.
This was the guy who wrote the book "Anti-Fragile", which I had hoped would educate and broaden my way of thinking, in the same way that the Malcolm Gladwell books ("Tipping Point", "Blink", "Outliers") did. He ended up droning on and on without really making a worthwhile point, and I gave up after a while.
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
I know several people who have left high energy physics to become data scientists. Nobody in HEP calls themselves a "data scientist", but that's (some of) what we do anyway. It's just analysis of very large data sets. Unlike in the life sciences, both HEP and many commercial / industrial environments have sufficiently large data sets that very complex questions can be asked and answered. You can never have "enough data" -- if you think you have "enough data", then you aren't asking hard enough questions.
SIGSEGV caught, terminating
wait... not that kind of sig.
Um, yes it does.
if you imagine a random draw in which you had 99 realisations of the number "1" and 1 realisation of the number "101". The mean would equal 2 while the (population) standard deviation would equal sqrt(1/100*(99+99^2)) (which is a fairly large number).
If you removed the one observation of "101" from the sample, you would have zero standard deviation. So one additional outlier will cause the standard deviation to go from zero to 10 without changing the mean by nearly as much (or the median at all).
The issue of underweighting tail events is really a separate issue, and that is usually because a distribution such as the normal distribution has very thin tails, so basically anything outside about 4 standard deviations would be predicted to be extremely unlikely by the normal distribution.
Other distributions do better, but they tend to be rather less friendly from an analytical perspective. However, with the kind of computing power available nowadays, this is less of an issue now as it would have been when people first started using the normal distribution for the purposes for which it has been always known (and some not so clever bankers have recently found) to be inadequate.
I should add that it is still a bit of an issue where speed is concerned though, because you can do your calculations involving the normal distribution much faster than you can many of the other more esoteric and likely more realistic distributions.
"If people believe that they have to give up a comfortable lifestyle to reduce carbon dioxide emissions, they will look for any evidence that AGW is incorrect, no matter how flimsy it is. You can see this behavior for what it is when people cling to a mistaken idea for dear life."
The above reminded me of something from Nassim Taleb's writings. Those who have read his books may be familiar with the following Upton Sinclair quote: "It is difficult to get a man to understand something, when his salary depends upon his not understanding it." NNT applies this principle to financial 'experts' (quants, stockbrokers, advisors, etc.) who do things that are demonstrably counterproductive (applying stat methods that assume Gaussianity to non-normal distributions; disregarding the randomness inherent in stock movements) not necessarily out of ignorance, but largely because such actions serve their economic benefit. In all areas, people often disregard evidence when doing so serves what they perceive as their immediate interests.
Data science is a field that combines machine learning and statistics to derive meaning from data. Data scientists should be reasonably well-versed in classical stats, but the data sets they deal with are often huge, ill-defined, and not amenable to analysis using classical methods. To deal with such challenges, data science recruits a healthy combination of certain areas of comp-sci (databases, machine learning, NLP, AI), statistical methods, and, quite often, improvisation.
Strange that there are so many people on here that are unfamiliar with data science.
Concerning his education credentials: he's got a U. Penn. MBA and a U. of Paris doctorate, and currently teaches at NYU Polytech. If you want to know his thoughts on normal distributions, stats, epistemology, econ, and the social sciences, his books are excellent, and are well worth a read (although much of the best material is quite derivative of Mandelbrot). NNT may be called an anti-academic, anti-econ establishment crank, but it would generally be inaccurate, in accordance with your inference, to accuse him of lying.
I can really go for renaming standard deviation, but it should not be abolished.
Standard deviation is a function of the second moment of the data, and if you remember your laws for combining moments of inertia (the parallel axis theorem), then you'll understand better what you're dealing with.
2nd moments detail resistance to spin, and thus the resiliance of your findings to changes and errors.
Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
Calculating the standard deviation of a data set with only one pass over the data as you initially collect it is fairly straightforward. This is ideal in situations where the data you are working with is ephemeral, or of unbounded size and impractical to store every individual sample. How do you calculate the mean deviation without having to go back and revisit all of your samples?
File under 'M' for 'Manic ranting'
Naw, you'd probably need a Poisson distribution ;-).
"[I]t is a wise man who admits the limits of his knowledge or skill, and that pretending either causes harm." --Terry Go
Perhaps we should be looking at MAD statistics more often when summarizing or describing data. However, the standard deviations are very useful in Statistical Inference . Standard deviations are always reported with the parameter estimates. Now this is really useful because the parameter estimates are assumed to be approximately normally distributed either due to the Central Limit Theorem or by assumption of iid normal disturbances. Under the standard normal distribution, two standard deviations account for 95% of the coverage probability, so just by glancing at the standard deviations you know roughly the confidence terminals and also the outcome of a simple z or t test of a hypothesis about the given estimate.
Do you take every observation: square it, average the total, then take the square root? Or do you remove the sign and calculate the average?
WTF! - he's managed to get both the definition of standard deviation and mean absolute deviation wrong.
And neither does the media-consuming public. Most would totally ignore your measure of precision regardless of whether you call it standard deviation or mean absolute deviation. For them your average is absolute and if any values aren't at all near it something is terribly wrong. They will also not rest until every school performs above average and nothing in your work will convince them otherwise. The public doesn't like uncertainty and will assume every outcome is for a special reason, and this even goes for the non-religious ones. The idea that some things aren't absolute and are actually uncertain and variable terrifies them.
Nowhere is this more apparent than in sports. Everything there is always "written in the stars" or "destiny" and if you win it always proves beyond doubt your are better than your opposition (or you were 100% cheated by the refs). Hell, journalists may have had a full article written up 2 minutes before the end of a game and then completely change everything to be about one team's dogged determination because chance would have it they scored in the last minute. I love football (soccer), but discussing it can be frustrating.
If you still believe you can convince them, use mean absolute deviation in your "executive summary" or press release and leave the standard deviation as is in your actual paper. The only ones that actually read the paper are scientists anyway. The typical journalist reading your actual paper is likely to misunderstand something in every paragraph anyway. Changing real science to pander to the masses is a fucking huge mistake.
You mean that there are scientists publishing papers out there that don't understand basic concepts of statistics? Like, the rest of the world?
I'm truly 99.9999% e+/-0.5 shocked!
"I think this line is mostly filler"
A standard deviation is something kinky everyone should try at least once.
Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.
(what's new right?)
I never even thought to conflate Std Deviation and Mean Deviation prior to reading this article/summary. I just thought of Std Deviation as that bit of the normal distribution which captures ~68.2% of the values (for +/- 1 sigma). And Yes, I knew how it is calculated, my mind just didn't go that direction.
McFly777
- - -
"What do people mean when they say the computer went down on them?" -Marilyn Pittman
I worked for one of those guys once. Not long after I was hired, he eagerly explained to me that sigma is how many nines there are after the decimal point.
Using the mean deviation is kind of like kissing your sister. Nice, but it doesn't go anywhere. Significance testing is right out, for starts.
Physicist here...
I should think the travesty in this article is an economist not making a huge deal about the real issue here and that is measures of central tendency (any measure) only really makes sense when you're looking at gaussian type data (don't economists have fat-tail debacles etched into them at school???). Using a mean and a standard deviation, rmse or whatever to encapsulate a power law distributed thing is dangerous when you start USING it for something (like derivatives pricing). Power law distributions are more prevalent than popularly imagined... Use care when using measures of central tendency on them.
I like that little poke at journalists:
In other words, it's not just journalists who fall for the mistake, so do educated people.
The real danger comes not from a 50% confusion between standard deviation and mean absolute deviation; but from the assumption that the statistical distribution is Gaussian.
Before the credit crunch, financiers who considered themselves "masters of the universe" believed on the basis of the Black–Scholes equation that they could hedge their risks with a mean time to failure of billions of years. The probability distributions were assumed to be Gaussian, but this bore no relation to the past performance of the stock market.
I like that little poke at journalists: ...
I assume you have met a journalist, before? 8-}