Why Standard Deviation Should Be Retired From Scientific Use
An anonymous reader writes "Statistician and author Nassim Taleb has a suggestion for scientific researchers: stop trying to use standard deviations in your work. He says it's misunderstood more often than not, and also not the best tool for its purpose. Taleb thinks researchers should use mean deviation instead. 'It is all due to a historical accident: in 1893, the great Karl Pearson introduced the term "standard deviation" for what had been known as "root mean square error." The confusion started then: people thought it meant mean deviation. The idea stuck: every time a newspaper has attempted to clarify the concept of market "volatility", it defined it verbally as mean deviation yet produced the numerical measure of the (higher) standard deviation. But it is not just journalists who fall for the mistake: I recall seeing official documents from the department of commerce and the Federal Reserve partaking of the conflation, even regulators in statements on market volatility. What is worse, Goldstein and I found that a high number of data scientists (many with PhDs) also get confused in real life.'"
...because people use it incorrectly in economics? Get bent. The standard deviation is a useful tool for statistical analysis of large populations.
The meaning of standard deviation is something you learn on a basic statistics course.
We don't ask biochemists to change their terms because the electron transport chain is complicated.
We don't ask cryptographers to change their terms because the difference between extra entropy and multiplicative prediction resistance is not obvious.
We should not ask statisticians to change their terms because people are too stupid to understand them.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
On the other hand, you also need to use 2-pass algorithms to compute Mean Absolute Deviation, whereas STD can be easily calculated in one pass. And you still need standard deviation as it relates directly to the second moment about the mean.
Also, annoyingly, Median Absolute Deviation competes for the MAD name and is more robust against outliers.
Sanity is a sandbox. I prefer the swings.
The problem is that people think they understand statistics when all they know is how to enter numbers into a program to generate "statistics".
They mistake the tools-used-to-make-the-model for reality. Whether intentionally or not.
Standard Deviation is the square root of the second moment about the mean, an important fundamental concept to probability distributions. Looking at moments of probability distributions gives us lots of tools that have been developed over the years and in many cases we can apply closed form solutions with reasonably lenient assumptions. Then we apply the square root in order to put it in the same units as the original list of observations and get some of the heuristic advantages that he attributes to the mean absolute deviation.
But it is a balance, and any data set should be looked at from multiple angles, with multiple summary statistics. To say MAD is better that standard deviation is a reasonable point (with which I would disagree), but to say we should stop using standard deviation (the point made in TFA) is totally incorrect.
There is a great difference between a mean value and an RMS value. Scientific people can work with the appropriate version so I don't see a problem with using the correct one for the correct occasion. And certainly science should stay with the correct term as appropriate.
What I believe the person is calling for here is the most appropriate use when communicating to the non-scientific person. This is an education issue in that the communication really should not use either term as a shorthand but should explain in full the effect of the distribution. Science uses mean and standard deviation (often also requiring a named distribution) because they are shorthands that describe the random behavior and have full meaning without any other explanation needed. So I say use neither term when communicating to the non-scientific as they do not fulfill the communication role to which they are intended.
What I believe should actually be done is proper education of all so that they understand the differences between various random distributions and move totally away from a "it is cold today, so global climate change based on heating must be a lie".
I don't know which is more foolish, thinking that saying nothing, but saying it first, is a worthwhile goal, or claiming to be first when you're not. No need for you to choose, however: you did both.
First!
... to within 0.5 standard deviations.
Actually, the more posts this story attracts, the more accurate your statement is, and the fewer standard deviations you are away from true first. Response times not being distributed in a Gaussian curve perhaps complicates things.
Perhaps non-mathematicians don't have a problem with this, but it rubs me the wrong way.
What makes the mean an interesting quantity is that it is the constant that best approximates the data, where the measure of goodness of the approximation is precisely the way I like it: As the sum of the squares of the differences.
I understand that not everybody is an "L2" kind of guy, like I am. "L1" people prefer to measure the distance between things as the sum of the absolute values of the differences. But in that case, what makes the mean important? The constant that minimizes the sum of absolute values of the differences is the median, not the mean.
So you either use mean and standard deviation, or you use median and mean absolute deviation. But this notion of measuring mean absolute deviation from the mean is strange.
Anyway, his proposal is preposterous: I use the standard deviation daily and I don't care if others lack the sophistication to understand what it means.
I also think averages should go away. Most people think they are being reported the median (the number in the middle) when people tell them the average. It's great for real estate agents, and people trying to advocate for tax reform, but the numbers are not what people think they are.
Well, given that they think it's a great idea to take two different data sets measured in the same units, but measured in completely different ways, and put them together as a comparison over time then I'd say the definition of deviation is the least of their worries.
I often change CSensiblyNamedClassThatDescribesItsFunctionWell to bTrue throughout the code for precisely this reason and no-one ever appreciates it :(
Well... first of all, summary has it wrong. It's not "mean deviation", it's "mean absolute deviation", or just "absolute deviation" from the literature I've seen. (Mean deviation is actually always zero, the most useless thing you could possibly consider.)
Keep in mind that standard deviation is the provably best basis if your goal is to estimate a population *mean*, the most commonly used measure of center. Absolute deviation, on the other hand, is the best basis to use for an estimate of a population *median*, which is maybe fine for finances, which is what the linked paper seems mostly focused on. (Bayesian best estimators, if I recall correctly.)
If the main critique is that economists and social scientists don't know what the F they're doing, then I won't disagree with that. But no need to metastasize the infection to math and statistics in general.
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
I studied geodesy in germany as diploma on a technical university. Standard deviation has its right to exist and to be there and to be used. If this man really means what he says he should not say to abandon standard deviation but to write BOOKS that teach people correctly what it is and how it is calculated on the data which you have. Yes I also meet people (talking of themselves as scientists and researchers) who have no fucking clue how to work with data and standard deviation, but on the other hand I also meet alot who know and also derive the right conclusions, formulars or algorithms out of these. For me this guy sounds like a mad panda who just didn't get it right...
What other existing specialization in computer science, physics, etc,. do you feel is qualified to use Hadoop to process trillions of triple stores into a network and subsequently build highly multivariate link prediction models and evaluate their output statistically with respect to ground truth, to name but one trifling task?
As it happens, one of my colleagues runs a project which, among other things, does exactly that. His PhD is in computer science. I'm a bioinformaticist with a background primarily in biostatistics; I couldn't develop a tool like that, but I can certainly see the value in it. In general, I'm not arguing that the tasks currently getting lumped together under "data science" aren't valuable. I'm just saying that I'm not convinced they fit together into a coherent field that can meaningfully be studied in a single degree program, and attempts to make them so may well run into the problem of "jack of all trades, master of none."
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Cancer research and particle physics use data scientists. Unfortunately so does amazon.com.
Okay, since cancer research is a very large field, I can't say for sure one way or the other ... but I do know that working in bioinformatics at a major academic research center, I've never known a single person in medical research of any kind who called themselves a "data scientist." We have lots of computer scientists and statisticians, most of whom, fortunately, get along well enough to make use of each other's strengths. Regarding particle physics I have no idea, but yeah, I'm willing to bet Amazon or any other large corporation hires more "data scientists" than all the scientific institutions in the world put together--and gets exactly the kind of buzzword bingo they're paying for in return.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
"It is like the median a very robust method, not readily influenced by outliers. The median is wickedly robust, with a breakdown point at 50%, meaning that you can throw a huge a mount of junk data at it and it still doesn't care. The arithmetic mean and the standatd deviation are both junk, often worse than the too-often-assumed-normal data thrown at it."
That depends entirely on what you are trying to show. None of them are junk for all purposes; all of them are junk for the wrong purposes.
For example, if you're talking about salaries of employees of a corporation, the mean might not mean much: the CEO makes 30 times as much as everyone else, and other managers 20 times more, lower managers 10 times more... so the mean is thrown way off. The median is much more meaningful.
On the other hand, even the mode can be useful sometimes. Suppose the corporation has only 3 pay grades: employees grade A, managers grade B, owner and CEO grade C. In that case the mode might actually tell you something interesting. That's not the best example, but it is an example.
Hi, I'm a statistician.
It's not so simple to just say "ok, we're going to use the Mean Absolute Deviation from now on." The use of standard deviation is not quite the historical accident that Taleb makes it out to be--there are good reasons for using it. Because it is a one-to-one function of the second central moment (variance), it inherits a bunch of nice properties that the mean absolute deviation does not. There is not a one-to-one correspondence between variance and mean absolute deviation.
Taleb is correct that the mean absolute deviation is easier to explain to people, but this is not just a matter of changing units of measure (where there is a one-to-one correspondence) or changing function and variable names in code (where there is again a one-to-one correspondence). Standard deviation and mean absolute deviation have different theoretical properties. These differences have led most statisticians over the last hundred years to conclude that the standard deviation is a better measure of variability, even though it is harder to explain.
I would have said "18 half gallon pottles to the quarter-barrel firkin."
Wolfram Alpha says 15.75 pottles to the firkin, but that's because of US/UK gallon conversions, I reckon.
352 nails in a chain - which was interesting to me, in that Google includes those units in its calculator.
I now know more about pottles, firkins, nails and chains that I did when I woke up. I shudder to think about what got pushed out of my old head to make way for this new minutia.
I think NNT is saying that the MAD ought to be used when you are conveying a numerical representation of the "deviations" with the intent that readers use this number to imagine or intuit the size of the "deviations." His example is that of how much the temperature might change on a day-to-day basis. According to him, it's not just that the concept is easier to explain, but that it is the more accurate measure to use for this purpose.
Based on his other work I'm sure he understands that the STD is generally superior for optimization purposes, fit comparison, etc.
.: Semper Absurda
For normal densities, standard deviations and MAD are just proportional, with a factor of about 1.25, so it doesn't matter which you use.
For non-normal densities, neither of them really is universally "right" for characterizing the deviation, but it's mathematically a whole lot easier to understand how standard deviation behaves in those cases than MAD. So even there, standard deviations are usually the better choice.
...and besides... JUST THINK of all the rigorous Lean Management courses that will have to re-certify all of their "Six-Sigma Black Belts" to some kind of "Half-Dozen of the Other" degrees!
PANDEMONIUM!!!
Agreed that this is a ridiculous proposal. He probably just wants more publicity.
This was the guy who wrote the book "Anti-Fragile", which I had hoped would educate and broaden my way of thinking, in the same way that the Malcolm Gladwell books ("Tipping Point", "Blink", "Outliers") did. He ended up droning on and on without really making a worthwhile point, and I gave up after a while.
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
I know several people who have left high energy physics to become data scientists. Nobody in HEP calls themselves a "data scientist", but that's (some of) what we do anyway. It's just analysis of very large data sets. Unlike in the life sciences, both HEP and many commercial / industrial environments have sufficiently large data sets that very complex questions can be asked and answered. You can never have "enough data" -- if you think you have "enough data", then you aren't asking hard enough questions.
SIGSEGV caught, terminating
wait... not that kind of sig.
Um, yes it does.
if you imagine a random draw in which you had 99 realisations of the number "1" and 1 realisation of the number "101". The mean would equal 2 while the (population) standard deviation would equal sqrt(1/100*(99+99^2)) (which is a fairly large number).
If you removed the one observation of "101" from the sample, you would have zero standard deviation. So one additional outlier will cause the standard deviation to go from zero to 10 without changing the mean by nearly as much (or the median at all).
The issue of underweighting tail events is really a separate issue, and that is usually because a distribution such as the normal distribution has very thin tails, so basically anything outside about 4 standard deviations would be predicted to be extremely unlikely by the normal distribution.
Other distributions do better, but they tend to be rather less friendly from an analytical perspective. However, with the kind of computing power available nowadays, this is less of an issue now as it would have been when people first started using the normal distribution for the purposes for which it has been always known (and some not so clever bankers have recently found) to be inadequate.
I should add that it is still a bit of an issue where speed is concerned though, because you can do your calculations involving the normal distribution much faster than you can many of the other more esoteric and likely more realistic distributions.
"If people believe that they have to give up a comfortable lifestyle to reduce carbon dioxide emissions, they will look for any evidence that AGW is incorrect, no matter how flimsy it is. You can see this behavior for what it is when people cling to a mistaken idea for dear life."
The above reminded me of something from Nassim Taleb's writings. Those who have read his books may be familiar with the following Upton Sinclair quote: "It is difficult to get a man to understand something, when his salary depends upon his not understanding it." NNT applies this principle to financial 'experts' (quants, stockbrokers, advisors, etc.) who do things that are demonstrably counterproductive (applying stat methods that assume Gaussianity to non-normal distributions; disregarding the randomness inherent in stock movements) not necessarily out of ignorance, but largely because such actions serve their economic benefit. In all areas, people often disregard evidence when doing so serves what they perceive as their immediate interests.
Data science is a field that combines machine learning and statistics to derive meaning from data. Data scientists should be reasonably well-versed in classical stats, but the data sets they deal with are often huge, ill-defined, and not amenable to analysis using classical methods. To deal with such challenges, data science recruits a healthy combination of certain areas of comp-sci (databases, machine learning, NLP, AI), statistical methods, and, quite often, improvisation.
Strange that there are so many people on here that are unfamiliar with data science.
I can really go for renaming standard deviation, but it should not be abolished.
Standard deviation is a function of the second moment of the data, and if you remember your laws for combining moments of inertia (the parallel axis theorem), then you'll understand better what you're dealing with.
2nd moments detail resistance to spin, and thus the resiliance of your findings to changes and errors.
Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
Not within one day, but across several days.
Say someone just asked you to measure the "average daily variations" for the temperature of your town (or for the stock price of a company, or the blood pressure of your uncle) over the past five days. The five changes are: (-23, 7, -3, 20, -1). How do you do it?
The STD of the series is 15.7 and the MAD is 10.8. NNT argues that MAD is more useful in this context.
In fact, whenever people make decisions after being supplied with the standard deviation number, they act as if it were the expected mean deviation...every time a newspaper has attempted to clarify the concept of market "volatility", it defined it verbally as mean deviation yet produced the numerical measure of the (higher) standard deviation.
He says he's seen this mistake made not just in popular articles, but also in financial publications and regulatory documents. The daily temperature is just a contrived example; he's mainly talking about financial analysis (which is his field).
.: Semper Absurda
Naw, you'd probably need a Poisson distribution ;-).
"[I]t is a wise man who admits the limits of his knowledge or skill, and that pretending either causes harm." --Terry Go
And neither does the media-consuming public. Most would totally ignore your measure of precision regardless of whether you call it standard deviation or mean absolute deviation. For them your average is absolute and if any values aren't at all near it something is terribly wrong. They will also not rest until every school performs above average and nothing in your work will convince them otherwise. The public doesn't like uncertainty and will assume every outcome is for a special reason, and this even goes for the non-religious ones. The idea that some things aren't absolute and are actually uncertain and variable terrifies them.
Nowhere is this more apparent than in sports. Everything there is always "written in the stars" or "destiny" and if you win it always proves beyond doubt your are better than your opposition (or you were 100% cheated by the refs). Hell, journalists may have had a full article written up 2 minutes before the end of a game and then completely change everything to be about one team's dogged determination because chance would have it they scored in the last minute. I love football (soccer), but discussing it can be frustrating.
If you still believe you can convince them, use mean absolute deviation in your "executive summary" or press release and leave the standard deviation as is in your actual paper. The only ones that actually read the paper are scientists anyway. The typical journalist reading your actual paper is likely to misunderstand something in every paragraph anyway. Changing real science to pander to the masses is a fucking huge mistake.
pnWhat vIs nWrong cWith aHungarian nNotation?
Who ordered that?
No, there is always 16 ounces in a pound regardless of what you are "weighing" (measuring the mass of). Perhaps you've conflated ounce with fluid ounce -- a distinct, though confusingly named, unit.
And yes, a mile is 5280 ft = 63360 inches. I don't know where you pulled "3mm shy of" from but if you're measuring in miles and worrying about being 3mm shy, you're doing it wrong.
BZZZT! Wrong. Gold, silver, and other precious metals are measured in Troy ounces, which are slightly heavier than regular ounces. Oddly, Troy pounds only have 12 Troy ounces in them. Thus, an ounce of gold is heavier than an ounce of lead, but a pound of gold is lighter than a pound of lead.
Similarly, long distances are measured in statute (or survey) miles, which are based on a longer standard than customary measures. The volume of a hogshead is different between ale and beer.
But thanks for correcting me. Not even Americans know their own system.
When our name is on the back of your car, we're behind you all the way!
I like that little poke at journalists:
In other words, it's not just journalists who fall for the mistake, so do educated people.