The History of 'Correlation Does Not Imply Causation'
Dr Herbert West writes "The phrase 'correlation does not imply causation' goes back to 1880 (according to Google Books). However, use of the phrase took off in the 1990s and 2000s, and is becoming a quick way to short-circuit certain kinds of arguments. In the late 19th century, British statistician Karl Pearson introduced a powerful idea in math: that a relationship between two variables could be characterized according to its strength and expressed in numbers. An exciting concept, but it raised a new issue: how to interpret the data in a way that is helpful, rather than misleading. When we mistake correlation for causation, we find a cause that isn't there, which is a problem. However, as science grows more powerful and government more technocratic, the stakes of correlation — of counterfeit relationships and bogus findings — grow larger."
http://xkcd.com/552/
However, as science grows more powerful and government more technocratic, the stakes of correlation — of counterfeit relationships and bogus findings — grow larger."
Well is science growing powerful finds all these false correlations? Or these correlations always existed and now we know enough to say they were false. Anyway correlation is not causation.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
In what sense, exactly does science grow more powerful? In my experience, sciences grows more expensive, less funded, more hyped, less understood, and overall less heeded.
Hey mate, spare a sig?
Is this correlated with the fundamental interconnectedness of all things?
... and is becoming a quick way to short-circuit certain kinds of arguments.
... Correlation does not imply causation.
It must have been something you assimilated. . . .
Correlation doesn't PROVE causation.... ...but it bloody well DOES suggest it, at least in the course of our daily lives.
The reason this phrase is so catchy is that it's counter-intuitive, and easily proven to be true. People love to use it as a "gotcha" phrase, PRECISELY because in regular life correlation does in fact usually imply causation.
In fact, correlation is used by most scientists to begin the hypothesis process. A power plant is built on a river, and the river starts drying up - most people would begin their analysis by checking on the power plant, and not the population of honeybees.
Your kid is alone in the kitchen. The cookie jar is (now) empty. Does his presence CONCLUSIVELY PROVE that he ate the cookies? Of course not, and a wise parent would find other evidence to draw a conclusion. But the correlation of their places in time and space, as well as a known predilection for cookies means that correlation strongly suggests an avenue of investigation (you're probably not going to start figuring out what happened by pursuing some other entirely different course).
It's the sort of empty-headed 'gotcha' phrase that's so popular and so often used without real thought behind it.
-Styopa
The people who mindlessly deny the possibility of causation are worse than those who compare everything to Hitler.
My Sig: SEGV
and is becoming a quick way to short-circuit certain kinds of arguments
The real problem here comes from people using that as a "short cut" to an actual argument.
On the one hand, we've done a great job at getting them to grasp that correlation does not imply causation. Now, we need to get people to understand what does - Necessary and Sufficient.
Next time someone uses that as a catch-phrase to shoot down a correlation as meaningless, ask them:
Does B require A? Necessary.
Does A lead to B? Sufficient.
QED, A causes B (or vice-versa).
Of course, my choice of the word "meaningless" there carries its own problems - Using correlation vs causation as a rhetorical shortcut to actual logic glosses over the fact that (statistically significant) correlations can have meaning (just that they don't "mean" causation). FWIW, The vast majority of modern medicine involves dealing with correlations rather than causes - "depressed people have low serotonin, prozac increases available serotonin", "people with high cholesterol have more heart attacks; lipitor reduces cholesterol". You can often use a correlation, as long as the two sides actually do link via some unknown variables. When they don't, however - Well, pirates don't prevent global warming because adding more pirates to the world doesn't somehow put us back before the industrial revolution.
Correlation suggests only Correlation. It doesn't suggest causation, but as you noted, it does suggest areas for further investigation. The relationship may or may not turn out to be directly causal.
I had the phrase "Desired: A woman who understands that correlation does not imply causality..." in my dating profile.
I married the woman who replied. Yes, I am surprised that worked as well.
Kind thoughts do not change the world
TFA does a pretty good job of explaining why. Here's something I'd like to add: no, correlation does not imply causation, in the strict mathematical meaning of "imply"; in mathematical parlance, "A implies B" means that if A is true, B will always be true as well, and of course "X is positively correlated with Y" does not mean "an increase in X causes an increase in Y." But there's another meaning of "imply," and, like the common confusion about the meaning of the word "theory" in creationist arguments, it causes a lot of problems.
In common usage, "imply" carries a lot of ambiguity with it. In fact, it's almost never used to connote mathematical certainty. If you ask me, "Did John say he stole my money?" and I reply, "He implied that he did," that is a very different response from "Yes, that's what he said." In this usage, "A implies B" means that A is something which increases our estimate of the probability of B; if A is true, we're more likely to believe that B is true as well than if we had no information about A at all.
And in this sense, yes indeed, correlation does imply causation, and if you don't understand this then you should probably stop pretending that you understand the English language. Furthermore, it makes perfect mathematical sense. If you have data on both A and B, then if you can show a positive correlation, the hypothesis of a causal relationship will be much, much stronger than if you can't. And if you show a negative correlation, then forget about it. In other words, while "correlation implies causastion" isn't true in mathematical terms, the converse statement, "causation implies correlation," is true. Correlation is necessary though not sufficient for establishing causation.
Perhaps most importantly, every author of every peer-reviewed paper published in a respectable journal knows this. Next time you read some pop-sci reporting on any study in any field, and are tempted on that basis to dismiss it with "Correlation isn't causation, don't those dumb scientists know that?" ... stop. Think. Read the abstract. And if you want to discuss the results in any detail, read the paper, and understand the methodology. If it's paywalled, find a way to get access (I guarantee you that you can). And if you're unwilling to do this, then you should probably just keep quiet, because you do not know what you need to know to form an informed opinion.
BTW, the link in the summary goes to the second page of a two-page article. Here is a link to the single-page version.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Correlation does in fact imply causation, as any scientist who knows how to make use of their nose or their gut very early in the process of creating a hypothethis to test knows. It implies. It suggests. It's just that it may very well do so incorrectly. I've already given in to the fact that use of the phrase "correlation does not imply causation" instead of the phrase "correlation is not causation" will continue to annoy me and others who similarly like to at least try to get things right.
Here's the link to page 1 for the extremely laz—editors: [Filter error: That's an awful long string of letters there.].
People will pass up steak once a week, for crap every day.
shouldn't we thank David Hume for popularizing this idea?
How does gril get pregnat? It is obvious to even the most pedantic scientist that correlation always, always, always does in fact "imply" causation. Dr. Herbert West, your PhD should be revoked for making up an obnoxious and incorrect variant of the phrase "correlation does not equal causation."
The kid there at the time, when the cookie jar is empty implies NOTHING. Perhaps the kid is still standing around wondering what to snack on precisely BECAUSE the cookie jar is empty. You correctly identify and avenue of investigation only by pointing out that the kid has a predilection for cookies - if that were not the case, his presence in the kitchen would be irrelevant. If the other kid upstairs liked cookies and this one hates that kind of cookie, you would not say their presence together is supportive of the hypothesis that he ate them - you'd go see what the other kid is munching upstairs.
And this very nicely illustrates the point of why it's unintuitive.
I go one step further, and require a basic understanding of Lorentz transformation.
And don't all of you girls here run down my mailbox, now...
But it gets the best odds in Vegas.
Fugue for Aaron Swartz
We know fossil fuel use is on the rise. We know the earth is getting warmer. So you HAVE to see the FACT that people are using more fuel to run their air conditioners precisely BECAUSE it's hotter these days. Warming causes increased energy usage. duh.
That was a joke son, I'm not trolling....
The most recent popularity of "correlation does not indicate causality" is the result of the rise of anti-intellecutalism and anti-reason. It's something that stupid people say to try to sound smart, and to deny data.
Correlation is not proof, but if you see replicable continual correlation, ignoring it is dumb.
It comes from people who try to use an 18th century view that Science "creates facts", instead of "creates models that either are supported by observation or are not supported by observation". Correlation is just another observation.
It's one of the big favorites of the anti-intellectual Right and climate change deniers.
You are welcome on my lawn.
I married the woman who replied.
Boy, that escalated quickly.
>Yes, I am surprised that worked as well.
Perhaps. Or maybe it was spurious.
Maybe it worked, maybe it didn't. Surely we can't assume that just because you had a statement about correlation/causation in your profile and you married the person who responded, that one caused the other.
I always respond to that this way: "But causation does imply correlation. Since we can't directly see causes (if we could, we wouldn't be investigating looking for them) and we need something that we can see to tell us where to start looking, correlation is as good a starting point as we're going to get.".
The argument is "Cum Hoc Ergo Propter Hoc" as described in Latin. The similar phrase "Post Hoc, Ergo Propter Hoc" or "after this therefore because of this" dates back to 1704, according to Merriam Webster. I would assume "Cum Hoc Ergo Propter Hoc" is of similar age and origin.
DMCA - Chilling free speech since 1998.
Statistics can ONLY show the degree of correlation. Statistics can never show causation. So, all you're ever going to get from statistics is correlation.
That reality escapes many.
References:
1) Although [statistical] regression cannot prove causation, no statistical method can do that,
2) Epidemiological studies can never prove causation; that is, it cannot prove that a specific risk factor actually causes the disease being studied. Epidemiological evidence can only show that this risk factor is associated (correlated) with a higher incidence of disease in the population exposed to that risk factor. The higher the correlation the more certain the association, but it cannot prove the causation.
The reality is that statistics can ONLY show you whether there is a correlation or not, and how strong it is. Then it requires other methods to suggest whether there is a causative relationship.
Causation is a pretty fuzzy philosophical topic so arguing about what it is or isn't, is not terribly useful.
...I guess. So we are telling them that Correlation and Causation have close to the same meaning, got it. Like ham isn't bacon, and yet both are strangely delicious. Mmmmmm, pig meat...sorry, go on.
....I am getting more and more confused. Goddam Irish.
.. note to lib arts folks... that superior laughing feeling is what the STEM people experience when algebra dropouts try to swim in our sea of math...
...And you win them over with your swauve, charming wit. SLAMDUNK!
It seems pretty simple to me. Correlation is 'Sometimes A and B are found together'. Causation is 'A causes B'. But go on...
Maybe the best way to explain it to a liberal arts grad would be something like the journey is different than the destination
or when you come to a fork in the road and see the road less traveled correlation is how you know its less traveled,
Wait, what?
or that its all somehow symbolic of Hemmingways Old Man And The Sea and the act of fishing is much different than the expectations about getting a fish.
Is this stats 101, or literary criticism 204?
Either that or the point of Joyce's Ulysses wasn't a numerical analysis that people walked around a hell of a lot in Ireland a hundred years ago.
STEM == Science Engineering Technology Math, I am guessing. For just a moment as I read that, I thought you were talking about some alien genetically engineered clones grown from stem cells, bent on subverting and destroying our society from with in. Too bad really, because it would have tied up the explanation nicely.
so when the smart guy says something about statistics, if the 1040EZ form baffles you and you can't find the "any" key on your keyboard, thats a good sign you should probably shut up and do what the smart guy says.
Please promise me that you will never become a school teacher
HA! I just wasted some of your bandwidth with a frivolous sig!
The number of pirates skyrocketed starting in the fourth quarter of 1999 when Napster kicked off the culture of illegal peer-to-peer file sharing, yet there hasn't been too much of a cooling effect. Correlation busted.
I would argue that correlation absolutely IMPLIES causation, but does not PROVE causation.
"Sic Semper Tyrannosaurus Rex."
Well is science growing powerful finds all these false correlations?
Think of it this way: When you find a correlation, there are four possibilities: A causes B, B causes A, C causes A and B, or chance. Repeating experiments helps strengthen the correlation, which diminishes the probability of chance. Further experiments varying those parts of A and B that can be controlled help distinguish among the remaining three and help identify C.
This is actually the most reasonable definition of "causation" for the following values:
"Hoc"= state of the Universe at a given instant
"Post Hoc"= state of the Universe at (given instant + infinitesimal interval)
Set your phasers on "funky"!
No, but only because of Betteridge.
Escher was the first MC and Giger invented the HR department.
That's why experiments are repeated: to make the probability of chance smaller.
So by putting that phrase in your profile you get married if a women replies? Can you leave out words for more casual relationships???
Global warming and the decline of pirates comes to mind.
There hasn't been a "decline of pirates" since 1999.
Then Causation: If first A then B, all the time.
It is pretty clear that these two statements do not have any implication. There is nothing in Correlation's definition at all about 'first', which determines whether A causes B or B causes A. In addition, it is clearly worded to avoid the definitive 'all the time' which is necessary for causation.
Basically, Causation causes Correlation, but not the other way around. It is exactly as likely to be reversed, and also possible it is a third cause, or even random correlation. If A correlates with B, then A might cause B, but B is just as likely to cause A, or both could be caused by C, or it could be random chance.
Saying that Correlation implies causation is like saying that living in a penthouse causes you to be wealthy. Yes most people that live in penthouses are wealthy, but not all. Some penthouses are in slums. And even so, the wealth causes the penthouse, not the other way around.
excitingthingstodo.blogspot.com
It is absolutely true that correlation does not imply causation, but people seem to use it (especially on here) as if it magically refutes everything. Usually more so when they don't want it to be true, or just don't want to deal with it.
I hit you with a bat.
You are bleeding on the ground.
'But.. but.. correlation does not imply causation.. Maybe I started bleeding spontaneously...'
I will shred my adversaries. Pull their eyes out just enough to turn them towards their mewing, mutilated faces. Illyria
it could just as likely that an unknown C is the cause of both A and B.
The trouble with that, as we hashed out the last time we argued about this, is that people say "correlation does not prove causation" to mean "you haven't already discovered C; therefore, it is futile to spend any resources to discover C."
The funniest part is where the author claims that throwing out the phrase stops arguments in their tracks. I guess he doesn't spend much time around here.
I've tried to recently start throwing out "Causation doesn't always mean correlation" whenever I can find situations that it makes sense.
E.g. a recent Wired article talked about statistical analysis "proving" that the idea of a football team gaining "momentum" after an interception is a myth.
I think it's a fair assumption that sometimes a team gets momentum from an interception... but other times the team who lost the ball gets fired up in response. And lots of other times there is no clear advantage one way or the other. But the overall statistics being a wash doesn't mean there aren't specific affects going on at a finer scale that have been missed by big picture statistics.
correlation measures tendency towards a linear fit
This is true of Pearson correlation, but there exist several other correlation measures. Relying exclusively on Pearson correlation to disprove causation is itself a fallacy.
Dr. Herbert West, your PhD should be revoked for making up an obnoxious and incorrect variant of the phrase "correlation does not equal causation."
We're dealing with an equivocation here, and unrecognized equivocation stops useful debate. The English word "imply" has two meanings, one weak and one strong: "suggest" and "prove". Yes, correlation suggests causation, but it doesn't prove it. What a strong correlation does prove, however, is that a search for what causal relationship underlies this correlation is far more likely than not to promote the progress of science.
It comes from people who try to use an 18th century view that Science "creates facts", instead of "creates models that either are supported by observation or are not supported by observation".
That or "science creates models, but I can't see how these models are so useful in the daily lives of those around me, so I refuse to endorse borrowing money from China and Japan to fund creating these toy models."
So, what traits did you correlate with that trait? ;)
Funny you should say that. My recent studies have led me to the conclusion that being a vegetarian causes you to become a bad painter.
Da Vinci was a fine painter!
You think soft scientists have problems with repeatability?
It's the sweet scientists that I'm worried about.
-kgj
We can still burn witches for being left-handed, right?
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
The problem is too many studies and 'correlations' are based on statistically insignificant sample sets. People don't seem to get this little detail. Without statistically significant sample sets the correlations are virtually useless.
I *wish* government was becoming more technocratic, given the original meaning of the term.
Our ignorance is not so vast as our failure to use what we know. - M. King Hubbert
This reply usually confuses them enough to go away.
Well, reading your comment, are you now persuaded that correlation does imply causality?
http://xkcd.com/552/
Joachim
People don't write Manifestos any more -- what's going on in this world? [Frank Zappa]
I always thought that it rained worms during a heavy shower, because they were all over the sidewalk afterward. And now I find that the worms on the sidewalk corellated with the storm, but the storm wasn't causative.
"... That's inferring a narrative explanation from a conclusion relative to your subjective perception of the likelihood of various premises that lead to that conclusion. ..."
I think you may be full of shit. Maybe you should see someone about it. And you might be massively undervaluing our mishmashes. If they didn't work as kind of OK correlators we wouldn't be here
Easy way to tell if the invoker of the phrase does not have a clue is if they state it as "correlation is not causation." This is a truism and, as such, never carries any information. Correlation does not imply causation actually states something entirely different -- that the argument is reaching (rather than that it is false).
Any guest worker system is indistinguishable from indentured servitude.
As a matter of fact, it calculates linearly two trends. One could, in fact, say that it measures how aligned two trends are. It's calculation actually measures the cosine of the angle between two vectors represented by the two data sets. It's just that being aligned does not mean that one of the occurances caused the other.
Any guest worker system is indistinguishable from indentured servitude.
"There are many results that have false causation because an ignored variable was hiding there."
I call bullshit.
You may just possibly be able to find one. I bet it takes you bloody ages to find it and is a tiny tiny fraction of the ones that didn't display this effect.
That is why you want 95% disproof of the null hypothesis: the rejection of the possibility that you just happeneed to pick a period that had them changing the same direction modulo some factor.
And, in climate change, the null hypothesis HAS been rejected.
The causation is there.
But deniers cry "Prove the causation", then you show the evidence of that causation and they cry "But that is in a lab, not the atmosphere", so you show a computation of it in an atmosphere, so they cry "that's a model, not a real atmosphere!", so you show them the data that shows the correlation that you expect to result from that causation and that is actually seen in the records and they cry "Correlation is not causation!".
Hence the complete bollocks on this site and every other about some dipshit claiming "Correlation =/= Causation" merely so they can shut their eyes and go "lalalala! not listening!".
That's obvious.
He wanted a wife that understood that just because he was three hours late coming home from work, smelling of beer and cheap perfume, with lipstick on his collar, does NOT imply he went to a bar and picked up. :-)
Single page instead of page 2 only
but they correlate pretty well.
The article about "Internet usage and depression" didn't just talk about correlation. In the last few paragraphs, it pretends that the features that correlate with depression are automatically "signs" and "symptoms" of depression. A symptom in medicine is something that is indeed caused by the disease, not only correlated with it. You wouldn't call a high leukocyte level a symptom of fever, even if they cooccur in an infection.
OTOH looking at the /. post they cite:
> There are so many variables here that it isn't funny. I frequently cringe when I see social science "foo linked to bar says study" headlines. There are so many ways to cut the data
Now if that kind of language doesn't imply depression...
Data arises from retrospective or prospective studies.
Retrospective data was created before a statistician could design an experiment.
Prospective data sees the statistician set various levels of a variable
to randomly selected experimental units (maybe people, maybe production machines).
In a (prospective) experiment, an observed correlation implies causation.
For example, in manufacturing plastic, keeping constant other variables (humidity and speed of production),
set the temperature sometimes at 100 degrees and sometimes at 200 degrees,
randomly choosing the order these temperatures get applied.
If the 200 degree temperature produces a stronger plastic (response or dependent variable),
then your positive correlation implies causation.
In the future, knowing that increased heat increases plastic strength, the manufacturer would raise the temperature.
But experiments consume time and money, so institutions not individuals usually perform them.
Million dollar clinical trials do determine whether a drug is effective.
While experimental economics can determine causation, most economics is retrospective, so conclusions become controversial.
Sea pirates are clearly a much stronger source of global cooling than Napster 'pirates'.
Busted in Somalia. Pirates attacked ships off the Somali coast 151 times in 2011, once for each Pokemon in the original Game Boy games.
perhaps it's the value plundered by the pirates.
BBC estimates the 2011 plunder at $146 million.
I had the phrase "Desired: A woman who understands that correlation does not imply causality..." in my dating profile.
I married the woman who replied. Yes, I am surprised that worked as well.
But correlation does not imply causality, so you don't know for sure that it worked!
Some people are lucky.
that could test, is correlation is causation?
but the Judge ruled the co-relation was clear.
If people want to deal with social science causation, they must stop arguing and start experimenting. But how? How can we experiment in the social sciences in a way that demands consent of the human subjects at the same time as providing experimental control?
The answer is Secession from Slavery to Free Scientific Society:
Secession from Slavery to Free Scientific Society
by James Bowery
INTRODUCTION
Secession is necessary to free society. Free society starts with mutual consent. Mutual consent implies the option not to consent. "Freedom From" compliments "Freedom To".
Secession is necessary to true social science: We can best discover causal laws by testing theories with controlled experiments. This is true of all science. Controlled experiments require separate experimental groups, treated according to different theories and comparing the measured results with predictions. In practice, human ecologies can form separate experimental groups only by upholding geographic boundaries that prevent cross-contamination between treatments – cross-contamination with its resulting confusion and confounding of results. We can argue how best to achieve this in practice, but the principle of giving experimental evidence priority over any amount of argument, debate, deliberation, peer review or judicial proceeding stands as more self-evident than anything in the Declaration of Independence.
In a free scientific society, an individual is subject to treatment only after giving informed consent.
These two pillars of social good -- truth and freedom -- stand upon the foundation of secession.
Tyranny of the majority, limited only by a vague laundry list of selectively enforced human rights -- the sine qua non of "liberal democracy" -- must submit to the right to secede or it violates truth and freedom, hence all social good.
SLAVERY
Getting right to the point that people need addressed whenever "secession" is uttered:
Abolition of slavery is support of individual secession.
Slaves want to secede from their "owners" just as others want to -- and do -- secede from societies they find objectionable. The difference between slavery and others turns solely on whether the individual's right to secede is realized. All who are denied secession are slaves: their consent is violated.
If men from Maine choose to support the right of secession of slaves by marching on South Carolina to kill unrepentant slave owners -- every last one of them -- those men from Maine in no way lose their own rights. Men retaining their humanity may differ over whether it is wisest to intervene in such a way – or to intervene at all. For example, should a government which is capable of raising taxes do so for the waging of war against slavery or, better for the purchase of slaves to be freed from their dependent owners? Eminent domain “taking” arguments aside, just men may, as well, differ over whether it is wisest to put down a rabid animal, or to treat it. The compromise upon which the United States was founded was flawed, perhaps fatally, by its incorporation of slave states.
Likewise, this in no way supports the 14th Amendment to the United States Constitution or The Union. It supports only the 13th Amendment. Despite Title VII of the Civil Rights Act of 1964's pretenses to the contrary, it is still a "badge of slavery" to be forced into association with others. Likewise the Immigration and Nationality Act of 1965 compounded this badge of slavery born of the so-called “Civil Rights Movement”.
"Freedo
Seastead this.