Cause and Effect: How a Revolutionary New Statistical Test Can Tease Them Apart

← Back to Stories (view on slashdot.org)

Cause and Effect: How a Revolutionary New Statistical Test Can Tease Them Apart

Posted by timothy on Thursday December 18, 2014 @06:10AM from the submission-caused-post dept.

KentuckyFC writes Statisticians have long thought it impossible to tell cause and effect apart using observational data. The problem is to take two sets of measurements that are correlated, say X and Y, and to find out if X caused Y or Y caused X. That's straightforward with a controlled experiment in which one variable can be held constant to see how this influences the other. Take for example, a correlation between wind speed and the rotation speed of a wind turbine. Observational data gives no clue about cause and effect but an experiment that holds the wind speed constant while measuring the speed of the turbine, and vice versa, would soon give an answer. But in the last couple of years, statisticians have developed a technique that can tease apart cause and effect from the observational data alone. It is based on the idea that any set of measurements always contain noise. However, the noise in the cause variable can influence the effect but not the other way round. So the noise in the effect dataset is always more complex than the noise in the cause dataset. The new statistical test, known as the additive noise model, is designed to find this asymmetry. Now statisticians have tested the model on 88 sets of cause-and-effect data, ranging from altitude and temperature measurements at German weather stations to the correlation between rent and apartment size in student accommodation.The results suggest that the additive noise model can tease apart cause and effect correctly in up to 80 per cent of the cases (provided there are no confounding factors or selection effects). That's a useful new trick in a statistician's armoury, particularly in areas of science where controlled experiments are expensive, unethical or practically impossible.

137 comments

Min score:

Reason:

Sort:

No problem. by TechyImmigrant · 2014-12-18 06:15 · Score: 4, Insightful

>provided there are no confounding factors or selection effects
So that'll provide plenty of material for medical researchers, nutrition researchers, education researchers and economists to keep doing what they're doing.

--
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
1. Re:No problem. by digsbo · 2014-12-18 06:23 · Score: 0
  
  You left out people with an opinion on climate change.
2. Re:No problem. by Noah+Haders · 2014-12-18 06:41 · Score: 5, Funny
  
  one weird trick to separate cause and effect!
3. Re:No problem. by TechyImmigrant · 2014-12-18 06:45 · Score: 1
  
  With good reason.
  
  --
  I should use this sig to advertise my book ISBN-13 : 978-1501515132.
4. Re:No problem. by Anonymous Coward · 2014-12-18 06:50 · Score: 2, Funny
  
  Statisticians HATE him!
5. Re:No problem. by Anonymous Coward · 2014-12-18 07:30 · Score: 5, Insightful
  
  I can't thing of any cases where I know there are no confounding factors but don't know which is the cause and which is the effect.
  Also, when it comes to medical stuff, or any human observational study, I can't think of any that don't have selection effects as well. Its a neat trick, but I honestly can't think of a single case where it applies in a useful way. Does anyone have an example?
  The article starts with this example of a confounding factor (which makes this test not applicable):
  
  That turned out to be an erroneous conclusion. Later studies showed that women who took hormone replacement therapy were likely to be from higher socio-economic groups with higher incomes, better diets and generally healthier outcomes. It was this that caused the correlation the earlier studies had found. By contrast, proper randomised controlled trials showed that hormone replacement therapy actually increased the risk of heart disease.
  This test may sometimes be able to provide evidence against causation in such cases (which is useful) but it can't determine causation (because there may be confounding factors). That may be news worthy, but it deserves a more accurate headline: new statistical test can form confidence bounds for how unlikely a it would be for a new parameter to be of this magnitude if there were causation: when combined with existing test it may discredit more potential claims of causation than previously practical.
6. Re:No problem. by wiredlogic · 2014-12-18 07:38 · Score: 2
  
  At least PBS will be able keep up their snake oil infomercials and I won't feel guilty for not supporting them.
  
  --
  I am becoming gerund, destroyer of verbs.
7. Re:No problem. by TechyImmigrant · 2014-12-18 07:53 · Score: 1
  
  > That may be news worthy, but it deserves a more accurate headline: new statistical test can form confidence bounds for how unlikely a it would be for a new parameter to be of this magnitude if there were causation: when combined with existing test it may discredit more potential claims of causation than previously practical.
  Bingo. You have won the internets.
  
  --
  I should use this sig to advertise my book ISBN-13 : 978-1501515132.
8. Re: No problem. by tobenemo32 · 2014-12-18 07:57 · Score: 0
  
  I can differentiate the cause and effect of wind speed vs. Wind turbine rotation. Wind will cause the wind turbine to turn, while wind turbine rotating will not cause wind, unless the generator of wind turbine doubles as a motor to turn the propellers and cause the wind. Prove me wrong .
9. Re:No problem. by Verdatum · 2014-12-18 08:55 · Score: 1
  
  God. I seriously miss the PBS before this. I guess I'm in the minority, but I'd love it if more of my tax dollars went to them once again so they didn't have to pull that shit anymore.
10. Re: No problem. by TechyImmigrant · 2014-12-18 09:07 · Score: 4, Insightful
  
  If you stop the wind all of a sudden, the turbine will continue to turn, causing wind, until the energy in the turbine is spent.
  
  --
  I should use this sig to advertise my book ISBN-13 : 978-1501515132.
11. Re:No problem. by timeOday · 2014-12-18 11:16 · Score: 1
  
  The whole question of "which direction is the causality" is misleading in the first place; pure, uni-directional causality in situations of interest to people is almost non-existent. What we should usually look for is stable configurations ("stable" not implying "good," as in poverty), and self-reinforcing cycles (whether virtuous or vicious). Even if manipulating A causes B to change, it may also be that manipulating B would cause A to change.
12. Re:No problem. by Anonymous Coward · 2014-12-18 12:02 · Score: 0
  
  I think there was a recent case of misunderstood cause and effect with some medication, big data is just getting going, there will be others.
13. Re:No problem. by Capsaicin · 2014-12-18 17:26 · Score: 1
  
  [I]t deserves a more accurate headline: new statistical test can form confidence bounds for how unlikely a it would be for a new parameter to be of this magnitude if there were causation: when combined with existing test it may discredit more potential claims of causation than previously practical.
  Did you really need to give it such an obviously click-baity title?
  
  --
  Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke
14. Re: No problem. by MichaelBrassell · 2014-12-18 17:31 · Score: 1
  
  When machine learning based on this model is applied to the improvement of this statistical model we might have a problem.
15. Re:No problem. by icebike · 2014-12-18 19:55 · Score: 1
  
  The other thing they fail to understand is that causality is patently obvious in the vast majority of cases where there are no confounding factors.
  Probably the social sciences are most in need tests like this, as they are always trying to pin some outcome on some input in a bubbling cauldron of alternatives. But of course, the cauldron is full of confounding factors.
  
  --
  Sig Battery depleted. Reverting to safe mode.
16. Re:No problem. by soccerisgod · 2014-12-18 23:32 · Score: 1
  
  You must be new here.
  
  --
  If a train station is a place where a train stops, what's a workstation?
17. Re:No problem. by rdnetto · 2014-12-19 04:16 · Score: 2
  
  I suspect the test could be generalized to work for N variables, since the noise should increase as we move along a causal chain. The only issue is the exponential drop-off in confidence. If the accuracy could be improved, it could be quite useful for deriving or verifying Bayesian networks.
  
  --
  Most human behaviour can be explained in terms of identity.
18. Re:No problem. by gzuckier · 2014-12-19 08:26 · Score: 2
  
  Indeed. Famous, perhaps apocryphal, finding that storks bring babies, correlating postwar stork population in Europe with birth rate; confounding factor was spike in marriage rate after war, resulting in more babies, and more houses (in the chimneys of which storks nest). Can't take that apart by comparing noise rate in stork count and in baby count.
  
  --
  Star Trek transporters are just 3d printers.
19. Re:No problem. by gzuckier · 2014-12-19 08:28 · Score: 2
  
  The other thing they fail to understand is that causality is patently obvious in the vast majority of cases where there are no confounding factors.
  Probably the social sciences are most in need tests like this, as they are always trying to pin some outcome on some input in a bubbling cauldron of alternatives. But of course, the cauldron is full of confounding factors.
  Still going to need to elucidate reasonably valid mechanism to convince anybody of anything.
  
  --
  Star Trek transporters are just 3d printers.
20. Re: No problem. by gzuckier · 2014-12-19 08:30 · Score: 2
  
  I still believe that the only reason fridges and freezers are cold is because you keep buying cold stuff, and putting it into them. That's why they need to be insulated. All those electric motors and stuff just keep chugging heat out.
  
  --
  Star Trek transporters are just 3d printers.
21. Re:No problem. by david_thornley · 2014-12-19 08:49 · Score: 1
  
  One thing we're very interested in is the consequences of certain actions. Suppose we were to decriminalize many drugs, so nobody would ever be sent to prison for them: what effects would that have? I personally am very interested in the influence of diet on heart disease, and while a lot of people are willing to tell me what they know they contradict each other.
  
  --
  "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
22. Re:No problem. by digsbo · 2014-12-19 08:57 · Score: 1
  
  If the good reason that Climate Change doesn't have confounding factors and can be tested via experimentation?
23. Re:No problem. by mcswell · 2014-12-20 18:42 · Score: 2
  
  It may be obvious, but that doesn't mean it isn't contested. An example (which Pearl uses in his book Causality) is lung cancer and smoking. It was obvious to most people that smoking caused lung cancer, but another possibility was that there was a genetic predisposition to lung cancer, and that genetic factor also caused people to want to smoke. The tobacco industry in fact argued this, and (IIUC) it took some time before the direction of causation could be established in the legal sense.
  An example I heard about just yesterday involved exercise and health. The question was not so much whether exercise improved health (that is obvious), but how the causation worked. The study I read about said exercise had been shown to cause methylation of DNA. Establishing that causal relationship was done experimentally by having people exercise one leg and not the other.
  Causation in economics is also hard to establish (I'm told--I'm glad not to be an economist).
  So no, I don't think causation is always obvious.
Great... by Anonymous Coward · 2014-12-18 06:16 · Score: 0

So once we start using this on everything, 1 out of every 5 times, it will lead us to bogus conclusions with false statistical confidence....
1. Re:Great... by Black+Parrot · 2014-12-18 07:18 · Score: 2
  
  The standard t-test for detecting an effect is already probabalistic. In science and medicine a 95% confidence value is commonly used, which means a 1/20 of detecting something that isn't there.
  
  --
  Sheesh, evil *and* a jerk. -- Jade
2. Re:Great... by Anonymous Coward · 2014-12-18 07:52 · Score: 0
  
  The standard t-test for detecting an effect is already probabalistic. In science and medicine a 95% confidence value is commonly used, which means a 1/20 of detecting something that isn't there.
  More and more, confidence levels are being replace with the actual power of the test in reporting results in scientific reports. Confidence levels are really a throwback towhen you used statistical tables rather than the luxury we have today of actually computing values from the probability distribution/density functions.
3. Re:Great... by Anonymous Coward · 2014-12-18 08:01 · Score: 0
  
  So that is what crit failures happen quite often. Nice.
4. Re:Great... by Zephyn · 2014-12-18 08:33 · Score: 2
  
  So once we start using this on everything, 1 out of every 5 times, it will lead us to bogus conclusions with false statistical confidence....
  Apparently the Trident Gum people have been using this for decades.
5. Re:Great... by ConceptJunkie · 2014-12-18 09:26 · Score: 3, Funny
  
  So once we start using this on everything, 1 out of every 5 times, it will lead us to bogus conclusions with false statistical confidence....
  So, a vast improvement then? ;-)
  
  --
  You are in a maze of twisty little passages, all alike.
6. Re:Great... by Decker-Mage · 2014-12-18 15:20 · Score: 1
  
  The standard t-test for detecting an effect is already probabalistic. In science and medicine a 95% confidence value is commonly used, which means a 1/20 of detecting something that isn't there.
  Unless things have been radically relaxed in the last decade, the standard in hard sciences and medicine remains a 99% confidence interval. It's the social sciences that allow for a 95% confidence interval. Having worked in all the different schools out there, I think I have some confidence in my assertion.
  
  --
  "[I]t is a wise man who admits the limits of his knowledge or skill, and that pretending either causes harm." --Terry Go
7. Re:Great... by SkimTony · 2014-12-18 16:33 · Score: 1
  
  Some confidence? Would you give yourself a 99% confidence interval, or only a 95%?
8. Re:Great... by Capsaicin · 2014-12-18 17:31 · Score: 1
  
  Would you give yourself a 99% confidence interval, or only a 95%?
  The question of the different criteria used in different fields of research is itself a social science question. Using OP's own criteria, they would require only 95% confidence. Obvious ... no?
  
  --
  Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke
9. Re:Great... by tgv · 2014-12-18 19:18 · Score: 1
  
  You've got your statistics all wrong: you misrepresent significance testing, and overlook that t-tests are only suitable for a small range of problems. Plus it doesn't bear on the discussion of causality. You should have been downmodded into oblivion.
10. Re:Great... by gzuckier · 2014-12-19 08:32 · Score: 1
  
  Yeah. you can make anything significant if you use like the entire population of the US as your population; on the other hand, really obvious effects will not reach significance if you use the currently affordable study populations of like 40 people.
  
  --
  Star Trek transporters are just 3d printers.
11. Re:Great... by Cinnamon+Beige · 2014-12-19 08:54 · Score: 1
  
  Some confidence? Would you give yourself a 99% confidence interval, or only a 95%?
  95% confidence. It depends on what you're testing, the purpose of the test, and the design of the experiment. For example, in some cases you might go with 90% simply because you're doing a pilot study--think of it as beta testing, or perhaps the alpha testing round. These are typically small and, well, simple, and you may go with a higher alpha simply because you're doing rough measurements to see if it works at all before investing the resources into doing a larger study with a lower alpha.
  On the other hand, some large medical experiments may even go for a 99.5% confidence interval, due to both the fact that they can due to having a huge sample population and the importance of being as certain as possible.
  100% certainty basically translates as "Numbers were pulled from anus."
Always by phantomfive · 2014-12-18 06:16 · Score: 3, Interesting

So the noise in the effect dataset is always more complex than the noise in the cause dataset....... the additive noise model can tease apart cause and effect correctly in up to 80 per cent of the cases
In other words, not always.

--
"First they came for the slanderers and i said nothing."
1. Re:Always by Mr+D+from+63 · 2014-12-18 06:25 · Score: 5, Interesting
  
  This is the tricky part, and it seems to work if you know exactly the cause and effect in advance, so you know which data to look at. It is quite clever though, and would seem to have application as an indicator if nothing else.
  
  I recall some equipment monitoring techniques used in my industry. There were reams of data. If a piece of equipment failed, you could go back and look at the data and see that there were indications. But filtering those indications out as useful input was always the problem. Only the blatant, in your face indications were caught. I see a similar problem here, that you might be able to show cause and effect with this data in hindsight, but it won't be so clear when you don't know the answer already.
2. Re:Always by phantomfive · 2014-12-18 06:35 · Score: 3, Interesting
  
  Indeed, it's easy to think of situations where the opposite is true, where the noise is simpler in the 'effect' than in the 'cause,' because there is some attenuation factor in between that reduces the noise. That's more or less what a damper or shock absorber is designed to do. And a low pass filter in audio does the same thing.
  
  Now you might say, "obviously a low-pass filter is in the way, and that's causing the difference" but that gets back to your point, where it's easy to figure out when you already know the system, but if you don't, then it's not so easy.
  
  --
  "First they came for the slanderers and i said nothing."
3. Re:Always by Anonymous Coward · 2014-12-18 06:35 · Score: 0
  
  No, always, and then the current test for it only works in 80% of cases. "In other words" should not prefix a non-sequiteur.
4. Re:Always by TechyImmigrant · 2014-12-18 06:43 · Score: 1
  
  Data dependent changes. They're a problem in statistics and they're evil in crypto.
  
  --
  I should use this sig to advertise my book ISBN-13 : 978-1501515132.
5. Re:Always by aaaaaaargh! · 2014-12-18 06:58 · Score: 0
  
  I don't understand the initial problem. If A temporally precedes B, then B cannot have caused A but it may have been caused by A. (Unless there is backwards causation, which presumably violates many laws of physics.)
  So why can't they properly take into account time?
6. Re:Always by Anonymous Coward · 2014-12-18 07:01 · Score: 0
  
  They didn't setup NTP on their servers, so the timestamps are FUBAR'd.
7. Re:Always by dinfinity · 2014-12-18 07:03 · Score: 1
  
  Seriously? You clicked on an article on statistics, yet haven't the faintest clue what 'correlation' means?
  Try going to Yahoo Answers instead of Slashdot. You'll do everybody a favor.
8. Re:Always by itzly · 2014-12-18 07:05 · Score: 2
  
  Probably because A and B have a large overlap in time, combined with poor record keeping at the beginning.
9. Re:Always by phantomfive · 2014-12-18 07:18 · Score: 1
  
  I'm not sure what a data dependent change is.
  
  --
  "First they came for the slanderers and i said nothing."
10. Re:Always by TechyImmigrant · 2014-12-18 07:51 · Score: 3, Informative
  
  An algorithm changes its behavior based on the value.
  The example I gave is a sneaky algorithm in the FIPS spec that deletes consecutive values when they match.
  I.E.
  If this_value == last_value:
  don't output this_value
  else
  do output this_value.
  This is on the output of an RNG and so it reduces the entropy in the random numbers because there are no matching consecutive numbers, whereas in a full entropy stream, all pairs would be equally likely.
  In the context of noise in statistical analysis, it can confound the additive noise models.
  Algorithms that do things to data, but don't look at the values of the data when deciding what to do are not data dependent and so that limits the scope various bad things to happen.
  
  --
  I should use this sig to advertise my book ISBN-13 : 978-1501515132.
11. Re:Always by Hussman32 · 2014-12-18 08:01 · Score: 1
  
  Nice description. Obviously this one is susceptible to a dead sensor, or stuck value. I run into these issues all the time, which I circumvent by keeping track of the local error (when the error decreases to zero too, I know it's a dead sensor).
  
  --
  "Who are you?" "No one of consequence." "I must know." "Get used to disappointment."
12. Re:Always by Anonymous Coward · 2014-12-18 08:01 · Score: 1
  
  80 percent of the time it works every time...
13. Re:Always by Anonymous Coward · 2014-12-18 17:43 · Score: 0
  
  "In other words" should not prefix a non-sequiteur.
  What phrase should we use to prefix our non sequitur in future?
14. Re:Always by Capsaicin · 2014-12-18 17:53 · Score: 1
  
  So why can't they properly take into account time?
  Because the original set up may be buried in time. You find a wind-turbine turning, the wind is blowing. Merely by measuring the speed of each how can you tell which came first? (Yes, I know ... you compare the noise profile of the respective data sets.)
  But now back to dwelling exclusively on the potential problems without acknowledging any even limited usefulness of this methodology might have ...
  
  --
  Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke
15. Re:Always by mcswell · 2014-12-20 18:44 · Score: 1
  
  Now, now. (https://www.youtube.com/watch?v=Hw5OoVXeJbU)
That 20% is the killer though by neilo_1701D · 2014-12-18 06:17 · Score: 2

Reading through the article, it wasn't clear to me how it is determined whether it worked correctly or not.
But still, an interesting statistical breakthrough, and one that allows researches to ask interesting questions about their data.
1. Re:That 20% is the killer though by Hussman32 · 2014-12-18 08:09 · Score: 1
  
  I agree, as long as it is recognized as a tool to assist in defining the confidence, as opposed to a guarantee of confidence. I also wonder what adjustments can be made to 'tune' the result to the desired conclusions.
  
  --
  "Who are you?" "No one of consequence." "I must know." "Get used to disappointment."
What's that saying? by fahrbot-bot · 2014-12-18 06:18 · Score: 0

Lies, damned lies and statistics.

--
It must have been something you assimilated. . . .
So, correlation CAN mean causation? by Anonymous Coward · 2014-12-18 06:18 · Score: 5, Insightful

Well, of course it can. How do you think causation is determined? First by noticing a correlation. There can't be causation without correlation.
Gawd I hate the brain-dead fools who thoughtlessly parrot, "Correlation is not causation!"
1. Re:So, correlation CAN mean causation? by Anonymous Coward · 2014-12-18 06:32 · Score: 1
  
  There can't be causation without correlation.
  That is an interesting statement. I would love to see some proof of that.
  Wouldn't a one-shot event with a delayed consequence have causation without correlation?
  I speculate that there can't be correlation between non-repeating, non-simultaneous events.
2. Re:So, correlation CAN mean causation? by Anonymous Coward · 2014-12-18 06:45 · Score: 1
  
  Every consequence is delayed. Except for maybe some effects of quantum entanglement. (grin)
  What delay do you suppose is the cutoff for something to not be correlated?
3. Re:So, correlation CAN mean causation? by Anonymous Coward · 2014-12-18 06:48 · Score: 2, Interesting
  
  Gawd I hate the brain-dead fools who thoughtlessly parrot, "Correlation is not causation!"
  The proper term is: "Correlation does not imply causation". Perhaps you are being pendantic, but I'd rather hang around people who think "Correlation is not causation" (since it is more correct), than people who think "Correlation is causation".
4. Re:So, correlation CAN mean causation? by Anonymous Coward · 2014-12-18 07:18 · Score: 0
  
  There can be causation without correlation. It just takes a moderating variable which suppresses the correlation. e.g. http://srmo.sagepub.com/view/the-sage-encyclopedia-of-social-science-research-methods/n989.xml
5. Re:So, correlation CAN mean causation? by clovis · 2014-12-18 08:40 · Score: 1
  
  There can't be causation without correlation.
  That is an interesting statement. I would love to see some proof of that.
  Wouldn't a one-shot event with a delayed consequence have causation without correlation?
  I speculate that there can't be correlation between non-repeating, non-simultaneous events.
  You are correct, it is possible to have a causal relationship that does not result in a correlation.
  This occurs if the consequence of the cause has a mediating factor occurring before the consequence, and the mediating factor varies in some way that is not dependent upon the causal action.
  Here's a simplified example:
  There are causes that make stock market prices vary, but the direction of the price depends upon how the information regarding the event is presented.
  Falling oil prices cause oil company share prices to vary, but whether they rise or fall depends upon how the news media presents the cause and expected outcome - something that may depend upon political factors (Let's punish Russia!), or whether it is presented as "the sky is falling" or "buy opportunity" which may be influenced by the news reporting advertiser's needs.
6. Re:So, correlation CAN mean causation? by Wraithlyn · 2014-12-18 08:42 · Score: 4, Interesting
  
  I prefer "Correlation does not prove causation".
  Edward Tufte suggested "Correlation is not causation but it sure is a hint."
  
  --
  "Mind, as manifested by the capacity to make choices, is to some extent present in every electron." -Freeman Dyson
7. Re:So, correlation CAN mean causation? by Anonymous Coward · 2014-12-18 08:58 · Score: 0
  
  Gawd I hate the brain-dead fools who thoughtlessly parrot, "Correlation is not causation!"
  The proper term is: "Correlation does not imply causation". Perhaps you are being pendantic, but I'd rather hang around people who think "Correlation is not causation" (since it is more correct), than people who think "Correlation is causation".
  I think the AC was referring to slashdot posters who reply with "Correlation is not causation" to articles that they clearly have not read.
  I can recall one in which a biochemical pathway was well understood in a rat model, and an experiment was done to verify if the same cause produced the same outcome in humans. With rats, you can test them, kill them, grind up their brains, or whatever and exactly determine the biochemical changes and how the outcome was caused. The experiment can be repeated upon humans to see if the same cause is correlated with the some outcome, but you can't (shouldn't?) kill them and cut them up to measure the mediating biochemical processes. It's testing a hypothesis by making a prediction and conducting an experiment.
  And, yet, some AC will post a single reply "Correlation is not causation". I dunno, maybe they assumed that the experiment was some data-mining exercise.
8. Re:So, correlation CAN mean causation? by turbidostato · 2014-12-18 13:08 · Score: 1
  
  "You are correct, it is possible to have a causal relationship that does not result in a correlation."
  Too long to explain. An easier way: there can be causation without correlation. This is called deterministic chaos (the butterfly inducing a hurricane in the other side of the world, remember?).
9. Re:So, correlation CAN mean causation? by Anonymous Coward · 2014-12-18 14:11 · Score: 0
  
  The textbook counterexample is Y=X^2, where X is a random variable uniformly distributed between -1 and 1. Y and X will be nearly perfectly uncorrelated, but Y is directly caused by X.
  Correlation is not causation, and causation is not correlation, either.
10. Re:So, correlation CAN mean causation? by gzuckier · 2014-12-19 10:07 · Score: 1
  
  Or stuff like "this drug increases survival in men, but decreases survival in women" If you don't suspect enough to look for that and just look at overall survival, you'd be mystified.
  
  --
  Star Trek transporters are just 3d printers.
11. Re:So, correlation CAN mean causation? by gzuckier · 2014-12-19 10:10 · Score: 1
  
  Indeed. if you have a correlation, and a well established mechanism that would account for it, you've got causation, until something better comes along; and this better thing has to not only account for the correlation, but also explain why the mechanism actually doesn't work in this case. Which brings us, of course, to CO2.......
  
  --
  Star Trek transporters are just 3d printers.
12. Re:So, correlation CAN mean causation? by cwsumner · 2014-12-20 10:50 · Score: 1
  
  Every consequence is delayed. Except for maybe some effects of quantum entanglement. (grin)
  What delay do you suppose is the cutoff for something to not be correlated?
  Actually, no.
  Repetitive events may not have a beginning or end, within the data gathering period. It is quite common to not be able to tell which is "first". In that test condition, the result can "come before" the cause, at least mathemetically. Or on the display screens. It's called "phase reset", and probably other things in other tech dialects.
  And most of the questions discussed on slashdot are repetitive events...
  The article seems practical, though. The cause and effect both have noise. The noise from the cause is transferred to the effect. So, the effect has much more noise.
  Of course, the normal troubleshooting technique is to inject artificial noise. As in: Touch it with a square wave, and see if it comes out the speakers! 8-)
We are all victims of causality by Anonymous Coward · 2014-12-18 06:20 · Score: 0

I drank too much wine, I must take a piss.
1. Re:We are all victims of causality by gzuckier · 2014-12-19 10:11 · Score: 1
  
  Drunk falls into an open grave, passes out. In the AM, thinks to self: "If I'm not dead, then why am I in a grave? And if I am dead, then why do I need to pee?"
  
  --
  Star Trek transporters are just 3d printers.
What if there is a third party? by Skewray · 2014-12-18 06:24 · Score: 1

So if Z causes both X and Y, I assume that this amazing test gives garbage?
1. Re:What if there is a third party? by Anonymous Coward · 2014-12-18 06:40 · Score: 0
  
  That ("Z causes both X and Y") is basically the definition of a confounding factor (https://en.wikipedia.org/wiki/Confounding), which, if I recall, was even mentioned in the summary as a limitation.
2. Re:What if there is a third party? by nine-times · 2014-12-18 06:58 · Score: 1
  
  That was one of my thoughts, as well. I think I understand the concept, and it seems like an interesting and possibly useful approach. However, it doesn't seem like it will necessarily give us causal links in a very certain way, since many real-life situations have many factors with complex relationships. Like: Z causes X and Y, but perhaps it always causes X and only makes Y more likely. Or: A, B, and C all independently increase the chances of E, but only when an unknown factor D is present.
  So I'd guess that this isn't going to be anything like a magic bullet, but I don't know that the people who came up with it expected it to be. It might just be another useful tool for analysis.
3. Re:What if there is a third party? by Black+Parrot · 2014-12-18 07:23 · Score: 2
  
  So if Z causes both X and Y, I assume that this amazing test gives garbage?
  Perhaps in some cases it would be possible to detect that both X and Y were being affected by the same noise, implying the existence of some unknown Z?
  
  --
  Sheesh, evil *and* a jerk. -- Jade
4. Re:What if there is a third party? by Anonymous Coward · 2014-12-18 07:46 · Score: 1
  
  Ideally, this test should say that neither caused the other.
5. Re:What if there is a third party? by Anonymous Coward · 2014-12-18 08:42 · Score: 0
  
  That is freaking simple : PUT X and Y as one new single entity R , then it change to Z causes R , then test remain true .
6. Re:What if there is a third party? by ClickOnThis · 2014-12-18 10:28 · Score: 1
  
  Not to mention A causes X and B causes Y. You know, a coincidence.
  One would assume that multiple trials would cause A and B to occur at different times, thus eliminating any perceived correlation between X and Y.
  
  --
  If it weren't for deadlines, nothing would be late.
David Hume by pigiron · 2014-12-18 06:26 · Score: 0

You blithering idiots have not exactly solved Hume's fundamental Problem of Induction.
1. Re:David Hume by Black+Parrot · 2014-12-18 07:25 · Score: 5, Insightful
  
  Yes, but now we can find out whether we read Slashdot because we are nerds, or we are nerds because we read Slashdot.
  
  --
  Sheesh, evil *and* a jerk. -- Jade
2. Re:David Hume by gzuckier · 2014-12-19 10:18 · Score: 1
  
  I know, my sound system has terrible problems with 60 Hz Hume from Induction.
  
  --
  Star Trek transporters are just 3d printers.
3. Re:David Hume by Anonymous Coward · 2014-12-23 03:01 · Score: 0
  
  Teasing apart noise caused by shouting nerds from random Slashdot noise will be a challenge, though.
"up to 80%" - is that a joke? by Anonymous Coward · 2014-12-18 06:28 · Score: 1

Is that a joke for the quantitatively pedantic?
Hey, we have this new technique. It's somewhere between 0% and 80% reliable.
Lies, damned lies, and statistics by Anonymous Coward · 2014-12-18 06:41 · Score: 0, Funny

80 percent accuracy would get you laughed out of a room 100% of the time.
1. Re:Lies, damned lies, and statistics by Anonymous Coward · 2014-12-18 06:59 · Score: 0
  
  The above post is 80% accurate.
2. Re:Lies, damned lies, and statistics by skids · 2014-12-18 08:06 · Score: 2
  
  Almost any level of accuracy above pure randomness can be fruitfully added to the bayesion inference process. You can pretty harmlessly add the pure noise as well, it's just not going to be fruitful.
  
  --
  Someone had to do it.
Re:Finally prove climate change! by Noah+Haders · 2014-12-18 06:44 · Score: 1

80% of the time it confirms the scientits' exectations.
New? by Anonymous Coward · 2014-12-18 06:49 · Score: 0

I only read the summary, not TFA, but it doesn't seem like a new idea to me. I can think of scads of areas in engineering in which assuming there is (typically independent, Gaussian) noise in the model and/or the measurement is the basis that makes the calculations work out. E.g., using random pertubations and a Kalman filtering algortihm to uncover the model of an unknown system knowing only the output of the black box.
1. Re: New? by Anonymous Coward · 2014-12-18 07:25 · Score: 0
  
  I think the OPs point was simply that this wasn't new or revolutionary. He (or she) didn't even address whether it was effective or not.
Other causality tests exist by Anonymous Coward · 2014-12-18 06:57 · Score: 5, Informative

Many other attempts at detecting causality exist. There's one based on dynamical systems theory (Takens' theorem): in a multidimensional, causally linked dynamical system, all the information in the high-dimensional system can be recovered from a multiple values of a single dimension over time.
The method works by reconstructing values of X from lagged vectors of Y(t) nearest-neighbor lagged vectors of Y in a training set. As the training set gets larger, the predictions get better. If they keep getting better, X probably causes Y. The idea that the noise in X(t) shows up in Y(t) but not the other way around is implicitly captured in that approach, although not in a statistically rigorous way.
Sugihara et al. Science 2012 (sorry about paywall).
1. Re:Other causality tests exist by Anonymous Coward · 2014-12-18 07:10 · Score: 1
  
  Proofreading...should say:
  The method works by reconstructing values of X(t) from lagged vectors of Y(t) USING nearest-neighbor lagged vectors of Y(t) in a training set.
2. Re:Other causality tests exist by Anonymous Coward · 2014-12-18 10:50 · Score: 0
  
  Don't worry dude, everyone who read past the first paragraph without glazing over understood that.
I predict by gurps_npc · 2014-12-18 07:03 · Score: 2

1) A sudden disappearance of studies claiming that video games cause violent behavior rather than the other way around.
2) A whole bunch of people totally ignoring this study because they don't like what it means.

--
excitingthingstodo.blogspot.com
1. Re:I predict by bobaferret · 2014-12-18 07:38 · Score: 1
  
  I think you're dead on with #2, but I doubt anyone will ever say that violent behavior caused video games....
2. Re:I predict by skids · 2014-12-18 08:16 · Score: 1
  
  I'm hazy on the details but ISTR sports deriving from wardances which dervied from a desire not to have all your warriors killed just to figure out who got to use the better stream for the following month.
  
  --
  Someone had to do it.
3. Re:I predict by gurps_npc · 2014-12-18 10:12 · Score: 1
  
  I meant that violent people enjoy violent video games.
  That is, you don't become violent because you play a lot of violent video games. Instead you play a lot of violent video games because you have an aggressive personality.
  
  --
  excitingthingstodo.blogspot.com
4. Re:I predict by Anonymous Coward · 2014-12-18 12:08 · Score: 0
  
  Having an aggressive personality is not equivalent to being violent.
  On the other hand, stupid is as stupid does.
Re:Morons by itzly · 2014-12-18 07:06 · Score: 1

Don't fucking delete your fucking data you fucking dipshit.
Unless, of course, you know that some fucking data is bad, and other fucking data is good. In that case it makes sense to fucking delete the fucking bad data.
Crime v Ice Cream by Anonymous Coward · 2014-12-18 07:14 · Score: 0

Finally, we can discover whether increased crime causes ice cream sales to rise...or if it's the other way around.
1. Re:Crime v Ice Cream by turbidostato · 2014-12-18 13:13 · Score: 1
  
  "Finally, we can discover whether increased crime causes ice cream sales to rise...or if it's the other way around."
  Nonsense... increased ice cream sales comes from global warming which, in turn, reduces pirates as everybody know, therefore reducing crime, not the other way around.
2. Re:Crime v Ice Cream by Anonymous Coward · 2014-12-18 21:40 · Score: 0
  
  global warming which, in turn, reduces pirates as everybody know, therefore reducing crime
  Nope. Global warming causes continents to shrink, which puts people closer to the shore, making it easier to become pirates.
3. Re:Crime v Ice Cream by mcswell · 2014-12-20 18:49 · Score: 1
  
  I don't know about that, but I can tell you that global warming causes the ground to get harder. Proof: when I was younger, I could camp out on the ground with no air mattress; can't do that now.
Re:Morons by cultiv8 · 2014-12-18 07:15 · Score: 1

But then how are you supposed to get your research published?

--
sysadmins and parents of newborns get the same amount of sleep.
Re:Morons by Anonymous Coward · 2014-12-18 07:15 · Score: 0

Wow, so angry! Look at all those fucks and fuckings you wrote! boy are you mad, ropeable even...spitting tacks, cross and angry angry!
So much vile, so much hatred, so much angry anger and swearing and lots and lots of fucks!
Boy, you are as mad as anybody I have ever seen.
So angry!
Wait...verifying........ ...yes, hes angry alright! So angry!
So...um...why so angry bro?
Re:Morons by itzly · 2014-12-18 07:17 · Score: 1

You torture the data until they confess.
Re:Morons by Anonymous Coward · 2014-12-18 07:18 · Score: 1

I would like to officially confirm that, indeed, OP is angry.
sexconker, can you please point to me the place on the doll where the bad ebil statistician touched you?
We will get you some therapy sorted out. Please dont rape, torture and mutilate the dead body of an innocent person in the meantime.
So angry!
This assumes some sort of causal relationship. by MouseTheLuckyDog · 2014-12-18 07:18 · Score: 1

It implicitly presumes that there is some relatively direct casual relationship between the two events.
Fundamental flaw.
1. Re:This assumes some sort of causal relationship. by obenchainr · 2014-12-18 07:26 · Score: 1
  
  ... a flaw which is explicitly pointed out early on in the paper and stated pretty plainly, but that would take actually reading it to know. I'm actually kind of interested in this. We've got a few data sets we do that have highly-correlated variables, and I'll be curious to apply one or two of the methods here (once I figure them out - I'm not a statistician, my boss is) to see if there's any difference in the noise between them. Won't tell me anything definitively, but could suggest which if any might be more likely to be causal. That in turn would suggest future avenues of study.
Re:Finally prove climate change! by neilo_1701D · 2014-12-18 07:27 · Score: 1

Now that they've found a way to filter out ("ignore") data that doesn't fit, maybe now they'll actually be able to conclusively prove that climate change exists!
AGW skeptic here, but I'd be very cautions about applying this technique to climatic data to try and prove anything.
This technique works best there there are a limited number (read: two) variables, and a clear cause & effect (ie. one variable is dependant). At least that's my understanding.
Climate data is mindbogglingly complex, with a huge number of know variables with known and unknown dependencies. Even something as seemingly straightforward as the carbon cycle has a large number of feedbacks, which (again as I understand it) would only mess this approach up.
To my mind, the AGW hypothesis either succeeds or fails based on the predictions it makes and how much in-line those predictions are with observed reality. Clever statistical tricks don't help nor lend credibility in either case.
Bad turbine example by Moof123 · 2014-12-18 07:34 · Score: 1

The turbine example is poor. Adequate data will show causality in time between a wind gust, and a delayed turbine rotation rate. Momentum easily causes a lag between one data set and the other, and the concept of time running in one direction can easily be used to suggest causality.
I am more curious about a test that would show if 2 data sets are clearly caused by a third non-measured factor.
1. Re:Bad turbine example by BenSchuarmer · 2014-12-18 08:39 · Score: 3, Funny
  
  That's just what big turbine wants you to believe.
Re:Morons by Anonymous Coward · 2014-12-18 07:38 · Score: 0

There's no such thing as bad data, only bad methods of taking measurements. If you can't quantify, precisely and deterministically, what's wrong with your measurement method, then trying to just noise-filter the resulting data is a net loss of real data.
This tells us nothing about the arrow of time by Culture20 · 2014-12-18 07:40 · Score: 3, Insightful

Which direction in time does cause/effect flow? The world may never know.
1. Re:This tells us nothing about the arrow of time by Anonymous Coward · 2014-12-18 17:30 · Score: 0
  
  sideways
2. Re:This tells us nothing about the arrow of time by cwsumner · 2014-12-20 11:04 · Score: 1
  
  Which direction in time does cause/effect flow? The world may never know.
  Time flows in all directions, but you only see one direction. That direction depends on which direction -you- are from the "big bang".
covariance matrix? by grep_rocks · 2014-12-18 08:39 · Score: 1

I looked at the article - I don't understand how this is different than a covariance matrix?
Re:Morons by Anonymous Coward · 2014-12-18 08:55 · Score: 0

It seems you don't like statistics ... wondering how you would 'look' TBs of raw data (e.g. from the LHC)?
Example of bad data by Anonymous Coward · 2014-12-18 09:24 · Score: 0

Example of bad data: a series of measurements of windspeed that has, during the series, a block of flats put right next to it.
This is the "Garbage In" that climate deniers are supposed to be against because it gives "Garbage Out", however, they often DEMAND the garbage data is put in, without any reference to the factors you provide for making sense of that bad data.
1. Re:Example of bad data by Anonymous Coward · 2014-12-18 12:47 · Score: 0
  
  Probably the worst application of bias in popular scientific studies right now is the African data relating HIV infections to intact males. Looks great, until you find out that they circumcised these adult men, the found that gee, they didn't get HIV in the 3 weeks after they had the ends of their dicks cut off. Set aside the awful ethics in paying poor africans to be mutilated, and the fact that they set these men into the world to see what aids they'd get... they were recovering from a circumcision. You don't have sex after that for several weeks. Amazingly, even the US CDC has been taken in by this junk science. Here's a correlation: infant foreskins are a 400 million $/year industry in face creams and skin grafts - and somehow this awful act of "science" was perpetrated on the world. It's going to take decades to unwind the damage that comes from people citing it without using a hint of critical thinking.
  Applying the logic from this study, as long as you'd made certain to get some other random variables, like money or social status or something, you'd be good to go. What makes anybody think the overt biases would be at all mitigated by this?
How helpful? by Daemonic · 2014-12-18 10:42 · Score: 1

It's not clear from the article how helpful this would be in the HRT example they give. The test can generally tell you which direction the causality runs in, but if there is no causation, will this be a clear enough test?
Can they say:
Does A cause B? Probably not.
Does B cause A? Probably not.
So there's probably a C causing A and B.
There's a lot of probablys in that.
1. Re:How helpful? by Anonymous Coward · 2014-12-18 17:19 · Score: 0
  
  Does tanning cause gold watches? Probably not.
  Do gold watches cause tanning? Probably not.
  So there's probably a 'being rich' causing tanning and gold watches.
At last... by skelly33 · 2014-12-18 10:51 · Score: 1

... light at the end of the tunnel re: Chicken v. Egg... Pretty interesting though!
Onsager reciprocity by Anonymous Coward · 2014-12-18 11:20 · Score: 0

Wind is generated by the turbine, and turbine spins due to the wind. If one only takes measurements at the steady-state situation, there is no way to tell which what is cause and what is effect!
Didn't Judea Pearl solve this decades ago? by Sanity · 2014-12-18 12:05 · Score: 1

This excellent blog article describes a technique developed by Judea Pearl decades ago to do exactly this. Would be interested to understand how this is different/better.
1. Re:Didn't Judea Pearl solve this decades ago? by mcswell · 2014-12-20 18:53 · Score: 1
  
  They cite him in the article, so they're aware of him. I haven't read the article enough to know how this relates, but I think they're proposing another way to infer causation from causality, beyond the structural modeling (and experimental methods) that Pearl describes.
Re:Morons by sexconker · 2014-12-18 13:08 · Score: 1

I love statistics. I hate "statisticians".
Re:Morons by sexconker · 2014-12-18 13:18 · Score: 1

You can't know your data is bad when doing experimentation. That's the point of experimentation - you control variables and observe others to test a hypothesis.
The point at which you can KNOW data is bad is the point at which you know all of the variables and all the details of the phenomena observing. It's like "experimenting" with 1+1 on a calculator. When it give you a 12 you know you've got bad data (you keyed in 11+1 or 1+11 or something), but that's only because you know the entire system and what it's supposed to do. It's not an experiment at that point, and there's no fucking point in doing it.
If you're experimenting on something then you don't know the entire system. If you don't know the entire system then you cannot know for sure whether any data is bad or not.
Even without going to that extreme. "bad data" - even obviously "bad data" - is merely a failure to control variables. The methodology and experiment as a whole is then suspect. Repeat with better control and methodology, or deal with the small amount of ugliness in the graph that the "bad data" may have contributed.
Re:Morons by blue+trane · 2014-12-18 21:19 · Score: 1

Understandable reaction to the quantity of smugness in the story?
altitude noise? by Anonymous Coward · 2014-12-19 07:45 · Score: 0

how to interpret noise with regard to altitude or temperature? I guess it is much more related to Observational error
Temp vs CO2? by iMactheKnife · 2014-12-19 09:02 · Score: 1

Hmmm. there is a lot more noise in the global temperature data than there is the atmospheric concentration of CO2.
Re:Finally prove climate change! by gzuckier · 2014-12-19 10:21 · Score: 1

Now that they've found a way to filter out ("ignore") data that doesn't fit, maybe now they'll actually be able to conclusively prove that climate change exists!
Oh, wait. They are already ignoring the data that doesn't fit, so I guess this won't help. Well, maybe sometime in the next 50 years they'll actually come up with a model that is accurate for more than 2-3 years in the future.
There's undoubtedly more noise in climate data than in CO2 data, so you've just reminded us about the "climate makes CO2 rise, not the other way around" argument and it is now even more clear that it is false. Good job!.

--
Star Trek transporters are just 3d printers.