Metrics Mania and the Countless Counting Problem
mobkarma writes "Einstein once said, 'Not everything that can be counted counts, and not everything that counts can be counted.' A New York Times article suggests that unless we know how things are counted, we don't know if it's wise to count on the numbers. The problem isn't with statistical tests themselves, but with what we do before and after we run them. If a person starts drinking day in and day out after a cancer diagnosis and dies from acute cirrhosis, did he kill himself? The answers to such questions significantly affect the count."
Cancer itself could be considered a form of killing yourself.
Q: What did the clinically depressed alcoholic man with acute cirrhosis get for Christmas?
A: Cancer
You don't die of cirrhosis by drinking heavily for a short time. You may die of alcohol poisoning.
Intron: the portion of DNA which expresses nothing useful.
I don't disagree, but, it cuts both ways. I think the article has a point... the numbers only have meaning in context.
If I tell you "X people die every year from being shot, in their home, with their own gun", that tells you something. It evokes images of burglars or irresponsible people playing. It SOUNDS like a statement about how safe or dangerous it is to have a gun in your own house.
However, if my number "X" includes suicides, well, then how much of a statement about the relative danger of owning a gun am I making? How about if I can find no link between owning a gun and committing suicide?
Clearly the statement is correct, "shot, in their home, with their own gun" but, even so, its misleading if you then use the numbers wrong.
Take texting while driving. The claim is 900 deaths a year. How do they come at that number? Even better, how big is that number? 900 sounds like a lot.. However... its less than the estimated number of serial murder victims in the US. Overall driving deaths are more like 40,000 a year. Context is everything. If I said "about 1 driving death in 40 is related to txting while driving" thats suddenly a lot smaller, yet, represents the same data.
frankly, I tend to think a LOT of statistics are meaningless. NY state enacted a law against handhed phone use while driving. It resulted in a 70% decrease in OBSERVED use. There was no decrease at all in accident rate.
What this tells me is, someone really believed that this was going to make a difference, came up with numbers and statistics and, in reality, the one little item that he picked out had about as much bearing on accident rates as the price of butter in bangladesh does.
Much of the time statistics are used to just bullshit and make it look like we aren't playing blindfolded darts when we make public policy.
-Steve
"I opened my eyes, and everything went dark again"
Sounds like a restatement of the simultaneously-discovered Goodhart's Law, Lucas critique, and Campbell's Law.
Basically, once you start measuring something as a proxy for what you really want to know, people start to take the proxy into account when making decisions, to the point where it becomes useless as a measure for whatever it was intended.
Here, people take these cancer tests as a measure of their probability of cancer. But once they start to treat them as reliable, they start doing more self-destructive things, destroying the correlation between the proxy (the cancer test) and the actual probability of cancer.
Information theory is life. The rest is just the KL divergence.
had about as much bearing on accident rates as the price of butter in Bangladesh does.
Thanks, now I have to check the price of butter in Bangladesh to see whether it's safe to drive home, you insensitive clod!
"This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
I'd wager that the 1 in 40 people who've died from texting while driving came out of a sample of 1 in 40 drivers who don't put enough importance into paying attention to the road. So, take away texting drivers, and you'll still have 1 in 40 people dying because they were adjusting their radio one station at a time without looking up, or rolling up the rear-passenger window by hand because they don't have power windows.
I don't think texting while driving has increased accidents, I just think it's made it easier to point out who the stupid drivers are.
I finally learn how to count and now they tell me it's useless. What's next, I learn how to type and I find out nobody is reading what I write?
There are more things in heaven and earth than are dreamt of in your philosophy.
Actually the "counting" problem they mentioned is a categorization problem. Depending how you define your categories, you get different counts. But that's because those are really different categories (they are defined differently). So the question is not really one of counting, but one of the "correct" definition of the category.
The Tao of math: The numbers you can count are not the real numbers.
Many years ago, I had an in-depth discussion about gathering statistics on heart disease with a woman on the board of the American Heart Association. This was a big deal. Serious ethical issues were in play and there was a great deal of infighting going on.
I asked her how you make a definitive decision that someone has heart disease. I was trying to figure out what to measure. Her answer surprised me. She said "You wait till they die. Then you cut out their heart and have a look." She then went on to patiently explain to me that the only thing that could be measured and evaluated were "markers" of heart disease. Those markers, as revealed by various disgnostic tests, could be mighty reliable. But you never know if someone is going to die of heart disease until they...you know...actually *die*.
Thus informed, I came to realize that what we measure is almost never what we really want to know. Measuring the right stuff is simply too hard to do. No matter where you look, this is almost universally true. In my job, for example, we fix computer problems. Thus, we measure how many incidents get closed and how much time it took. If you quickly close an incident, then surely you've provided good service, right? Most slashdotters should realize that's not true. In fact, my job is actually to get other, more important workers back to work asap. The only way to measure that would be to interview my customers and their bosses. We'd have to pry for an hour into their effectiveness to find out if I properly completed a job that took me five minutes. That's too much trouble, so we look for markers. Closed incidents. Timeliness of closures.
Measures are inadequate so often that I pretty much don't trust anything that contains them. After years of training in Quality Improvement Processes, I came to realize that the amount of time needed to understand a process and perfectly spec out what needs to be measured is 452% of the expected life cycle of the project, plus or minus a 17.5% margin of error. (Aside - How much do you trust those statistics?)
Almost no one can devote the time required to do the job (no matter what "the job" is) right. We just hope people do their best and trust to good intentions.
As a computer guy who wants things to be either "yes" or "no", unambiguously, I found this state of affairs very difficult to accept. But it's just part of being human.
Ah, the irony of using a statistic to prove that statistics are meaningless.
"Molest me not with this pocket calculator stuff."
- Deep Thought
However, if my number "X" includes suicides, well, then how much of a statement about the relative danger of owning a gun am I making? How about if I can find no link between owning a gun and committing suicide?
Clearly the statement is correct, "shot, in their home, with their own gun" but, even so, its misleading if you then use the numbers wrong.
I don't think I disagree with your overall point, but I do have a quibble with this. I think there's reason to believe that owning a gun makes it more likely that you will commit suicide. Suicidal thoughts are not unusual, and suicide attempts usually fail. Becoming suicidal is often in part a response to a sudden crisis. People usually don't plan to lose their jobs, or get dumped by their romantic partners, or so on, but sooner or later, something as upsetting as those things happens to anyone. If you're horribly upset, and decide to drive out to the bridge to jump off it, you've got more time to change your mind, then if you decide to shoot yourself with the pistol in the safe in the living room.
I don't know. Decade when I started driving, if I was behind someone drifting in and out of their lane, driving 15 MPH below the speed limit for no reason, etc...it was a 2:30AM and they were drunk. Rarely if ever happened during the day (and that was 500 of freeway commute time every month). Now I see this constantly; like every other day at least. Some knob almost causing an accident while texting or dialing with one hand.
I judt got a nre Kinesis keybiartf so please excusr ant egregiou typos.