Visualizing False Positives In Broad Screening
AlejoHausner writes "To find one terrorist in 3000 people, using a screen that works 90% of the time, you'll end up detaining 300 people, one of whom might be your target. A BBC article asks for an effective way to communicate this clearly. 'Screening for HIV with 99.9% accuracy? Switch it around. Think also about screening the millions of non-HIV people and being wrong about one person in every 1,000.' The problem is important in any area where a less-than-perfect screen is used to detect a rare event in a population. As a recent NYTimes story notes, widespread screening for cancers (except for maybe colon cancer) does more harm than good. How can this counter-intuitive fact be communicated effectively to people unschooled in statistics?"
How can this counter-intuitive fact be communicated effectively to people unschooled in statistics?
Hmm, teach them statistics?
While it's true that there will be false positives, as well as false negatives, you don't convict someone, or have a lung removed, without further testing. When I was diagnosed for cancer, I was tested and re-tested to verify that there was, indeed, cancer. The same goes with screening for terrorists, or anything else. Did the article mention the rate for false negatives as well? After all, if you have a five pound tumor hanging off you face, and your doctor tells you there's nothing wrong, I'd definitely want a second opinion!
That's easy, just tell them that the screenings work about as well as speech recognition. It's 95% accurate and everyone knows how much it sucks.
Great! Thank you for identifying yourself as one of those "unschooled in statistics" people the summary mentioned. Now we just need to experiment with different ways to get you to understand this simple concept.
Wow. Way to illustrate the point. Remember, terrorists are roughly zero percent of the population (at least, of the population going on plane trips in the U.S./U.K.). Odds are, at most one of those 3000 actually is a terrorist. So if it is 90% accurate in identifying terrorist vs. non-terrorist (and vice versa), then 10% of the non-terrorists will be identified as terrorists (or ~300), while the 0-1 terrorists will be missed 10% of the time. And of course, since you don't know for sure if there was a terrorist in the group, an in-depth search of the 300 will usually be a waste of time.
$_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
http://www.amazon.com/Manga-Guide-Statistics-Shin-Takahashi/dp/1593271891
I hate math, always did. I was good at it but just could not stand it. As such I skipped out on about anything math related beyond algebra (college level). Didn't impede my programming ability at all.
Still there are times where I like to learn how stuff works and honestly this series of books, Manga Guide to ......, has given me a quick leg up on a few subjects I would never have gained from traditional text books.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
Exactly. So, someone who doesn't have a grasp on the terminology wants to educate folks who don't have a grasp on it either.
And this kids -- we call journalism.
I am the lawn!
If you have a screen that works 90% of the time, and you detain 300 people, 270 will be terrorists.
Congratulations, you got it wrong exactly the way that is being complained about.
The test accuracy is measured compared to the population tested. In fact, a test that consistently says "no cancer" in all cases is 99% accurate when run on the general population.
Finally! A year of moderation! Ready for 2019?
Back during the TQM fad they'd make this point by giving everyone a clear plastic box with 10,000 little balls in it. There was a cribbage board like affair in it, with 1,000 holes, such that by inverting and shaking the box, then turning it upright, 1,000 of the balls would settle into the holes more or less at random, but still be visible through the clear box. The balls were color coded -- 10 red balls, 40 black ones, 50 blue ones, and the rest white. The odds of getting no red and no black are lower than 1%, contrary to most people's expectations.
This was used to drive home a point about the difficulty of "testing in quality" (quality tests suffer false negatives and if there are, say, 1000 such individual measurements on a piece of machinery it's nearly impossible to ship a machine without at least one thing wrong unless the tolerances are well controlled at the point of manufacture). The same idea works any time you want to illustrate the effects of low-incidence events on a large population.
I've always wondered how much injustice is perpetrated by drug screening on large populations, since false positives do occur and statistically must occur twice in a row at least some of the time, which is the threshold considered conclusive proof of abuse by most employers and the courts.
The article itself started out by oversimplifying the test. It would be an astounding coincidence if the test had both a 10% false-positive and a 10% false-negative rate. In fact, any normal test has a very different false-positive and false-negative rate. People who describe the test should mention both, not this meaningless "90% accurate" number.
The BBC article, while claiming to want to reduce confusion, actually perpetuates the problem by using the meaningless "90%" number instead of the specific positive and negative failure rates. If every article describing tests would quote both failure rates, that would go a long way to getting people to understanding the situation.
I hate it when I make a joke and I get modded "+5 insightful". Mod the stupid comments "funny", not "insightful", pleas
"Works 90% of the time" here means that it will correctly identify a person as terrorist or not-terrorist in 90% of tests.
On a sample of 3000 with accuracy 90%, you will end up with 300 results guaranteed to be wrong or ambiguous, which may or may NOT mean the subject is a terrorist. To be safe, obviously you have to detain these "ambiguous" subjects.
Considering that we know the number of terrorists is incredibly small (from a UK perspective, I'd say something like 100 in 70 millions, or 1 in 700.000, probably even less), we can deduce that these tools are guaranteed to victimize thousands of innocents (at least 69.999) for each "terrorist" ever caught.
-- Let's go Viridian.
Anyone can write software to look for a turban
This sort of racist bollocks is what has been getting people attacked in the US for wearing them, despite them being an optional part of the Muslim faith so most turban wearers are from entirely different religions which actually require them.
http://en.wikipedia.org/wiki/Turban
Please educate yourself before posting such drivel.
I dont read
The given version of "terrorist" is arbitrary and thus subject to change over time - from people who hijack planes with guns and explosives, to apparently nowadays, Iceland, however I think that if you're starting with a number of 1 in 3000 you are so far from reality anyway that what you really want to do is harass innocent people.
Let's look at ALL the hijackings from 1970 to 2000, a total of 924 hijackings. I couldn't find more recent figures quickly, but let's assume that hijackings have continued at a rate of around 30 per year (the average from 1970-2000), that would add another 30 * 9 = 270 hijackings, for a total of 1194 ok I will be generous 1200 hijackings.
Now let's assume (and this is a BIG assumption - I am again going to be very generous) that TEN people, (the terrorists), board the plane for EACH hijacking event. So now we have 12,000 terrorists.
Now let's just look at the passenger data for the LAST YEAR ALONE for the top 5 airlines. They carried last year 420 million people. LAST YEAR. Now assuming that since 1970 till today there have been a total of 12000 "terrorists" (a VERY generous number), when you divide 420 million by that, you would be looking at 1:35,000 people being a "potential terrorist". However do remember that I am only including passenger data for ONE SINGLE YEAR. Assuming again a 90% accuracy, you are still wrongly intimidating well over 3500 people.
If I was to go through year by year and gouge up the billions of people that have been transported by air, the actual chances of the person being screened actually being a terrorist drops to almost zero.
I will not argue against the value of security as a deterrent. However I think that airport security employees should be well aware that they are, more likely than not, harassing innocent people. Therefore all the excessive bullying, posturing, abuse, privacy and rights violations are completely unnecessary in this context. Airline terrorism is NOT a real threat, be it ever so dramatic on the few times when it does happen. Use technology to screen for the obvious, and lock the god damned cockpit door with a solid lock, for the not so obvious.
Seven puppies were harmed during the making of this post.
And here, silly statisticians use two numbers, alpha and beta to represent failure rates. Someone needs to educate them that they really only need one number
See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
The test accuracy is measured compared to the population tested. In fact, a test that consistently says "no cancer" in all cases is 99% accurate when run on the general population.
Wow, thanks! I was going to have this mole checked out at the doctor, you really just saved me a lot of time! I mean, I didn't understand your magic numbers, but if it means I don't have cancer, I'm for it!
I think they totally forget that there is ALSO a 10% possibility that you _don't_ detect the terrorist...
Watch this TED : http://www.ted.com/talks/peter_donnelly_shows_how_stats_fool_juries.html
Privacy is terrorism.
You'd have a really good point if there weren't actually bigoted assholes and/or ignorant people in the world who agree with the great-grand-parent. Earlier in my life I may have been one of them.
I remember painting Muslims with a very broad and unfair brush. People would tell me that all Muslims aren't bad and most want the same thing I do, peace and prosperity. Why don't they speak out against the bigoted extremist representatives then? I would ask.
I didn't have the slightest understanding of the culture and environment those types of ideas breed in and probably still don't. However, I can come out of my own bubble enough to ask myself the question - What motivation would I have to speak out against wrongs being done against a culture who shows repeated disrespect and ignorance for my own?
I'm not suggesting we adopt sharia law and that all North American women start wearing burqa as a sign of respect. There is a very thick line between embracing and adopting a culture and respecting it.
Turbans are worn by Sikhs. This is a completely different religion to Islam which is alleged to harbour these terrorists.
This sort of racist bollocks is what has been getting people attacked in the US for wearing them
What a load of crap, back in 2006 NBC dateline had a bunch of muslims go to a Nascar race and see if they were harassed, guess what they were NOT bothered at all. This sort of idiotic bollocks is what perpetuates the myth that the US is full of racists.
Knowledge = Power
P= W/t
t=Money
Money = Work/Knowledge so the less you know the more you make
You are wrong. Lets suppose there are 60,000,010 people in the country, of which 10 are terrorists and 60,000,000 are not terrorists.
The test will incorrectly identify 600,000 of the non-terrorists as terrorists, and 1 of the terrorists as a non-terrorists.
What this means is that out of the 600,009 people it identifies as terrorists, only 9 actually are.
You'll know when your people are ready for statistics. . . don't even bother trying until state-run lotteries go broke for lack of players.
Er, not really. The usual cost-benefit, expected-payoff analysis doesn't really work when you're talking about extreme examples like winning the lottery, at least not with huge payoffs measured in tens or hundreds of millions of dollars. You can know, perfectly well, that the ROI on a lottery ticket is less than the cost of the ticket, and still consider it a perfectly rational investment.
If I buy $150 worth of groceries and throw in a $1 lottery ticket on top of it, the effective cost to me is zero. I'm never going to notice that dollar being gone. Not having that dollar is going to make no difference to my life. But in the (exceedingly unlikely, yes) event that I win a $100 million jackpot, the payoff is damn near infinite. Having that kind of money can't really be compared to, say, getting a raise, or seeing your stocks go up in the market. It's just on a whole different scale.
So in short: infinity - (0 * 10^-9) = infinity. Don't assume that everyone who buys a lottery ticket is ignorant. Actually, I suspect most people who buy lottery tickets are making this kind of calculation, even if they're not doing the numbers quite as explicitly.
Here's an example in the opposite direction, which I think will make things a little more clear. Suppose I were to set up a "reverse lottery," which works as follows. You have, let's say, a net worth of $100,000. If you sign up for my lottery, I pay you a dollar. Then you pick six numbers between 1 and 10, I draw six balls out of urns, and if the numbers match ... I take everything you own. Your house, your car, your computer, the clothes off your back. You're turned out on the street.
In probabilistic terms, it would make perfect sense for you to play. 1 - (100000 * 10^-6) = 0.9, which means that the game has a positive expected payoff. In fact, it would make sense for you to play a lot, up to whatever limit is allowed, let's say once a day. But would you do it? I kind of doubt you would, because every day, you'd be looking at that one-in-a-million chance of having your life shattered. Most people would consider that a bad risk, no matter what the raw numbers say. And people who play the lottery consider it a pretty good risk for the same reason.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
A broken clock is right at least twice a day.
Yet you failed to learn from that post that making insulting jokes about how anyone wearing a turban in the US can be beaten senseless "because they're a Muslim terrorist" is unacceptable in public
Unacceptable to whom? And why?
As I read the joke, it is making fun of ignorant American rednecks. While I guess some people might find painting such people with such a broad brush insulting or offensive, I don't see it, myself.
Apparently you a) have a different interpretation of the joke and b) feel that your interpretation justifies declaring the joke "unacceptable in public." I don't get your logic, and I certainly don't appreciate your arbitrary and unjustified declaration regarding what is or is not acceptable behaviour "in public".
This is particularly true since /. has a significant world-wide readership--if you clowns can't control your bigots that's your problem, not justification for declaring, in typically American imperialist fashion, what is and is not acceptable here in this international, albeit US-dominated, forum.
Ok, the problem with this comment is that it is now a) exactly what I feel and b) -1 flamebait. Oh well.
Blasphemy is a human right. Blasphemophobia kills.
I think that's called the "prosecutor's fallacy." If there's a 1/10,000 chance of a child dying of cot death, and a woman has two children die of cot death, the prosecutor tells the jury that the chances could only be 1/10,000 * 1/10,000 = 1/100 million that both deaths were a cot death, so she must have murdered them.
This only works if the deaths are statistically independent, which they're not. The parents could have a genetic defect which cause 2 successive infants to die.
If each parent had 1 fatal recessive genetic defect, then 1/4 of their children would die, so the odds are 1/16 that two successive children would die. But actually a lot of fatal birth defects are more complicated than that simple mendelian pattern.
It's even more complicated because some mothers have been captured on video trying to smother their children.
Math has a way of warping almost anything. Take the miles per gallon rating we use in the US to tell us how efficient our cars are. Miles per gallon is actually a very misleading measurement. What we should probably use is gallons per mile, or gallons per 100 miles.
Take an example where a Range Rover gets 14 MPG, a Toyota Rav4 gets 24 mpg, and a Prius gets 46 mpg. It isn't intuitive based on the miles per gallon, but moving from the Range Rover to the Rav4 saves more fuel than moving from the Rav4 to the Prius. That is because people don't drive a fixed number of gallons, but drive (more or less) a fixed number of miles. When you look at the gallons used per 100 miles it is clear. The Range Rover uses 7.14 gallons per 100 miles, while the Rav4 uses 4.17 and the Prius 2.17. So it is clear that changing from a Range Rover to a Rav4 will save almost 3 gallons per 100 miles, while changing from a Rav4 to a Prius only saves 2 gallons per 100 miles.
A family with two children is chosen at random from a large population.
If I tell you only that they have at least one daughter, what is the probability that both children are girls?
Most people can get that one (it's 1/3), but fail miserably on this question:
You are incorrect. Your statistic would be true if we were randomly picking family with two children until we came across one with (at least) one girl. There's a 1/3 odds there we'd pick one with two girls, and 2/3 that we'd pick one with just one.
However, that is not what you said. You said we picked the groups at random, and, hence, telling us the gender of a child tells us nothing about the other one. The genders are entirely independent of each other.
You can see how that works by imagining that the second child has not actually been born yet.
Or imagine it as coin flips. If I announce the result of one coin flip, it's not going to alter the other. If I make pairs of coin flips, and deliberately select a pair that has at least one tails in it, however, I have removed certain flips from the odds.
You actually understand this in your second example, and get the right odds, but surreally miss it in the first, despite using exactly the same example. If only girls are named Mary, saying one is named Mary is exactly identical to saying one is a girl. Your two examples are the same. You meant for your first example to be:
If we pick out two parents who have at least one girl, what are the odds that their other child is a girl?
That has the odds of 1/3, because the possibilities are M/F, F/M, and F/F.
If corporations are people, aren't stockholders guilty of slavery?
... and a utility function too!
The article is confusing because it doesn't indicate the false negative rate. You basically need to know the entire confusion matrix before inferring anything. This way, you can not only calculate the accuracy and the false positive rate, but you can also calculate the false negative rate, the precision and the recall. Precision and recall are much more useful metrics than recall when it comes to tests like these.
Also, you need to know how much it really costs you to have false negatives and false positives. If you accuse someone erroneously of being a terrorist, and the only inconvenience is a few extra minutes of body search (and the humiliation) at the airport, it *might* still be worth the trouble. If on the other hand you end up sending the poor dude to jail, and he sues you for wrongful conviction, then not so much. You therefore need to have a utility function that assesses the cost of getting it right and wrong both ways (positive and negative). That's basically what is discussed in the other article (the cost of cancer screening tests), albeit in an informal way.
"In our tactical decisions, we are operating contrary to our strategic interest."