Slashdot Mirror


Anti-Terrorist Data Mining Doesn't Work Very Well

Presto Vivace and others sent us this CNet report on a just-released NRC report coming to the conclusion, which will surprise no one here, that data mining doesn't work very well. It's all those darn false positives. The submitter adds, "Any chance we could go back to probable cause?" "A report scheduled to be released on Tuesday by the National Research Council, which has been years in the making, concludes that automated identification of terrorists through data mining or any other mechanism 'is neither feasible as an objective nor desirable as a goal of technology development efforts.' Inevitable false positives will result in 'ordinary, law-abiding citizens and businesses' being incorrectly flagged as suspects. The whopping 352-page report, called 'Protecting Individual Privacy in the Struggle Against Terrorists,' amounts to [be] at least a partial repudiation of the Defense Department's controversial data-mining program called Total Information Awareness, which was limited by Congress in 2003."

4 of 163 comments (clear)

  1. The actual report by Americano · · Score: 4, Informative

    I know this is slashdot and all, but if anybody's actually interested in looking at the full report, it's available for reading in pdf format online.

  2. Paradox of the False Positive by gknoy · · Score: 4, Informative

    I realize this is likely starting to sound old, but Cory Doctorow's Little Brother should be required reading for people doing something like this. His writings about the "Paradox of the False Positive" are enumerated there, but also in other sources:

    http://www.guardian.co.uk/technology/2008/may/20/rare.events

    Statisticians speak of something called the Paradox of the False Positive. Here's how that works: imagine that you've got a disease that strikes one in a million people, and a test for the disease that's 99% accurate. You administer the test to a million people, and it will be positive for around 10,000 of them because for every hundred people, it will be wrong once (that's what 99% accurate means). Yet, statistically, we know that there's only one infected person in the entire sample. That means that your "99% accurate" test is wrong 9,999 times out of 10,000!

    Terrorism is a lot less common than one in a million and automated "tests" for terrorism data-mined conclusions drawn from transactions, Oyster cards, bank transfers, travel schedules, etc are a lot less accurate than 99%. That means practically every person who is branded a terrorist by our data-mining efforts is innocent.

    (emphasis mine)

    And, as others have pointed out, this system is likely to have a false positive rate higher than 1%.

  3. Re:I'd run on that platform. by Geoffrey.landis · · Score: 4, Informative

    The no fly list doesn't identify people, just names, and it's very exact, so changing charles to chuck will defeat it.

    No, actually it won't. The newspapers are full of stories of people who were detained or forbidden from flying because their name was similar to a name on the list, or a nickname of a name on the list, or a possible alternative spelling of a name on the list, or names that had once been used as an alias of names on the list.

    for example, the name "T. Kennedy" was on the list. Senator Edward Kennedy (whose name does not begin with "T", but who is nicknamed "Teddy") was stopped:
    from Wikipedia

    "In August 2004, Senator Ted Kennedy (D-MA) told a Senate Judiciary Committee discussing the No Fly List that he had appeared on the list and had been repeatedly delayed at airports. He said it had taken him three weeks of appeals directly to Homeland Security Secretary Tom Ridge to have him removed from the list. Kennedy said he was eventually told that the name "T Kennedy" was added to the list because it was once used as an alias of a suspected terrorist. There are an estimated 7,000 American men whose legal names correspond to "T Kennedy". (Senator Kennedy, whose first name is Edward and for whom "Ted" is only a nickname, would not be one of them.)"

    --
    http://www.geoffreylandis.com
  4. Re:I'd run on that platform. by Free+the+Cowards · · Score: 4, Informative

    It doesn't matter, because the only place where you have to get your ID checked is at the TSA checkpoint, and they don't check it against any databases.

    So, the easy recipe for bypassing the no-fly list is:

    1. Purchase tickets in a fake name.
    2. Check in at home before your flight, and print your boarding pass on your home printer.
    3. Using any number of techniques which are trivial to the computer literate, capture that boarding pass, alter it to match your real name, and print a second copy.
    4. When you arrive at the airport, go straight to the security checkpoint.
    5. Use the altered pass with your real name in combination with your real ID to get through security.
    6. Use the original, non-altered pass to board the plane.

    I flew as recently as last month and was not subjected to anything which would defeat this scheme. It fails if you need to check luggage, but I doubt a terrorist is going to be doing that. The no-fly list is such an obvious joke.

    --
    If you mod me Overrated, you are admitting that you have no penis.