Slashdot Mirror


Anti-Terrorist Data Mining Doesn't Work Very Well

Presto Vivace and others sent us this CNet report on a just-released NRC report coming to the conclusion, which will surprise no one here, that data mining doesn't work very well. It's all those darn false positives. The submitter adds, "Any chance we could go back to probable cause?" "A report scheduled to be released on Tuesday by the National Research Council, which has been years in the making, concludes that automated identification of terrorists through data mining or any other mechanism 'is neither feasible as an objective nor desirable as a goal of technology development efforts.' Inevitable false positives will result in 'ordinary, law-abiding citizens and businesses' being incorrectly flagged as suspects. The whopping 352-page report, called 'Protecting Individual Privacy in the Struggle Against Terrorists,' amounts to [be] at least a partial repudiation of the Defense Department's controversial data-mining program called Total Information Awareness, which was limited by Congress in 2003."

6 of 163 comments (clear)

  1. The actual report by Americano · · Score: 4, Informative

    I know this is slashdot and all, but if anybody's actually interested in looking at the full report, it's available for reading in pdf format online.

  2. Paradox of the False Positive by gknoy · · Score: 4, Informative

    I realize this is likely starting to sound old, but Cory Doctorow's Little Brother should be required reading for people doing something like this. His writings about the "Paradox of the False Positive" are enumerated there, but also in other sources:

    http://www.guardian.co.uk/technology/2008/may/20/rare.events

    Statisticians speak of something called the Paradox of the False Positive. Here's how that works: imagine that you've got a disease that strikes one in a million people, and a test for the disease that's 99% accurate. You administer the test to a million people, and it will be positive for around 10,000 of them because for every hundred people, it will be wrong once (that's what 99% accurate means). Yet, statistically, we know that there's only one infected person in the entire sample. That means that your "99% accurate" test is wrong 9,999 times out of 10,000!

    Terrorism is a lot less common than one in a million and automated "tests" for terrorism data-mined conclusions drawn from transactions, Oyster cards, bank transfers, travel schedules, etc are a lot less accurate than 99%. That means practically every person who is branded a terrorist by our data-mining efforts is innocent.

    (emphasis mine)

    And, as others have pointed out, this system is likely to have a false positive rate higher than 1%.

  3. Re:I'd run on that platform. by Geoffrey.landis · · Score: 4, Informative

    The no fly list doesn't identify people, just names, and it's very exact, so changing charles to chuck will defeat it.

    No, actually it won't. The newspapers are full of stories of people who were detained or forbidden from flying because their name was similar to a name on the list, or a nickname of a name on the list, or a possible alternative spelling of a name on the list, or names that had once been used as an alias of names on the list.

    for example, the name "T. Kennedy" was on the list. Senator Edward Kennedy (whose name does not begin with "T", but who is nicknamed "Teddy") was stopped:
    from Wikipedia

    "In August 2004, Senator Ted Kennedy (D-MA) told a Senate Judiciary Committee discussing the No Fly List that he had appeared on the list and had been repeatedly delayed at airports. He said it had taken him three weeks of appeals directly to Homeland Security Secretary Tom Ridge to have him removed from the list. Kennedy said he was eventually told that the name "T Kennedy" was added to the list because it was once used as an alias of a suspected terrorist. There are an estimated 7,000 American men whose legal names correspond to "T Kennedy". (Senator Kennedy, whose first name is Edward and for whom "Ted" is only a nickname, would not be one of them.)"

    --
    http://www.geoffreylandis.com
  4. Re:Seems by Martin+Blank · · Score: 2, Informative

    I'm actually well aware of how intelligence works. Merely cultivating contacts is an arduous process, because pushing it too fast can cause them to become suspicious and either stop talking to or actively turn on the recruiter. Some are eager to provide what the recruiter wants, and some take years to provide any useful information.

    Your 80/20 assertion is at least partially incorrect, because if it were, the US would have been far less worried about Soviet space program in the later part of the 1960s, and we'd be spending less effort protecting certain sensitive technologies from getting out to various other entities. We wouldn't spend billions on the NRO, and NSA wouldn't need to keep upgrading their SIGINT capabilities each year.

    There are situations where you have to interface with informants that are part of the entity being watched, and some of those informants aren't people with whom the US government wants their dealings public. Congress had a small fit about that in the 1990s, and it made life difficult for field agents.

    --
    You can never go home again... but I guess you can shop there.
  5. Re:I'd run on that platform. by Free+the+Cowards · · Score: 4, Informative

    It doesn't matter, because the only place where you have to get your ID checked is at the TSA checkpoint, and they don't check it against any databases.

    So, the easy recipe for bypassing the no-fly list is:

    1. Purchase tickets in a fake name.
    2. Check in at home before your flight, and print your boarding pass on your home printer.
    3. Using any number of techniques which are trivial to the computer literate, capture that boarding pass, alter it to match your real name, and print a second copy.
    4. When you arrive at the airport, go straight to the security checkpoint.
    5. Use the altered pass with your real name in combination with your real ID to get through security.
    6. Use the original, non-altered pass to board the plane.

    I flew as recently as last month and was not subjected to anything which would defeat this scheme. It fails if you need to check luggage, but I doubt a terrorist is going to be doing that. The no-fly list is such an obvious joke.

    --
    If you mod me Overrated, you are admitting that you have no penis.
  6. Re:Seems by Martin+Blank · · Score: 2, Informative

    Yes, I know about OSINT. It still doesn't replace SIGINT, which cannot replace HUMINT. They're all interlocking pieces of the intelligence realm. HUMINT is more expensive than OSINT, and SIGINT is more expensive than HUMINT. Costs for all of them reach points of diminishing returns. A satellite that shows movements in real time at 1m resolution is better than nothing. Improving that to .5m may cost ten times as much but deliver only five times the value. Improving it to .1m may cost 100 times as much but deliver only 20 times the value.

    Any good intelligence network makes use of everything that it can, whether newspapers, forum posts, criminal contacts, or radio intercepts. All of it is important.

    --
    You can never go home again... but I guess you can shop there.