Slashdot Mirror


Data Mining In Law Enforcement

jcatcw points out a blog entry by Scott McPherson, CIO for the Florida House of Representatives. McPherson condemns the state of data sharing and data mining in law enforcement, saying that the US causes itself a great deal of trouble by focusing more on "antiterror armor and nuke-sniffing devices" than a useful information distribution network. He discusses a few such projects, and how they could have directly affected the events of 9/11. Quoting: "One of those ingenious things that actually worked, Seisint founder Hank Asher's brilliant MATRIX system, remains mired in controversy and politics. Hank showed me MATRIX just a few short weeks after the 9/11 attacks. Using law enforcement data and commercial data, all of the commercial data available in the public domain, Asher's query produced [hijacker Mohamed] Atta's photo -- and about 80 others, many of them fellow 9/11 hijackers, many of them associates of the 9/11 hijackers. It was simple data mining and algorithms, and none of the information was obtained illegally."

17 of 148 comments (clear)

  1. Hold on a minute here by goldcd · · Score: 4, Insightful

    so he managed to write some software that analyzed the internet - and managed to produce photos of some of the people that erm had already erm been identified. Surely (and maybe I've misunderstood something here) a 'result' would be identifying people likely to commit terrorist attacks, allowing enforcement agencies to monitor them and prevent them from commiting future attacks. (and no - this doesn't mean off-shoring every muslin who downloaded the Jolly Roger Cookbook).

    1. Re:Hold on a minute here by k1e0x · · Score: 3, Insightful

      Yeah, he wrote software that detects terrorists after they have committed a crime.. Its key component searches google news. heh.

      But really. Lots of people *may* commit crimes. Computers may decide you are likely to rob a bank tomorrow, that does not mean you will. We need to make sure the law is always about what you do not what a computer projects your going to do. The day we jail people who *might* be about to commit a crime is the day we put people in jail for their thoughts.

      --
      Bringing liberty to the masses. - http://freetalklive.com/
    2. Re:Hold on a minute here by Alpha830RulZ · · Score: 4, Insightful

      If you assume for a minute that the author of TFA is smart enough to figure out if this was a google search or not, this is probably pretty interesting. I'm going to, perhaps naively, assume that the data mining approach was done as a reasonable experiment of a mining approach on some set of data, and arrived at a set of names that should be interesting to check up up. I'll further assume that he properly restricted his training set of data to only data that was available before 9/11.

      If that is the case, this is a pretty impressive set of results. Being able to identify, say, 5 of the attackers, and to have a number of the other hits be known associates, when the training set likely consisted of at least 10's of thousands of names, is pretty fair accuracy. The false positive rate is pretty fair, as well, especially when you contrast it to the No Fly list, which has numerous false positives, and no known successes in identifying anyone of interest.

      There is likely some sort of clustering algorithm behind this, and the math behind those is pretty solid. Before you dis this, or even get excited about privacy issues, I'd suggest you check out a reference such as this

      I'm not really concerned about data mining as a privacy issue, and I think it's a pretty legitimate approach for law enforcement. As a side note, I do data mining and predictive analytics for a living. It's objective, it's factual, and if the practitioner is knowledgable about it, it shouldn't be stigmatizing. Indeed, it would reduce scrutiny on the majority of the folks that would otherwise be tarred by having an arabic surname and swarthy skin.

      It would have the potential to be vastly more effective, and vastly less expensive than the path we are on now. One reason that we might not be using could be that we -have- used it, and didn't find anything. That's the thing about objective data mining, if there is nothing there, it'll tell you that. I don't think, for our current administration, that it's a desireable outcome to find that there is nothing to worry about. If that happened, the populace would be less fearful, and less easy to control.

      Take this one step further, and apply this bit of thought. It has been shown time and again that the TSA is incompetent, and that any motivated terrorist could get a weapon on board a plane. It is further obvious that our ports are porous, and that soft targets abound. We have seen no triumphant pictures of the authorities frog marching attempted terrorists away, no success stories of how these measures have saved our lives again. We have also seen no further attacks.

      This strongly suggests to this practitioner that we have a near zero incidence rate of terrorists in the US; that when a terrorist attempts an attack, he succeeds, and that the lack of attacks suggests that the attack rate is close to zero.

      Data mining would be a useful tool to calibrate this theory.

      --
      I was taught to respect my elders. The trouble is, it's getting harder and harder to find some.
    3. Re:Hold on a minute here by networkBoy · · Score: 2, Insightful

      Let me start by saying I agree with you 100%.

      Now for the thought experiment.
      Stipulation: The computer produces 0.00% false positive identifications.
      The computer identifies a suspect as 100% likely to rob a bank (he's at the teller window, has demanded cash and is pointing a gun) is it OK to arrest him?
      The computer identifies a suspect as 99.9% likely to rob a bank (he's next in line for a teller, has a gun and a demand note) is it OK to arrest him?
      The computer identifies a suspect as 99% likely to rob a ban (he's at the door, gun and note in hand) blah blah blah.
      at what point is the line drawn that the computer's threshold is high enough to warrant an arrest?

      Naturally the stipulation is false. No automated system will have *no* false positives. Schnier has shown how even a 99.99% accurate system is useless when you have millions of "things" to track and identify as threats...
      Just curious as to your thoughts?
      -nB

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    4. Re:Hold on a minute here by Anonymous Coward · · Score: 1, Insightful

      The computer identifies a suspect as 100% likely to rob a bank (he's at the teller window, has demanded cash and is pointing a gun) is it OK to arrest him?

      Well, yes. He's in the process of committing a crime.

      The computer identifies a suspect as 99.9% likely to rob a bank (he's next in line for a teller, has a gun and a demand note) is it OK to arrest him?

      If the bank has a sign on it saying that possession of a weapon inside is illegal, then yeah.

      The computer identifies a suspect as 99% likely to rob a ban (he's at the door, gun and note in hand) blah blah blah.

      Then the cop could wait until he walks through the door.

      Of course, beyond the fact that the stipulation is false, these are all very contrived situations (that the computer is aware of the person holding the gun, the note etc, and that the computer waits until the last instant before making its decision) that aren't going to have anything to do with how "data mining" systems will be used in the real world. It's going to be more like "Timothy McVeigh called 555-1212 once a week for the month before he blew up a government building, and now John Doe is calling 555-1212 once a week." Maybe the computer will "know" that 555-1212 is the most famous pizza parlor in Oklahoma City. Maybe the computer will know it but will ignore this because maybe the pizza delivery guy is a terrorist. Hell, maybe the pizza delivery guy IS a terrorist, and you can call up and order the "daily special" to the address of your enemies.

  2. Or not by geekoid · · Score: 1, Insightful

    "... obtained illegally"

    As counter intuitive as it may seem at first, agencies have strict rules on this kind of behavior.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    1. Re:Or not by shmlco · · Score: 5, Insightful

      Well, as in many things it would seem that there's a loophole or two involved. While there are many restrictions placed on government in terms of data collection and data mining, there are few placed on individual businesses who do the same thing (think credit agencies). As such, there's little stopping the government from simply contracting out its needs to private companies.

      --
      Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
    2. Re:Or not by mOdQuArK! · · Score: 2, Insightful

      On an individual-by-individual basis perhaps, but if rule-breakers are regularly, visibly & effectively punished, then statistically speaking, an organization will have fewer rule breakers.

  3. Hindsight is 20/20 by garcia · · Score: 4, Insightful

    Wow, really? You were able to identify after the fact? Great! Real useful -- that and the fact that it's much easier to find that information when you are looking for a specific result. If this guy had come out and said, "hey, I was able to find those people before the fact," then I'd be impressed.

    1. Re:Hindsight is 20/20 by Chris+Burke · · Score: 4, Insightful

      Yeah, I've got a mother-fucking perfect Suicide Bomber detector. It never fails. 100% specificity, 100% sensitivity. Here's how it works (it's patented, so my lucrative business is not in danger by sharing my methods):

      I stand around a marketplace in Baghdad. When a guy runs up to a crowd, screams "Allah Akhbar", pulls a string on his coat, and fucking explodes all over the place, I point at the spot where he used to be, and say "That was a suicide bomber".

      And before you try to horn in on my business, know that I've already sold the DoD enhancements to my algorithm that covers cases where the bomber doesn't scream "Allah Akhbar", or where the bomber is a she not a he, or where the explosives are in a car not a coat. Or combinations thereof.

      But seriously, it says that "his query" produced Atta's photo (and 80 others only some of which apparently had anything to do with 9/11). What exactly was this query? "9/11 hijackers"? "terrorists named Atta"? "Arabs who've been pulled over"? So Atta's driving citations means it was theoretically possible for someone to pull his name up. The question is, why would they have done this? What would have motivated someone to perform that query, and how exactly does data mining driving citations lead to the important conclusion that Atta was a terrorist?

      The article makes good points that data sharing between law enforcement agencies is a good thing, and helps with such rather mundane things as finding fugitives who skip out on parole, or people who don't show up for court dates. But that MATRIX nonsense is yet another attempt to cash in on post-9/11 anti-terror funding bonanzas. Which, now that I've gotten my slice of the pie, I'm against. :)

      --

      The enemies of Democracy are
  4. Bad news actually by iamacat · · Score: 3, Insightful

    The same techniques will likely be effective for identifying most effective protestors against current administration, or people that can be most effectively exploiting sexually, financially or politically. In fact, terrorists generally cover their tracks much better than innocent civilians.

  5. Re:I could find 19 terrorists in like 5 minutes! by nbert · · Score: 2, Insightful

    Plus in this adult version of the game people tend to ignore that the next top terrorist will not have a profile on www.myspace.com/insaneplancehijacker/, because he/she knows that data mining exists. Legislation and the public in most western countries tends to ignore that any new countermeasures/laws will result in instant adaption on the other side.

    Especially at airports I sometimes get so angry about all the silliness that I play some mind-game with the aim of blowing it all up. My current favorite is to put all kinds of fluids in my hand-luggage to distract them from my laptop. I'd simply replace the MBP's CD-Drive with C4 (and some perfectly centered metal rings to make it look like the actual bay). I'm sure it would work out.
    On the other hand I'm quite aware that some circumstances make it easier for me: Blond hair and no beard, terrorists use Dell ;) and I know some European airports which don't even check your luggage if you have a gallon of fluids in your hand luggage (I usually realize on the security check flying back).Heathrow for example is more busy enforcing their non-smoking policy and tracking lost luggage. If you wanted to transport a nuke Heathrow would be the place to start your journey. But if you are simply looking for a pleasant flight avoid it at all costs :D

  6. Re:License plates by Anonymous Coward · · Score: 2, Insightful

    I've always wondered why they don't equip police cars with a video camera and the ability to OCR every single plate that comes into view There are already systems like this deployed. I don't know specifically where, but I receive a Law Enforcement monthly magazine and I've seen many ads for exactly this type of product.

    A quick search for 'automated license plate' on google brings up a bunch of relevant results if you're interested in finding out more.
  7. Re:Wonder how long until this is all public domain by Anonymous Coward · · Score: 2, Insightful

    I keep watching the bar for spying on people get lower and lower.

    First it was suspected enemy agentz.

    Now its anyone suspected of a crime.

    What the hell are you talking about? People suspected of crimes have always been subject to spying, e.g. wiretaps.

  8. Re:I could find 19 terrorists in like 5 minutes! by Klaus_1250 · · Score: 2, Insightful

    Especially at airports I sometimes get so angry about all the silliness that I play some mind-game with the aim of blowing it all up.

    Last time I was at an airport dropping my sister of, I was thinking the exact same thing. I saw her going through the security-checkpoint and she had to turn on her laptop so they knew it wasn't a bom. How silly is that: "could you please activate the potential on-switch of a bomb, so we can be sure it isn't a bom?"

    Not sure if it is the same everywhere, but the security-checkpoint was pretty crowded, at least 50 at the checkpoint and 100 in close vicinity. If your goal, as a terrorist, is to instill fear, what better way to get people frightened to death of security checkpoints? As a bonus, you kill off some infidels and shutdown the airport of several days (depending on the airport anywhere between hundreds of thousands to millions of dollars/euro's/etc. of damage/loss)

    The reality is of course, that the "real terrorist masterminds" and their cells, won't do that. They attack important/unique symbols. The fact that people die in the process, economic damages arise, etc are just bonuses. So the only thing I have to worry about at security checkpoints are those who are in control of them, or some radical religious fruitcake reading slashdot.

    --
    It only takes one man to change the Wisdom of the Crowd to Tyranny of the Masses.
  9. Re:Algorithms are easy by rtb61 · · Score: 2, Insightful
    I think you have missed the ultimate frailty of data mining as well. It has much more to do with eliminating false data than jamming ever more of it into the mix. Once you have false data in there it contaminates are worth while intelligence, it creates false connections and obscures the truth.

    The most likely sources of false data is not the people they are trying to catch but supposedly legitimate sources pushing their own barrow, intelligent consultants trying to rack up hundreds of thousands of dollars in fees, incompetent local law enforcement on petty ego bloating revenge trips, federal agents to tightly involved in politics creating evidence for electioneering purposes and, of course corporations abusing data in every way imaginable to justify hundreds of millions of dollars in contracts creating giant gigo data dumps, over and over again.

    Then of course is the waste of intelligence and law enforcement resources in pursuing a mountain of worthless leads because somewhere in there are valid ones. Hmm, crime prevention, which is the most sensible route, based upon an initial lead you question a possible suspect put them on alert and dissuade them from any further activity or you insert agents who further stir up the mix, who exacerbate a situation that would likely have petered out, who actually create the threat in order to gain an expensive conviction and costly imprisonment for a crime that was never committed, but it looks good for promotions.

    So is data mined evidence sufficient to trash some ones house, terrify their family, attend their place of employment and get them fired and, to not only threaten their future but actively destroy it, all of course completely free of any legal responsibility for those clearly immoral actions.

    --
    Chaos - everything, everywhere, everywhen
  10. Re:Maybe by Antique+Geekmeister · · Score: 2, Insightful

    Sure, now you and your girlfriend have a gun to call on in a family spat. You do realize how much more common domestic violence is than home invasions with someone present?