Slashdot Mirror


Data Mining In Law Enforcement

jcatcw points out a blog entry by Scott McPherson, CIO for the Florida House of Representatives. McPherson condemns the state of data sharing and data mining in law enforcement, saying that the US causes itself a great deal of trouble by focusing more on "antiterror armor and nuke-sniffing devices" than a useful information distribution network. He discusses a few such projects, and how they could have directly affected the events of 9/11. Quoting: "One of those ingenious things that actually worked, Seisint founder Hank Asher's brilliant MATRIX system, remains mired in controversy and politics. Hank showed me MATRIX just a few short weeks after the 9/11 attacks. Using law enforcement data and commercial data, all of the commercial data available in the public domain, Asher's query produced [hijacker Mohamed] Atta's photo -- and about 80 others, many of them fellow 9/11 hijackers, many of them associates of the 9/11 hijackers. It was simple data mining and algorithms, and none of the information was obtained illegally."

6 of 148 comments (clear)

  1. Islands of Automation by rlp · · Score: 4, Informative

    I've worked in the field of law enforcement data sharing. Fact is that most law enforcement agencies are either islands of automation or very loosely connected to other agencies. The stuff you see in TV and movies ("24") is a fantasy. Adjacent towns and cities rarely share information, and this lack of knowledge can put members of their police force in danger (for instance when making a traffic stop). A few years ago, the DOJ kicked off a sharing initiative with the Global Justice XML Data Model (GJXDM). This is an XML based specification for exchanging law enforcement data that was developed at Georgia Tech. I was involved in an initiative in Ohio to share police record management system information at a state level. The system was deployed and is operational today. GJXDM has been superseded by the National Information Exchange Model (NIEM). It should be noted that the NIEM model is even more complex than it's predecessor and tends to break many XML tools. The data exchanged tends to be fairly rudimentary and fairly sparse - arrests, bookings, warrants. Nevertheless, most agencies, and most states have either not implemented data sharing or are in the earliest stages of doing so.

    --
    [Insert pithy quote here]
    1. Re:Islands of Automation by Ronin+Developer · · Score: 2, Informative

      I, too, worked with law enforcement data sharing and, as a senior engineer for a (probably THE)leader in law enforcement software, wrote an interface for our Ohio customers to access the OLLEISN system (and about 10 other data sharing systems as well).

      Personally, the company I worked for had a system that kicked the butts of the larger initiatives. It replicated in near real time, worked with incremental data, optimized network resources and bandwidth, fault tolerant, highly scalable (from local to national level), allowed the departments keep their information in data silos and, thus, controlled the release of the information. It was accessible using a desktop client or from our award winning mobile product.

      Additionally, it has been recognized by several law enforcement publications. Yes, it was deployed in some large markets as well as some smaller implementations of just a few departments wishing to share data with one another and has proven itself highly effective. Yet, it was not listed in this article. Why? Damned good question.

      GJXDM was limited. NIEM is not all that difficult to understand and it fixes a lot of problems of GJXDM. But, both models place some real restrictions on vendors as they need to be able to map their data model to the NIEM or other data sharing initiatives data model. It's not practical to rewrite a 25 year old, proven product just to conform to the NIEM, GJXDM or the next model that is touted as the standard. Hence, I had a job interfacing our system to all the data sharing systems out there.

      No, I don't work there any more. I recently moved on to a new career. But, the work was highly rewarding and I support it wholeheartedly.

  2. Re:Hmm by thisissilly · · Score: 3, Informative
    "public domain" has different meanings in different contexts. In the context of copyright, which is the more common usage on /., "public domain" means "not under copyright", i.e. either there is no copyright or it has expired.

    In the context of Intelligence Analysis, "public domain" means information that is available publicly, as opposed to classified or secret information. Whether something is copyright or not doesn't enter into it.

  3. they have had this for years by Anonymous Coward · · Score: 1, Informative


    and its called ANPR
    http://en.wikipedia.org/wiki/ANPR

    cars,bridges,tunnels

  4. Algorithm training by aero6dof · · Score: 3, Informative

    Hank showed me MATRIX just a few short weeks after the 9/11 attacks. Using law enforcement data and commercial data, all of the commercial data available in the public domain, Asher's query produced Atta's photo -- and about 80 others, many of them fellow 9/11 hijackers, many of them associates of the 9/11 hijackers.

    Without additional information it's impossible to say if this is impressive, or just a stupid algorithm trick. With many mining algos, you can easily train them pull certain needles out of the haystack. The question is, will your training situation look anything like the future situations? Training the algo only with the 9/11 terrorists, would it pull out the trade center bombers, or Timothy McVeigh? Will future predictions be right or will it pull out groups of Arabic student pilots who had the misfortune of buying the same shampoo most preferred by 9 out of 10 terrorists. Especially with rare events, I think you mostly get into a hyper complicated version of correlation != causation.

  5. I call BS by nitpickers · · Score: 3, Informative

    I do web data mining for a living and there is no way any algorithm or a combination of them can give you that kind of accuracy. You will have to be a few light years ahead of current published research to do that. Unless of course the system is drawing from published news about the suspected terrorist attacks in which case what they did was do-able (not as easy as one might naively think... the web is a pretty dirty medium but definitely do-able). I will believe that kind of a thing when I see it.