Data Mining In Law Enforcement
jcatcw points out a blog entry by Scott McPherson, CIO for the Florida House of Representatives. McPherson condemns the state of data sharing and data mining in law enforcement, saying that the US causes itself a great deal of trouble by focusing more on "antiterror armor and nuke-sniffing devices" than a useful information distribution network. He discusses a few such projects, and how they could have directly affected the events of 9/11. Quoting:
"One of those ingenious things that actually worked, Seisint founder Hank Asher's brilliant MATRIX system, remains mired in controversy and politics. Hank showed me MATRIX just a few short weeks after the 9/11 attacks. Using law enforcement data and commercial data, all of the commercial data available in the public domain, Asher's query produced [hijacker Mohamed] Atta's photo -- and about 80 others, many of them fellow 9/11 hijackers, many of them associates of the 9/11 hijackers. It was simple data mining and algorithms, and none of the information was obtained illegally."
so he managed to write some software that analyzed the internet - and managed to produce photos of some of the people that erm had already erm been identified. Surely (and maybe I've misunderstood something here) a 'result' would be identifying people likely to commit terrorist attacks, allowing enforcement agencies to monitor them and prevent them from commiting future attacks. (and no - this doesn't mean off-shoring every muslin who downloaded the Jolly Roger Cookbook).
Wow, really? You were able to identify after the fact? Great! Real useful -- that and the fact that it's much easier to find that information when you are looking for a specific result. If this guy had come out and said, "hey, I was able to find those people before the fact," then I'd be impressed.
I keep watching the bar for spying on people get lower and lower.
First it was suspected enemy agentz.
Then it was suspected associates, even though separation may be 3-4 people away in a chain.
Now its anyone suspected of a crime.
How long until everyone is dumped in this database for not just intel or law enforcment, but potential employers, stalkers, and violent criminals data mining for easy marks?
It was simple data mining and algorithms, and none of the information was obtained illegally. 1. He doesn't tell us what the "Asher's query" was, leaving us with the impression that anyone could magically ask the right question and stop crime.
2. I wonder what he means by "commercial data available in the public domain". Either it's commercial and you have to pay for it, or it's public domain. My long distance calling patterns are commercial data (and is sold by the phone company for marketing), but they're not "public domain" in the way that most of us would understand it.
[Fuck Beta]
o0t!
I have a lot of issues with the various things in this article, but I'll keep it to one for now. Maybe Atta could have been arrested because of better coordination between local law enforcement. But his arrest almost certainly would NOT have prevented 9/11. Moussawi was supposed to be there that fateful day, and it still went down. One person arrested, even one of the many masterminds, would not have prevented it.
Also, no local law enforcement officer would have been able to piece together this plot from looking through one car BEFORE the event. Piloting multiple planes simultaneously into various landmarks was just too implausible to be believed before it happened. Even if John McClain himself figured it out, he wouldn't be able to convince anyone to help him stop 19 other people from boarding planes in multiple airports.
Sharing information sure beats what we're doing now, both in law enforcement and the intelligence community where I work, which is holding everything close so no one else can take credit. But let's not exaggerate the benefits here.
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
A few short weeks after the Kentucky Derby, I devised a database system that predicted the winner. Impressive, no?
I wrote parts of this stuff
You often hear about the police pulling over some guy for whatever reason and finding out he had an outstanding warrant or something. I've always wondered why they don't equip police cars with a video camera and the ability to OCR every single plate that comes into view. License plates all use the same font, so they should be easy to OCR, and in theory they use a high-visibility color scheme (though that's not always the case.) The camera would scan, read the characters, and compare it to a big list of stolen vehicles, stolen license plates, vehicles that fled accident scenes or other crimes, vehicles that belong to people that have warrants, Amber alerts, etc., and any "interesting" plates would pop up on the laptop that's now in most police cars.
I'm not saying it would put up a big "pull over and detain!" notice, but it could pop up the plate, the vehicle it should be on, the owner, and why it's of interest, then the officer would decide what to do. I.e., if a car pops up as belonging to a wanted 22-year-old male but it's obviously someone else in the car (too old, wrong gender, etc.) then they would ignore it.
Of course, like anything, there is the potential for abuse, but before you freak out about privacy, remember that driving, by definition, is a very public act. We're not talking about millimeter-wave radio or looking behind closed curtains with an infrared camera, we're talking about reading the required-by-law several-inch-high unique identifier on a hunk of steel with unobstructed windows on the public roads. If you're wanted and don't want to get caught, it's your responsibility to not go out in public with a visible unique identifier.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
I've worked in the field of law enforcement data sharing. Fact is that most law enforcement agencies are either islands of automation or very loosely connected to other agencies. The stuff you see in TV and movies ("24") is a fantasy. Adjacent towns and cities rarely share information, and this lack of knowledge can put members of their police force in danger (for instance when making a traffic stop). A few years ago, the DOJ kicked off a sharing initiative with the Global Justice XML Data Model (GJXDM). This is an XML based specification for exchanging law enforcement data that was developed at Georgia Tech. I was involved in an initiative in Ohio to share police record management system information at a state level. The system was deployed and is operational today. GJXDM has been superseded by the National Information Exchange Model (NIEM). It should be noted that the NIEM model is even more complex than it's predecessor and tends to break many XML tools. The data exchanged tends to be fairly rudimentary and fairly sparse - arrests, bookings, warrants. Nevertheless, most agencies, and most states have either not implemented data sharing or are in the earliest stages of doing so.
[Insert pithy quote here]
Rules are pretty damn useless at modifying behavior if violators aren't caught & punished.
"Data Mining In Law Enforcement"
I'll take "How do you round up the most possible innocent people and make false charges against them" for $500, Alex...
If you can read this, I forgot to post anonymously.
Well, as in many things it would seem that there's a loophole or two involved. While there are many restrictions placed on government in terms of data collection and data mining, there are few placed on individual businesses who do the same thing (think credit agencies). As such, there's little stopping the government from simply contracting out its needs to private companies.
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
The same techniques will likely be effective for identifying most effective protestors against current administration, or people that can be most effectively exploiting sexually, financially or politically. In fact, terrorists generally cover their tracks much better than innocent civilians.
Plus in this adult version of the game people tend to ignore that the next top terrorist will not have a profile on www.myspace.com/insaneplancehijacker/, because he/she knows that data mining exists. Legislation and the public in most western countries tends to ignore that any new countermeasures/laws will result in instant adaption on the other side.
;) and I know some European airports which don't even check your luggage if you have a gallon of fluids in your hand luggage (I usually realize on the security check flying back).Heathrow for example is more busy enforcing their non-smoking policy and tracking lost luggage. If you wanted to transport a nuke Heathrow would be the place to start your journey. But if you are simply looking for a pleasant flight avoid it at all costs :D
Especially at airports I sometimes get so angry about all the silliness that I play some mind-game with the aim of blowing it all up. My current favorite is to put all kinds of fluids in my hand-luggage to distract them from my laptop. I'd simply replace the MBP's CD-Drive with C4 (and some perfectly centered metal rings to make it look like the actual bay). I'm sure it would work out.
On the other hand I'm quite aware that some circumstances make it easier for me: Blond hair and no beard, terrorists use Dell
I don't read replies by ACs.
Capital punishment is.
"Prefiero morir de pie que vivir siempre arrodillado!"
This guy also doesn't seem to have much knowledge of intel gathering. The idea that forward projection isn't happening is...uh...wrong, and that's all I'll say on the matter (disclaimer: I'm ex-NSA)
If you're ex-NSA, then you also know that the difficulty isn't in writing the algorithms, it's in getting somebody to stitch together all the goddamn databases that are strung out all over creation.
Shit, *I* can write the social networking algorithms, anomaly detection, etc. But it doesn't do any good if you don't have the data integrated, and despite what's happened the last 8 years we still don't have it.
I also don't get the false dichotomy the author uses to rag on sensor-based detection.
On an individual-by-individual basis perhaps, but if rule-breakers are regularly, visibly & effectively punished, then statistically speaking, an organization will have fewer rule breakers.
Last time I was at an airport dropping my sister of, I was thinking the exact same thing. I saw her going through the security-checkpoint and she had to turn on her laptop so they knew it wasn't a bom. How silly is that: "could you please activate the potential on-switch of a bomb, so we can be sure it isn't a bom?"
Not sure if it is the same everywhere, but the security-checkpoint was pretty crowded, at least 50 at the checkpoint and 100 in close vicinity. If your goal, as a terrorist, is to instill fear, what better way to get people frightened to death of security checkpoints? As a bonus, you kill off some infidels and shutdown the airport of several days (depending on the airport anywhere between hundreds of thousands to millions of dollars/euro's/etc. of damage/loss)
The reality is of course, that the "real terrorist masterminds" and their cells, won't do that. They attack important/unique symbols. The fact that people die in the process, economic damages arise, etc are just bonuses. So the only thing I have to worry about at security checkpoints are those who are in control of them, or some radical religious fruitcake reading slashdot.
It only takes one man to change the Wisdom of the Crowd to Tyranny of the Masses.
Hank showed me MATRIX just a few short weeks after the 9/11 attacks. Using law enforcement data and commercial data, all of the commercial data available in the public domain, Asher's query produced Atta's photo -- and about 80 others, many of them fellow 9/11 hijackers, many of them associates of the 9/11 hijackers.
Without additional information it's impossible to say if this is impressive, or just a stupid algorithm trick. With many mining algos, you can easily train them pull certain needles out of the haystack. The question is, will your training situation look anything like the future situations? Training the algo only with the 9/11 terrorists, would it pull out the trade center bombers, or Timothy McVeigh? Will future predictions be right or will it pull out groups of Arabic student pilots who had the misfortune of buying the same shampoo most preferred by 9 out of 10 terrorists. Especially with rare events, I think you mostly get into a hyper complicated version of correlation != causation.
I do web data mining for a living and there is no way any algorithm or a combination of them can give you that kind of accuracy. You will have to be a few light years ahead of current published research to do that. Unless of course the system is drawing from published news about the suspected terrorist attacks in which case what they did was do-able (not as easy as one might naively think... the web is a pretty dirty medium but definitely do-able). I will believe that kind of a thing when I see it.
Then you believe that suicide bombing will go away because the bombers know they will die?
Capital punishment has its uses, but as a deterrent it's pretty limited.