Slashdot Mirror


Academics Confirm Major Predictive Policing Algorithm Is Fundamentally Flawed (vice.com)

An anonymous reader quotes a report from Motherboard: Last week, Motherboard published an investigation which revealed that law enforcement agencies around the country are using PredPol -- a predictive policing software that once cited the controversial, unproven "broken windows" policing theory as a part of its best practices. Our report showed that local police in Kansas, Washington, South Carolina, California, Georgia, Utah, and Michigan are using or have used the software. In a 2014 presentation to police departments obtained by Motherboard, the company says that the software is "based on nearly seven years of detailed academic research into the causes of crime pattern formation the mathematics looks complicated -- and it is complicated for normal mortal humans -- but the behaviors upon which the math is based are very understandable."

The company says those behaviors are "repeat victimization" of an address, "near-repeat victimization" (the proximity of other addresses to previously reported crimes), and "local search" (criminals are likely to commit crimes near their homes or near other crimes they've committed, PredPol says.) But academics Motherboard spoke to say that the mathematical theory that is used to power PredPol is flawed, and that its algorithm -- at least as pitched to police -- is far too simplistic to actually predict crime. Kristian Lum, who co-wrote a 2016 paper that tested the algorithmic mechanisms of PredPol with real crime data, told Motherboard in a phone call that although PredPol is powered by complicated-looking mathematical formulas, its actual function can be summarized as a moving average -- or an average of subsets within a data set.
"The academic foundation for PredPol's software takes a statistical modeling method used to predict earthquakes and apply it to crime," reports Motherboard. "Much like how earthquakes are likely to appear in similar places, the papers argue, crimes are also likely to occur in similar places. Suresh Venkatasubramanian, a professor of computing at the University of Utah and a member of the board of directors for ACLU Utah, told Motherboard that earthquake data and crime data are, naturally, collected in different ways."

"I would say in our mind, the key difference is that in earthquake models, you have seismographs everywhere -- wherever an earthquake happens, you'll find it," Venkatasubramanian said. "The crux of the issue really is that to what extent are you able to get data about what you're observing that is not also totally on the model itself." "If you build predictive policing, you are essentially sending police to certain neighborhoods based on what what they told you -- but that also means you're not sending police to other neighborhoods because the system didn't tell you to go there," Venkatasubramanian said. "If you assume that the data collection for your system is generated by police whom you sent to certain neighborhoods, then essentially your model is controlling the next round of data you get."

3 of 145 comments (clear)

  1. Re:A city by 110010001000 · · Score: 4, Insightful

    That really makes no sense for most crimes. Look at murder or burglary: it doesn't matter if police are in the neighborhood "noticing crimes" or not, it is going to get reported equally theoretically. The only way that applies is for victimless crimes. Traffic violations aren't going to be reported unless a police notices it.

  2. Re:A city by 110010001000 · · Score: 4, Insightful

    Yes we do. If there is a murder, burglary, mugging it is going to be "detected" and reported no matter where it occurs by the populace. The only crimes that won't get reported are minor violations (traffic, etc). Police rarely detect crimes - they respond to crimes after they happen.

  3. *sigh* by jythie · · Score: 4, Insightful

    So.. while not an academic, this is pretty close to my field of research. Looking at their model, I am not surprised they sold this product but deeply disappointed. This is the type of model that is REALLY easy to sell to people, both law enforcement and the military (our customer) are enamored with them for their near magic ability to 'predict' things. Only they don't, they tend to fail in unpredictable ways. They are not bad in multi-model systems where you take a dozen or so different systems built by different teams, run them in parallel, then have subject matter experts ponder the conflicting results. But actual police out of a single model? Madness... or hubris.. or stupidity... or simply being enamored with a slick sales pitch from 'one of your own' offering to solve problems in the way you want them solved.

    Oddly enough, we actually DID do a LEO model years back, which was actually pretty effective, but it encouraged things like community outreach and police/citizen interaction which worked really well for officers on the ground but pissed off lawmakers and 'police unions', so it was largely dropped.

    Which gets back to this story and one of the fundamental flaws in such attempts. The decision makers are not interested in solutions that make things better for high crime areas in the first place, the people in those areas are not part of their power block. They want solutions that 'sound right' to people who live elsewhere and confirm what they already believe. Which is exactly what models like this are good at producing. They are kinda like torture... useless for prediction or information gathering, but an excellent political tool for confirming the story your career depends on being 'true'.