Software 'No More Accurate Than Untrained Humans' At Predicting Recidivism (theguardian.com)
An anonymous reader quotes a report from The Guardian: The credibility of a computer program used for bail and sentencing decisions has been called into question after it was found to be no more accurate at predicting the risk of reoffending than people with no criminal justice experience provided with only the defendant's age, sex and criminal history. The algorithm, called Compas (Correctional Offender Management Profiling for Alternative Sanctions), is used throughout the U.S. to weigh up whether defendants awaiting trial or sentencing are at too much risk of reoffending to be released on bail. Since being developed in 1998, the tool is reported to have been used to assess more than one million defendants. But a new paper has cast doubt on whether the software's predictions are sufficiently accurate to justify its use in potentially life-changing decisions.
The academics used a database of more than 7,000 pretrial defendants from Broward County, Florida, which included individual demographic information, age, sex, criminal history and arrest record in the two year period following the Compas scoring. The online workers were given short descriptions that included a defendant's sex, age, and previous criminal history and asked whether they thought they would reoffend. Using far less information than Compas (seven variables versus 137), when the results were pooled the humans were accurate in 67% of cases, compared to the 65% accuracy of Compas. In a second analysis, the paper found that Compas's accuracy at predicting recidivism could also be matched using a simple calculation involving only an offender's age and the number of prior convictions.
The academics used a database of more than 7,000 pretrial defendants from Broward County, Florida, which included individual demographic information, age, sex, criminal history and arrest record in the two year period following the Compas scoring. The online workers were given short descriptions that included a defendant's sex, age, and previous criminal history and asked whether they thought they would reoffend. Using far less information than Compas (seven variables versus 137), when the results were pooled the humans were accurate in 67% of cases, compared to the 65% accuracy of Compas. In a second analysis, the paper found that Compas's accuracy at predicting recidivism could also be matched using a simple calculation involving only an offender's age and the number of prior convictions.
Only bad programmers/designers.
Slashdot, fix the reply notifications... You won't get away with it...
It seems obvious that someone with more relapses in the past will also be more likely to do it again. However, I will assume that at that point, a judge wont allow for bail anyway so if this is about people with three or less offenses on their record, I'd imagine that ONLY going by the criminal history is going to be inaccurate no matter who or what is looking at it.
Isn't this more a case of bad data as opposed to bad programming? Because "no more accurate than an untrained person" implies pure chance.
Tl;Dr Single old program tested in situation vendor says is inaccurate use of software, software doesn't work well. Thus all programs will forever be terrible at this task and these computer guys should give up and do something useful. Like writing headlines for news sites!
Isn't this precisely what you would expect when the information gathered to make the decision isn't influential enough on the outcome. It says they have 137 variables, which were as useful as 2. It suggests that the additional variables are either unrelated to the outcome, or are strongly related to the 2 suggested such that either way they provide no additional accuracy.
So. Is there any better algorithm? You'd think that if there were a consensus among people studying this, they'd code in the consensus. Maybe the interesting thing here is that age and priors are the only useful information for predicting recidivism. This doesn't seem like rocket science. We've got decades of data. We ought to be able to run some other algorithms over it--something that takes into account a 3rd variable, and see if it helps. Maybe it does. Maybe it doesn't.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
They said the software used 137 data points on determining the probability of re-offending but they were no better than if someone use just 2, age and prior convictions. Perhaps I've had more statistics training than most but this seems highly probable. This is pretty basic data analysis, or so I thought. If you take a bunch of data points and correlate them to re-offend rate there will be some data points that correlate more than others. If one doing the analysis tossed out the data points that had little to no correlation then the accuracy of the predictive value will still be effectively unchanged.
I don't think pure statistics is the proper approach as much as machine learning. To be honest the problem sounds like something that could be tackled in an undergraduate course, "here's your variables, there's your outcomes, run a classifier, and submit your results".
Perhaps there's something fundamentally difficult about getting above 2/3 accuracy, but it seems you should REALLY be able to beat untrained workers on a problem like this.
I suspect this is just a case of a product built in the late 90's on weak ML (or some homegrown stats) and they never felt the need to improve their results since.
Another thing that I've learned, and I'll admit is controversial to the SJWs out there, is the correlations between ethnicity and intelligence, and between criminal tendencies and intelligence. This is not controversial to the people that do this analysis, it's been established with considerable evidence.
There's no question that there's correlations between IQ and skin colour, the question is whether that's the characteristic of the ethnicity or race or due to socio-economic factors.
Those with an IQ around 85 or 90 (depending on who you ask) will be most likely to be criminals. Above that IQ there is greater profit in getting a job. Below that IQ the people will have problems concocting the means to break the law and still come out ahead.
The biggest predictor of criminality is age, the cause isn't poor earning potential, it's poor self-control and ability to anticipate consequences.
People from certain areas of the world will, on average, have a lower IQ. Average IQ, by definition, is 100.
If these people want a more accurate indication of criminal behavior then give an IQ test. They won't do that though because people with a certain ethnic background will "fail" this test and be considered more likely to offend.
They won't do that because you'd be denying people bail for being dumb. Anyway, you're probably not getting useful data because we're already dealing with convicted criminals and we have data on their criminal history.
With this trend of ethnic background having some correlation to skin color this algorithm would immediately be considered "racist" and be tossed out by the SJWs. Even though it would be highly accurate in determining future criminal behavior we can't tolerate a "racist" algorithm.
Why do we see more people with dark skin in prisons? Not because of some inherent racism. It's because low IQ people are more likely to break the law, and people with dark skin tend to have a lower IQ. This should not reflect on any individual because "trend" does not mean "will" or "did". Also, even with a 90 average IQ in a population still leaves a lot of room on a bell curve for many geniuses in that population.
Posted anonymously because I'm sure just mentioning these indisputable facts will likely get me labeled a racist.
If you wanted to discuss the role of race in a predictive algorithm that's valid, there's definitely an issue where machine learning algorithms can learn racial bias, even when they have to infer it through secondary measures. And depending on your view that's a good thing (it improves accuracy) or a bad thing (people are literally being judged by the colour of thei
I stole this Sig
The software works for free 24 hours a day, 7 times a week, doesn't need sick days, vacation, maternity leave nor does it want a pension when it ill be replaced by a much better AI version.
When it comes to homicides, most of them (about 90%) are perpetrated by yourself, your close relatives (spouse, parents, children) or your acquaintances.
It has been shown that COMPASS overestimates the recidivism of black people by a factor of about two, while it underestimates the recidivism of white people at about the same rate -- while at the same time not even including race in the list of variables.
So it will rather deny bail to a black person which never commits a crime again. But it will let a white person go free on bail who later will become a repeat offender. As the exact inner workings of COMPASS are regarded as business secret, there were some experiments to find out why it is so bad at estimating the recidivism rate of people, and it seems that it totally overweighs social factors (stable/unstable family background, unemployment rate, debts etc.pp.), because there are many of them in the list of factors it considers. On the other hand, there are not many variables for the type of crime committed, and thus it does constantly underestimates those in the total. It would thus grant bail to a sexual offender who comes from a stable family background with steady income, though the recidivism rate of those is 70%, but it is only a single factor weighing against the offender. On the other hand it would deny bail to a petty thief, who does not have a stable family life, is indebted, has only short periods of employment and moves often.
Basicly: COMPASS is biased against people in poverty.
Are they much more accurate? How much?
60% of all homicides are suicides
most of them (about 90%) are perpetrated by yourself
Neither of which are relevant to the racial disparity question.
Looking at Wikipedia, 52% of the USA's murders (i.e. not including suicides) are committed by black murderers.
Our AC troll presumably thinks that this proves that black people are naturally violent, or some such nonsense. Nope. The USA is very far from a colour-blind society, so the figures aren't all that surprising. Black Americans are far more likely to have the misfortune of growing up around violent gangs, etc.
Steven Pinker spoke about exactly this recently. There's no need to deny seemingly awkward facts in order to be a good liberal. Really, the facts show that racial inequality is still high in the USA, which supports the liberal view.
Alternatively: vendor oversells effectiveness of its proprietary, secret sauce methodology and doesn't like any independent evaluation of its products unless it's favorable. Customers, having a naive faith in technology, buy anyways, which produces exactly the results you mention: programs will be forever terrible at this task. Why should anyone bother to make a program good when customers will shell out good money for mediocre?
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
No. The problem is that people have realized the software is racist. What happens is this:
Black citizens tend to get more minor criminal issues than white ones because of institutional racism. Then this software sees that a black man has two citations for, say crossing the street away from a crosswalk, while the white man does not. So it gives him a higher risk of recidivism, which means more bail/longer jail time.
Then the software guys complain and say they aren't racist, they are just applying the algorithm.
This article is trying to shut them up by saying their algorithm, in addition to being racist, doesn't work any better than simple common sense.
It is not an attack on the business model, just of the current state of the art.
excitingthingstodo.blogspot.com
It tried to be fair and actually failed, because it uses a methodology that clearly wasn't designed by a statistician.
The program uses over a hundred factors in its classification scheme, but statisticians and data scientists make a point of pruning factors because long experience has shown that introducing many irrelevant factors actually reduces predictive accuracy. And just because race is not an explicit factor doesn't mean that the algorithm is race blind either. It's entirely feasible to given the huge number of factors involve to recover the subject's race with a better-than-chance reliabilty, whether explicitly or implicitly; intentionally or even by accident.
Now the program's score is equally correlated with reoffending rates whether the subject happens to be white or black, which sounds impressive and color-blind -- to a layman. To a mathematician not so much. It's actually quite easy to produce this result by tweaking your model, implicitly recovering race in the manner suggested above and forcing it to produce a result that looks right -- in aggregate.
But what a statistician wants to know is about conditional probabilities, and it turns out that when applied to retrospective data the program is twice as likely to commit a type 1 error (falsely predicting reoffending) for black subjects as white. If this makes the whole process of achieving fairness sound hard, that's because it is. Color-blindness in aggregate isn't the same as color blindness on a case-by-case basis, and that's the thing that actually matters.
Ultimately you want criminal justice decisions to be based on reason, and mathematics is the purest form of reason there is. And because you want those decisions to be based on reason, they have to be transparent. Secret methods for arriving at decision-making are fundamentally antithetical to our concept of justice.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Is it really important for a potential murder victim whether the potential perpetrator is "naturally" or "culturally" violent?
No, but it may be important to a society that has a desire to reduce the murder rate.
I disagree with what you say on multiple levels. I did NOT claim that criminal history is a proxy for race. Instead I claimed that blacks are disproportionately likely to have a criminal history. I also do not agree that race predicts recidivism independently, your blatantly racist belief that certain races commit more crimes. One study (or two or three) does not confirm your racist beliefs.
There are multitude other studies that contradict yours - and they have major holes in them. One of the big holes is that you assume arrest statistics are fair, the cops clearly are not. I.E. as demonstrated by this story: https://features.propublica.or..., blacks are far more likely to be punished by police for the same infraction that is ignored when white men do it. This negates the value of statistics showing blacks commit more crimes.
Finally, I do not eliminate valid independent variables. Instead, I claim they are not valid,.
excitingthingstodo.blogspot.com
When it comes to homicides, most of them (about 90%) are perpetrated by yourself, your close relatives (spouse, parents, children) or your acquaintances.
Seems like we should do "something" about this person and their spouse, parents, children and acquaintances.