How Statistics Can Foul the Meaning of DNA Evidence
azoblue writes with a piece in New Scientist that might make you rethink the concept of "statistical certainty." As the article puts it, "even when analysts agree that someone could be a match for a piece of DNA evidence, the statistical weight assigned to that match can vary enormously, even by orders of magnitude." Azoblue writes: "For instance, in one man's trial the DNA evidence statistic ranged from 1/95,000 to 1/13, depending on the different weighing methods used by the defense and the prosecution."
"Members of the jury, there's only a 1 in 13 chance that the defendant is actually the killer based on the DNA evidence. If the defendant were sitting in the jury with you, then there's an equal chance that it was any one of you. And since we can eliminate all 12 of you, that leaves only the defendant left over. So you must find the defendant guilty of all charges since he's the only one left out of 13 people. The prosecution rests."
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
Since when in the hell do you count common matches as proof that it comes from one person? Some of these labs are doing something very wrong, and I hate to think of both the false positives, and negatives, that came from their "expert" opinions.
Absolute power corrupts absolutely. indymedia
The trouble that this paper and many others illustrate is the HUGE ignorance of proper statistical methods in the scientific community. Such things like a students T test are - statistically speaking - simple. Yet they are often beyond many in the science community. Thus, there is a tendency for misuse of technique, which in turn leads to divergent interpretations of what a data set means. The legal profession is even worse, as they don't care about the laws of mathematics. In a court, you are not required to answer to a professor of mathematics, hence you can assert anything. If your opponent doesn't have the necessary skill or knowledge to call BS on what you say, you can win an argument with a completely baseless assertion. Take an example. A man is fired for missing work on a Monday. The company's lawyer states "Fully 40% of this employee's absenteeism occurs on Mondays and Fridays. It is appalling that this weekend extending behaviour continues, and we must do something about it". The mathematically challenged lawyer for the poor sap can't see the issue with this and lets it stand.
JE (always wanted to use that example. May have the justification a bit!)
Why don’t you just suggest that anyone who’s arrested is “statistically” guilty and we should just skip the trial...
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
it shouldn't be used to free someone who was justly convicted with other evidence.
And you know that the other evidence wasn't faulty, how? Police make mistakes, witnesses lie or remember things wrong, etc etc.
You either believe your justice system is fair or else you scrap the entire thing.
Or you ditch that false dichotomy and realize that within every system mistakes will be made. There is nothing in fixing past errors that means you throw out the whole system.
Your alternative would mean that we would have to release every murderer and rapist.
No, actually it wouldn't.
Genetics means "out of your control" and touches on some raw nerve issues, so there's a lot of throwing around of "statistical" information and unrealistic mental models.
For example of statistical confusion:
New research shows that at least 10 percent of genes in the human population can vary in the number of copies of DNA sequences they contain--a finding that alters current thinking that the DNA of any two humans is 99.9 percent similar in content and identity.
http://www.hhmi.org/news/scherer20061123.html
And broken mental models:
http://en.wikipedia.org/wiki/Lewontin's_Fallacy
Until our knowledge improves, you're going to see more "politicization" of DNA-related science.
Futurist Traditionalism
You can prove anything with statistics.
No. You can prove anything with BAD statistics. Unfortunately, most statistics are bad.
-Scientist Statistician (enough to know that I don't know statistics)
I speak from personal experience. I use them al the time and still don't really understand them. Not how they apply in criminal investigations anyway.
Let's say you have evidence that matches 1 in a thousand people. You search through your database of all 1000 suspects and you get a single match. Did he do it? Logically you'd expect this to mean you can be 99.9% sure. You then search through the database of a million random people. You get 1000 matches. Does this mean there's only a 0.1% chance that your original suspect was guilty? Well, maybe there's some other compelling evidence that makes it most likely that one of those 1000 people were the culprits. But you have 10000 outliers. They're each a tenth as likely to have committed the crime. You get 10 matches. So, once again we're at the 50% probability of guilt, or something in that ballpark.
I'm sure this is a somewhat different example than that given in the article but that's not the point. The point is that is there a 99.9% probability, a 0.1% probability, a 50% probability or some other probability of guilt? Or am I just trying to confuse you by throwing numbers at you?
This should only suprise people who think court cases are about facts and justice. It is well known that facts just get in the way of what's true and real.
This sounds like a good reason to stop releasing all of those convicted murderers and rapists who were freed on DNA evidence.
Not at all.
There is no problem determining that the DNA is from somebody else than the accused. All it takes is a single marker that's different. That's easy.
The problem is going from some bunch of markers that match to saying "This IS the bum! (Well, except for a one-in-[some number] chance it really isn't.) That requires a lot of information about prevalence of genetic markers, whether there is a correlation between their distribution. That information isn't well researched and the different estimates are based on different wild guesses by different experts. Further, the whole independent-probability thing gets knocked into a cocked hat with FAR lower numbers if the police found the accused by searching a DNA database for matches. And what if he had an evil identical twin? Or somebody with access to PCR gene-amplification materials, a DNA sample, and an atomizer decided to frame him?
IMHO DNA evidence is decisive for the defense. But pending a lot more research it's still voodoo for the prosecution.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Sadly true, but there's so much about DNA analysis that you don't get on an episode of CSI. On TV DNA analysis only takes a few minutes and matches are proudly announced by flashing messages on the DNA machine.
In real life good DNA matching takes days, cost a lot of money and, as the article points out, matching can be in the eye of the beholder. DNA samples are incredibly easy to contaminate, whole labs can become contaminated over time if they don't have and follow strict contamination protocols. And there has been more than one reported case of harried techs gun-decking DNA analysis when police and prosecutors were certain they had the right guy.
Well done DNA analysis can be an amazing crime fighting tool but the science is not perfect and it's okay to be skeptical. There is no magic identification test that's completely fool proof. And DNA tests are only as good as the fool running the test.
That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
In this case, however, there were many people present at the discovery of the object from which the DNA was taken for analysis. As it happens, several of these people were relatives (brother, mother) of the person the prosecution were trying to persuade us was the person that possessed (in legal terms) the object.
The question that I kept hoping the defense attorney would ask was "what are the probabilities of an erroneous match if the people are relatives, not just two random people off the street"? Unfortunately, he didn't.
As it happened, there were so many other peculiarities in this case as well as some pretty bizarre testimony from prosecution witnesses that we voted to acquit without making much of the DNA evidence.
udin
... in a case I was on the jury for. (Sorry for the bait-and-switch title, couldn't resist.)
This was a case of armed home invasion. The victim was a big bruiser of a man, a multiple convicted drug addict. The defendant was a scrawny young Cape Verdean guy. (Cape Verdean drug gangs are common in the area: this is important later.) The victim testified that, after buying drugs from the defendant, he got a series of enraged voicemails demanding the return of the defendant's cell phone. A few hours later, the defendant allegedly shows up at the victim's house with a gun and barges in yelling. A struggle ensued, a shot was fired into the floor, and the guy with the gun fled.
Evidence against the defendant included eyewitness testimony from the defendant, matching ammunition found at the defendant's house, and crucially a do-rag found at the scene of the scuffle. DNA tests matched the do-rag to a mixture of at least 3 people, including the defendant. The DNA mixing was probably due to really awful police work: a paper bag borrowed from the defendant's cupboard is not a proper evidence collection container.
As in TFA, mixed DNA dramatically affected the "probability of exclusion" statistics: the state's expert testified there was a 1 in 50 chance that a random man on the street would match the DNA on the do-rag. The odds that a random *black* man on the street would match were much higher, like 1 in 20; the defense pointed out that the odds that a random *Cape Verdean* would match would be much higher.
We've grown used DNA evidence saying things like, "not one other person on the planet could match this DNA", but in this case, the odds were good that the DNA evidence would match at least one other person sitting in the *courtroom*. The defense also took the unusual tactic of introducing the defendant's sister, who testified that her *other* brother looks very much like the defendant, and she said it was *his* voice on the enraged voicemails. What are the odds that the DNA matches the *brother* instead? Damned good.
Between the fact that the eye witness seemed shifty and unreliable and was probably on crack at the time of the incident, and the fact that all the physical evidence could just as well implicate the brother as the defendant, we couldn't rule out the possibility that the cops got the wrong guy, so we found him not guilty. If I had to take a bet, I'd say he did it, but I wouldn't bet his life on it.
Anyway. Moral of the story is: on cop shows and in the public awareness, DNA evidence is rock solid and incontrovertible. But in the real world, the statistics of DNA mixtures make things a whole lot less cut-and-dried.
That's being done routinely all over the world today.
People who drink are statistically more likely to commit traffic accidents, so they are convicted without the need to actually do any harm to anyone.
Well, then, if you expect your opponent to pull something like that, bring in a statistician, qualify him as an expert witness and let him rip the assertion to shreds.
That doesn't always work so well. Read about John Puckett sometime...
Rather than try to sort out the disparities between its numbers and database findings, the FBI has fought to keep this information under wraps. After Barlow subpoenaed the Arizona database searches, the agency sent the state's Department of Public Safety a cease-and-desist letter. Eventually, the Arizona attorney general obtained a court order to block Barlow's distribution of the findings. In other instances, the FBI has threatened to revoke access to the bureau's master DNA database if states make the contents of their systems available to defense teams or academics. Agency officials argue they have done so because granting access would violate the privacy of the offenders (although researchers generally request anonymous DNA profiles with no names attached) and tie up the FBI's computers, impeding investigations. These justifications baffle researchers.
Source: DNA's Dirty Little Secret
And most people don't understand statistics, however good they are, and draw wrong conclusions. Case in point: the author of TFA doesn't seem to have a clue what a likelihood ratio (LR) is. In the article it comes across as a type of comparison - contrasted with RMP (random match probability) and RMNE (random man not excluded), which are different tests to apply to the data. But actually LR is a way of presenting a probability which is used by forensic scientists because it's supposed to be easier for juries to understand - so you could present an RMP result as an LR, or an RMNE result as an LR.
FWIW, I'm not defending any of the statistics in TFA as good. I notice a complete absence of any error estimates. And I distrust forensic match probabilities in general because I've seen forensic fingerprint analysis software which uses pseudorandom numbers in the computation of the match probability and can vary the LRs presented (to 16s.f., believe it or not) by an order of magnitude if you recalculate.