Scientists and Lawyers Argue For Open US DNA Database

← Back to Stories (view on slashdot.org)

Scientists and Lawyers Argue For Open US DNA Database

Posted by Soulskill on Saturday January 9, 2010 @03:29AM from the they-must-have-caught-an-episode-of-csi dept.

chrb writes "New Scientist has an article questioning the uniqueness of DNA profiles. 41 scientists and lawyers recently published a high-profile Nature article (sub. required) arguing that the FBI should release its complete CODIS database. The request follows research on the already released Arizona state DNA database (a subset of CODIS) which showed a surprisingly large number of matches between the profiles of different individuals, including one between a white man and a black man. The group states that the assumption that a DNA profile represents a unique individual, with only a minuscule probability of a secondary match, has never been independently verified on a large sample of DNA profiles. The new requests follow the FBI's rejection of similar previous requests."

8 of 120 comments (clear)

Min score:

Reason:

Sort:

Misuse Of Statistics by Jane+Q.+Public · 2010-01-09 03:56 · Score: 4, Interesting

I have been concerned for years about this, because you often hear prosecutors and "expert" witness testimony to the effect that "the odds are billions to one against this being someone else".

Among other possible statistical mistakes, these unrealistically large numbers are based on the idea that each genetic location being compared is statistically independent. But in fact we know that to not be so. What we definitely do not know is how, or how often, many of these may actually depend on each other.

Let me give you a purely hypothetical example: what are the odds that a genetic profile from a random person contains a gene determining curly hair. What are the odds of finding this gene in a random sample?

You can answer this approximately by simply observing what percentage of the population has curly hair. Let's say 1/4 just for argument. So your odds are 1 in 4.

But here's the kicker question: what are the odds that a genetic profile includes a gene for curly hair, given that it also contains a gene for sicle cell anemia?

The odds are going to change drastically.

This is not a real example, of course, just illustrative. But one can easily see that the contents of genetic locations are NOT necessarily statistically independent, even if one of them does not directly cause the other.

We simply do not know enough to say that any two genetic locations are truly independent. Therefore these huge probabilities ("billions to one" for example) being spouted by prosecutors are completely specious.
1. Re:Misuse Of Statistics by Cassini2 · 2010-01-09 05:16 · Score: 4, Interesting
  
  How it really works? Imagine that you already identified several suspects. If you take DNA samples of these few people and one of them matches the DNA from the hair from the scene, you can still conclude that given your knowledge, with a very high probability the person in question was present at the crime scene.
  
  While true, this statement is yet another example from the trap of misleading statistics. Individually, your statement is likely true. However, collectively, for all the tests the FBI lab is likely doing, then it is likely false.
  Look at it this way: "The probability of me, as a random individual, winning the lottery today is near zero." From this, it is tempting to conclude that: "no random individual in North America will win the lottery today." However, this is clearly not true. Multiple random strangers will almost certainly win the lottery today.
  The statement "no random individual will win the lottery today" is false, because a huge number change occurred. There are millions of people in North America. A similar problem happens with the FBI genetic testing. They do a great deal of testing. Proving an individual test is likely correct is very different than proving large numbers of tests are all correct.
  From a statistical analysis point of view, you would be better matching any given DNA profile against everyone else's in North America. Then you would know exactly how many random matches occurred, and if lab contamination occurred, because the sample would match the lab techs and the police officers DNA too. This is the test the FBI is arguing against. Nevertheless, this is the validation test that needs to be done, because modern PCR DNA techniques should detect significant numbers of people connected with the location and/or sample access path, over significant periods of time.
Re:chimps have 97% of human DNA by Anonymous Coward · 2010-01-09 04:02 · Score: 4, Interesting

I am consistently horrified that juries offload their responsibility by blindly applying the judgement of expert witnesses (who are often paid to say the same thing over and over again), whether forensic scientists, psychologists or IT specialists. I take DNA evidence the same way as I take the contents of a third party /var/log: with a pound of salt, because I know it could have been planted.
When I was a juror I was interested in means, motive and opportunity as necessary but not sufficient conditions to vote guilty. I also made use of the defendant's inconsistencies in his testimony, details about the background of the defendant and victim to the extent that it was relevant to his alleged act, consistency of information from eye witnesses around the time of the event, known and unknown, doctors' reports, police officers, etc. I paid little attention to forensic details which might, according to the arguments of a scientist, help /confirm/ the prosecution's case, because I have more than reasonable doubt in my mind of any evidence which requires me to be an expert to interpret correctly - especially when I'm not that expert, instead deferring to some guy I just heard in a courtroom.
It's not just the veracity of the DNA testing by BetterSense · 2010-01-09 04:37 · Score: 4, Interesting

Even more so than the issue of statistical independence or veracity of the DNA testing process itself (which SHOULD be investigated) is the simple possibility of corruption, incompetence, or simple mistake. If a DNA testing lab simply accepts a bribe to give their expert testimony, has a mistake and switches sample vials, etc, their expert court-testimonyer will still show up in court claiming "The chances are approximately eighty-three bazillion to one".

This giant number has the emotional effect of certainty, but that number is just the chances that the sample the DNA lab recieved corresponds to the DNA of the accused--IF NO MISTAKES WERE MADE and nobody is planting evidence or accepting bribes. It's not the chance that the accused is innocent. I'm sure this distinction is made in the verbal "fine print" but the jury will still be swayed. The giant odds numbers are nothing powerful emotional hooks. The real possibility that the DNA evidence does not finger the accused breaks down like this:

1:1billion the DNA matches someone else due to a flaw in the statistics of DNA testing
TIMES
1:$smallernumber the DNA lab has accepted a bribe, has a mole, made a mistake, etc
TIMES
1:$smallernumber the DNA lab has honestly received a sample from the accused but the sample was planted at the scene by police, the real criminal, or really bad luck.

The jury won't be considering these factors when they hear the "1:1billion" number. It's nothing but sensationalism.
Don't forget the human epigenome by junglebeast · 2010-01-09 04:39 · Score: 4, Interesting

Scientists already know that the human genome (DNA) is not the complete blueprint for an organism. The human epigenome, which is far more complex, and contains more of the details about how to put those building blocks together, is no less important...and seems likely that it contains more of what separates us as individuals.
Re:chimps have 97% of human DNA by Leo_07 · 2010-01-09 04:39 · Score: 4, Interesting

I agree with mangu that "DNA tests should be used for finding someone innocent rather than guilty." Paternity tests are done in a similar way even though the general public does not seem to know: genetic microsatellite tests can disprove paternity but not prove if it is in fact the father due to false positives. The question should be how many microsatellite sites (sites that are usually different in the human population) should be analyzed to arrive to a conclusion?
Re:chimps have 97% of human DNA by honkycat · 2010-01-09 05:29 · Score: 4, Interesting

If the purpose is to independently evaluate the rate of false matches in a DNA database to be used in criminal investigations, what better database is there than the one that will be used for that purpose?
Privacy issues can easily be worked around here---there's no need for personally identifiable information (i.e., name or location, not the dna data itself :-P ) to accompany the database for this purpose. You might also worry about statistical independence between the sample to be used for the analysis and that used for testing the results, but there are very well established methods for using subsamples of a data set in just this way.
Re:chimps have 97% of human DNA by hey! · 2010-01-09 05:46 · Score: 4, Interesting

I wouldn't call it a case of mission creep. Research is needed to confirm that the database is suitable for the purposes it was created for.
These issues were identified as early as 1969, in a landmark HEW report on computer records and the rights of citizens. It boils down to this: inferences drawn from data that affect the lives of people ought to be rationally justifiable. This means not using data until its suitability can be established. Mission creep can lead to data being used outside the context it is reliable in; but we can also run afoul of privacy and due process concerns by collecting data in the first place without establishing it means what he hope it means.
I've been concerned for years about the reasoning used in DNA screening. It entails a long chain of assumptions, and while all the assumptions *seem* plausible, the chance that one or more of them is wrong or has some unknown wrinkle is not negligible.

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.