Scientists and Lawyers Argue For Open US DNA Database
chrb writes "New Scientist has an article questioning the uniqueness of DNA profiles. 41 scientists and lawyers recently published a high-profile Nature article (sub. required) arguing that the FBI should release its complete CODIS database. The request follows research on the already released Arizona state DNA database (a subset of CODIS) which showed a surprisingly large number of matches between the profiles of different individuals, including one between a white man and a black man. The group states that the assumption that a DNA profile represents a unique individual, with only a minuscule probability of a secondary match, has never been independently verified on a large sample of DNA profiles. The new requests follow the FBI's rejection of similar previous requests."
Before DNA tests are accepted as conclusive much better studies should be done, particularly for false positives.
I believe DNA tests should be used for finding someone innocent rather than guilty. Negatives aren't that big a problem. If there are discrepancies then obviously it's not the same DNA.
Positives are another issue, how many common features there must be to accept two DNA samples as coming from the same individual?
I have been concerned for years about this, because you often hear prosecutors and "expert" witness testimony to the effect that "the odds are billions to one against this being someone else".
Among other possible statistical mistakes, these unrealistically large numbers are based on the idea that each genetic location being compared is statistically independent. But in fact we know that to not be so. What we definitely do not know is how, or how often, many of these may actually depend on each other.
Let me give you a purely hypothetical example: what are the odds that a genetic profile from a random person contains a gene determining curly hair. What are the odds of finding this gene in a random sample?
You can answer this approximately by simply observing what percentage of the population has curly hair. Let's say 1/4 just for argument. So your odds are 1 in 4.
But here's the kicker question: what are the odds that a genetic profile includes a gene for curly hair, given that it also contains a gene for sicle cell anemia?
The odds are going to change drastically.
This is not a real example, of course, just illustrative. But one can easily see that the contents of genetic locations are NOT necessarily statistically independent, even if one of them does not directly cause the other.
We simply do not know enough to say that any two genetic locations are truly independent. Therefore these huge probabilities ("billions to one" for example) being spouted by prosecutors are completely specious.
By the way, I should point out that there are at least several public and private DNA databases being developed in the U.S. alone. However, some of them are for special purposes (genealogy for example), and will test different locations than those used by forensics labs.
Are we talking about here? If this is a catalog of DNA of convicted criminals then it might be ok. But if its also DNA samples from other people who gave a sample to clear their name, then I don'yt think it should be made public.
Letter is at http://www.bioforensics.com/articles/Krane_Science_letter_2009.pdf/
Even more so than the issue of statistical independence or veracity of the DNA testing process itself (which SHOULD be investigated) is the simple possibility of corruption, incompetence, or simple mistake. If a DNA testing lab simply accepts a bribe to give their expert testimony, has a mistake and switches sample vials, etc, their expert court-testimonyer will still show up in court claiming "The chances are approximately eighty-three bazillion to one".
This giant number has the emotional effect of certainty, but that number is just the chances that the sample the DNA lab recieved corresponds to the DNA of the accused--IF NO MISTAKES WERE MADE and nobody is planting evidence or accepting bribes. It's not the chance that the accused is innocent. I'm sure this distinction is made in the verbal "fine print" but the jury will still be swayed. The giant odds numbers are nothing powerful emotional hooks. The real possibility that the DNA evidence does not finger the accused breaks down like this:
1:1billion the DNA matches someone else due to a flaw in the statistics of DNA testing
TIMES
1:$smallernumber the DNA lab has accepted a bribe, has a mole, made a mistake, etc
TIMES
1:$smallernumber the DNA lab has honestly received a sample from the accused but the sample was planted at the scene by police, the real criminal, or really bad luck.
The jury won't be considering these factors when they hear the "1:1billion" number. It's nothing but sensationalism.
Scientists already know that the human genome (DNA) is not the complete blueprint for an organism. The human epigenome, which is far more complex, and contains more of the details about how to put those building blocks together, is no less important...and seems likely that it contains more of what separates us as individuals.
Having the names of the people associated with each DNA analysis would be completely unnecessary. Just assign each person a unique, meaningless number in place of their name and the problem is solved. There's probably 6 other ways to solve the privacy problem and still make the data useful. If researchers find special cases where they need actual identities to better understand what's going on, make them sign NDAs and release the information to only them.
The FBI doesn't want to release this because they know there's a lot of partial or complete matches in the database. Suddenly having news stories about how there's 100 people in the FBI DNA database with the same 13 identifiers (flash to expert testimony claiming billions to one of such a match) would be a major disaster for the FBI. The FBI would then talk about how most of them are the same person using different names, and various other explanations, but the damage would be done (flash to news story about one side of a match being a 22 year old male from Alaska, and another a 76 year old female from Florida).
I understand why the FBI doesn't want to do this, but it's extremely important data about how valid this type of DNA testing is (especially within certain populations) (flash to news story about racism). Essentially the government holds evidence about the validity of DNA testing that's relevant to thousands of criminal cases that it refuses to release. That sounds like a strong constitutional issue to me.
AccountKiller
The FBI's database only uses 15 markers, checking 15 sites in DNA. That's not good enough, and there are false matches. The problem is that they're using DNA technology from about 1990.
23andme, the commercial DNA analysis service, checks 580,000 sites in DNA. 23andme probably has enough data to validate the quality of the FBI's marker selection. That's a good way to check. Identical twins do match, even at the 23andme level of analysis.
No, it wouldn't, for two reasons given in the article. First is that it is possible someone has two entries in the database. The only way to discover this is to find a matching pair of DNA sequences and then look at the personally identifiable information to figure out if you have a duplicate or not. Second, is the possibility the information in the database was entered wrong, and that someones profile does not match what their DNA is.
Fly me to the moon Let me sing among those stars Let me see what spring is like On jupiter and mars
Its similar to the birthday problem. Given a class of 35 students the odds that one of them has the same birthday as yours are 35/365 = 9.5%. However, the probably that there are two students in the class who have the same birthday (not necc yours) is about 81% (check Birthday Problem on Wikipedia).
Its the same here. The probability of there being matches between different people in a large database of DNA is going to be a lot higher than the probability that there is a match to a given person or crime scene DNA.
Assume several thousand matches are found in the database. Defense lawyer will argue odds are in the thousands that the defendent was falsely matched. This is wrong. Much like the puzzle of how many people do you need to have at a party to have two with the same birthday (about 30, I believe). But the odds that two people have the same birthday are about 1 in 365 not 30/365 as would be falsely concluded using the same arguement as above.
Assume odds are 1 in 10,000,000 that two people have the same DNA profile. Then defense lawyers asks expert witness
"How many people would have to be in a stadium before the odds are greater than 50% that two have the same profile?
Witness "About 4400."
Of course the readers of slashdot would be excused from the jury by the defense as they would not fall for this.
Forgive me that I'm a layperson who didn't RTFA
I'd forgive you, but the article was written for lay people and it clearly answered your question.
I was always assuming that, given that scientists who know what they're doing should have invented this test, there was some sophisticated process that would ensure that they would somehow only choose base pairs from the subset that was actually different in different individuals
If you had read the article, you might have noticed that it says the test selected for non-coding DNA (that is it doesn't produce proteins) that commonly varies in humans.
AccountKiller
A properly convicted criminal serving a jail sentence has lost a portion of his rights, the most obvious being the right to leave the jail.
Rights are what you have as a result of being human, i.e. a rational animal. When you act to hurt an innocent person (violating his rights), you have thrown away some portion of your rights immediately. If the violated rights are among those recognized by the government and you're caught and successfully prosecuted, then the government can punish or force restitution in proportion to the damaged rights of the hurt person. The government does this without violating your rights because you have forfeited them to the extent of the damage you've done. When the punishment or restitution is complete, the deficit in your rights is gone. Your rights are restored - whether the government recognizes it or not.
Rights in the sense of civil rights or political rights have a lot of similarity to the phrase "It is right that." If you are about to leave a grocery with a can of soup, "it is right that" you pay the grocer: he has a right to be paid.
-----
The protection of others is not the only reason governments jail people (and please don't confuse government with the fiction that is society). Punishment, political revenge, "protective custody", "crimes" that have no victims, are all reasons government use for imprisonment.
Contribute to civilization: ari.aynrand.org/donate