The 1000 Genomes Project
jd writes "An international consortium of specialists in genetics has announced the 1000 Genomes Project, in which at least 1,000 people from around the world will have their genomes fully sequenced as part of an effort to discover the relationship between genetics and disease. At present, over 100 regions of DNA are known to be related to illnesses, but the maps that exist are vague and are drawn from an extremely small population pool. According to the article, this results in the need for slow, expensive, and laborious studies to pinpoint causes, especially for rarer conditions. This project aims to find conditions that might only appear once in every 2,000 people (though how they intend to do that with half that number is unclear). The researchers hope to massively speed up the diagnosis of genetically linked illnesses and to improve the reliability of such diagnoses."
I wonder why there's so much funding coming from China for this project.
You can see the list of all participants (including funders) here.
This project aims to find conditions that might only appear once in every 2,000 people (though how they intend to do that with half that number is unclear).
Well, they could sequence the DNA of people known to have rare diseases.
Libertarian Leaning Political Discussion Forum.
Let's try to make it clearer, then.
The probability that a given condition appears in an individual is 1 in 2000, or 0.0005. The probability that it does not appear in that individual is 0.9995. The probability that it does not appear in any of 1000 individuals is 0.9995^1000 = 0.6 approximately; and the probability that at least one of the 1000 individuals has it is 0.4. Not bad at all. (If you used 2000 people, the probability that at least one of them would have it would improve to about 0.6.)
Suppose you aren't interested in just one conditions, but in lots of conditions -- say, ten of them. The probability that at least one individual would have at least one of those conditions is 1 - 0.9995^(1000*10) = 0.993 == ie, practically certain.
They really ought to teach basic probability theory in schools...
I have no idea what they plan to do with 1000 gnomes, but I can only guess that whatever it is will end in a giant explosion.
Anyone reading up on the progress in genomics over the last decade has seen the huge leaps in speed and accuracy and the insane cuts in cost to work with nucleic acids.
From a lab level where what used to be a weeks work with lots of chemicals and processing is now usually a 20 minute protocol with a kit from Quagen. what used to be massive amounts of work with hundreds of gels and digestions and labeling steps to analyse nucleic acid sequences is now a few days with an affymetrix kit, giving far more accurate and useable results. Across every step this progress has been rapid.
And in the future, near-term like within a decade, all these methods will become outdated and replaced with near-realtime analysis and diagnosis. The best point in all of this is that no matter how advanced medical tech has become, the limiting factor has been that it's necessary to actually BRING your disease ridden body to the hospital or doctor. The rise of companies like www.decodeme.com is what i expect DNA assesment to be like in the future. You send off some samples you scrape off your cheek yourself, and within a few days you get a full diagnosis on any known predisposition to disease or genetic problems.
Which is why a lot more attention should be put into the debate on morality and genetic profiling. It's going to be here before you can blink, it might be nice to know what you think about using embryo selection to wipe out CF before it becomes a possibility.
I've got marfan syndrome. I am really eager to have my genome sampled so that this condition is better understood.
www.marfan.org
Seven Days with Ubuntu Unity
Human/Ranger/Zangband
It's sort of right. Usually the phenotype will be recessive - so two bad copies need exist for the condition to be seen but only one bad copy needs to exist for it to be a useful sequence. For example, although the frequency of cystic fibrosis in Caucasians is 1/400, but the allele frequency is 1/20. So you need to look at the square root which gives you much higher probability of a hit. (BTW, the frequency in Asians is I believe on the order of 1/500,000 so CF could be cured simply by outbreeding - and no - that never worked for me as a pickup line...)
Note, that you don't necessarily need to have a visible phenotype for the sequence to be useful. You might have a marker already from previous studies to allow you to identify a single bad copy.
There are other projects that sequence the DNA of people known to have rare diseases such as cystic fibrosis, and there are projects that sequence the DNA of people with common diseases like heart disease, but we don't know much about the variants in the middle that are neither very common nor very rare. This is an attempt to fill in that gap in our knowledge.
Human/Ranger/Zangband
Finding diseases that eventuate in 1 in 2000 people with a genomic study of 1000 people is entirely possible... with one thousand people you have two thousand sets of genes. Since most genetic diseases are caused by two of the same recessive alleles (usually resulting from broken genes) in a single haplotype there would be lots of carriers; those with a single disease allele that could be spotted as a major deletion relative to the genomic reference sequence.
What do we propose to do once we have genetic maps anyway? Scientists (especially within the drug industry) have no clue what they're doing - all we do is "best guess" diagnoses, and then pump people full of drugs that may or may not help, and that induce more serious side effects than they're supposed to be "helping".
This whole idea of "early detection" pisses me off; it just reminds me of the drug industry. It really does come down to the almighty human thinking they know what they're doing. Hopefully we find a genetic marker for depression... that way we can take 95% of people taking anti-depressants off their drugs for not actually having depression.
Every single other mammal on the planet survives without this bullshit; why can't we? Oh that's right, there's money to be made.
Does an individuals DNA structure change at all through out ones life time?
I believe you are talking about the DNA sequence, and not the structure of DNA itself? The DNA sequence is relatively unchanged throughout your life. The only things that changes it, are spontaneous mutations and pathogen-induced mutations (Bacteria, but especially viruses). Most of the time, cells with lethal malfunctions in their DNA undergo self-killing, known as apoptosis. Others that behave unnormally, either due to infection, infection-induced DNA mutations or due to spontaneous DNA mutations, are usually killed of by specifik immune defence killer-cells, that can recognize cells that are different from the others. But in the end, some persist and becomes uncontrollable, like cancer-cells that divide extremly fast. Many mutations are silent though, meaning that even though the end product, the proteins, are different from 'normal', they can act as if nothing had happened.
When scientists use the word "complete" they are being misleading. There are very large, difficult to sequence regions (, heterochromatin,, eg centromeres) that have not been sequenced, ever, and that are biologically important (centromeres are required in every cell division, to ensure that each cell has the proper set of chromosomes.)
Even within the "normal", euchromomatic, sequencable DNA, there are gaps that have not been sequenced.
Beyond this, you need to know haplotypes - that is, for most of your DNA there are two copies (except the x and y sex chromosomes) one from dad and one from mom
Since these two copies are different, it matters, a lot, what differences are where.
Remember that human chromosomes are diploid - we have two copies of (most) genes. (A few of the genes on the male Y chromosome have no analogue on the X chromosome, but that's a very small percentage of the human genome). So in total they will have roughly 2000 samples for each gene - 2 for every individual.
Of course, that doesn't provide a correlation with specific genetic diseases - but here classical genetics techniques allow you to get an insight on how some of those diseases might be related to specific genes. The easiest to understand are those genetic diseases that are dominant - that is, you need only one copy of the gene in order to have the disease. On average, a dominant genetic disease which has a frequency in the population of 1 in 1000 would have about a 50% probability of being represented in the sample.
The situation is more complex for genetic diseases caused by recessive genes - which form the majority of genetic diseases. People carrying only one copy of the disease will be asymptomatic but a 'carrier' of the disease so that their children would have some chance of getting the disease if their other parent was also a carrier of the disease. However even there, you often have a good idea who might be a carrier for the disease based on family history: how many others in that person's family are affected. If any of those individuals are the person's children, you know that that person is a carrier for the disease; if one of them is a (full) sibling, then there's (at least) a 50% chance that the individual is a carrier, etc.
These sorts of familial relationships are the bread-and-butter of traditional investigations into genetic diseases, and this promises the ability to multiply their effectiveness. In effect you are getting a window into the genomes of many more individuals than merely those whose genomes were sequenced.
p + q = 1
p^2 + 2pq + q^2 = 1
P and q are the frequency of a specific gene (assuming there are only two variants, but lets KISS.) Each organism has two copies of a given gene. They can be pp, pq, or qq. So the number of p genes and q genes must equal 100%. And the number of people who are pp, qp, or qq must equal 100%, hence the two equations.
In the case of a simple autosomal recessive gene, the disease exists when an individual is qq. So qq = 1/2000 = 0.0005. So q (the prevalence of the allele) is 0.02. So you would expect that 1/50 people has the q gene (almost all of them as heterozygotes who have one p and one q gene.) If a gene exists in 1/50 people and you sample 1000, the odds that you wouldn't find it is pretty remote.
There fixed it for you. No need to thank me.