Human Chromosome 22 Mapped
tuck was the first of many to submit this important milestone in arguably the world's most important scientific endeavor. The Human Genome Project has completed mapping its first entire chromosome, number 22. Second-smallest of our 23 chromosomes, some of 22's genes can cause "heart defects, immune system disorders, cancers, schizophrenia and mental retardation." Portion of its DNA which is "junk" (encodes no protein): 42%. Read it at your favorite source:
CNN,
MSNBC,
the Boston Globe,
the Christian Science Monitor,
the AP,
or Reuters.
I could be asking a stupid question here (blame it on my genes), but since everyone has different DNA, how do the scientists decide which genes are normal? I mean, if there is a hereditary disease that only a very few people get, you probably decide that the 'healthy' gene is the more normal gene. But what if the gene f.e. encodes eye-color, how do the scientists decide what gene to map? Or do they simply say that that and that gene encodes eye-color and leave it open what DNA sequence it has?
HGP does it by "clone by clone" strategy. That means, the chromosome is cut in smaller pieces, then again in even smaller pieces, the pieces get cloned (cloning in molecular biology means not the Dolly sheep: it means, a certain sequence is inserted into a epigenetic element called plasmid, which itself can propagate and thrive in bacteria), and then sequenced. This is cumbersome and requires a lot of manual work, but it provides unmatched quality of the sequencing.
TIGR adopted another strategy, the so-called shotgun method. In this strategy, you get a sequence, but you have not the slightest idea where from the genome does it come from. Only you when you have a lot of this sequences you can start assembling - using a lot of CPU, trying to match them one to another, like a puzzle, but trickier: a sequence often contains errors, especially at the end (you can only read a couple of hundreds bases, then the signals are to weak and not clear enough). This strategy requires considerably less highly trained man-power: just a lot of technicians and $$ for expensive sequencing machines (which were provided to TIGR by Perkin-Elmer). However, this method has a serious drawback: there are many sequences in the human genome, which are partly, or totally repeated. This repeat elements can be a royal pain in the sequencing project (been there, done that).
It is really a good thing that there are two sequencing project with two different strategies: the comparison between the two sequences could provide an enormous insight into a) quality of the sequencing b) human variablity. Venter from TIGR is definitely an enfant terrible, but I don't think he is one of those corporate Bad Guys (TM), rather one who was fed up with the slow pace of conventional scientific projects. On the other hand, HGP, representing "the slow pace", shows us also why a slow pace is needed sometimes.
Regards,
January
I can see the fnords!
Imagine hiding easter eggs in junk DNA... Bend the frog's arm the right way and it croaks the genetic engineers' names....
Of itself, the periodic table didn't make any new chemicals. What it did was provide a framework to identify patterns that could be used to predict areas of research. For example, the discovery of helium: the table predicted the existance of the element, and allowed calculation of the spectral lines. The element was then identified in the sun, hence the name (helium, from helios, the sun).
Similiarly, the HGP of itself won't cure any diseases; rather it will allow the mapping of patterns. We'll be able to say, "This gene, which we know does this in wheat, is present in humans. Perhaps it does the same thing?".
Once we get one copy of the human genome sequenced, we'll still need to sequence many others, from [tall|short|skinny|fat|bald|hairy...] people, and start cross-referencing the results.
Think of it as a massive reverse engineering project on a program we only have uncommented object code for.
Unless the "junk" DNA are comments...
www.eFax.com are spammers
The pairs are not identical. Each gene on a cronosome refers to another on the second half of the pair. Some genes are dominant and others are recessive. If a gene is dominant than the behaviour to prescribes is used, recessive genes only surface when bothe genes across the pair are the same ie. brown eyes+brown eyes=brown eyes, brown eyes+blue eyes=brown eyes, blue eyes+blue eyes=blue eyes. It is the differance between the chronosomes in a pair that make us what we are. This also explains why males are more prone to some conditions- ie hymaphilia(sp?) A recessive gene on the X gene is not dominated on the Y chronosome as that gene simply is not there- where as females two of these genes to get it.
J-aims
--
Yo, whatever happened to peas? Join T( H)GS
It'll be interesting to see if male and female lifespans are equal in another 200 years or so, after sexual equality has been fully established.
According to Desmond Morris' The Human Sexes , recently rerun on TLC, men's life expectancy was several years longer than women's for most of recorded history.
It has been only in this century that women's lifespans have caught up with, and exceeded, men's -- Morris attributes this to improvements in medical care, specifically the dramatic reduction in the number of women who die while giving birth.
DAMNED!!! Why can't I send a cancel message? :-) I pressed the "Submit" button again! Grrr....
Back to the topic. There are cases, when finding one single disorder which causes one specific disease is easy. There are cases, when you can pin down a certain region - by tracing the genetic tree of the family, whose members have the disease. There are cases, where you are able to tell that - well, there *is* a genetic component of a certain disease. In some cases, you can tell two forms of the disease: a genetically inherited and a genetically independent form (e.g. the early-onset Alzheimers and the age-dependent Alzheimers disease).
There is yet one thing you have to keep in mind: there is no "gene causing disease X". It's rather: "a gene, whose malfunction or absence causes disease X". For example, a single nucleotide substitution can result in a non-active enzyme, or an enzyme with much slower activity. The whole metabolic pathway, to which this enzyme belongs, is hampered. In some cases a heterozygous organism will have another copy of the gene, which will do the job, or do the job at least in a part - and the disease shows fully in homozygous organisms.
Regards,
January
we have 23 PAIRS of chromosomes, for a total of 46. I think. someone tell me im not smoking phillipino crack rock here.
Joseph?
(This is all an oversimplification, I'm sure.)
Gates' Law: Every 18 months, the speed of software halves.
Has any experimentation been done on creatures with differences in only their 'junk' DNA.
It just seems a bit iffy to say it's junk because it doesn't do something that we know other DNA does.
To reliably say it does nothing you would have to know how the whole system works, wouldn't you.
-- That which does not kill us has made its last mistake.
I think we all know why the "junk" section is aprox. 42% of the chromosome... this just proves that Douglas Adams has been on to something far greater than any of the rest of us could possibly imagine.
I have never seen this answered to my satisfaction. Are they using a particular individual's DNA, multiple DNA samples from many individuals, or does it matter?
The Nature article said that individual human DNA differs from person to person by about 1 base pair in 1000.
If this is true, it seems like having one individual's sequence might be useful, but it is not going to tell you all that much about the variance from person to person. You'll get a general idea of what's going on, but it seems like you would have to sequence quite a few more individuals before you could really say how genetic changes effect a gene's expression.
-josh
Your friend is right. Rather than the word 'diseases', I should have probably used 'afflictions'.
Myoneurogastrointestinal encephalomyopathy.. Shall we break it down, perhaps?
Myo - muscle
neuro - nerves
gastro - stomach
intestinal - speaks for itself
encephalo - brain
pathy - feeling/suffering. So far as I can tell, this means that due to something between the muscles and nerves in the gastrointestinal region, the brain is feeling a plot of pain. Fun, neh?
------
If a tree falls on an anonymous coward yelling 'first post' in the forest, does anybody hear?
but I strongly doubt there would be gene that's ONLY function is disease... what sense would that make?
Why should it make sense? Assuming you don't suscribe to Creationism, there's no reason to assume a reason behind any particular genetic coding, any more than you should assume the function of gravity is to make your milk spill. Rather than 'function', which tends to sound like a design with a purpose, think 'effect' or 'result'. So, the effect of foo genes or gene-sequences is bar desease. A mutation has 4 possible results:
1. It helps a creature and/or its offspring thrive and reproduce.
2. It hinders a creature and/or its offspring from thriving or reproducing.
3. It has advantages/disadvantages which don't (yet) affect the reproduction chances.
4. It has no effect at all (or yet).
That's it. No point system other than:
1. You have children.
2. They inherit some of your genes and some of your partner's genes, and perhaps some of the genes mutate.
3. Repeat.
People tend to think that their's some grand design behind everything that is. I'll leave the resolution of this question to the Philosophers and Theologians, but I think we can agree that if there is one, it's not something we're capable of recognising...
Chris
San Francisco values: compassion, tolerance, respect, intelligence
I know next to nothing about DNA but the term junk DNA seems... wrong. First of all contrast the articles in the 6 or so URLs listed. Only one referred to them as junk DNA, this sounds more like the reporters lack of understanding or bias than something the scientist said.
Second, consider a gene as an information exchange mechanism. Most forms of information exchange include some amount of material that isn't essential to the message but can't really be classified as junk either. It may be redundancy, it may be for error detection or correction or it may be for clarification.
Run an estimate of the actual needed text in the average paragraph written or spoken in English. The percentage that is 100% essential is pretty astonishinly small. It's a bit higher for a text but a bit lower for a novel. Mathematical proofs are pretty concentrated information but consider what happens if a little bit of information is transmitted wrong, say a sign is reversed. It's difficult to recover from it.
Likewise I think a 100% essential gene would be very difficult. Any random genetic damage would have impact. Gene replication would have to be absolutely exact and so on.
As I stated, I don't know anything about genes or DNA, but from an information theory standpoint calling 'unused' DNA junk seems wrong.
I would like to learn more about genes though, can anybody recommend a good progression of texts on the subject? Something to take somebody from absolute layman to at least having a general idea of the subject?
Actually, introns != junk DNA . Introns are intervening sequences within genes that are spliced out before the RNA is translated into protein. Introns often contain regions important for regulation of gene expression and serve as a way to generate more diversity in gene products (by the production of alternatively spliced RNA transcripts). Junk DNA refers to regions of DNA between genes with function (if any) unknown. Just want to keep the discussions accurate.
They are doing the MDMI (multiple data many individuals) approach. What they get is basically an average genome. And certainly a good map of the genome.
The next step is to hunt for SNP:s, which stands for Single Nuceleotide Polymorphisms. And the race has already started. Both companies and universities are hunting for them. They are expected to be useful for things like identifying inheritedable deseases.
Lars
--
Reality or nothing.
Imagine this:
You are a computer programmer, faced with the task of decyphering almost a gigabyte of machine code, which was written by billions of programmers making random changes and seeing if the result was an improvement.
And you thought Perl was hard to read...
--
Patrick Doyle
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Junk DNA is one of the worst misnomers possible, coined back when researchers honestly believed that non-coding DNA had no purpose. I believe what they mean is junk = introns + intergenic space, i.e., all non-coding sequence on chromosome 22.
... "How will what we're learning about our genes today affect medicine X years down the road?" where X = 10, 20, 50, 100.
This is a bad misnomer because the junk DNA is required for the proper expression of all of our genes. We have on the order of a trillion cells, so 100,000 proteins (all combinations of 2, representing gene A regulating gene B) can only differentiate, at best, 10 billion. The complexity, and where a lot of the interesting research will be in a few years, is in how these genes are regulated to properly create all of our cells, each of which "knows" what it is, and what it is supposed to do.
I must also say that I am surprised at their estimate of only 42% non-coding. The usual estimates are of ~3% (at most 10%) coding sequence in the genome as a whole, which gives a greater than 90% non-coding estimate.
So... the interesting question, maybe I should send this to Ask Slashdot, is
-Todd
"I'm almost done with classes! Again!" (me)
For those of you who can't stand not having the source code for everything you use , you can download the results of the human genome sequencing project from http://www.ncbi.nlm.nih.gov/genome/seq/ .
(Before you all rush and slashdot the site, please ask yourself whether you really need to download over one gigabyte of what is, to the uninitiated's intents and purposes, a random string of A's, T's, C's and G's.)
What exactly does "mapped" mean?
In general it means that the location has been established relative to known markers. In this case, though, the chromosome has been sequenced : the areas have had their composition established base-by-base.
Does that mean they know what all the bases are in the average human?
Roughly, yes. The sequence is a mosaic derived from several people.
Does this imply any knowledge of the pattern of such variations?
Not in itself, no, although other work is continuing to establish this.
Does it imply any knowledge of the function of the encoded proteins?
Again, not in itself. Many of the identified genes have been studied already. Others have similarities to genes already known, either from humans or other creatures. Some have been inferred from features of the sequence itself and are of totally unknown function.
A biology class I took said that human DNA was 96% junk (not protein encoding).
Was this biology class wrong?
No. The vast majority doesn't code for protein, and most of this has no known function. Closely related species have widely differing amounts of this, so (together with other reasons) the current hypothesis is that it doesn't do much that's useful for the organism. Some of it is composed of "selfish" elements such as transposons : it might be the case that in a looser sense a lot of it is.
I extracted the Junk DNA and respliced it so that it would stand without the DNA that is neccessary to humans, inserted it in a cell and watched it grow. 5 hours later to my horror it took a flat retangular shape, black lines appeared on a white surface. They connected to form letters in clear English which read...
"Mr _________ , You have been selected as a final entry for the Publisher's Clearinghouse largest drawing, enclosed is a Check worth $30,000,000 if you have the winning number!!! Please open and send your entry form within the next 24 hours, and get a GUARANTEED prize."
I tried the Junk DNA of other chromosomes and got ads for term life insurance, timeshares, and then the Junk DNA materialized in front of me into a pushy Amway distributer!!!! The horror!!! Cellular SPAM!!! AGHHHHHHHH
Of course, that would be awful, but what about the installation process? Everyone would want to improve it.
:-)
Currently, parents are forced to accept all the default values, and many are clamoring to get at least an installation menu, to be able to choose hair color, IQ and IP address
Does that make it "holy writ"?
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Congratulations to all who participated in its sequencing. We look forward to the first draft of the human genome by spring 2000.
I'll do it for cheesy poofs.
"unemployable underclass".
That "underclass" will strive to become as large as possible.
Remember at the end of 1984, the fake society is falling apart. The proles, being the vast majority, are poised to take over.
If "most" people are in this underclass, they have
the opportunity to organize and the sheer size of the class makes them predominate.
As long as they are the minority, they haven't a chance.
-fb Everything not expressly forbidden is now mandatory.
Will this finish a task? No, it is just a beginning - having the sequence, the real work starts: searching ORFs (Open Reading Frames - sequences which could possibly be genes), running database searches, and slowly passing to the most exciting fields of modern molecular biology - from genomics to transcriptomics and proteomics. Transcriptomics is looking for genes, which actually got expressed, and proteomics - similarly, looking for expressed proteins. Making transcription / translation (translation is the process in which proteins get synthetized) profiles can lead us to 1) function of proteins (e.g. protein X. is expressed under this and this conditions, so it must take part in this and this metabolic response) 2) regulation - DNA is a single strand, but various enzymes are present in various copy numbers under various conditions.
Those are enormous projects. A lot of work has to be done before the raw sequence will actually be of any use; nethertheless, it is a milestone of molecular biology and will be a fine achievement for the end of our century.
Another project will be to determine the variability of human genome: screening for different gene allels, mutations etc. This will be one of the most important goals in human genomics in the next few years.
Whats on the catch... erm, chromosome 22? 22 is 33,400,000 bases long (Mycoplasma pneumoniae, one of the smallest bacteriums, has about 816,394 bases). It contains several already known genes responsible for various genetic disorders, and possibly a gene responsible for certain types of schizophrenia.
By the way, a much better source of information is the Nature science update page - the original scientific publication has been published today in Nature.
Regards,
January
Are they using a particular individual's DNA .. ?
:-)
After we finish mapping some DNA we can go munch on some grindage, buuuuuuuuudy!
We're going down, in a spiral to the ground
I heard somewhere that the sequence "GATTACA" actually appears at least once in every single human gene (this being why they chose it for the movie title). Furthermore, it's the only sequence that does that.
I'm no genetic researcher, and neither is the person who told me this, but I suppose it's possible.
I'd be careful calling it that. Someone proposed the idea that it's mutation fodder (that is, a safe place for mutations to occur). That's a possibility.
But there've also been posts talking about a lot of redundancy and such. It's possible that all this "junk" DNA still has uses that we haven't seen yet. I guess we won't know until we've mapped out the whole thing.
Who knows... maybe someday we'll all have something like a mini-RAID coded into our DNA.
The human genome project is funded in the US by the National Institutes of Health and the Department of Energy, and in Britain, by the Wellcome Trust, a charitable organization.
Every base that we sequence is put in the public domain.
We strongly oppose the patenting of sequences. Some of our strategies are designed to preempt attempts by companies to patent sequences from the human genome.
AFAIK, most of the genome sequence being produced by the HGP is from a single male individual. (Male, because we need to see a Y chromosome too.) I dunno for sure about Sanger, but WashU and Whitehead in the US are working from the same clone library.
The identity of this person is a closely guarded secret, as well it should be: this person's genome sequence will be available on the Internet. We'd like to avoid a nightmare scenario where a well-meaning "genome hacker" discovers a fatal disease gene in the sequence, and calls this guy up out of the blue to tell him.
That's just an extreme example. Basically, there's serious privacy and confidentiality issues. We consider the genome sequence to be a "reference sequence" or a "typical example", and we don't need (or want) to know who it came from.
Soon we will be Open Source. I fear that the temptation to develop and try patches will be irresistible to many.
From the CNN article:
More than 30 human disorders are already associated with changes to
genes of chromosome 22. These include a form of leukemia, disorders
of fetal development and the nervous system, and schizophrenia.
From the introduction:
some of 22's genes can cause "heart defects, immune system disorders, cancers, schizophrenia
and mental retardation."
Is it just me or is there not a big difference between causation and association? Seems to me along the line of correlation vs. causation. Anyway, I believe that scientists have still
a long way to go before they find the genes that cause certain disorders. And then they will still have to prove that these genes are in no way responsible for any other functions in the human body to safely alter them. Seeing all the good possible uses for medicine it still gives me the creeps how through the use of genetics and monocausal argumentation a new "scientifically backed" racism could emerge again. Now don't scream technophobe but how would you all react once the genes allegedly causing things like alcoholism, homosexuality, autism, criminalism, lazyness or whatever unwanted psychic or physical trait you can think off where identified? Have we got our ethics ready to handle this or will it be "what can be done will be done"? On whom will we test genetic engineering for a better race? The inhabitants of prisons, mental institutions, military institutions or just unwanted embryos? Will we allow babys to live with these disorders? Will we allow people to work without mandatory testing of genetic normality?
History has shown that scientists have often produced technology that was later misused by the
willing. Hopefully this time they think more before they hand this Pandoras box to the masses.
It is still an open question what role the junk DNA, technically called introns plays in organism development. Unlike the simple unicelluar critters (prokaryotes) such as bateria, all higher level organisms (eukaryotes) have these long non-coding sequences which have been retained across evolutionary generations despite the extra energy/space required. The whole area is akin to the physists search for all the various subatomic particles after the cracking of the atom. We can see the bits and pieces, we can assemble the various sequences, but there's no unifying standard model of how or why. With Nobel prizes and new killer apps in the air, it is not surprising that universities and institutes are throwing money into the research.
:-). Fun times ahead.
The 19th centure might have been the dominance of physics and engineering but there's a lot of speculation and anticipation (especially by the empty hands of the biologists and zoologists) that the next century will be their turn at the gravy train
LL
First of all, "junk" is a loaded term, which is certainly evidenced by all the nonsense it has spawned in this discussion. So let's do this by enumerating the different types of DNA a typical eukaryotic genome contains:
1. Coding Regions. DNA that gets transcribed to RNA. RNA transcripts in turn have exons, which get translated into proteins, and introns, which get spliced out before translation. Why this added level of complexity? Many reasons. In sexual reproduction, new chromosomes are produced by mixing and matching old chromosomes at random. It's more likely for the new chromosome to be functional if the crossover point is in an intron because crossovers can introduce mutations, especially a nasty sort of mutation called a frameshift which would render everything downstream unintelligible. Exons also allow for a certain modularity of function, evolutionary mutations can involve entire exons being combined instead of having to try changes on a base-by-base level.
2. Regulatory regions. DNA that turns other bits of DNA up or down. Mainly used to control transcription, but also used to control DNA replication.
3. Structural regions. Eukaryotic DNA is a huge, long, string requiring a certain amount of overhead to prevent it from becoming an unmanagable tangle. Lots of the chromosome is dedicated to binding to structural proteins, generally known as histones, around which the DNA is wound. Also centromeric and telomeric proteins.
4. Repeats, cryptic genes, etc: In order to avoid overloading the term "junk," let's call this category "cruft." Cruft arises for lots of reasons. For example, sexual reproduction produces gametes, and it's far from perfect: Regions get repeated, regions get dropped. So called cryptic genes are probably the result of a spliced RNA being reverse-transcribed back into DNA and reinserted into the genome without introns or regulatory elements. What's useful about the cruft is that it provides fodder for further evolution.
In summary: Eukaryotes are big and complex, which means that you have to allow for a certain amount of overhead and slop.
I hope this has helped.
whuppy enjoys smelling like diesel fuel
Bottom line: human chromosomes may be patented. Fight it.