Human Gene Count Slashed
jd continues: "This has the potential for making life extremely interesting for genetic engineers, given that both individual genes and interactions between genes must be proportionately more complex, in order to get the same level of complexity out. Half the number of genes equates to twice the information encoded in forms other than discrete physical blocks of code.
There is no mention in the article of a story running in 2002 of genetic therapies unexpectedly causing cancer, although if you now factor in the increased complexity of interactions, it is possible that such side-effects can be better understood and therefore prevented. The new estimates, therefore, are more than just idle curiosity but have the potential for impacting how the science is approached."
Finally scientific proof that it's not the size that matters, it's how you use it.
It is the number of genes that has been revised down. The genome is the complete set of DNA and contains all the genes.
25,000 genes will be enough for everyone. - 2004
The new estimate, of between 20,000 to 25,000 genomes is marginally less than the 27,000 for the Arabidopsis, a flowering plant in the mustard family.
Damn elitist mustard, looking down on us.
Where'd they off-shore the genes to?
"Like fire and fusion, government is a dangerous servant and a terrible master."~RAH
According to scientists, we gained 1000 genes compared to rodents when we diverged from them 75 millions years ago. And we 'lost' 33 genes compared to them (they have a functional copy, we have a nonfunctional pseudogene; it's still there, only not working - stop codons, etc).
The "we must have more gene than (insert stupid animal or plant here)" is funny. Our superiority complex at its best.
Read about the whole thing (with more links) on my blog (see sig)
Eureka Science News - automatically updated
Gauging the complexity is difficult, given there are a number of factors not currently understood, particularly the importance of non-coding RNA, which accounts for 98% of the genome. In the past, the information content of these regions was thought to be low, but this attitude is changing. As knowledge of the genome increases, the estimated number of genes drops, and more information emphasis is put on non-coding portions of the genome.
Evaluating the function of ncRNA is difficult because as of yet there are no statistically significant markers for them. Given the release today, and trends of late, more and more attention will be put on trying to decipher the utility of "junk" DNA.
Well, technically, you CAN buy genes. There are quite a few companies that sell pre-sequenced genes. In fact, the entire genomes of several organisms are available in varying amounts ligated into Bacterial Artificial Chromosomes (BACs) and plasmids. An interesting link is http://www.arabidopsis.org/ - There's a lot of information on Arabidopsis, where they keep a database of the entire Arabidopsis genome as well as many freely-available tools for its analysis.
to how many genomes are in a single human genome. However, speaking about genes in a genome, as the article states, this "correction" only counts those genes that make some discernable protein product. The number misses the number of open reading frames (ORF) that may not encode a protein at all, but a regulatory or enzymatic RNA. Probably, the next big project in life/medical research, after the big proteomics initiatives, will be the study of non-protein encoding ORFs. This problem is very tough to crack since 1) these RNA's do not have a common sequence element like "normal" messenger RNAs, 2) may be as short as 15 base pair (LIN12(?) in C. elegans), and 3) there are MANY, MANY possible ORFs in the genome.
Are these technically genes? They are regulated. They have a function. They are transcribed. The only thing different from the standard definition of a gene is that the RNA is not translated into protein.
In addition to multiple protein products from one "gene" as the article states, regulation of the gene may also be much more complex compared to "lower" organism. For example, the gene expression profile of the malarial parasite Plasmodium falciparum suggests very limited regulation. Basically, it looks like a linear progression with very limit amount of response. So, temporal and spatial regulation makes even multiple product genes seem to like a larger cohort of genes. Take the daughterless gene in Drosophila. It is used very early in embryonic development to control sexual differentiation. However, later, the gene product is used in neuronal differentiation. So, for the fly, sex is literally on the brain.
The thing is, we've had the arabidopsis genome sequenced for a while now. And because the organism has a lower degree of complexity it is a lot easier to study in many ways. I don't know if I'd necessarily say that there is more study being done on humans than on Arabidopsis - In fact, I highly doubt it.
We have a much clearer idea of most of the inner workings of that lowly little mustard plant than of our own. It's a matter of understanding the simple stuff and then working our way up. Like with the nematode C. elegans -- we know more information about that than you could possibly imagine. We know how many cells it has at every stage of its life and what they are doing. We have its genome sequenced. And from all of this information we have learned a lot about the inner workings of our cells as well. You find a lot of homologies between organisms.
In fact, if you examine the RNA polymerases of humans, bacteria and archaea you would find that ours are much closer to archaea (the most ancient of ancient organisms still around) than to bacteria.
So looking at these organisms that have been around since the beginning of life, we can learn about the development of our genomes and by examining their functions we can learn much about how ours work. Even if we do have our entire genome sequenced, that doesn't mean we know what it all does.
Actually last months Scientific American had a good article on this. Basically we are finding that what we once thought was junk (non coding areas and RNA coding areas which do not code for proteins) is probably some of the more important aspects of the nucleus. I quote:
"But investigators have since sequenced the genomes of diverse species, and it has become abundantly clear that to correlation between numbers of conventional genes and complexity truly is poor. The simple nematode worm Caenorhabditis elegans (made up of only about 1,000 cells) has about 19,000 protein-coding genes, almost 50 percent more than insects (13,500) and nearly as many as humans (around 25,000). Conversely, the relation between the amount of nonprotein-coding DNA sequences and organism complexity is more sonsistent.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Only one? Ahem: Mitochondrial genome; Nuclear genome.
:)
As a mitochondrial researcher, I resent the most important organelle of the cell being overlooked or lumped in together with the nucleus here!
So I would say two genomes
Let look at that stats:
Terrorist kill ~ 3000 people in 2001 and it becomes a focus of the US nation. While:
Breast cancer kills > 40,000 / year
Prostate cancer kills > 30,000 / year
Diabetes kills > 70,000 / year
The numbers world wide of course are much larger.
Yeah OT I know but these kind of discoveries convince me our priorities are misplaced.
I've read the headline as "Human Genome Slashdotted" and I shouted: "Dear God, we're doomed!" My God, what an embarrassment... I need sleep.
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
I'm not sure how that is.
"We just have to get used to the fact that we don't have many more genes than a worm," Rubin said.
So how can humans be so complex with relatively few genes?
Seems to me like the instruction sets are the same, while the coding complexity varies?
On the contrary, the complexity now increases. There are many genes that act in completely differen't roles depending on the cell type (nerve, epidermal, etc.). So a common language changes from cell type to cell type-- if one would even call it a common language. There is a large part of Bioinformatics/Computational Biology that deals with trying to determine interaction networks between genes. It's very complex, and difficult to deal with.
:) ).
With less genes we then expect to have a larger amount of downstream interactions between other genes. It might seem that with less genes then we have less to worry about, but we have already speculated for a long time that gene regulatory networks are complex.
To use an analogy (for all you computer geeks), it's like a programmer trying to read poorly modularized code. When you have no idea what class is doing what, and how they interact with other classes (as every class has multiple roles and talks to multiple other classes) then it is difficult to understand why the program behaves the way it does. If the program had many classes that were well modularized and designed with very distinct roles, then it would be easier to understand why things work the way they do.
With less genes and increased complexity we have an even more difficult task. It also highlights some of the reasons on why microarray analysis has not done what we expected it to do. Increasing the complexity and dependency between genes means that we probably are going to take a longer time understanding and extrapolating information from all these networks (which means more job security for me
An often unknown fact is that a single gene can code for thousands of different proteins. Protein regulation can occur in a variety of way, one of which is through "junk" DNA.
Currently little is known on the exact mechanism, which is a huge impediment to proteomics. As the phenomenon is elucidated, expect to see a lot more useful information coming out of genome projects.
Computationally predicting the 3-D structure and function of a gene is far more important than you probably realize. Reaching this point will revolutionize almost every aspect of your life, from pharmaceuticals, to nutrition, to silico-neural interfaces.
gatacgtactgagtctacgtacgtactgagtcatcagtctacgtacgtac gtatgcagtcagtcagtcagtctactgacgtacgtatactacgtatacgg gtagcgatctacgcatccggactgggatctcgtgtacgtacgtacgttag tcgtacgtgtgtatgcgttacgtttagcccaacacactgatgctgatcta gtactcgtaacgtgtacgtacgtacgtacgtacgtacgtacgtatcgagt acgtgtacgtacgtcatgacgtacgttagcgtagtagtagttcgtagtag tcgtgtagtcgtactggtactactacagtactacgtacgtacgttacggt acgtac gatacgtactgagtctacgtacgtactgagtcatcagtctacgtacgtac gtatgcagtcagtcagtcagtctactgacgtacgtatactacgtatacgg gtagcgatctacgcatccggactgggatctcgtgtacgtacgtacgttag tcgtacgtgtgtatgcgttacgtttagcccaacacactgatgctgatcta gtactcgtaacgtgtacgtacgtacgtacgtacgtacgtacgtatcgagt acgtgtacgtacgtcatgacgtacgttagcgtagtagtagttcgtagtag tcgtgtagtcgtactggtactactacagtactacgtacgtacgttacggt acgtacgatacgtactgagtctacgtacgtactgagtcatcagtctacgt gtatgcagtcagtcagtcagtctactgacgtacgtatactacgtatacgg gtagcgatctacgcatccggactgggatctcgtgtacgtacgtacgttag tcgtacgtgtgtatgcgttacgtttagcccaacacactgatgctgatcta gtactcgtaacgtgtacgtacgtacgtacgtacgtacgtacgtatcgagt acgtgtacgtacgtcatgacgtacgttagcgtagtagtagttcgtagtag tcgtgtagtcgtactggtactactacagtactacgtacgtacgttacggt acgtacgatacgtactgagtctacgtacgtactgagtcatcagtctacgt acgtac gtatgcagtcagtcagtcagtctactgacgtacgtatactacgtatacgg gtagcgatctacgcatccggactgggatctcgtgtacgtacgtacgttag tcgtacgtgtgtatgcgttacgtttagcccaacacactgatgctgatcta gtactcgtaacgtgtacgtacgtacgtacgtacgtacgtacgtatcgagt acgtgtacgtacgtcatgacgtacgttagcgtagtagtagttcgtagtag tcgtgtagtcgtactggtactactacagtactacgtacgtacgttacggt acgtacgatacgtactgagtctacgtacgtactgagtcatcagtctacgt acgtac gtatgcagtcagtcagtcagtctactgacgtacgtatactacgtatacgg gtagcgatctacgcatccggactgggatctcgtgtacgtacgtacgttag tcgtacgtgtgtatgcgttacgtttagcccaacacactgatgctgatcta gtactcgtaacgtgtacgtacgtacgtacgtacgtacgtacgtatcgagt acgtgtacgtacgtcatgacgtacgttagcgtagtagtagttcgtagtag tcgtgtagtcgtactggtactactacagtactacgtacgtacgttacggt acgtacgatacgtactgagtctacgtacgtactgagtcatcagtctacgt acgtac gtatgcagtcagtcagtcagtctactgacgtacgtatactacgtatacgg gtagcgatctacgcatccggactgggatctcgtgtacgtacgtacgttag tcgtacgtgtgtatgcgttacgtttagcccaacacactgatgctgatcta gtactcgtaacgtgtacgtacgtacgtacgtacgtacgtacgtatcgagt acgtgtacgtacgtcatgacgtacgttagcgtagtagtagttcgtagtag tcgtgtagtcgtactggtactactacagtactacgtacgtacgttacggt acgtac
Game: Player 'Donald J Trump' now has AI skill level 'experimental'.
Actually, that's a bad analogy, since modern assembly possesses a significantly richer grammar than C. However, it is correct to say that the interactions between language elements (instructions) in ASM are very much simpler than in C.
More on topic: Why are people surprised that millions of years of evolution has resulted in a high entropy encoding "format" (the genome) whose consituent elements are multipurpose and have complex interactions with each other? An animal is more evolved (has a history of more complex environmental interactions) than a plant, so why shouldn't its genome be less redundant / contain more entropy? Comparisons of number of genes are (to return to the computing analogy) like comparing two processors based on their physical size.
D.