Human Gene Count Slashed
jd continues: "This has the potential for making life extremely interesting for genetic engineers, given that both individual genes and interactions between genes must be proportionately more complex, in order to get the same level of complexity out. Half the number of genes equates to twice the information encoded in forms other than discrete physical blocks of code.
There is no mention in the article of a story running in 2002 of genetic therapies unexpectedly causing cancer, although if you now factor in the increased complexity of interactions, it is possible that such side-effects can be better understood and therefore prevented. The new estimates, therefore, are more than just idle curiosity but have the potential for impacting how the science is approached."
That would be incorrect. The number of genomes in the human genome is 1.
Brandon
It is the number of genes that has been revised down. The genome is the complete set of DNA and contains all the genes.
Did you mean the number of genes?
The article poster mistook 'Genome' for 'Gene'. Organisms only have one genome as it is a collection of genes.
Go to the back of the class!
Arabidopsis is essentially the lab rat of the plant biology world so trust me, there is a lot of research into Arabidopsis as well.
Gauging the complexity is difficult, given there are a number of factors not currently understood, particularly the importance of non-coding RNA, which accounts for 98% of the genome. In the past, the information content of these regions was thought to be low, but this attitude is changing. As knowledge of the genome increases, the estimated number of genes drops, and more information emphasis is put on non-coding portions of the genome.
Evaluating the function of ncRNA is difficult because as of yet there are no statistically significant markers for them. Given the release today, and trends of late, more and more attention will be put on trying to decipher the utility of "junk" DNA.
to how many genomes are in a single human genome. However, speaking about genes in a genome, as the article states, this "correction" only counts those genes that make some discernable protein product. The number misses the number of open reading frames (ORF) that may not encode a protein at all, but a regulatory or enzymatic RNA. Probably, the next big project in life/medical research, after the big proteomics initiatives, will be the study of non-protein encoding ORFs. This problem is very tough to crack since 1) these RNA's do not have a common sequence element like "normal" messenger RNAs, 2) may be as short as 15 base pair (LIN12(?) in C. elegans), and 3) there are MANY, MANY possible ORFs in the genome.
Are these technically genes? They are regulated. They have a function. They are transcribed. The only thing different from the standard definition of a gene is that the RNA is not translated into protein.
In addition to multiple protein products from one "gene" as the article states, regulation of the gene may also be much more complex compared to "lower" organism. For example, the gene expression profile of the malarial parasite Plasmodium falciparum suggests very limited regulation. Basically, it looks like a linear progression with very limit amount of response. So, temporal and spatial regulation makes even multiple product genes seem to like a larger cohort of genes. Take the daughterless gene in Drosophila. It is used very early in embryonic development to control sexual differentiation. However, later, the gene product is used in neuronal differentiation. So, for the fly, sex is literally on the brain.
When the articles talk of "estimate" numbers of genes, they are not referring to the known numbers of genes. Instead, they are referring to computational predictions, based on certain patterns found in the genome.
A gene is predicted if it has traits such as known start and stop codons, promoter regions, G-C content, and so on. These patterns are quite complex, and current algorithms are about 50-60% correct.
The actual number of experimentally confirmed sequences is in the low thousands, IIRC.
The thing is, we've had the arabidopsis genome sequenced for a while now. And because the organism has a lower degree of complexity it is a lot easier to study in many ways. I don't know if I'd necessarily say that there is more study being done on humans than on Arabidopsis - In fact, I highly doubt it.
We have a much clearer idea of most of the inner workings of that lowly little mustard plant than of our own. It's a matter of understanding the simple stuff and then working our way up. Like with the nematode C. elegans -- we know more information about that than you could possibly imagine. We know how many cells it has at every stage of its life and what they are doing. We have its genome sequenced. And from all of this information we have learned a lot about the inner workings of our cells as well. You find a lot of homologies between organisms.
In fact, if you examine the RNA polymerases of humans, bacteria and archaea you would find that ours are much closer to archaea (the most ancient of ancient organisms still around) than to bacteria.
So looking at these organisms that have been around since the beginning of life, we can learn about the development of our genomes and by examining their functions we can learn much about how ours work. Even if we do have our entire genome sequenced, that doesn't mean we know what it all does.
Actually last months Scientific American had a good article on this. Basically we are finding that what we once thought was junk (non coding areas and RNA coding areas which do not code for proteins) is probably some of the more important aspects of the nucleus. I quote:
"But investigators have since sequenced the genomes of diverse species, and it has become abundantly clear that to correlation between numbers of conventional genes and complexity truly is poor. The simple nematode worm Caenorhabditis elegans (made up of only about 1,000 cells) has about 19,000 protein-coding genes, almost 50 percent more than insects (13,500) and nearly as many as humans (around 25,000). Conversely, the relation between the amount of nonprotein-coding DNA sequences and organism complexity is more sonsistent.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Only one? Ahem: Mitochondrial genome; Nuclear genome.
:)
As a mitochondrial researcher, I resent the most important organelle of the cell being overlooked or lumped in together with the nucleus here!
So I would say two genomes
PBS has excellent videos from the program Cracking the Code of Life of the teams (Human Genome Project and private company Celera) that worked on decoding the entire 3 billion sequences of the human genome. It is very worth watching to understand this article.
The shift from 100,000 to 20,000 predicted genes is important because it signals a fundamental change in the way genomics are viewed. Scientists have to consider non-obvious explanations for genetic phenomenon. Why do we have a small number of genes, but a high level of complexity?
The genome is ~2% gene, the rest is largely unknown. Traditionally, this has been referred to as junk DNA, good for spacing, but not much else. Growing consensus believes there is more to these regions, and efforts are underway to explain them. One of the more significant points to consider is the amount of RNA made which never codes for protiens. Biology generally does away with useless actions, but non coding RNA is rampant.
The number of genes influences how hard scientists look at other explanations for phenomenon.
There is no mention in the article of a story running in 2002 of genetic therapies unexpectedly causing cancer,
Nor should there be; general estimates of the number of genes have nothing to do with mechanisms by which gene therapy might cause cancer. Nor is it unexpected that gene therapy can cause cancer; that has always been a known risk.
To be more specific, the cancer caused by that form of gene therapy seems that the retrovirus used to insert a block of engineered DNA into the genome inserts the piece in an "unlucky" spot. The genes are broadly spaced in most regions of the genome, and most insertion sites will not cause problems. But if the engineered DNA gets inserted in the wrong place- say in the midst of a potential oncogene (cancer-promoting gene)- then cancer might result.
So if there are less genes in the genome, if anything there would be less "vulnerable" spots to hit that would cause cancer. But really the number of total genes is not tightly linked to the number of insertion sites that could be oncogenic.
Besides, there is still plenty of complexity. Alternative splicing can take one gene and make many alternative mRNAs that can produce different proteins. Alternative splicing takes the estimated number of _transcripts_ back up to several times (?) the number of genes.
To up the level of complexity, imagine that the blocks of code are randomly ordered (although blocks of genes tend to stay on the same chromosomes), are all executing in parallel, and can trigger reordering & rewriting of themselves & each other.
Yep, that's going to be one helluva debugger!
I remember reading about a researcher who wanted to study genetic algorithms. I wish I had a link handy, but googling didn't turn it up.
Anyway, this guy wants to create a genetic algorithm that results in a circuit that can detect the difference between two tones, one something like 200 HZ and the other 2 KHZ.
He uses an FPGA chip to do the testing with. After a few weeks, he has an FPGA programmed such that it reliably discerns between the two input signals.
So, how does it work? Downloading the program from the FPGA chip results in a nonsensical circuit - except that it works. Running the same program on another FPGA chip of the same model results in a total failure.
Even changing the power supply makes the circuit not work! Months of study results in a complete, total unknown. Results inconclusive.
The human genome is not built of simple, engineered pieces. Interactions will occur with the total sum of possible interactions, down to the molecular level.
It will be many, many years before our own microbiological structure is understood. As we proceed, we'll see information technology and biology merge, as, when push comes to shove, both consist of the replication of complex patterns.
I have no problem with your religion until you decide it's reason to deprive others of the truth.
Natural selection is not evolution. Natural selection is a reduction in the gene pool, not an extension. Useful genetic modifications are rare and hard to come by, not to mention they don't get passed on well. If you go and kill all white people, and only black people are left, evolution did not occur.
This is definitely a mis-perception, usually based on the fact that most evolutionary descriptions only describe those things that lead up to humans. Plants are, in many cases, more highly evolved than animals are. Even than humans are. They just haven't specialized for intelligence.
It is a mistake to think that supremacy in one area (intelligence) means supremacy in all areas. Some people pride themselves on being efficient workers, others pride themselves on being paid well to do very little. In the biological world, plants would be the "blue pill" type of creature, the type B personalities, and they're REALLY REALLY good at it.
When I was working at Monsanto, I was told that wheat has a genetic strand about three times as long as the human genetic strand. This may or may not have relevance to the rest of the post, but I thought I'd toss it in just because it's interesting.
As another point, the length of the strand doesn't necessarily indicate a more evolved state. It can be assumed that some strands are more efficient than others, and thus don't NEED to be as long. Take Microsoft code, for instance. Just because they take more code to do the job doesn't mean it's a superior product.
Wake up - the future is arriving faster than you think.