Central Dogma of Genetics May Not Be So Central
Amorymeltzer writes "RNA molecules aren't always faithful reproductions of the genetic instructions contained within DNA, a new study shows (abstract). The finding seems to violate a tenet of genetics so fundamental that scientists call it the central dogma: DNA letters encode information, and RNA is made in DNA's likeness. The RNA then serves as a template to build proteins. But a study of RNA in white blood cells from 27 different people shows that, on average, each person has nearly 4,000 genes in which the RNA copies contain misspellings not found in DNA."
We have known for many years that the same DNA codes to different proteins, with the adjustments given the information in the non-coding regions AND the information in the epigenome. That people have discovered that the intermediate step is also adjusted can hardly be called a shock. The proteins have to get built differently somehow, so some alteration in the intermediate coding was inevitable. Honestly! If geneticists aren't even reading their own bloody papers, maybe the government grants should be issued to those Slashdot readers who do.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
What it does in fact say is that information flows from DNA to RNA to proteins, and not the other way around: proteins can't write DNA.
This is not nearly as earth-shattering as the journo makes out.
When DNA is copied to make new DNA, you get a certain number of copying errors, called mutations - most of them harmless. I assume everyone knows about those.
When DNA is copied to make a temporary-working-copy RNA, you get a larger number of these copying errors because, in general, they are one-shot non-critical deals. The need for stringency is much lower, the selective advantage for stringency is not so great, so it comes as no surprise that the level of proof-reading is also reduced.
Now, it's also possible that there are mechanisms by which these RNA molecules can be purposefully edited. As mentioned in the article, significant post-transcriptional editing (including in eukaryotes the readaction of big chunks, which are called "Introns".) But this finding doesn't speak much to that, although the rate is a *sconch* higher than I might expect for random errors. Even so, this doesn't shake the central dogma of molecular biology in any meaningful way, as for example Reverse Transcriptases did.
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
The amazing thing is not that there are mistakes, but the exact same mistakes occur in (almost) every strand of RNA! They aren't random errors, they occur the same way every time!
From the article: The most common of the 12 different types of misspellings was when an A in the DNA was changed to G in the RNA. That change accounted for about a third of the misspellings.
This is a textbook example of RNA editing by adenosine deaminase. It will convert the Adenosine bases ('A') to Inosine ('I'). When they try to sequence the RNA the first step is to make a DNA copy. During the process the positions that contain 'I' are copied mostly as 'G'. This is because 'I' can pair with any base, but prefers 'C'. So in the first strand you will get 'C' paired with 'I'. When you build the second strand these 'C' positions will direct incorporation of 'G'.
Mystery solved
It's actually believed that the earliest forms of biochemical life consisted almost entirely of RNA. It is the only molecule we know of that can act as both information storage/transport and chemical catalyst (all proteins made by modern life are in fact polymerized by a reaction catalyzed by RNA). There is some disagreement as to whether this "RNA world" came before or after lipid membranes.
There are many, many twists to this sordid puzzle, but you are correct. The concept of a 1:1:1 translation has been dead for a very long time.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
"In his autobiography, What Mad Pursuit, Crick wrote about his choice of the word dogma and some of the problems it caused him:
"I called this idea the central dogma, for two reasons, I suspect. I had already used the obvious word hypothesis in the sequence hypothesis, and in addition I wanted to suggest that this new assumption was more central and more powerful. ... As it turned out, the use of the word dogma caused almost more trouble than it was worth.... Many years later Jacques Monod pointed out to me that I did not appear to understand the correct use of the word dogma, which is a belief that cannot be doubted. I did apprehend this in a vague sort of way but since I thought that all religious beliefs were without foundation, I used the word the way I myself thought about it, not as most of the world does, and simply applied it to a grand hypothesis that, however plausible, had little direct experimental support."
It's worth noting that this kind of thing happens a lot in biology, where a name gets appropriated without the borrower fully understanding its meaning—or in some cases, the correct pronunciation. Classicists are frequently driven mad when they discover the plural of "locus" is pronounced "low-sigh".
Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!
Actually, you too are incorect, albeit closer.
The issue is not a change in accuracy (RNA copying is well known to be significantly lower in fidelity than DNA), but rather than the same deviations from the expected change are happening consistently. An example (simplified):
Lets say that the coding strand is these 20 bases:
AGGCATAGGC
Further, we'll say there's 1,000 bases in either direction of the translated sequences. We'll say, excluding what is shown below, in this 2,010 base sequence, there are an average of 1 base in each sequence other than that which is expected (a reasonable error rate). Now, lets say we get the following copies:
1) AGGCGTAGGC
2) AGGCGTAGGC
3) AGGCATAGGC
4) AGGCGTAGGC
5) AGGCATAGGC
6) AGGCGTAGGC
7) AGGCGTAGGC
8) AGGCGTAGGC
9) AGGCGTAGGC
10) AGGCGTAGGC
We have an average of 1 base per 1,000 error, IIRC, par for the course, well within the model. Nothing to write home about. Except:
Note the "G" in the fifth position in most strands. That is not what was expected, nor is it in every strand but it is still way to regular to be an accident. That kind of issue is what is being described in the article. For the most part our model works, but there are too many consistent errors for us to think we have a complete understanding.
This means that the change was not an accident, but it also does not fit with our model. Thus, we need to find out what it is. It's not a matter of quantity, but precision.