Slashdot Mirror


Using Bacterial DNA For Data Storage

NPV writes "January ACM Communications has an article on the use of DNA in genetically modified bacteria to store information. This is an attempt to achieve the ultimate in archival storage (one of the modified bacteria can tolerate 1000X more radiation than a human being). Now just suppose that the "junk DNA" in the human genome is the documentation package for the machine code. Who wrote that manual?" Here's the article abstract.

3 of 211 comments (clear)

  1. Bacteria Have No Introns and Other Considerations by mustermark · · Score: 5, Informative

    Just to be clear, no non-coding segments have been found in bacteria yet (last I heard). So putting data in as 'junk-DNA' in humans is quite a bit different from interrupting a fully functional bacterial DNA segment with the data to be stored.

    Also note that the introns in eukaryotes are highly mutable (look up 'tandem repeats' if you have the inclination), so the fidelity of the data would be sacrificed by putting it there. The longest lifetime for the data would be achieved by tricking the replication machinery into thinking the segment was an exon, which would involve tying it to a functional protein that would be absent were the sequence to be mutated.

    Duplication of the data would also work, but it would only hammer down the probability of mutation, since the probability of a point mutation of a base at the same location in two widely separated sequences is roughly 10^-18 to 10^-17 per year for exons.

  2. Re:Bacteria Have No Introns and Other Consideratio by Rainier+Wolfecastle · · Score: 5, Informative

    I think that you may have your terms a little mixed up. An intron is the DNA between exons (coding regions) in a gene. i.e.

    junk---junk---junk---exon-intron-exon-intron-exo n- --junk---junk---junk.

    The junk DNA often referred to is mainly intergenic DNA, and this is where most of the non-coding DNA is found. This also makes up the majority of the eukaryotic genome. Prokaryotes (bacteria) do contain intergenic DNA, but no introns.

  3. Re:quaternary vs. binary by reverseengineer · · Score: 3, Informative

    IAAB too (not the same one as above), and I have to say, sorry, you're wrong. Yes, adenine (A) pairs with thymine (T) and guanine (G) pairs with cytosine (C), but bases are not restricted to one strand of a double stranded DNA- A and T or G and C can be found in the same strand. In fact, there are some regions where sequences consisting of A's and T's or C's and G's together play a critical role, like a sequence of TATAAT (or similar) called the TATA box, which is recognized by RNA polymerase, and leads to initiation of transcription. Usually, all 4 bases are present in each of the two strands, and since there are three bases in each codon, 4^3, or 64, possible different amino acids can be coded for from a single codon. Now, there are only actually 20 amino acids that are coded for (there are a few exceptions to this that depend on specific context), so a few of the possible codons can be used to code for a stop in protein translation, and there is a redundancy built in called "wobble" that allows correct translation despite certain slight mutations.

    Now, although there are two strands in most DNA molecules, only one actually codes for proteins- the two strands are sometimes referred to as sense and nonsense (or antisense) strands. Both are involved in replication, however- a DNA helicase splits the two strands, each acts as a template for a new complementary strand. And both can and usually do contain all four bases, with the concentration of each base in either strand being totally independent. Since the two strands in a double helix are complementary, the amount of adenine must equal the amount of thymine and the amount of cytosine must equal the amount of guanine in both strands . In fact, recognizing this relationship led to the realization that complementary base-pairing occurs. The original IAAB is correct though- the genetic code is indeed base 4- although nature has chosen to not use it to its full potential (i.e. code for 64 different amino acids) in favor of building in some redundancy.

    --
    "FDA staff reviewers expressed concern about the number of patients who were left out of the study because they died."