Slashdot Mirror


Encrypted DNA Storage Investigated by DOE Researchers (darkreading.com)

Biological engineers at a Department of Energy lab "are experimenting with encrypted DNA storage for archival applications." Slashdot reader ancientribe shares an article from Dark Reading: Using this method, the researchers could theoretically store 2.2 petabytes of information in one gram of DNA. That's 200 times the printed material at the Library of Congress... Instead of needing a 15,000 square-foot building to store 35,000 boxes of inactive records and archival documents, Sandia National Laboratories can potentially store information on much less paper, in powder form, in test tubes or petri dishes, or even as a bacterial cell... "Hard drives fail and very often the data can't be recovered," explains Bachand. "With DNA, it's possible to recover strands that are 10,000 to 20,000 years old... even if someone sneezes and the powder is lost, it's possible to recover all the information by just recovering one DNA molecule."

6 of 42 comments (clear)

  1. The sheer scale of it by dhaen · · Score: 2

    I deal in archiving film and video by the petabyte. At a storage symposium a couple of years ago I met my equivalent in the DNA research sphere, his data requirements blew me away. And all encoded in my cells.

    1. Re:The sheer scale of it by SNRatio · · Score: 2
      Sequencing DNA these days means creating a library of millions short segments (100-300 bp) of DNA from your sample, and then assembling the data into longer fragments by finding the segments that overlap and stringing them together. To sequence 4 billion base pairs they actually read about 120 billion base pairs (multiple reads are needed to eliminate errors, generate overlaps, etc). And that raw data is not 2 bits per base: it's an intensity level from the machine and a probability score that the algorithm has called the correct base for that position in the image, plus all of the associated indexing. About 40 bits per "base" for Illumina sequencing. Illumina X-10 sequencers can generate ~10 petabytes of data per year - each.

      The final archived data, what you might use for clinical purposes, could indeed be a diff file more or less. But in the meantime the world was generating 1 zettabyte of DNA sequencing data per year in 2015, the rate doubles every ~7 months.

      http://www.ncbi.nlm.nih.gov/pm...

  2. Re:Mutation by ShanghaiBill · · Score: 4, Funny

    You'd need robust error detection and correction because of mutation and damage.

    We already have that. There are a few billion years of prior art.

    But copying seems trivial.

    The hard part is writing the device driver to interface the ribosome to /dev/dna.

  3. Re:Mutation by TFAFalcon · · Score: 3, Insightful

    I think mutation isn't really that much of an issue if the DNA isn't actually doing anything (being duplicated or transcribed to RNA).

    It's supposed to be one of the more stable ways of storing data, much better than tape in fact. What I'd be more worried about is reading it again - current ways of reading DNA can misread it and have problems with long sequences of the same base pair, so some kind of an encoding to avoid those would be needed.

  4. Quartz Glass by simpz · · Score: 3, Informative

    Is the potential of Quartz Glass Storage for archive not better http://themindunleashed.org/2014/02/data-storage-crystal-quartz-will-change-everything.html Stable for longer won't get eaten by bacteria

  5. DNA degrades after just a few years by Tony+Isaac · · Score: 2

    I work for a DNA lab. After about 10 years, DNA samples that have been sent to us are basically unusable because they degrade over time. Sure, it might be possible to still read some strands of the remaining DNA, but significant percentages are lost. DNA archaeologists don't mind, because they are looking for whatever fragments they can still read. But if they required most of the DNA to be readable after long periods of time, they would be out of luck.