Encrypted DNA Storage Investigated by DOE Researchers (darkreading.com)
Biological engineers at a Department of Energy lab "are experimenting with encrypted DNA storage for archival applications." Slashdot reader ancientribe shares an article from Dark Reading:
Using this method, the researchers could theoretically store 2.2 petabytes of information in one gram of DNA. That's 200 times the printed material at the Library of Congress... Instead of needing a 15,000 square-foot building to store 35,000 boxes of inactive records and archival documents, Sandia National Laboratories can potentially store information on much less paper, in powder form, in test tubes or petri dishes, or even as a bacterial cell... "Hard drives fail and very often the data can't be recovered," explains Bachand. "With DNA, it's possible to recover strands that are 10,000 to 20,000 years old... even if someone sneezes and the powder is lost, it's possible to recover all the information by just recovering one DNA molecule."
You'd need robust error detection and correction because of mutation and damage.
But copying seems trivial.
OTOH they are the DOE so maybe they can sustain a 20000 year project?
DMCA takedown notice for getting infected with a Kanye West "Music" Bacteria...
I deal in archiving film and video by the petabyte. At a storage symposium a couple of years ago I met my equivalent in the DNA research sphere, his data requirements blew me away. And all encoded in my cells.
see subject
Sounds suspiciously like xor compression.
So then what?
Whenever the press covers the "data storage in DNA"-topic, they boast about huge storage capacities based he assumption that you can basically store 2 bits per base pair. But DNA has not quite evolved to be a long-term mass-storage device. DNA is rather an energy-efficient way to store relatively small amounts of data (~0.8 GB of very redundant data in a human) that exists in so many copies (billions in a human) that it doesn't matter too much if millions of those billions of copies suffer some "bit rot" over time, and also the DNA storage needs a living organism around it to sustain constantly ongoing activities to repair or sort out damaged data. Also, DNA is meant to be variable over time, as mutation is important for ongoing success of a species.
I don't think that DNA based storage will ever beat simple, anorganic storage in terms of providing reliable long-term mass storage. It's just not optimized for that purpose.
It's kind of like SOLAR ROADWAYS. SOLAR FREAKING ROADWAYS. (Maybe also flying cars.) All Bullshit ideas if you're an engineer.
True enough. Although looking at the figures given in the summary, there's one hell of a lot of redundancy in their 2.2 petabyte/gram estimate. Looking up the molecular masses of the base pairs plus the sugar chain to make up a DNA molecule and assuming 2 bits per base pair, I get approximately 160,000 petabytes per gram of material (no redundancy), so the estimate given in the summary has a redundancy factor of about 73,000.
The internet tells me that human genome weighs 3.59 x 10^-12 grams.
1 gram of dna * 1 complete strand of dna / (3.59 x 10^-12 grams) = 278 x 10^9 strands = 278,000,000,000 strands of dna.
Length of human dna stretched out: about 2 meters
(278 x 10^9 strands) * (2 meters / strand) = 554 x 10^9 meters
I can't conceive of how you can organize that in order to read it.
Then again, I don't know the length of a blu-ray, if you could unravel it and stretch it out straight. Or that of a record.
Not useful data to people.
It's called "junk DNA".
Escher was the first MC and Giger invented the HR department.
Now everyone is potentially possessing child porn or terrorist documents.
Is the potential of Quartz Glass Storage for archive not better http://themindunleashed.org/2014/02/data-storage-crystal-quartz-will-change-everything.html Stable for longer won't get eaten by bacteria
I work for a DNA lab. After about 10 years, DNA samples that have been sent to us are basically unusable because they degrade over time. Sure, it might be possible to still read some strands of the remaining DNA, but significant percentages are lost. DNA archaeologists don't mind, because they are looking for whatever fragments they can still read. But if they required most of the DNA to be readable after long periods of time, they would be out of luck.
Today's DNA reading techniques begin with PCR, a process that multiplies small amounts of DNA so that millions of copies are made. These copies are needed to be accurately read by the equipment, in order to distinguish between "good" copies and noise. Getting the results amounts to statistical analysis of the number of A, T, C, or G results read at a certain location; a "call" can be made only if a high enough percentage of the results agree.
The bit density claims are massively overstated, and reading the data would not be trivial!
to recover the information, you have to sequence the DNA
That is, determine the physical order of A,T,C,G
The error rates are high with current technology
and they are a LOT Higher if you start with one molecule of DNA
When people talk about recovering 10,000 year old DNA, they are talking 50% -80% recovery
that really what you want for storage of financial records ?
I assert that anyone using DNA for industrial purposes is stupid or a grant whore
and yes, I know about illumina, PACBIO, Oxford nanopore, etc
I know about WGA error rates, and deamination and all that good stuff
Mainly because scientists have focused on reading and invented clever technologies to do so. The guy who made the reading breakthrough, Craig Venter, is also a writing pioneer in his synthetic biology work. Earlier this year there was a secret meeting ran by a Harvard prof to launch the DNA-WRITE project to improve write technology. The meeting was secret because it was feared that anti-GMO groups and general Frankenstein fear might quash the efforts prematurely. P.S. Some computer write memory technologies are also much slower than reading.