Researchers Achieve Storage Density of 2.2 Petabytes Per Gram of DNA
SternisheFan sends news of researchers who encoded an MP3, a PDF, a JPG, and a TXT file into DNA, along with another file that explains the encoding. The researchers estimate the storage density of this technique at 2.2 petabytes per gram (abstract). "We knew we needed to make a code using only short strings of DNA, and to do it in such a way that creating a run of the same letter would be impossible. So we figured, let's break up the code into lots of overlapping fragments going in both directions, with indexing information showing where each fragment belongs in the overall code, and make a coding scheme that doesn't allow repeats. That way, you would have to have the same error on four different fragments for it to fail – and that would be very rare," said one of the study's authors. "We've created a code that's error tolerant using a molecular form we know will last in the right conditions for 10 000 years, or possibly longer," said another.
How many Libraries of Congress is that?
The download data into Kneau's head as a courier. However I think they spoke of "gigabytes" back in 1995.
It's useless unless it's reasonably fast.
Memory upgrade kits of the future could just be a razor blade and a plastic bag. Bleed your own upgrade!
Solving Unix problems since 1989...
How fast does it spin? Whats the iops on something like that? How fast will Windows 7 boot on it?
I understand they wanted the overall system to be fault tolerant, but it might be better to leave that part to established computer science. I understand DNA might be uniquely prone to certain types of errors or reading problems - but there's a lot of computer science theory (and practice) established here that would likely make the overall system more robust than what looks like a fairly simple redundancy scheme.
Let's not stir that bag of worms...
I can't wait to see what happens when a video stored on DNA goes viral...
*ducks*
So that is how the Goa'uld stored everything in their 'genetic memory'. And here, I always thought StarGate SG-1 was a crock of lies!
Each DNA nucleotide has a molecular weight of about 150. So a gram of DNA should contain about about 6e23/150 = 4e21 bases. At two bits per base, that is 1e21 bytes. These guys are getting 2e15. So, in theory, they are getting about a half millionth of the potential storage, or 0.0005%.
This seems like an amazing development, but just today we've had a story about Monsanto and how well their error correction is going despite haivng the best in Western thinking availalble to them. Why should we trust that IBM's procedures are any better?
It's 2.2 petabytes per gram, but only if you don't mind that it contains a billion copies of the same 2.2 megabytes. Making lots of copies of a short DNA sequence is easy. Making a whole gram of unique DNA sequences is much, much harder. What's the non-redundant storage density of this process?
Give me Classic Slashdot or give me death!
Keep a strand of DNA "alive" in your computer?
If you could reason with religious people, there would be no religious people
It could be they are already using a fancier scheme - it's hard to tell what's real details of their method, and what's pop-sci "summary". So I apologize if I'm not giving them deserved credit here.
Let's not stir that bag of worms...
Kinda makes me wonder...
When we hit a point where we can easily manipulate DNA as data, what will the implications of that be biologically?
Will people be able to write biological weapons as they currently do with computer viruses?
How rare is "very rare"? If they have that 2.2 petabyte gram of sotrage, and "rare" means 0.0001% of the time, that's still 9 billion failures in your archived data.
Think about it, saving your stuff in a dna tattoo... very cool and very creepy at the same time.
KERNEL PANIC -SIGFAULT AT ADDRESS #51A54D07
Computer memory is about reading and writing data efficiently, NEVER the obvious observation that stable arrangements of something can 'encode' information. Why would Slashdot promote such self-serving garbage?
A more intelligent question would be why we have failed to create organic memory, based on very long carbon-based molecules. In theory, we could march electron states up and down such a molecule, in a way analogous to early CCD devices or 'bubble' memory, and use these states to encode information at an atomic density.
Building unique molecule chains to encode information would be the worst idea ever. It would be impossible to think of a slower, more expensive, and less reliable method of data storage. If density and endurance is desired over cost and speed, you simply create a very thin layer of some stable material, and burn or impress extremely small holes or indentations into the material. This way, you store one bit for every handful of atoms, essentially for ever.
However, DNA 'manipulation' is trendy, no matter how pointless the enterprise, and Slashdot seems to take pride in being the 'tabloid' of tech forums.
would like to buy your tech for our animus project.
If this is an optimal encoding, we should start looking for encoded messages from ancient alien civilizations.
So, while I realize that the intet here is not to put it inside a living organism.... some part of me wants to know what would happen if the data for various windows malware packages was encoded, and injected into bacterial hosts.
Think of all the new diseases that could come about from pure happenstance, coinicidence, and murphy's law!
Kind of a "throw stuff at the wall and see what sticks" silliness side effect of using DNA for data storage.
DNA is the machine language of biological life. What happens if it starts perpetuating itself?
Are YOU using the TOOL, or is the TOOL using YOU? Think about it!
Where's the de-mutation program for that?
Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
will somehow involve porn, no doubt.
and still just 4 calories per gram!
Okay, storing is "solved". How about retrieval? Especially random access retrieval that are simple and fast (relatively speaking) that allow such storage medium to be practical? Certainly not DNA sequencing that can take weeks to complete?
The second problem: DNA denature and fragment at room temperature. Certainly a -80C lab freezer for storage wouldn't be practical.
Third problem: DNA secondary and tertiary structure. The coding scheme must also solves the problem of DNA tendency to make secondary structure (like hairpin) or tertiary structure (like super-coil) that can hamper reading / access to the information. I think this is the reason why the storage uses short sequences. But short DNA sequences like the one proposed (~100 bp, from the figure) could still make such structures.
--
Error 500: Internal sig error
"That was the best sex ever and BTW, I just gave you copies all my videos".
Nate
You copied an MP3? Expect to be sued by the RIAA and their European buddies.
The question is: What if other already used similar method to send messages to us? How would you find that out? Anybody tried to find it out? Considering the possibility we are not alone...
How long until the crypto guys start using this to pass messages in live animals?
Good grief, this has to be the one storage medium slower than a Commodore 1541 disk drive. Slower than an ASR-33 paper tape reader.
ok, those were both really fowl.
I've fallen off your lawn, and I can't get up.
Using the numbers provided above:
1g DNA =~ 4x10^21 bases (but there is also the phospate backbone) so lets lowball at 2x10^15 bases in the story and since the going rate for DNA synthesis is ~ $0.28 per base. [http://www.genscript.com/gene_synthesis.html] and the company can generate 6.5*10^6 bases in a month...
then 1g of synthsized DNA would cost ~$5.6x10^14 and take ~25,600,000 years to generate.
we're in trouble now - they aren't coming for our water, bodies or women. And we're broadcasting their material unidirectionally. Humanity is so screwed. - AC21874812
Great! now we need to install AV software in all our brains to protect us from Computer virus that leap into the WetWare (/SARC)
Should we start checking our own DNA for encoded files from our deep ancestors?
I was actually trying to come up with a ReiserFS gag.
Science is all about firing a drunk pig out of a cannon just to see what happens.
Slashdot told me so
10,000 years my ass....
I'm not signing anything
So now I can store my entire porn collection in one spurt.
Science!
I filed a patent along similar lines back in 2006, IIRC. Although long since lapsed, it did include more sophisticated error correction and compression. The text of the patent can be found here: https://docs.google.com/file/d/0BwCRbg2GjBaddHU5UnRYTWJUS3c/edit
Wouldn't it be great if we could accumulate knowledge by eating a knowledge-encoded steak?
Its like a millisecond per base pair or a kilobyte or two per second. However a cell may have tens of thousands of ribosomes to parallelise this function.
A media that is DRM free because the rip-and-burn tools cost about a billion dollars. I for one would not want to carry around a box of test tubes with gelatinous MP3s of every note and recording humanity has ever emitted...gimme my iPod.
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
what happens when the dna becomes a virus and it really becomes a video
or becomes aware like skynet and....hides waiting elusive to pay back humanity for losing the terminator wars
Okay, storing is "solved". How about retrieval? Especially random access retrieval that are simple and fast (relatively speaking) that allow such storage medium to be practical? Certainly not DNA sequencing that can take weeks to complete?
Sequencing has moved on, especially for short fragments. Look up Ion Torrent systems - this would take minutes to hours at most
The second problem: DNA denature and fragment at room temperature. Certainly a -80C lab freezer for storage wouldn't be practical.
DNA is perfectly happy at RT if you dry it out. Research labs routinely exchange samples spotted onto blotting paper.
Third problem: DNA secondary and tertiary structure. The coding scheme must also solves the problem of DNA tendency to make secondary structure (like hairpin) or tertiary structure (like super-coil) that can hamper reading / access to the information. I think this is the reason why the storage uses short sequences. But short DNA sequences like the one proposed (~100 bp, from the figure) could still make such structures.
Not relevant in most cases. Massively GC-rich sequences might cause some problems, but you could adjust your coding scheme to avoid this if it ever became a problem.
Sorry, this is just the best news I've read in ages. Fucking AWESOME!!!