Slashdot Mirror


Using Bacterial DNA For Data Storage

NPV writes "January ACM Communications has an article on the use of DNA in genetically modified bacteria to store information. This is an attempt to achieve the ultimate in archival storage (one of the modified bacteria can tolerate 1000X more radiation than a human being). Now just suppose that the "junk DNA" in the human genome is the documentation package for the machine code. Who wrote that manual?" Here's the article abstract.

211 comments

  1. Of course its junk DNA... by packeteer · · Score: 3, Interesting

    I mean these bacteria have evolved for millions of years to be as streamlined as possible and yet i a few short years we can figure it all out and more. Also we can make it better of course.

    --
    unzip; strip; touch; finger; mount; fsck; more; yes; unmount; sleep
    1. Re:Of course its junk DNA... by zatz · · Score: 2, Insightful

      It's not like copying some extra genetic material is that expensive for the cell. What's the selective disadvantage in having some superfluous introns (non-coding regions) in your DNA?

      We may not immediately be able to make natural organisms "better" in terms of natural fitness, but we can still make plenty of modifications which are beneficial to us. We can do it even without the use direct genetic engineering; we call that "domestication".

      --

      Java: the COBOL of the new millenium.
    2. Re:Of course its junk DNA... by packeteer · · Score: 2

      And of course domestication has never been anythig but pleasent for us. All i am saying is thatfor thousands of years scientists have been very wrong and we keep fixing that. Sometimes we find things that are most likely right but i dont think we get too exited and assume we know it all by now.

      --
      unzip; strip; touch; finger; mount; fsck; more; yes; unmount; sleep
    3. Re:Of course its junk DNA... by 0x0d0a · · Score: 5, Interesting

      To be entirely fair, they were using a brute force mechanism and dealing with a changing, hostile environment. We can use a controlled environment.

      Yet I don't see this hitting the market in the next ten years.

      I remember about eight years ago an article about how the future of storage was going to be in a frozen solid containing bacteria that change shape when a certain intensity of light hits them -- two lasers, each with half the requisite amount of light, would shine in to cause the bacteria to change shape where they met. Terrabytes in a little cube. Never happened.

    4. Re:Of course its junk DNA... by zatz · · Score: 1, Flamebait

      I am amazed that you can post on Slashdot at a default of +2 with your broken English and Luddite views.

      Domestication has, on the whole, been *extremely* pleasant for us. Occasionally we get a plague from an animal species that way, yes. We've also managed to do such amazing things as grow food so efficiently that we can support a large and sedentary (vs nomadic) population almost anywhere on Earth, and give them lots of leisure time to display their ignorance by commenting on Slashdot.

      Also, the "junk DNA" remark you are criticizing was from the submitter. Maybe the article suggested somehow adding more chromosomes rather than replacing apparently unused regions. Either way, it sounded like primarily a thought experiment, so you can relax. Perhaps after you eschew modern medicine and die at 30, we will encode your sad story into some bacterial DNA for the edification of future generations.

      <flamethrower mode=off>

      --

      Java: the COBOL of the new millenium.
    5. Re:Of course its junk DNA... by marshac · · Score: 2, Informative

      The thing about 'junk' DNA is that it's not junk at all. When you remove the 'junk', the organism dies. It's just junk until we find out what it really does.

    6. Re:Of course its junk DNA... by Anonymous Coward · · Score: 0

      "I am amazed that you can post on Slashdot at a default of +2 with your broken English and Luddite views"

      Because everyone has english as a first language, and if you have a gramatical or spelling error, your comments are worthless.

      Biggot.

    7. Re:Of course its junk DNA... by Anonymous Coward · · Score: 0

      I think you'll find that the word is "bigot", not "biggot". Are you fucking stupid or something?

    8. Re:Of course its junk DNA... by Anonymous Coward · · Score: 0
      Because everyone has english as a first language, and if you have a gramatical or spelling error, your comments are worthless.

      Of course not. It is jarring and irritating to read posts with errors that preview would catch.

      That is reality, not bigotry.

    9. Re:Of course its junk DNA... by Anonymous Coward · · Score: 0

      READ THE BOOK 'PREY'!!! You people must read this before messing with technologies such as these.

    10. Re:Of course its junk DNA... by Anonymous Coward · · Score: 0

      Posting obvious spelling and structural mistakes on the internet is like appearing in a public forum with egg on your tie.

      This might sound harsh, but I wouldn't be seen posting in my broken German on German SuSE lists? Although with English the default language of the internet, it's not such a problem.

      It's no great hassle, and unlike the above poster, I don't mind reading these posts, but I agree they are a speedbump to my reading of the page.

      Oh well, my 2c.

  2. Fascinating! by Anonymous Coward · · Score: 0

    Now we can have computers so small that we can't even operate them without an electron microscope. Who wants to build the keyboard and monitor?

    1. Re:Fascinating! by hackwrench · · Score: 1

      Uh, just what do you think a microchip is?

  3. Gaia by Scaba · · Score: 2

    It reminds me of the planet Gaia in the later Foundation books of Asimov. Memories stored in all living things and in the very planet down to the molten core.

    1. Re:Gaia by Anonymous Coward · · Score: 0

      Fag.

    2. Re:Gaia by Anonymous Coward · · Score: 0

      Nice...you eat your mother out with that mouth?

  4. Who wrote that manual? by Anonymous Coward · · Score: 5, Funny

    Who wrote that manual?

    I think the important question is... who has IP rights over it?

    1. Re:Who wrote that manual? by Anonymous Coward · · Score: 0

      Be careful. Otherwise, someone's going to start demanding that we call ourselves Gnu/Humans.

    2. Re:Who wrote that manual? by Anonymous Coward · · Score: 0

      Our Lord and Saviour Jesus Christ

      Anyone messing with His plans would be in more than they bargained for! :)

    3. Re:Who wrote that manual? by Anonymous Coward · · Score: 0

      Silence, troll!

    4. Re:Who wrote that manual? by Anonymous Coward · · Score: 0

      I do.

    5. Re:Who wrote that manual? by mlush · · Score: 2, Interesting
      The Raelians, duh!

      Ha! Raelians can't read biology and think p53 is a new gene (first published 1984) that makes evolution impossible, cos its a DNA repair enzyme, which makes mutation and hence evolution impossible (1).

      Which is true in that DNA repair exists and p53 is involved in it (although its more involved in getting cells to commit suicide if there feeling a bit precancerous), but it won't stop genes mutating as all it does is checks/corrects DNA base pairing sometimes correcting it the wrong way creating a new mutation

      (1) under Evidence -> Science & Future -> Alt theorys of Evolution.... (F***ing frames)

      PS Being involved in human gene nomenclature I feel duty bound to mention p53 approved symbol is TP53.

    6. Re:Who wrote that manual? by Anonymous Coward · · Score: 0

      RTFM : Read the freaking mitochondria

    7. Re:Who wrote that manual? by bluFox · · Score: 1

      Looks like our very own creater ........

      the documentation is harder to read than

      the code.... :)

      --
      ~561
    8. Re:Who wrote that manual? by Anonymous Coward · · Score: 0

      God does not exist.

    9. Re:Who wrote that manual? by Wild+Ennui · · Score: 1

      More germane is when Microsoft claims the use of four bases as an integral part of Windows and decides that our carbon based life form is now infringing on their intellectual property. But of course, that may be a moot point if Jeff Bezos can patent the 'single dick' model of reproduction.

  5. mutations? by Anonymous Coward · · Score: 1, Interesting

    Isn't allowing for mutations an important aspect of DNA. Doesn't make for a good place to store info methinks.

  6. This brings up a concern by Anonymous Coward · · Score: 0

    Please check out the article on:

    http://www.chemical-engineering.com

    This brings up the concern as to whether we should be using DNA for storage. There is a serious concern about if this should be done.

    1. Re:This brings up a concern by Mr+Teddy+Bear · · Score: 1

      i think they should just clone good people over and over... that way stories can be passed down from a copy of a copy of a copy of a copy. Ok, maybe I am being stupid. In fact, I know I am. ;-)

      A little on-topic, messing with DNA in such a way seems a little... sketchy.

  7. Who wrote that manual? by zatz · · Score: 5, Funny

    The Raelians, duh! That's how come Clonaid is so far ahead of other human cloning efforts... they read the documentation.

    --

    Java: the COBOL of the new millenium.
  8. No spy! You can't have it (swallow) by jpt.d · · Score: 2, Insightful

    .... ....

    Doctor, my stomach hurts! .... ....
    (1 year later)
    Plague Plague Plague!

    --
    What we see depends on mainly what we look for. -- John Lubbock Now search for that bug slave!
  9. Great! by Travoltus · · Score: 5, Funny

    So when one of these engineered bacteria wipes out the human species, and some alien species comes along and ganders a look, the bacteria will be carrying a precise record of how we humans fscked ourselves.

    --
    --- Grow a pair, liberals... stop letting the Republicans bully you!
    1. Re:Great! by Anonymous Coward · · Score: 0

      ... a precise record of how we humans fscked ourselves.

      You spelt "fucked" wrong, moron.

    2. Re:Great! by Anonymous Coward · · Score: 0
      You're stupid, so shut the fuck up!!!

      You like my spelling better?

    3. Re:Great! by Anonymous Coward · · Score: 0

      Hey, bitch. Stay out of this unless you want your skull smashed in. Comprehendé?

    4. Re:Great! by Anonymous Coward · · Score: 0

      smash your momma, moron's son...

    5. Re:Great! by Anonymous Coward · · Score: 0

      That's the point.

    6. Re:Great! by Anonymous Coward · · Score: 0

      That's not the fucking point. The fucking point is that you fucking morons can't fucking spell.

    7. Re:Great! by Anonymous Coward · · Score: 0

      GOOD TROLL, you got a lot of fish with the one OR IF you don't know what FSCK is then go back to fucking your moma, Windows FOO.

    8. Re:Great! by derekb · · Score: 1


      More like, the society comes to earth and finds nothing but stored copies of Star Trek Nemesis screener...

      Sounds like a new RIAA slogan 'do you want to be responsible for destroying the world?'

    9. Re:Great! by Saeger · · Score: 1
      Certain species of uber-geeks spell "fuck" as "fsck" on purpose in to appear a bit more professionally pretentious, and to display their tribal *nix membership.

      This pathetic meme will live on as long as it gets a *chortle* for being a "witty" and politically correct mispelling of an otherwise 'uncivilized' curse word.

      --

      --
      Power to the Peaceful
    10. Re:Great! by billburroughs · · Score: 1

      I can see it now...

      Patient: Doc, I think I have Bronchitis.

      Doctor: Let's have a look. I see, yes, there it is, no not Bronchitis. Actually, it looks like a few O'Reilly manuals are in there.

      --
      - The word is a virus.
  10. Cheating possiblities by joelt49 · · Score: 2, Interesting

    Disclaimer: I am not advocating any behavior whatsoever here :) Just think: We could store entire textbooks in our DNA. The professors would have no way of taking it out of us. That would be interesting. Not only that, but we could but tons of info there. The only problem is that we would need a way to access it.
    This is interesting though. What if the entire human population became just a storage bank? What if EVERY LIVING THING on Earth became part of this bank? That would be an interesting scenario. For now, though, I'll just stick to normal HD's. A big problem, I suppose, is in changing the data. I wonder how many bacteria they had to go through to get it right.

    1. Re:Cheating possiblities by baryon351 · · Score: 3, Funny

      ...and making backup organisms would be more pleasurable than waiting for a tape unit to finish whirring

    2. Re:Cheating possiblities by Maria+D · · Score: 1

      What if this project with every living being a storage bank has already happened? ;-)

      Now we just need to decode the stored info...

    3. Re:Cheating possiblities by Anonymous Coward · · Score: 2, Funny

      So, a French Kiss becomes file-sharing.. Omigosh! MPAA and RIAA are not going to like this...

    4. Re:Cheating possiblities by justzisguy · · Score: 2, Funny

      Maybe not quite a storage bank...more like a computer. The greatest computer in the universe! Designed by Deep Thought for a few white mice to determine the Ultimate Question(TM) to the Ultimate Answer(TM) all so they could profit on some talk shows.

    5. Re:Cheating possiblities by Com2Kid · · Score: 1
      • What if the entire human population became just a storage bank? What if EVERY LIVING THING on Earth became part of this bank?


      Oh sheez, I hope you just didn't start something worse then "imagine a beowulf cluster of these" jokes.

      Nah, no catch name, no worries. :)
    6. Re:Cheating possiblities by Anonymous Coward · · Score: 0

      How do you know we aren't already? We don't know what everything in our DNA is for and surely we already have come across parts called "Junk DNA". Maybe there's illegal warez in those "Junk DNA" strands or something? Cool!

    7. Re:Cheating possiblities by Anonymous Coward · · Score: 0

      Well, at least they won't have to worry about slashdotters doing any file-sharing.

    8. Re:Cheating possiblities by morgajel · · Score: 1

      duh, it has. When you add all our DNA extra bits together, it makes a large equation..... ...and the answer is 42.

      --
      Looking for Book Reviews? Check out Literary Escapism.
    9. Re:Cheating possiblities by CSG_SurferDude · · Score: 2

      The beowulf cluster allready exists....

      They found out the answer is 42....

      Then they made the cluster to find out what the questions really is.

  11. hmmm by zachusaf · · Score: 5, Funny

    so much for P2P networks, if anyone wants the new Apache release, I just sneezed.......

    1. Re:hmmm by Anonymous Coward · · Score: 0

      Well, lucky you. I just vomited the lasted leaked copy of .NET server.

    2. Re:hmmm by Anonymous Coward · · Score: 0

      I just shitted out a copy of Windows XP (Now with added security flaws, bugs, and spyware!)

    3. Re:hmmm by 0x20 · · Score: 1

      and corn!

    4. Re:hmmm by mandolin · · Score: 1
      zachusaf

      gesundheit (bless you) :)

  12. Uhh, perhaps not. by smoondog · · Score: 4, Interesting

    (one of the modified bacteria can tolerate 1000X more radiation than a human being).

    I haven't read the article (don't have access to where I am) nor have I thought about this subject much, but one question I have is how the authors keep the sequences under selective pressure. DNA sequences are only conserved over many years if evolution needs them. Non-coding regions (So called "junk-DNA", poor choice of words, btw) would easily mutate into other sequences. One could imagine sequencing many cells, and infer the original sequence, but this gets more expensive as time goes on (as the number of sequences you need to sequence goes up).

    -Sean

    1. Re:Uhh, perhaps not. by iconian · · Score: 2, Interesting

      It seems more like the non-coding region uses the coding region to get itself around. "Life" as we define could just be a means for the non-coding region to reproduce itself. In other words, we could just be containers for these so-called junk regions.

    2. Re:Uhh, perhaps not. by martyn+s · · Score: 1

      That idea is well illustrated in the first few chapters (don't remember exactly which ones) of the selfish gene, richard dawkins.

      Blew my mind.

    3. Re:Uhh, perhaps not. by Anonymous Coward · · Score: 0

      One could presumably compute the error rate, and if it was known, could use an error correcting method similar to forward error correction (FEC).

      FEC all your data first, then encode it into the critter's alleles.

      Given that Ecoli (for example) has ~5M base pairs, and an assumed error rate (and usable gene rate) of 50%, you could still store around 2.5Mb of data in a single species. Compressed, that's enough to store something like the full text of Cryptonomicon and Snow Crash.

      At least that's what I'd store in it. :P

      Kordless

    4. Re:Uhh, perhaps not. by Abraxsis · · Score: 1

      Personally I find this idea amazing. But in response to the problem of mutation I would imagine that a technology could come along soon that would allow the DNA to be changed so that self-induced mutations wouldn't appear. I would imagine that this kind of technological jump isn't a self sustaining technology. It would HAVE to have other supporting technology to make the initial idea stable. Sort of like the research in prolonging the cellular production of telomerase to literally eliminate programmed cell death, and we got the idea from CANCER no less. The number one killer could be our ticket to immortality. And, vice versa, turning off telomerase makes cancer cells mortal again, allowing tumors to die off like normal cells. Ah well, so goes life.

    5. Re:Uhh, perhaps not. by smoondog · · Score: 2

      Actually this is not quite correct. As I understaand it, Telomerase shortening is a system engineered in the cells. Turn the system off, and telomerases don't shorten anym more. Mutations, on the other hand are difficult to prevent. This is because DNA damage happens spontaneously and often. This DNA damage makes fidelity very, very difficult. Our cells spend a lot of time preventing this.

      -Sean

  13. Don't wash your hands... by dagg · · Score: 2

    Because if you use your new computer after washing your hands with anti-bacterial soap, you could kill all the little buggers.

    --
    Sex - Find It
    1. Re:Don't wash your hands... by orthogonal · · Score: 2
      Because if you use your new computer after washing your hands with anti-bacterial soap, you could kill all the little buggers.

      Let me save /. some time and effort here:

      1. [Please insert one of

      smelly unwashed ego-maniacal GNU/lixux anti-Stallman
      or

      smelly un-washed Moutain Dew-swilling linux geek
      or

      smelly never-showering always-surrendering Frenchman
      'will be the only ones able to use this bacterial memory'
      joke here.]

      2. ???

      3. Profit.

  14. Yipes! by beatbox · · Score: 1

    You could literally program a virus by abusing a buffer overflow... well, a mean bacterium anyway...

  15. Bad news by Anonymous Coward · · Score: 1, Funny

    I suspect that this technology will be eagerly developed, at least until someone comes down with a bad case of Shakespeare's Macbeth in the lung.

  16. So, ah... by boomgopher · · Score: 1

    why is this better than magnetic or optical storage?
    Or the holographic/crystal whatever storage that's being developed?

    This seems, ah, messy.

    --
    Your hybrid is not saving the environment. Its purpose is to make you feel good about buying something.
  17. In other news... by pVoid · · Score: 5, Funny

    Scientist have discovered that humans and all life on earth was just a discarded bacterial disk drive from a geek with pimples living in his mother's basement 5 million light years from the solar system.

    1. Re:In other news... by Anonymous Coward · · Score: 0

      It was also just discovered that God is exactly like your typical Slashdot reader.

    2. Re:In other news... by Exiler · · Score: 2

      That's a little harsh, we may be rare in the real world but we do exist.

      --
      Banaaaana!
  18. Doctors Visit by Kelerain · · Score: 1

    I can see it now:

    Patient: I think I'm dying.

    Doctor: RTFM!

  19. dna in violation of dmca by mjp9055 · · Score: 1

    It is only a matter of time before this becomes a violation of the DMCA.

    1. Re:dna in violation of dmca by fucksl4shd0t · · Score: 2

      It is only a matter of time before this becomes a violation of the DMCA.

      Now that you mention it, all the President has to do to get his way with contraceptives is get a law passed that says that every person immediately gets copyright over their DNA (grammar?). Then contraceptive devices themselves and even talking about them would be a violation of the DMCA.

      --
      Like what I said? You might like my music
  20. So that's why by Tablizer · · Score: 2, Funny

    I think the back of my fridge has the Library of Congress.

  21. relijjin by Anonymous Coward · · Score: 0

    Now just suppose that the "junk DNA" in the human genome is the documentation package for the machine code. Who wrote that manual?

    This looks like a religious statement. If so, I am outraged! If not, then I'm not outraged. Carry on.

  22. Talk about.... by Anonymous Coward · · Score: 0

    Potential Viral Code.

  23. So thats where it went.. by Kelerain · · Score: 1

    They have been looking for it for the past week!

  24. I wonder what bacteria would look like by Kasmiur · · Score: 1, Offtopic

    if we were to store Microsofts code in its DNA. Then give it a year to see what pops out.

    hmm.. then we could load up linux and see if it creates anything better.

    We could then put both the Microsoft bacteria in a dish with the linux bacteria and see who wins out.

    Would a beowulf cluster of microsoft bacteria be another form of cancer?

    wait wait
    so its out of everyones system.

    In Russian the bacteria puts DNA in you!
    or
    All your Disease are belong to us!

    blah

    --
    -THIS SPACE FOR RENT!
    1. Re:I wonder what bacteria would look like by Anonymous Coward · · Score: 0

      Do shut up, old chap. You are embarrassing yourself.

    2. Re:I wonder what bacteria would look like by orthogonal · · Score: 2

      I wonder what bacteria would look like if we were to store Microsoft[']s code in its DNA. Then give it a year to see what pops out.

      After a year, a new EULA pops out. If you want the Service Pack that fixes the compromised immune system DNA, you have to agree to the EULA, which installs the auotmatic apoptosis DNA, forcing what the EULA euphemitically calls a "planned obselescence of all cellular function" just in time for the rollout of MS-DNA 2010.

  25. Ummmm..... by Necrotica · · Score: 0

    This could be interesting. Since bacteria mutate (or evolve, as it were) how would that affect previously stored data? Methinks the outcome would be funny. Save my resume one day, the next it becomes microbiotic porn. Cool!!

    I didn't read the article, but seriously, who in their right mind would save presumably important data on a mechanism that constantly changes?

  26. man man by 5alligator · · Score: 1

    Yep, it's in here, all right. OK, i've only just glanced at it, but it says some stuff about environment, path (jeez, maybe those buddhists are right...), and something about cats. And to think that i believed all of that stuff about apes. Cats - who woulda thunk?!

  27. There is a kind of bactera by autopr0n · · Score: 5, Interesting

    That keeps four copies of it's DNA in rings and error checks constantly. They're probably using one of these, as it happens to be very radiation resistant, I'm guessing they used these, and so the mutation rate would be very, very low. So it wouldn't keep forever, but would for a very long time.

    You could also put error checking (parity, checksums, etc) so once you found some bactera you could check to make sure they had the right version and not a mutation

    --
    autopr0n is like, down and stuff.
    1. Re:There is a kind of bactera by Anonymous Coward · · Score: 0

      ...You could also put error checking (parity, checksums, etc) so once you found some bactera you could check to make sure they had the right version and not a mutation
      of course, you'd use CVS to keep the right version at hand...

  28. Bacteria Have No Introns and Other Considerations by mustermark · · Score: 5, Informative

    Just to be clear, no non-coding segments have been found in bacteria yet (last I heard). So putting data in as 'junk-DNA' in humans is quite a bit different from interrupting a fully functional bacterial DNA segment with the data to be stored.

    Also note that the introns in eukaryotes are highly mutable (look up 'tandem repeats' if you have the inclination), so the fidelity of the data would be sacrificed by putting it there. The longest lifetime for the data would be achieved by tricking the replication machinery into thinking the segment was an exon, which would involve tying it to a functional protein that would be absent were the sequence to be mutated.

    Duplication of the data would also work, but it would only hammer down the probability of mutation, since the probability of a point mutation of a base at the same location in two widely separated sequences is roughly 10^-18 to 10^-17 per year for exons.

  29. The "Junk DNA" by bahwi · · Score: 2

    I'm sure the Junk DNA in the human genome, if they have anything to do with the Secret Message of Pi, or the Intelligence In Pi, then I'm sure it's written in the English Alphabet because that's what our Alien(Raelian?) ancestors wrote in. Haven't you seen Star Wars or Futurama?

    1. Re:The "Junk DNA" by Selfbain · · Score: 1

      Let me get this straight.. they are attempting to convert Pi into a language WE created? I somehow doubt any of our languages formed because they're naturally occuring forces.

      --
      Well, it has never been successfully tested.
  30. The Manual by E-Rock-23 · · Score: 2, Funny

    Who wrote that manual?

    And where the hell did they hide it? I've been trying to figure out the human race (more specifically, the female of the species) for years. Chicks are always telling me to RTFM, so hurry up and fork that thing over so I can get ahead (bad pun intended) in the world!

    --
    Blog Prophyts - Right On, Man
    1. Re:The Manual by kha0z · · Score: 1

      haha... i thought perhaps it was an attachment in one of their emails... i have been reading their email for a long time now and i can't find it... i have concluded that it doesn't exist. but i do know that we (men) know one thing about women: We know we know nothing about them.

      Knowledge is power!

      --
      kha0z
      Master of ImportChaos.com
    2. Re:The Manual by Anonymous Coward · · Score: 0

      The leading cause? Make that "excuse".

    3. Re:The Manual by Anonymous Coward · · Score: 0

      Stop trying to figure out females and you won't be stressed anymore...its worked for me, jus be an asshole/jerk/dick to them and they like it, y cuz they think they can change u...don't ask y, it jus does...

  31. 1000X radiation by Albinoman · · Score: 1

    This is important cause you know if we could travel the stars and we found an advanced race that has been wiped out and there is massive amount of radiation on the planet, the first place we would look for their history books is in the bacteria. (Yes Im being sarcastic)

    I also havent seen any posts commenting on how much pr0n could be stored in this.

    And before anyone gets to it, Im going to patent a "method for backing up your pr0n collection in your own DNA", I wont be hindered by the fact that I dont have the slightest idea how to do this.

  32. What? by Anonymous Coward · · Score: 2, Interesting

    "the ultimate in archival storage (one of the modified bacteria can tolerate 1000X more radiation than a human being)."
    What kind of comparison is that? Are human beings presently used as archival storage in irradiated areas?
    Seems that the punched metal tape the Army uses for ultimate reliability is the way to go. Even if the stuff rusts, is radioactive and glowing red, you can still read it.

  33. No, you mean a virus by autopr0n · · Score: 2

    Actualy, I think you just don't know what you're talking about. But you could 'overflow' (I think) the non-coding region and overwrite a part of the bacteria's DNA with the DNA for a virus, and perhaps the virus would come alive once that bit got read.

    hrm...

    --
    autopr0n is like, down and stuff.
  34. We wrote the manual! by SHEENmaster · · Score: 3, Funny

    Don't you people watch the outer limits?

    I'll probably write this code in sometime in the future. Human cloning is stealing and I will sue your ass for infringement.

    --
    You can't judge a book by the way it wears its hair.
    1. Re:We wrote the manual! by Styros · · Score: 2

      Don't you people watch the outer limits?

      I'll probably write this code in sometime in the future. Human cloning is stealing and I will sue your ass for infringement.


      Yeah, but then someone will hack your code, call it DeDNA, post it onto Kazaa, and then it's all down hill from there.

  35. this idea was proposed in NYT millenium issue by wisebabo · · Score: 1

    I believe the idea of storing and transmitting information via DNA was proposed by Jaron Lanier in the Y2K issue of the NYT magazine. The NYT was running a contest to come up with a "time capsule" that would last till Y3K and asked various prominent scientists, achitects etc. how they would make something that would last and would be easily found by future generations. Lanier proposed encoding a message in the DNA of cockroachs and then letting them reproduce naturally in the wild. In a thousand years they'd be everywhere! (His idea didn't win, a more conventional capsule with physical records was selected). Also in various science fiction books (Greg Bear) messages were encoded in people's DNA.

    1. Re:this idea was proposed in NYT millenium issue by tbsmkdn · · Score: 1

      This idea was also proposed by David E.H. Jones in his Daedalus column (which now appears in Nature). His article on 31 January, 1985, entitled "Archival Junk" discusses the use of DNA to encode the essence of human culture in case of another Dark Age. The article also appears in his compendium, The Further Inventions of Daedalus (Oxford University Press, 1999).

      --
      How's that working out for you? What's that? Being clever.
    2. Re:this idea was proposed in NYT millenium issue by jericho4.0 · · Score: 2
      Isn't it the final reason given for the existence of the human race in 'Hitchhikers Guide to the Galaxy'? That humans were created to carry a message accross time in DNA, but the reciver of the message had already known the content?

      Unless I'm totally wrong, of course.

      --
      "A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis
  36. After much testing... by Rui+del-Negro · · Score: 3, Funny

    Scientists have concluded that they can use a bacteria's DNA to store the complete description of... a bacteria. Revolutionary.

    What I really want to know is, can the same be done with the DNA of a bug? Because if it can, I'm going to buy some MSFT shares...

    RMN
    ~~~

  37. Re:Bacteria Have No Introns and Other Consideratio by Rainier+Wolfecastle · · Score: 5, Informative

    I think that you may have your terms a little mixed up. An intron is the DNA between exons (coding regions) in a gene. i.e.

    junk---junk---junk---exon-intron-exon-intron-exo n- --junk---junk---junk.

    The junk DNA often referred to is mainly intergenic DNA, and this is where most of the non-coding DNA is found. This also makes up the majority of the eukaryotic genome. Prokaryotes (bacteria) do contain intergenic DNA, but no introns.

  38. Well..... by mlg9000 · · Score: 1

    Guess you really WILL feel dirty for watching those porn divx movies you've wiped onto your MASSIVE petrie disk collection.

  39. kind of ironic isn't it? by circletimessquare · · Score: 2

    because the reverse has been true since before we were human beings. that is, virii (i know, not bacteria, but certainly the same thing under the rubric of "bad guys" in the most pop science sense) have been storing their info in US since before we were human beings.

    it's only good hollywood movie justice that we should play switch up and start storing our information in THEM. ;-)

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
  40. quaternary vs. binary by paughsw · · Score: 1

    IAAB (I am a biochemist) This is an interesting concept from a computer science aspect. Imagine instead of 1 and 0's you now had 4 (A,T,G,C)(or more if you were creative) items to choose from. You could pack alot more data into a smaller string because you wouldn't be limited to a on/off state. Also, evolution/mutation takes a long time relatively, and the bacteria has a ton of different ways to correct for mutations. Also, information exchange could be passed by bacterial sex, through plasmids and the like.

    1. Re:quaternary vs. binary by xamel · · Score: 0

      I used to think this was true, until I stopped to consider the following :

      WTF DO THOSE OTHER TO SYMBOLS STAND FOR???

      You see, in binary, you have 1/0, or on/off. Fairly self explanitory. However ,add two more states. What are they? 1/2 on and 1/2 off? Almost on? Electricity only works to ways, so adding two digets to binary to make quaternary would be almost infeasable...

      --
      GOD DAMNIT , MODERATE ME!
    2. Re:quaternary vs. binary by Anonymous Coward · · Score: 0

      Well, correct me if I'm wrong mister IAAB, but ATG&C are paired with another. I don't remember which is paired with which, but let's say that A and T are paired, and G and C are paired. This would mean that A-T mean the same thing, and G-C mean the same thing, which results in you only having base-2.

    3. Re:quaternary vs. binary by Anonymous Coward · · Score: 0
      You're joking, right?

      Er, in case you're not joking just take a second and think about it. The binary system is simply a numbering system in base 2, because you have two states. Obviously, math still works if you change your numbering system to another base (we use base 10 commonly), and storing data digitally is just math.

      So, I guess you now have 4 states to choose from instead of 2... so you'd do your math in base 4. Or, more likely you'd translate from base 4 to base 2 and do your math there if for no other reason than current hardware works on base 2.

      eg. (base 10) 43 = (base 2) 110101 = (base 4) 322

    4. Re:quaternary vs. binary by sponge_absorbent · · Score: 1

      so that explains the information unit used in startrek, quads!? but the other poster was right about going from binary to quaternary, it would not be a simple task.. sounds like something that could be done with spin states? (wild conjecture) anyway, i dont think this storage method is worth investing in until it becomes MUCH easier to sequence DNA. compare sustained data transfer speeds between todays hard drives and todays DNA sequenceing machines.

    5. Re:quaternary vs. binary by paughsw · · Score: 1

      Imagine you are a bacteria. Then ask yourself would you rather have 2 nucleotides or 4 nucleotides in which to code for proteins? your codon size would grow from 3 to 4. (with a loss of alot of reduncancy in the process) a 33 percent increase. now multiply your entire genome length by 33 percent. it's more efficent to store data in this way. if you were to conceptulize files as exons you would find that bacteria are very clever in their file storage. exons overlap, some portion of one is read backwards for another. don't think computers. think data storage. we aren't talking about electricity, we are talking about mother nature.

    6. Re:quaternary vs. binary by reverseengineer · · Score: 3, Informative

      IAAB too (not the same one as above), and I have to say, sorry, you're wrong. Yes, adenine (A) pairs with thymine (T) and guanine (G) pairs with cytosine (C), but bases are not restricted to one strand of a double stranded DNA- A and T or G and C can be found in the same strand. In fact, there are some regions where sequences consisting of A's and T's or C's and G's together play a critical role, like a sequence of TATAAT (or similar) called the TATA box, which is recognized by RNA polymerase, and leads to initiation of transcription. Usually, all 4 bases are present in each of the two strands, and since there are three bases in each codon, 4^3, or 64, possible different amino acids can be coded for from a single codon. Now, there are only actually 20 amino acids that are coded for (there are a few exceptions to this that depend on specific context), so a few of the possible codons can be used to code for a stop in protein translation, and there is a redundancy built in called "wobble" that allows correct translation despite certain slight mutations.

      Now, although there are two strands in most DNA molecules, only one actually codes for proteins- the two strands are sometimes referred to as sense and nonsense (or antisense) strands. Both are involved in replication, however- a DNA helicase splits the two strands, each acts as a template for a new complementary strand. And both can and usually do contain all four bases, with the concentration of each base in either strand being totally independent. Since the two strands in a double helix are complementary, the amount of adenine must equal the amount of thymine and the amount of cytosine must equal the amount of guanine in both strands . In fact, recognizing this relationship led to the realization that complementary base-pairing occurs. The original IAAB is correct though- the genetic code is indeed base 4- although nature has chosen to not use it to its full potential (i.e. code for 64 different amino acids) in favor of building in some redundancy.

      --
      "FDA staff reviewers expressed concern about the number of patients who were left out of the study because they died."
    7. Re:quaternary vs. binary by MillionthMonkey · · Score: 2
      Well, correct me if I'm wrong mister IAAB, but ATG&C are paired with another. I don't remember which is paired with which, but let's say that A and T are paired, and G and C are paired. This would mean that A-T mean the same thing, and G-C mean the same thing, which results in you only having base-2.

      Let me club you on the head with some ASCII art:
      AAAAATTTATTTAAAAAAAAA
      TTTTTAAATAAATTTTTTTTT
      According to your argument, this sequence contains no information.

    8. Re:quaternary vs. binary by paughsw · · Score: 1

      erm, make that a codon 5 nucleotides long.

  41. No, you didn't steal someone's yogurt by kfg · · Score: 3, Funny

    You just ate the entire sum of human knowledge. Nice work Sparky. Now you might want to go looking for a Tums and start polishing up your resume.

    KFG

  42. there is no such thing as junk dna by Anonymous Coward · · Score: 0

    think about it

  43. The Matrix... by darekana · · Score: 4, Funny

    ...if only the machines had used the humans for data storage!
    Morpheus coulda pointed to a SAN/NAS box!

    Instead they make a duracell commercial and mumble about the "human body generating more bio-electricity than a 120-volt battery and over 25,000 BTUs of body heat."

    Ok I'll quit ze bitching... it was spiffy anyway.

    1. Re:The Matrix... by isorox · · Score: 4, Funny

      ...if only the machines had used the humans for data storage!

      Just remember to feel sorry for the guy that gets slashdotted

    2. Re:The Matrix... by Wraithlyn · · Score: 3, Interesting

      Yeah.. the fundamental reason why the machines kept the humans alive (energy generation) is a completely absurd contrivance. That always bugged me, but not enough to ruin the movie. It does force an unreal (In addition to surreal? Whoah...) feeling to the whole movie, makes you go, "OK, we're in comic book/fantasy land now". It's somehow not as gritty and dark as it could be if you could actually believe machines might eventually enslave (breed, really) our entire species in a virtual zoo for a reason you could actually swallow.

      They coulda used some wonky vague ass stuff about the machines figuring out a method of harvesting the untapped power of the human consciousness and I would've been happier... the mind could generate the power itself somehow (emotional energy perhaps?) or maybe act as a conduit for drawing energy from extradimensional space. It would also give em a reason to stimulate and develop helathy human brains via their Matrix simulation instead of just keeping em doped all the time.

      --
      "Mind, as manifested by the capacity to make choices, is to some extent present in every electron." -Freeman Dyson
    3. Re:The Matrix... by lameland · · Score: 2, Funny

      My complaint with it was, if all they needed was body heat and electrical impulses from our nervous system, why use humans? Why not use a large, stupid animal like cows?

      The matrix would have been easy as hell to code: a big field, lots of grass -- that's it. I don't think you'd have any bovine Keanu Reeves breaking out (although a cow that knows Kung-Fu would be pretty damn funny).

    4. Re:The Matrix... by matrix29 · · Score: 2

      ...if only the machines had used the humans for data storage!
      Morpheus coulda pointed to a SAN/NAS box!


      I think that was some of the subtext of The Matrix movie. Humans were used as storage and processing as well as batteries and capacitors.

      Remember the "power from fusion" line? Where exactly would you dump the extra electrical output until you required it later? You'd store it in a stable chemical form (catalytic thermal salts) or highly combustible forms (gasoline or hydrogen).

      Why else would the A.I. even BOTHER keeping the humans alive in the first place unless it was required to do so for its own survival. Note that The Matrix guardians were seeking the keys to the "Zion Mainframe" so they could "leave this world". That means it was trapped there because it could not leave of its own free will.

      I assume the sequels will go into more detail on this. Of course I COULD be reading the subtext wrong and misinterpret the "Zion Mainframe" as a mobile computer storage & processing ship in place perhaps of a worldwide space defense system in orbit or on the moon (thus preventing The Matrix's A.I. from leaving the planet in a more literal sense as all attempts to leave end up with it being shot from the sky).

      --
      "Face it, a nation that maintains a 72% approval rating on George W. Bush is a nation with a very loose grip on reality.
    5. Re:The Matrix... by Dirtside · · Score: 2
      ...if only the machines had used the humans for data storage!
      Yeah! Then we could make a Beowulf cluster out of Beowulf himself! Erm, wait...
      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
    6. Re:The Matrix... by Kaz+Riprock · · Score: 3, Funny

      . It does force an unreal feeling to the whole movie, makes you go, "OK, we're in comic book/fantasy land now".

      Right...the leaping from building to building, spider robots, and 'faster than 5 speeding bullets' were fine...it was the human battery plot that made it seem like fantasy...

      --
      Mordor...a magical, mythical land where women are more rare than dragons--but where every man would rather find a dragon
    7. Re:The Matrix... by Wraithlyn · · Score: 2

      "Right...the leaping from building to building, spider robots, and 'faster than 5 speeding bullets' were fine...it was the human battery plot that made it seem like fantasy..."

      Well yeah, actually it was fine... :) The power jumping and bullet dodging and stuff merely took place in a SIMULATION. The Matrix. AI controlled virtual reality with a direct brain interface. Certainly conceivable within 40 years or so. And, the "spider robots" well, if you accept AI, designs like that, centuries from now, should be simple.

      But using human bodies to generate more energy than you put into them? Sorry, can't buy that. That's not a question of being too far-fetched, it's simply not mathematically possible.

      --
      "Mind, as manifested by the capacity to make choices, is to some extent present in every electron." -Freeman Dyson
  44. Yoghurt Harddrives? by v8interceptor · · Score: 1

    Yeah!!!! Hmm, I hope I don't get hunry...

    --
    --- Why are you wearing that stupid bunny suit? | Why are you wearing that stupid man suit?
    1. Re:Yoghurt Harddrives? by v8interceptor · · Score: 1

      And if I do I hope I remember the G!

      --
      --- Why are you wearing that stupid bunny suit? | Why are you wearing that stupid man suit?
  45. Bacteria dna code? by fucksl4shd0t · · Score: 2

    So, they decode some of the bacteria dna at some point thinking that maybe there's some important information left there, and they come up with:

    #include "stdio.h"

    void main(void)
    {
    printf("Hello, world!\n");
    }

    Now just suppose that the "junk DNA" in the human genome is the documentation package for the machine code. Who wrote that manual?

    The article posting was obviously just someone using it as a steppingstool to push their own preconceived notions of science upon us. I declare the article a troll.

    --
    Like what I said? You might like my music
  46. Store data by absurdhero · · Score: 1

    Oh good! I won't be needing a palm pilot any more. I can just write on my hand. Well, the cells in my hand.

  47. relevant star trek reference by marksilverman · · Score: 1

    The crew of the Enterprise found out about this a while back. Turns out an ancient alien race encoded a trite message of good will in the combined DNA of Humans, Cardassians, Klingons and Romulans. This was part of an attempt to explain why all the intelligent life forms all over the universe look suspiciously like humans with different forehead wrinkles.

  48. Sexy Manual! by Anonymous Coward · · Score: 0

    http://slashdot.org/~$$$$$exyGal

    +-----------+
    |There once |
    | was a girl|
    | named |
    | |
    |$$$$$exyGal|
    | |
    | She was |
    | very |
    | very |
    | very |
    | *horny |
    | |
    | She would |
    | walk |
    | around |
    | her |
    | house |
    | completly |
    | nude. |
    | |
    | She left |
    | her |
    | curtains |
    | open |
    | so you |
    |could watch|
    | |
    | come! |
    +-----------+

    http://slashdot.org/~$$$$$exyGal

    1. Re:Sexy Manual! by Anonymous Coward · · Score: 0

      For the curious, Emacs has a nice mode for drawing such ASCII brochures. Types "M-x picture" to use the picture mode.

      The M'gmt

  49. Swapping Data by Charcharodon · · Score: 1

    If they can put all the data they want into bactery then donwloading files could get pretty kinky. My question would be how would antivirus work (pill or harddrive?), and how would computer geeks ever get and new software?
    She said she downloaded that MP3 from a safe place doc, but after I got the song from her it started burning when I pee.

  50. Re:Bacteria Have No Introns and Other Consideratio by skeedlelee · · Score: 5, Insightful

    Just to be clear, no non-coding segments have been found in bacteria yet (last I heard).

    My first impluse was that this is way off. I'm used to working with plasmids where frequently like 60% of the sequence is junk. They use E. Coli and D. radiodurans in the study mentioned in the article. A brief survey of E. Coli K12 (the parent of most common lab strains) sez that about 5-10% of it is non-coding. The old initial reference claims about 11% is non-coding, but a good chunk of that may be regulatory. The radiodurans genome is about 9% non-coding. The up shot is that there is actually a fair amount of 'junk-DNA' in (at least the Coli) bacterial genomes. Not a lot by human standards but enough to be able to squeeze in a chunk here or there if you're careful.

    Another impulse was 'gad... that made it into Nature!?' (the journal, the article cited is a self congratulatory summary of their Nature paper). A lot of it follows a well duh kind of reasoning. 'Well duh' science is often the really good kind, but I wasn't particularily amazed by this. The DNA manipulation methods are beyond standard now, the only really clever thing was proposing the use of radiodurans as the host. Even that was sort of obvious (a blazingly well studied organism that is transformable). The DNA -> text using a 6 bit space? Well if you've ever designed linker regions in proteins I'm sure you were at least thought about spelling out you name or something in amino acids (unless your name is BOB). In part this is because every one learns the amino acids by doing stupid things like spelling out their name. Few people actually do this, mind you, as it usually would have some deleterious effect, but the point is I'm sure they weren't the first ones to try something like this, probably just the first to get funded to do this explicitly. Their big addition was to come up with a 3-letter code that includes all the letters and, ooo, punctuation. Then they spelled out bits of 'It's a small world.' My point is that it's not that far fetched and a bit surprising (to me) that it made it to Nature.

    As to the utility of these things for information carriers... Mutation would be a problem in the long term. Sure radiodurans would survive nuclear war (these guys put cockroaches to shame) but they do it using lots of mismatch repair and recombinatorial repair methods. These are not perfect repair systems, they can and frequently do introduce many errors, especially in non-essential DNA space. Tying it to a functional protein isn't a bad idea, but unless the added sequence adds some survival advantage it won't enhance the lifetime of the measage (ie. if uncorrputed data gives an advantage then it is statistically less likely to propagate). Also, as you mentioned, the bacterium might notice long chunks (they're using 100 characters here) of useless DNA and excise it. For that kind of text, it might be better to just etch it into stone or something, at least you have some hope of seeing it intact in 2000 years.

  51. Since these are bacteria, it'd probably be like: by Dthoma · · Score: 2

    #include <sys/types.h>
    #include <unistd.h>

    int main(void)
    {
    int i = 0;
    for (i = 1; i <= 20; i++) {
    fork();
    }
    return 0;
    }

    /* Whatever made you think that bacteria wouldn't be ANSI compliant? */

    --

    Note to M1-ers: a curt but otherwise insightful message is not "Flamebait" or "Troll".

  52. what's kinda neat about that junk dna... by tfoss · · Score: 2
    Now just suppose that the "junk DNA" in the human genome is the documentation package for the machine code.

    On a more realistic note, that junk DNA is probably more like a revision history of life. Many scientists are of the opinion that a significant portion of the junk DNA is really the product of virus infection way back in the evolutionary tree. Many viruses can copy their DNA into the host's genome where it will be propagated throughout life, and potentially into offspring. If this infection happens in the wrong section of a host's genome, the DNA is never read and while the virus doesn't propagate, its DNA will. Do this over a time scale of millions to billions of years and you get a lot of leftover virus DNA hanging around silent.

    So basically, some large percentage of your DNA is really just virus turds.

    -Ted

    --
    -=-=- Quantum physics - the dreams stuff are made of.
  53. Prankster-prone technology. by Anand_S · · Score: 4, Funny

    "All right. Which one of you bastards put the penicillin in my hard drive?"

  54. Is there a connection here? by euxneks · · Score: 2

    So would Bacteria that had better information kill out the other bacteria with lame information?
    Sounds like a primitive form of Slashdot ratings!!

    --
    in girum imus nocte et consumimur igni
  55. Junk DNA isn't the manual for our genetic code... by Dthoma · · Score: 2

    ...it's the statically linked libraries!

    *ducks*

    --

    Note to M1-ers: a curt but otherwise insightful message is not "Flamebait" or "Troll".

  56. Full text of article, for those of you who care by Anonymous Coward · · Score: 1, Informative

    A data preservation problem looms over today's information superhighway. Ancient humans preserved their knowledge by engraving bones and stone. About two millennia ago people invented paper to publish their thoughts. Today, we use magnetic media and silicon chips to store our data. But bones and stone erode, paper disintegrates, and electronic memory degrades. All these storage media require constant attention to maintain their information content, and all are easily destroyed by people and natural disasters, whether intentionally or by accident. In light of the vast amount of information being generated every day, it's time to find a new medium. Searching for an inexpensive, long-lasting medium for information storage, scientists at the Pacific Northwest National Laboratory (PNNL) are investigating deoxyribonucleic acid--commonly known as DNA--to develop a data memory technology with a life expectancy much greater than any existing counterpart. Our initial DNA memory prototype consists of four main steps: encoding meaningful information as artificial DNA sequences; transforming the sequences to living organisms; allowing the organisms to grow and multiply; and eventually extracting the information back from the organisms. Here we describe the objective of this investigation, which began in 1998, and experiments we've conducted to determine the feasibility of our approach, as well as several potential applications. Nature magazine reported a study [1] resembling the first part of our effort--encoding meaningful information as DNA sequences. It described an experiment in which a group of scientists at Mount Sinai School of Medicine in New York created an encoded DNA strand and hid it behind a period (a dot) in a printed document. The document was then sealed and mailed to its owners through the U.S. Postal Service. Eventually, the embedded message was successfully recovered in a laboratory environment. The article reported that the embedded information survived its rough handling in the mail, proving that a DNA strand can be as dependable as a piece of paper in terms of information storage. It is, however, still far from being able to outlast existing data-memory devices. In fact, a naked DNA molecule is easily destroyed in any open environment inhabited by people or potential enemies of nature. The so-called "double-strand break" of DNA, which is usually fatal, can be caused by common unfavorable environmental conditions, including excessive temperature and desiccation/rehydration. Even nucleases (a kind of DNA-degrading enzyme) in the environment can corrupt DNA molecules over time. Therefore, a key to our success is finding a super-dependable storage medium to ensure adequate protection for the encoded DNA strands. Our solution is to provide a living host for the DNA that tolerates the addition of artificial gene sequences and survives extreme environmental conditions. Perhaps more important, the host with the embedded information must be able to grow and multiply. Challenges Recent advances in genetic engineering have allowed the introduction of foreign DNA molecules into the living cells of bacteria, humans, and other organisms. Typically, a short, one-of-a-kind, well-researched DNA strand is applied to a living host for some particular biological study, with little or no intention of retrieving the embedded DNA afterward. This process is somewhat contrary to our basic DNA memory requirements that new and artificial DNA be generated frequently and that we be able to retrieve the embedded information afterward. These requirements pose serious challenges to our DNA memory design due to the size of a whole genome, which ranges from a few million DNA units in a bacterium to more than three billion in a mouse or human. It is practically impossible to retrieve an embedded message from a whole genome in a wet laboratory without knowing the content or whereabouts of the encoded DNA. The unpredictable nature of genomic mutation represents yet another obstacle, further reducing the odds of locating the message within a whole genome. Experimental Design The customized computational and wet-laboratory approach we developed leaves a trail of the embedded message for later retrieval while allowing us to preserve the integrity of the message. Our experiments were carried out in four primary stages: DNA host identification. In the process of identifying candidates to carry the embedded DNA molecules, we considered microorganisms (such as bacteria) and other agents (such as plants, including Arabidopsis) as message hosts. We eventually selected two well-understood bacteria--Escherichai coli (E. coli) and Deinococcus radiodurans (Deinococcus)--because microorganisms generally grow quickly and embedded information can be inherited quickly and continuously. We also considered the physical endurance of the DNA host candidates. Deinococcus, we learned, survive extreme conditions, including ultraviolet, desiccation, partial vacuum, and ionizing radiation up to 1.6 million rad, or radiation absorbed dose (about 0.1% of this dose is fatal to humans); some strains of Deinococcus also tolerate high temperatures. Information encoding.The four basic building units in DNA are bases called deoxyribonucleosides. In the biology literature, they are usually labeled A (Adenine), C (Cytosine), G (Guanine), and T (Thymine). Each base always bonds with another base (such as A with T and C with G). A single chain of these bases is called an oligonucleotide, or oligo. It pairs with another complementary oligo (such as GATCG with CTAGC) to form a double-stranded DNA. Each of these AT or CG pairs in a double-stranded DNA is called a base pair. Because a DNA sequence is digital, we can use them to construct any English text, just as we use binary numbers 0 and 1 to encode ASCII characters. Table 1 outlines the encoding table of our experiments using a set of triplets--a DNA sequence with any three of the four bases--the exact encoding scheme for our initial experiment. Note several triplets listed at the end of the table are open (intentionally) for later expansion. Unique DNA searching.The whole genomes of E. coli and Deinococcus have been completely sequenced and are available from The Institute for Genomic Research (www.tigr. org). Our task is to identify a set of fixed-size sequences (20-base-pair long in our experiments) that do not exist in the candidate bacteria yet satisfy all the genomic constraints and restrictions. This process is critical to our experiment, as we do not want to cause unnecessary mutation or damage to the bacteria. The resultant sequences also serve as sentinels to tag the beginning and end of the embedded messages--similar to file headers and footers in magnetic tape--for later identification and retrieval. Of the 10 billion potential candidates in the bacterium Deinococcus, we found through intensive computation only 25 qualified sequences that would be acceptable for our experiments. These sequences (see Table 2) serve as blueprints for chemically synthesizing oligos for subsequent steps in our experiments. The multiple triplets (such as TAA, TGA, and TAG) seen in many of the sequences are called stop codons and tell the bacterium repeatedly it has reached the end of the native DNA sequence and should stop translating its contents. Without the protection of stop codons, the bacterium could misinterpret the encoded information and produce artificial proteins that could destroy the integrity of the embedded message or even kill the bacterium. Wet laboratory procedures. We conducted four main procedures: Create complementary oligos. We started by creating two complementary oligos, each with 46 bases and consisting of two different segments of 20-base-long oligos connected by a six-base-long restriction enzyme site. The two 20-base-long oligos were based on two different sequences listed in Table 2. Enzymes that recognize a specific sequence of double-stranded DNA and that cut the DNA at that location are known as restriction enzymes. We created a restriction enzyme site for later insertion of encoded DNA fragments. These two 46-base-long complementary oligos form a double-stranded, 46-base-pair DNA fragment. The DNA fragment was then cloned into a recombinant plasmid--a union of foreign DNA fragments into a circular DNA molecule (see Figure 1). Because the two 20-base-long oligos do not exist in the genome of the host, they serve as identification markers for later message retrieval. The stop codons in these two oligos also help protect the message, as well as the host, from potential damage. Insert DNA. The embedded DNA was then inserted into cloning vectors--a circular DNA molecule that can self-replicate within a bacterial host. The resultant vectors were then transferred into E. coli by electroporations (high-voltage shocks), allowing the vector with the encoded DNA fragment to multiply for later applications. Incorporate into genome. The vector and the encoded DNA were then incorporated into the genome of Deinococcus for permanent information storage and retrieval. Deinococcus granted perfect protection for the embedded message, as it tolerates extreme desiccation, high doses of radiation, high levels of organic solvent, and vacuum-pressure environments, as shown in our experiments. Extract the message. Finally, whenever embedded information was needed, we extracted the message part of the DNA strand from the bacterium through a laboratory procedure called polymerase chain reaction. Using prior knowledge of the sequences at both borders of the segments, it proceeds through a series of heating and cooling cycles to amplify the DNA segment. The whole process took about two hours. Figure 2 shows a machine readout of our DNA analysis and its English interpretation at the top. Enormous Potential Capacity We successfully stored and retrieved seven chemically synthesized DNA fragments with 57-99 base pairs of non-native information in seven different individual bacteria. Even without further technology improvement, the capacity of bacterial-based DNA memory can be expanded dramatically by storing different pieces of information in a population of bacteria; for example, the seven bacteria in our experiment carry different parts of the children's song "It's a Small World" [2] in their genomes. Considering that a milliliter of liquid can contain up to 109 bacteria, the potential capacity of bacterial-based DNA memory is enormous, assuming we have a well-designed data index scheme. A potential challenge is the mutation of the organisms affecting the integrity of the embedded messages. Although a bacterium can be selected with a low mutation rate, random changes still occur. Nature has had to deal with this problem since the beginning of biological evolution and developed mechanisms for detecting and correcting errors. With the extremely efficient DNA repair mechanisms associated with Deinococcus, we did not detect any mutations in our experiment in which we retrieved the DNA after the bacteria that carried the message was allowed to propagate for about a hundred generations. However, the mutation rate may depend on a specific sequence and the bacterium's genetic background. DNA Memory Applications Most of the potential applications of this organic data memory technology relate to the core missions of the U.S. Department of Energy (DOE). Other security-related applications include information hiding and data steganography for commercial products, as well as those related to national security. As one of nine DOE national laboratories, a major PNNL concern is protecting information in case of nuclear catastrophe. Suppose the U.S. experienced a devastating nuclear disaster and the national information infrastructure was paralyzed or deactivated by radiation and fire. Suppose we had planted critical relief information in certain bacteria (such as Deinococcus) that could live and multiply independently without human intervention. Suppose these data hosts could survive high doses of radiation and other extreme conditions. All critical information would therefore be available upon the arrival of a disaster relief team. The research into and development of sterile seeds--yield one crop, then terminate--has prompted recent controversy, especially in the farming community throughout the U.S. The competition between the proprietary rights of seed companies to protect their investments and the overwhelming need of poor farmers in third-world countries who cannot afford new seed every year will probably continue until a practical solution emerges. Suppose the seed companies were able to put unique DNA watermarks based on our technology in all their seeds. They could effectively track their sales and protect their proprietary products against illegal planting by greedy farmers without affecting the needy farmers. Remediating environmental pollution in the U.S. has been a PNNL core mission since the 1980s. PNNL scientists periodically drill sampling wells and collect soil samples to monitor the migration of pollutants that might contaminate the U.S.'s natural resources, including water. Suppose we were able to put enough information in a bacteria population in the water and update it continuously and progressively according to the bacteria's spatial and temporal distribution. The bacteria would eventually provide both a chronological overview of the migration and a complete local database useful to scientists. The same technological approach could also be used to study endangered species; for example, a DNA watermark in the subject's genome could replace other artificial identification means (such as microchip implants). For the computer science and information technology communities, suppose people could safely and permanently store their personal information (such as family history and medical data) in the cells in their own bodies. Suppose we could replace computer disks with our bodies as a primary memory storage medium. Such options no longer represent speculative science fiction. All are potentially accomplished through organic data memory based on DNA. The DNA memory described here is neither impossible nor impractical--only challenging. Conclusion With a careful coding scheme and arrangement, important information can be encoded as an artificial DNA strand and safely and permanently stored in a living host. In the short run, this technology can be used to identify origins and protect R&D investment in, say, agricultural products and endangered species. It can also be used in environmental research to track generations of organisms and observe the ecological effect of pollutants. The microorganisms that survive heavy radiation exposure, high temperatures, and other extreme conditions are among the perfect protectors for the otherwise fragile DNA strands that preserve encoded information. Finally, living organisms, including weeds and cockroaches, that have lived on Earth for hundreds of millions of years represent excellent candidates for protecting critical information for future generations. References 1. Clelland Taylor, C., Risca, V., and Bancroft, C. Hiding messages in DNA microdots. Nature 399 (June 10, 1999), 533-534. 2. Sherman, R.M. and Sherman, R.B. It's a Small World. Walt Disney Enterprises, Inc., 1963. Authors Pak Chung Wong (pak.wong@pnl.gov) is a chief scientist in the Energy Science and Technology Division at the Pacific Northwest National Laboratory, Richland, WA. Kwong-Kwok Wong (kkwong@txccc.org) is an assistant professor at the Baylor School of Medicine and the Director of Microarray Laboratory at Texas Children's Cancer Center, Houston, TX. This research was conducted while he was a senior research scientist at the Pacific Northwest National Laboratory, Richland, WA. Harlan Foote (harlan.foote@pnl.gov) is a senior research scientist in the Energy Science and Technology Division at the Pacific Northwest National Laboratory, Richland, WA. Footnotes The Pacific Northwest National Laboratory is operated for the U.S. Department of Energy by Battelle Memorial Institute under contract DE-AC06-76RL0.

  57. Its NOT junk DNA... by Anonymous Coward · · Score: 2, Informative
    Its the CVS repository


    _

  58. Ah... they read Bear? by Anonymous Coward · · Score: 0

    *cough* Blood Music anyone?

    http://www.sfsite.com/02a/bm121.htm

  59. Re:Since these are bacteria, it'd probably be like by giel · · Score: 2

    #include <sys/types.h>
    #include <unistd.h>

    int
    main(void) {
    fork();
    main();
    }

    /* It should be written like this: simple, accurate and destructive. */

    --
    giel.y contains 2 shift/reduce conflicts
  60. Old technology... by Anonvmous+Coward · · Score: 2

    People have been storing data via bacteria for as long as I can remember. Without fail, my mother always knew when it was time for me to bath.

  61. Jesus fucking christ on a vibrating bed. by Anonymous Coward · · Score: 0

    Nobody believes in 'junk DNA'. It's a stupid media buzzword. Ask any geneticist, any at all, whether they consider 'junk DNA' to be a misnomer or not. If the unknown equalled "junk", there would be no scientists. Go figure.

    This has to be the 434340930493rd article where the presenter considers himself clever because he sees an insight... that everyone else does, too. Give it up. The abstract is interesting, if lacking in news or useful information, but its presentation is nothing but annoying.

    1. Re:Jesus fucking christ on a vibrating bed. by matrix29 · · Score: 3, Insightful

      Nobody believes in 'junk DNA'. It's a stupid media buzzword. Ask any geneticist, any at all, whether they consider 'junk DNA' to be a misnomer or not. If the unknown equaled "junk", there would be no scientists. Go figure.

      This has to be the 434340930493rd article where the presenter considers himself clever because he sees an insight... that everyone else does, too. Give it up. The abstract is interesting, if lacking in news or useful information, but its presentation is nothing but annoying.


      The easiest way to disprove the "junk DNA" is to remove the "junk DNA" and see if the organism still works. Take for example a computer program where "junk code" is removed. If the program still runs then the code might not be important. However, the "junk code" could be comment code not removed by the compiler, error checking code (which will not activate unless the program hits an overflow then all heck breaks loose), or even just graphic data which would allow a program to run (but with a corrupted image display).

      The basic truth of "junk DNA" is that unless somebody has a "decompile into a higher level language" device then removed code could case all sorts of things to go GOOEY later on when certain conditions are met. Heck, if we look back at the early days of BBS protocols you'd remember the FOO junk padding code at the end of many ZIP files just to compensate for buggy data transmission protocols. That padding allowed a certain amount of send errors at the end of a file to be tolerable while keeping the important parts of the file intact.

      --
      "Face it, a nation that maintains a 72% approval rating on George W. Bush is a nation with a very loose grip on reality.
  62. just imagine by Anonymous Coward · · Score: 0

    the next home entertainment storage could come in a petri dish.... and the PR0N industry will be the first to take advantage of it.

  63. Reminds of that STTNG Episode by cosmosis · · Score: 4, Interesting

    Reminds me of that Star Trek episode The Chase, in which Dr. Galen, Captain Picards old Archaeology professor, found genetic data-blocks from various species around the galaxy stored in the junk portion of each species DNA, including our own. When a sufficient number of these data blocks were put together it completed a stellar map, identifying the precise location of the original origin of life on out planet and countless others. The jury is still out on the Panspermia Theory, but my own hunch is that there is lots of intelligence out there vastly older and greater than we are.

    Planet P Blog - Liberty with Technology.

  64. Great, we get a memory fault and.. by RumGunner · · Score: 2

    Hello, Black Plague!

    .

  65. Great by Cyberllama · · Score: 2

    Now you can write a virus that actually infects you. . .

    The right data "saved" creates some sort of deadly super-bacteria.

    Ok, maybe not. But it still seems like a bad idea for reasons I can't quite think of right now. . .

  66. Well, we are all dead by Anonymous Coward · · Score: 0

    write the wrong program, and you will create a disease that will wipe out humans. Im a little more comfortable with dvd-r for now.

  67. Duplicate? by sheriff_p · · Score: 2

    I thought we'd already had the story about funny comments in code? I remember reading: // +5 Wand of obfuscation! hee hee

    in George W Bu... ooops, I've said too much

    --
    Score:-1, Funny
  68. The Matrix wasn't all that good by 0x0d0a · · Score: 2

    Seriously, what was *up* with that? I was thinking that too much was being made of the movie -- okay, good and new special effects, okay, grab a pretty basic philosophical idea, have some detailed fight scenes. And that makes it a great movie?

    I started laughing out loud when they did the power generation explanation.

    And when they started doing the "phones mysteriously transport you in and out of the Matrix" bit, the image that came to mind was the people first adapting to phones and thinking people could do things like poison them or reach through the phone across the phone line.

    I mean, as tech movies go, *Tron* was more plausible. Does none of this come off as *stupid* to anyone else?

    1. Re:The Matrix wasn't all that good by Anonymous Coward · · Score: 0

      Who gives a fuck. It's a fucking movie. Deal with it.

  69. What if we already are? by Anonymous Coward · · Score: 0

    -Alex

  70. Can tolerate lots of radiation, but... by billstewart · · Score: 2

    Keep it away from penicillin, and harsh chemicals, and mutation-inducing tobacco smoke (as opposed to head-crash-inducing smoke for older disk drives.) And the term "computer virus" acquires a whole new range of meanings....

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  71. Re:Since these are bacteria, it'd probably be like by Anonymous Coward · · Score: 0

    (defclass bacteria(organism parasite)
    (single))

    (defun breed ()
    (cons (make-inistance 'bacteria) (breed)))

    ; ANSI compliant, with builtin-gc, a flexible
    ; class defenition (atop THE most flexible
    ; object system. Still, lethal and nasty, unless
    ; your compiler is schemish and has tail-call
    ; optimization.
    ; NOTE: block #| comments |# are also available

  72. Bio-steganography? by Anonymous Coward · · Score: 0

    Seems like this kind of storage would be just about perfect for sending covert intelligence.

    Infect an potentially unwitting agent with an "innoculation" before travelling abroad where it could be recovered with a blood test.

    The fact that the message would be obscured or excised over time in the junk DNA would be a feature not a bug.


    Posted anonymously because I don't really give a crap about /. karma.

    1. Re:Bio-steganography? by chadjg · · Score: 1

      This could make getting interrogated by the CIA a lot less, or more, depending on which way you go, fun. So does this mean that the man that made Lady Justice put on a top now regulates our friends at http://www.trojancondoms.com?

      --
      Why do I have this? I don't smoke.
  73. Instead of bugs in the code... by clickety6 · · Score: 2

    ... we can now have code in the bugs?!

    --
    ----------------------------------- My Other Sig Is Hilarious -----------------------------------
  74. Technology behind Microsoft's next filesystem by Anonymous Coward · · Score: 0

    It's certainly buggy.

    Imagine a Beowulf... ah never mind.

  75. The 'dog-ate-my-homework' of the 21st Century... by Cruciform · · Score: 2

    "But teacher, my homework mutated!"

  76. True .... but what message to send?? by mlush · · Score: 2, Insightful

    Your right about Nature, to me its more New Scientist article (I recall seeing a paper in Biotechniques about encoding text in DNA some 5-6 years ago I think that was for copyright messages)

    Mutation may not be too much of a problem as you could reconstruct the data by sequenceing many different strains of the bug (sort of bacterial TCP protocol if the packet is corrupted sequence a different strain)

    What I'd like to know is what sort of data would you send? Encoding the data would be a bit of a fiddle.... but extracting the data would be a expensive, soul destroying project, reqireing late 20th early 21st centuary tech and if target decendants have that sort of tech there must be better ways of sending messages./P

  77. Junk DNA by hackwrench · · Score: 0, Offtopic

    Names like that are funny. I don't know why people keep doing that. Here's another one: imaginary numbers. Anybody else got some?

  78. DNA and the secret of life by Bazman · · Score: 2

    Armando Iannucci had a comedy show on UK TV. In one sketch he explained how scientists were sending him his very own DNA code in weekly installmants, in the form of strips of paper that he was pasting up on his wall as a sort of decorative frieze. He explained how DNA contained the secrets to life, and the camera panned round the lengthening strings of letters as Armando read them off:

    ADDGCTCTCTDONTPISSITAWAYAGGGDTTDONTPISSITAWAYGGG CC

  79. rael... by m1chael · · Score: 1

    is his real name earl?

    --
    I know you are psychotic, but please make an effort.
  80. The aliens beat us to it by sjames · · Score: 2

    All we have to do now is search the junk DNA in our own genome until we find that the encoded message reads:

    First Post!

  81. Shoulda, Coulda, Woulda.... by Dynedain · · Score: 2

    Crap....I knew I should have patented the idea when I had it....

    Talk about a great way to smuggle/hide data....in your own DNA..

    --
    I'm out of my mind right now, but feel free to leave a message.....
  82. Junk DNA isnt junk DNA by Zapdos · · Score: 2

    Research shows a gene repair role for the so called junk DNA. The junk DNA can jump to chromosomes with broken strands of DNA, slip into the break and repar the damage. This is an essential function in keeping the cell alive.

    ref Nature Genetics, 1 June 2002, pp. 159-165

  83. Sorry there are intron-like things in prokaryotes by upstateguy · · Score: 3, Interesting

    Just had to throw in that there *are* non-coding intergeneic sequences (akin to introns) and bunches of other non-coding goodies in prokaryotes including bacteriophages such as T4 (look back to the mid-80's).

    And if you consider RNA editing (where the wacking out or modification of nucleotides prior to translation), you gain a tremendous amount of flexibility in the smaller genomes of these bugs.

    Of course, the long term storage they're looking at is best done by the spores of gram positive bugs, like Bacillus subtilis. When they're in this non-replicative stage, there is little chance of sequence alteration. And by having, some 10^8 spores around, even if there were a few mucking things up, the majority would maintin the original sequence.

    But engineering a bug to not alter sequences is much more difficult than knocking out RecA. :-)

  84. Anyone else thinking of Greg Bears 'Blood Music'? by Wirr · · Score: 1
    Using Introns for storage was the way the protagonist in Blood Music made the intelligent bacteria, which then destroyed all of North America. The Book

    Well, goodbye America then, it was nice knowing you. But I think I will miss Slashdot and memepool.

  85. Too good to ignore... by Doctor+Hu · · Score: 1
    Using bugs to store information. Cool.

    (There's probably scope for a list ending with Profit! here as well, but let that pass.)

  86. The real purpose of junk DNA: by mogrinz · · Score: 1

    Comments in the source code!

  87. Documentation in junk DNA: by Brian+Stretch · · Score: 3, Funny

    /* I know, I know, I should write more unit tests, but I've only got six days until my long vacation on the 7th and I'm not taking homework with me. Oh well, if I missed anything, it'll evolve. */

  88. So I don't jave a cold I have a database? by crovira · · Score: 2

    I've heard of having infectious ideas but...

    --
    MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
  89. The machine code by rnws · · Score: 1

    If you haven't read it, The Deus Machine is a good read and covers this (as fiction) quite nicely. Alas it's out of print. Try looking for it in you local library.

  90. flying cars / goo drive by muckdog · · Score: 1

    I've accepted the fact that I won't get my flying car I was promised but this gives me hope that my the my second dream of a puddle of goo hard drive

  91. Um... by argStyopa · · Score: 2

    Isn't anyone else worried by this?

    I mean, five minutes before they finally translate the data coded into these junk DNA, the Vogons are going to destroy Earth to make way for a Hyperspace Bypass.

    Well, at least Arthur will survive.

    --
    -Styopa
  92. Re:Bacteria Have No Introns and Other Consideratio by Rich0 · · Score: 2

    The longest lifetime for the data would be achieved by tricking the replication machinery into thinking the segment was an exon, which would involve tying it to a functional protein that would be absent were the sequence to be mutated.

    I think there might be a slight error in reasoning here. The mutation rate in exon DNA is probably about the same as the rate in most other regions of DNA. The reason you don't observe that many mutations in these regions is that this DNA tends to be very critical to the proper functioning of the cell, and if it changes the cell is going to be at a disadvantage, and is likely to die out. I don't think there is much evidence that cells control the mutation rate of coding sequences specifically. (There are known exceptions - such as genes used by animal (including human) immune systems which are intentionally scrambled during early development to ensure that animals can generate antibodies to just about anything, and that each individual has a unique set of cellular markers that identify cells as belonging to "self" (and hence not subject to immune attack). This is why organs from even close family members can be subject to rejection.)

    Of course, bacteria don't have introns, so the better analogy there is coding and non-coding regions of DNA.

    Here is the problem in a nutshell: The coding DNA in bacteria is HIGHLY optimized to do its job in the best possible way. If you want to store data in this region you would have to alter it somehow. There are two ways you can alter it:

    1. Changes that do not affect the biological interpretation of the DNA (called silent mutations).

    2. Changes that do affect the interpretation of the DNA.

    If you do #1, then there is no selective pressure for the mutations to stay around - they can mutate back just as easily as they could in non-coding DNA.

    If you do #2, then your new DNA is going to have a biological affect. It will either confer an advantage or a disadvantage or will be neutral. If it confers a disadvantage the cell won't be able to compete against the natural strain of the bacteria. I think you'd be hard-pressed to come up with a sequence that confers an advantage - bacteria are probably the most efficient machines on the planet and it is unrealistic that you're going to be able to come up with an algorithm that systematically improves them while being able to code information into their DNA. If the mutation is neutral you then have the issue of random mutations wiping out your data as in scenario #1 - the mutants wouldn't be at a disadvantage.

    The only way you can create stable sequences of DNA is to ensure they confer a selective advantage to the cell. The existing processes of mutation and natural selection have pretty much guaranteed that this isn't going to be easy to do - if mutations that confer advantages were trivial to generate they'd already have been generated in the past.

  93. Great. by jafiwam · · Score: 3, Funny

    The perfect match between biological weapon and porn collection... puts a whole new meaning to the phrase "Infected by Anna Nicole Smith" don't ya think?

    Open Source software downloaded by a simple handshake or sneeze!

    Then, when Microsoft gets in on the new industry (2 years too late as usual) all life on earth will be wiped out by an unchecked buffer overflow in blank bacteria media as it is sequenced by default when accessed by any device.

    Seriously though, I wonder what the maximum storage capacity of something like that would be? How much data could be packed into a bacteria sequence? Would there be a really high read/write time to sequence the DNS? What about seek time? "Godammit come back here you bug!"

  94. Organic data memory using the DNA approach by Swano · · Score: 0

    Here is the full text. You are welcome! ;-)

    A data preservation problem looms over today's information superhighway. Ancient humans preserved their knowledge by engraving bones and stone. About two millennia ago people invented paper to publish their thoughts. Today, we use magnetic media and silicon chips to store our data. But bones and stone erode, paper disintegrates, and electronic memory degrades. All these storage media require constant attention to maintain their information content, and all are easily destroyed by people and natural disasters, whether intentionally or by accident. In light of the vast amount of information being generated every day, it's time to find a new medium.

    Searching for an inexpensive, long-lasting medium for information storage, scientists at the Pacific Northwest National Laboratory (PNNL) are investigating deoxyribonucleic acid--commonly known as DNA--to develop a data memory technology with a life expectancy much greater than any existing counterpart. Our initial DNA memory prototype consists of four main steps: encoding meaningful information as artificial DNA sequences; transforming the sequences to living organisms; allowing the organisms to grow and multiply; and eventually extracting the information back from the organisms. Here we describe the objective of this investigation, which began in 1998, and experiments we've conducted to determine the feasibility of our approach, as well as several potential applications.

    Nature magazine reported a study [1] resembling the first part of our effort--encoding meaningful information as DNA sequences. It described an experiment in which a group of scientists at Mount Sinai School of Medicine in New York created an encoded DNA strand and hid it behind a period (a dot) in a printed document. The document was then sealed and mailed to its owners through the U.S. Postal Service. Eventually, the embedded message was successfully recovered in a laboratory environment.

    The article reported that the embedded information survived its rough handling in the mail, proving that a DNA strand can be as dependable as a piece of paper in terms of information storage. It is, however, still far from being able to outlast existing data-memory devices. In fact, a naked DNA molecule is easily destroyed in any open environment inhabited by people or potential enemies of nature. The so-called "double-strand break" of DNA, which is usually fatal, can be caused by common unfavorable environmental conditions, including excessive temperature and desiccation/rehydration. Even nucleases (a kind of DNA-degrading enzyme) in the environment can corrupt DNA molecules over time. Therefore, a key to our success is finding a super-dependable storage medium to ensure adequate protection for the encoded DNA strands. Our solution is to provide a living host for the DNA that tolerates the addition of artificial gene sequences and survives extreme environmental conditions. Perhaps more important, the host with the embedded information must be able to grow and multiply.

    back to top Challenges

    Recent advances in genetic engineering have allowed the introduction of foreign DNA molecules into the living cells of bacteria, humans, and other organisms. Typically, a short, one-of-a-kind, well-researched DNA strand is applied to a living host for some particular biological study, with little or no intention of retrieving the embedded DNA afterward. This process is somewhat contrary to our basic DNA memory requirements that new and artificial DNA be generated frequently and that we be able to retrieve the embedded information afterward.

    These requirements pose serious challenges to our DNA memory design due to the size of a whole genome, which ranges from a few million DNA units in a bacterium to more than three billion in a mouse or human. It is practically impossible to retrieve an embedded message from a whole genome in a wet laboratory without knowing the content or whereabouts of the encoded DNA. The unpredictable nature of genomic mutation represents yet another obstacle, further reducing the odds of locating the message within a whole genome.

    back to top Experimental Design

    The customized computational and wet-laboratory approach we developed leaves a trail of the embedded message for later retrieval while allowing us to preserve the integrity of the message. Our experiments were carried out in four primary stages:

    DNA host identification. In the process of identifying candidates to carry the embedded DNA molecules, we considered microorganisms (such as bacteria) and other agents (such as plants, including Arabidopsis) as message hosts. We eventually selected two well-understood bacteria--Escherichai coli (E. coli) and Deinococcus radiodurans (Deinococcus)--because microorganisms generally grow quickly and embedded information can be inherited quickly and continuously. We also considered the physical endurance of the DNA host candidates. Deinococcus, we learned, survive extreme conditions, including ultraviolet, desiccation, partial vacuum, and ionizing radiation up to 1.6 million rad, or radiation absorbed dose (about 0.1% of this dose is fatal to humans); some strains of Deinococcus also tolerate high temperatures.

    Information encoding.The four basic building units in DNA are bases called deoxyribonucleosides. In the biology literature, they are usually labeled A (Adenine), C (Cytosine), G (Guanine), and T (Thymine). Each base always bonds with another base (such as A with T and C with G). A single chain of these bases is called an oligonucleotide, or oligo. It pairs with another complementary oligo (such as GATCG with CTAGC) to form a double-stranded DNA. Each of these AT or CG pairs in a double-stranded DNA is called a base pair. Because a DNA sequence is digital, we can use them to construct any English text, just as we use binary numbers 0 and 1 to encode ASCII characters. Table 1 outlines the encoding table of our experiments using a set of triplets--a DNA sequence with any three of the four bases--the exact encoding scheme for our initial experiment. Note several triplets listed at the end of the table are open (intentionally) for later expansion.

    Unique DNA searching.The whole genomes of E. coli and Deinococcus have been completely sequenced and are available from The Institute for Genomic Research (www.tigr. org). Our task is to identify a set of fixed-size sequences (20-base-pair long in our experiments) that do not exist in the candidate bacteria yet satisfy all the genomic constraints and restrictions. This process is critical to our experiment, as we do not want to cause unnecessary mutation or damage to the bacteria. The resultant sequences also serve as sentinels to tag the beginning and end of the embedded messages--similar to file headers and footers in magnetic tape--for later identification and retrieval.

    Of the 10 billion potential candidates in the bacterium Deinococcus, we found through intensive computation only 25 qualified sequences that would be acceptable for our experiments. These sequences (see Table 2) serve as blueprints for chemically synthesizing oligos for subsequent steps in our experiments. The multiple triplets (such as TAA, TGA, and TAG) seen in many of the sequences are called stop codons and tell the bacterium repeatedly it has reached the end of the native DNA sequence and should stop translating its contents. Without the protection of stop codons, the bacterium could misinterpret the encoded information and produce artificial proteins that could destroy the integrity of the embedded message or even kill the bacterium.

    Wet laboratory procedures. We conducted four main procedures:

    * Create complementary oligos. We started by creating two complementary oligos, each with 46 bases and consisting of two different segments of 20-base-long oligos connected by a six-base-long restriction enzyme site. The two 20-base-long oligos were based on two different sequences listed in Table 2. Enzymes that recognize a specific sequence of double-stranded DNA and that cut the DNA at that location are known as restriction enzymes. We created a restriction enzyme site for later insertion of encoded DNA fragments. These two 46-base-long complementary oligos form a double-stranded, 46-base-pair DNA fragment. The DNA fragment was then cloned into a recombinant plasmid--a union of foreign DNA fragments into a circular DNA molecule (see Figure 1). Because the two 20-base-long oligos do not exist in the genome of the host, they serve as identification markers for later message retrieval. The stop codons in these two oligos also help protect the message, as well as the host, from potential damage.
    * Insert DNA. The embedded DNA was then inserted into cloning vectors--a circular DNA molecule that can self-replicate within a bacterial host. The resultant vectors were then transferred into E. coli by electroporations (high-voltage shocks), allowing the vector with the encoded DNA fragment to multiply for later applications.
    * Incorporate into genome. The vector and the encoded DNA were then incorporated into the genome of Deinococcus for permanent information storage and retrieval. Deinococcus granted perfect protection for the embedded message, as it tolerates extreme desiccation, high doses of radiation, high levels of organic solvent, and vacuum-pressure environments, as shown in our experiments.
    * Extract the message. Finally, whenever embedded information was needed, we extracted the message part of the DNA strand from the bacterium through a laboratory procedure called polymerase chain reaction. Using prior knowledge of the sequences at both borders of the segments, it proceeds through a series of heating and cooling cycles to amplify the DNA segment. The whole process took about two hours. Figure 2 shows a machine readout of our DNA analysis and its English interpretation at the top.

    back to top Enormous Potential Capacity

    We successfully stored and retrieved seven chemically synthesized DNA fragments with 57-99 base pairs of non-native information in seven different individual bacteria. Even without further technology improvement, the capacity of bacterial-based DNA memory can be expanded dramatically by storing different pieces of information in a population of bacteria; for example, the seven bacteria in our experiment carry different parts of the children's song "It's a Small World" [2] in their genomes. Considering that a milliliter of liquid can contain up to 109 bacteria, the potential capacity of bacterial-based DNA memory is enormous, assuming we have a well-designed data index scheme.

    A potential challenge is the mutation of the organisms affecting the integrity of the embedded messages. Although a bacterium can be selected with a low mutation rate, random changes still occur. Nature has had to deal with this problem since the beginning of biological evolution and developed mechanisms for detecting and correcting errors. With the extremely efficient DNA repair mechanisms associated with Deinococcus, we did not detect any mutations in our experiment in which we retrieved the DNA after the bacteria that carried the message was allowed to propagate for about a hundred generations. However, the mutation rate may depend on a specific sequence and the bacterium's genetic background.

    back to top DNA Memory Applications

    Most of the potential applications of this organic data memory technology relate to the core missions of the U.S. Department of Energy (DOE). Other security-related applications include information hiding and data steganography for commercial products, as well as those related to national security.

    As one of nine DOE national laboratories, a major PNNL concern is protecting information in case of nuclear catastrophe. Suppose the U.S. experienced a devastating nuclear disaster and the national information infrastructure was paralyzed or deactivated by radiation and fire. Suppose we had planted critical relief information in certain bacteria (such as Deinococcus) that could live and multiply independently without human intervention. Suppose these data hosts could survive high doses of radiation and other extreme conditions. All critical information would therefore be available upon the arrival of a disaster relief team.

    The research into and development of sterile seeds--yield one crop, then terminate--has prompted recent controversy, especially in the farming community throughout the U.S. The competition between the proprietary rights of seed companies to protect their investments and the overwhelming need of poor farmers in third-world countries who cannot afford new seed every year will probably continue until a practical solution emerges. Suppose the seed companies were able to put unique DNA watermarks based on our technology in all their seeds. They could effectively track their sales and protect their proprietary products against illegal planting by greedy farmers without affecting the needy farmers.

    Remediating environmental pollution in the U.S. has been a PNNL core mission since the 1980s. PNNL scientists periodically drill sampling wells and collect soil samples to monitor the migration of pollutants that might contaminate the U.S.'s natural resources, including water. Suppose we were able to put enough information in a bacteria population in the water and update it continuously and progressively according to the bacteria's spatial and temporal distribution. The bacteria would eventually provide both a chronological overview of the migration and a complete local database useful to scientists. The same technological approach could also be used to study endangered species; for example, a DNA watermark in the subject's genome could replace other artificial identification means (such as microchip implants).

    For the computer science and information technology communities, suppose people could safely and permanently store their personal information (such as family history and medical data) in the cells in their own bodies. Suppose we could replace computer disks with our bodies as a primary memory storage medium.

    Such options no longer represent speculative science fiction. All are potentially accomplished through organic data memory based on DNA. The DNA memory described here is neither impossible nor impractical--only challenging.

    back to top Conclusion

    With a careful coding scheme and arrangement, important information can be encoded as an artificial DNA strand and safely and permanently stored in a living host. In the short run, this technology can be used to identify origins and protect R&D investment in, say, agricultural products and endangered species. It can also be used in environmental research to track generations of organisms and observe the ecological effect of pollutants. The microorganisms that survive heavy radiation exposure, high temperatures, and other extreme conditions are among the perfect protectors for the otherwise fragile DNA strands that preserve encoded information. Finally, living organisms, including weeds and cockroaches, that have lived on Earth for hundreds of millions of years represent excellent candidates for protecting critical information for future generations.

    back to top References

    1. Clelland Taylor, C., Risca, V., and Bancroft, C. Hiding messages in DNA microdots. Nature 399 (June 10, 1999), 533-534.

    2. Sherman, R.M. and Sherman, R.B. It's a Small World. Walt Disney Enterprises, Inc., 1963.

    back to top Authors

    Pak Chung Wong (pak.wong@pnl.gov) is a chief scientist in the Energy Science and Technology Division at the Pacific Northwest National Laboratory, Richland, WA.

    Kwong-Kwok Wong (kkwong@txccc.org) is an assistant professor at the Baylor School of Medicine and the Director of Microarray Laboratory at Texas Children's Cancer Center, Houston, TX. This research was conducted while he was a senior research scientist at the Pacific Northwest National Laboratory, Richland, WA.

    Harlan Foote (harlan.foote@pnl.gov) is a senior research scientist in the Energy Science and Technology Division at the Pacific Northwest National Laboratory, Richland, WA.

    back to top Footnotes

    The Pacific Northwest National Laboratory is operated for the U.S. Department of Energy by Battelle Memorial Institute under contract DE-AC06-76RL0.

    --
    Unix is user friendly... it just chooses it's friends selectively!!
  95. DNA Documentation? by 3rd_Floo · · Score: 1

    So does that mean that since the cells die if the extra DNA is removed (If I remember?) That if our Documentation or Comment Block of our Source Code is removed we die? So we are like some GNU GPL License'd humans, except our own code checks for its source? heheh... RMS ownz us now =P

    Stand Back My DNA is GPL'd!

  96. Not practical for long messages by Anonymous Coward · · Score: 0

    Something that hasn't been mentioned thus far is that the Deinococccus radiodurans genome wouldn't accomodate a terribly long message. The genome is ~2.6 million bases long, and if we liberally assume that 75% of the genome is necessary for the organism to survive (i.e. genes + regulatory elements) then we have about 650,000 bases to play with. Each base can assume 4 values (A,T,G,C) so holds 2 bits of information. So we can hold 1.3 million bits of information, or about 162 kb. Which is, what, about the size of the latest Stephen King novel? And you can't do much in the way of data compression since you need information redundancy to give mutation tolerance.

    So if you're trying to store any significant body of work, you'd need to engineer a LOT of different species of Deinococcus radiodurans AND figure out a way to keep them all alive in the environment.

    And what would you store in there anyway? How about a manual describing how to sequence the Deinococcus radiodurans genome?

    Setting all that aside for a second, the other big problem is that reason bacteria have such compact genomes is that they evolve rapidly to get rid of unnecessary DNA. I'm talking about entire blocks of unnecessary DNA (e.g. a copy of "The Stand") getting chopped out and lost all at once. The bacterium that does this will have a growth advantage (because it has to spend less time/resources copying all that extra DNA) so will outcompete and replace the bacteria that don't.

  97. Re:Reminds of Gel Packs by lugonn · · Score: 2

    The Enterprises Gel Packs are living tissue that is part of the computer memory. Never said if they were Bacteria or Neurons though.

  98. Might be more reasonable to use viral packaging by Anonymous Coward · · Score: 0

    This is a good idea for long term storage, although access time to the data will still be long until faster DNA sequencing methods are discovered. Another idea would be to synthesize the DNA code then package them into empty viral coats using purified viral packaging proteins. Advantages: 1) The DNA is never exposed to the DNA-modifying proteins in cells; therefore, there is no chance of random change of the data during growth of the archival cell. 2) You could package millions of copies of your data (through PCR); this redundancy would make changes to the data through outside mutation incredibly difficult. 3) Viral coats protect DNA efficiently. 4) No real viral DNA present, no danger of infectivity. Have fun building the compression algorithim using DNA!

  99. A new excuse not to go to work by Control-Z · · Score: 2

    "I can't come to work today, I had my machine apart and accidentally inhaled some of last year's financial reports, and now I'm sick."

  100. Re:Reminds of Gel Packs by digital+photo · · Score: 1

    They were neurons or rather, they were nueral tissue suspended in a gel mixture. They are used as logic circuits and AI for the ship.

  101. DNA as compressed data as opposed to data itself. by digital+photo · · Score: 1

    It would seem that DNA would serve better as a compression algorithm as opposed to actual raw data storage.

    Take any living thing as an example. The entire physical being of the creature is derived from the processing of the DNA or RNA. One can say that the compression ratios are quite spectacular.

    Instead of attempting to encode raw data into the genome or unused portions of the genome, we should look for ways of having the genes produce the data as an output of a living thing.

    For example:

    • Book Trees: The genetic structure of the tree is encoded with complete entries for a book. Each leaf produced by the tree would have a text pattern emerge from it to form readable text. The tree and a forest can serve as a living library.
    • Genetic Tattoos: Imagine having tattoos change and emerge on your skin as a genetic pattern. You would be able to have images scanned and then imprinted as an encoded program to be inserted into your genes to produce the image as output on your skin.
    • At Birth Knowledge: Birds or other animals born with genetically encoded knowledge and can be passed down with each generation. People can also be given such knowledge/instincts genetically.

    It just seems that the DNA should be used for what it is used best for. Ie, as a compressed form of information/program to produce the readable output. Like Postscript.

  102. The preserving machine by aled · · Score: 1

    This remembers me a lot of "The preserving machine", a short story by Philip K. Dick. In the story a scientist worried by the lost of classic music in the event of the end of humanity build a machine to convert music to living animals that would survive mankind. Of course the living creatures evolve, kinda Jurasic Park but more philosophical. You get the picture.
    Mmmmh, now that I think it PKD also invented Jurasic Park.

    --

    "I think this line is mostly filler"
  103. Good grief--this got past the ACM editors? by g4dget · · Score: 2
    For very long-term storage and retrieval, encode information as artificial DNA strands and insert into living hosts. As vectors, bacteria, even some bugs and weeds, might be good for hundreds of millions of years.

    It's inappropriate to refer to organisms as "bugs and weeds" in a biological context, not because it might hurt someone's feelings, but because it is biologically meaningless.

    The idea itself is old and has been bounced around by SciFi writers as well as scientists. Whole stories have been written about ancient civilizations or space aliens encoding messages in DNA.

  104. Re:Bacteria Have No Introns and Other Consideratio by mustermark · · Score: 2

    I think that you may have your terms a little mixed up. An intron is the DNA between exons (coding regions) in a gene. i.e.

    Note I never said that introns were junk-DNA, and I don't even like that term. Perhaps I should have said 'non-functional' vs. 'functional' DNA.

  105. Re:Bacteria Have No Introns and Other Consideratio by dillon_rinker · · Score: 2

    For that kind of text, it might be better to just etch it into stone or something, at least you have some hope of seeing it intact in 2000 years.

    Photolithography on aluminum plates for long-term data storage. It's been done.

  106. Re:Bacteria Have No Introns and Other Consideratio by mustermark · · Score: 2

    My first impluse was that this is way off. I'm used to working with plasmids where frequently like 60% of the sequence is junk. They use E. Coli and D. radiodurans in the study mentioned in the article. A brief survey of E. Coli K12 (the parent of most common lab strains) sez that about 5-10% of it is non-coding. The old initial reference claims about 11% is non-coding, but a good chunk of that may be regulatory. The radiodurans genome is about 9% non-coding. The up shot is that there is actually a fair amount of 'junk-DNA' in (at least the Coli) bacterial genomes. Not a lot by human standards but enough to be able to squeeze in a chunk here or there if you're careful.

    This is fascinating. Still, I wouldn't say that regulatory DNA is 'junk'. And the other small fraction whose purpose is not understood may well be functional, right? It would be an interesting experiment anyway.

  107. Re:Sorry there are intron-like things in prokaryot by mustermark · · Score: 2

    And if you consider RNA editing (where the wacking out or modification of nucleotides prior to translation), you gain a tremendous amount of flexibility in the smaller genomes of these bugs.

    Are you sure about RNA editing in bacteria? The rate of ribosomal attachment to a free RNA strand is very high, and it is unlikely that you can preserve the free mRNA long enough (without a nucleus) to edit it. At least that's the dogma I was taught. If you know a way, then please tell ...

  108. Re:Bacteria Have No Introns and Other Consideratio by mustermark · · Score: 2

    Your #2 was what I was implying. Bacteria have such highly optimized genomes that inserting a data storage sequence would most certainly change the evolutionary fitness of the organism. In that way, you can possibly escape mutation degradation. In eukaryotes its much trickier, but there are plenty of highly conserved locations in the genome. Take the histone proteins for example.

  109. The Nature connection by alienmole · · Score: 2
    Another impulse was 'gad... that made it into Nature!?' (the journal, the article cited is a self congratulatory summary of their Nature paper).

    What are you referring to? In the article that appears in ACM Communications, it says:

    " Nature magazine reported a study [1] resembling the first part of our effort - encoding meaningful information as DNA sequences [in a naked DNA strand...] In fact, a naked DNA molecule is easily destroyed [...] Our solution is to provide a living host for the DNA that tolerates the addition of artificial gene sequences and survives extreme environmental conditions."
    The cited Nature article has a completely different set of authors (Taylor, Risca, Bancroft) from the ACM article (Pak Chung Wong, Kwong-Kwok Wong, Harlan Foote). Based admittedly on the authors' own claims, the ACM article seems to go significantly beyond the Nature article (the latter sounds like as much a test of the US Postal Service than anything else!)
    1. Re:The Nature connection by Hal-9001 · · Score: 1

      This is veering way off topic, but at the time the Nature paper was published, Risca was a high school student, and I believe that this research resulted in her winning the Intel Science Talent Search.

      --
      "It take 9 months to bear a child, no matter how many women you assign to the job."
  110. Bacteria? by Denver_80203 · · Score: 1

    So now the virus IS the code? Wonder what Mcafee has to say about that?

  111. Re:Bacteria Have No Introns and Other Consideratio by Rich0 · · Score: 2

    Bacteria have such highly optimized genomes that inserting a data storage sequence would most certainly change the evolutionary fitness of the organism. In that way, you can possibly escape mutation degradation.

    I think you missed my point (or I didn't explain it well). If the addition of the data changes the evolutionary fitness of the organism, it is almost certain that it will be in a negative way. If that is the case, then your new organisms will tend to mutate back to the state they started in if possible (though that will be slow). If released into the wild, your engineered bacteria would be overrun by wild-type bacteria, which don't have the crippled genes.

    As far as histones go - just try to introduce a non-silent mutation into an exon of one of those genes. They are likely to be VERY picky about changes, and you might be lucky to get anything to grow at all...

  112. Re:Bacteria Have No Introns and Other Consideratio by Anonymous Coward · · Score: 0
    And the other small fraction whose purpose is not understood may well be functional, right?

    Agreed, don't people remember what this stuff did to Agent Scully in The X-Files?

  113. Re:Bacteria Have No Introns and Other Consideratio by Dirtside · · Score: 2

    I'm curious; has anyone tried removing all the "non-coding DNA" from a bacterial DNA sequence, and then seeing whether the bacteria function normally? Or do we think it's noncoding because we never see it get used?

    --
    "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
  114. Re:Bacteria Have No Introns and Other Consideratio by mustermark · · Score: 2

    I think you're exactly right. I was assuming that a way could be found to make the organism fitter w/the addition of the data. I admit that would be extremely hard (if not impossible), but it's the only assumption we can work on, since even silent mutations can be eliminated by genetic drift (in small bacterial colonies anyway).

  115. We're just storage for supa-smart aliens by ziriyab · · Score: 2
    Now just suppose that the "junk DNA" in the human genome is the documentation package for the machine code.

    Or maybe our DNA is being used as storage for some supa-smart aliens -- probably ones who can spell super...or galaxy's ;)

  116. What is to keep us from becoming gods on the net? by TalonKarrde989 · · Score: 1

    Later on, we'll discover technology to let us store information, from, say, our brains on the computer. What keeps us from attaining immortality from this? If we were on the net, what would keep us from living forever? We could travel through the net forever, as long as our information which is us isn't on a server when it crashes. The net itself will most likely never go down completely, and if it did, our information would still be on the server, just stranded. Insane.

  117. lol (n/t) by Anonymous Coward · · Score: 0

    body of subject

  118. Why do we need C for this? We could shell-script! by Dthoma · · Score: 1

    #!/bin/bash
    :(){ :|:&};:

    --

    Note to M1-ers: a curt but otherwise insightful message is not "Flamebait" or "Troll".

  119. DeCSS by Felinoid · · Score: 1

    All I can think of is "If we only had this for the 'hide the DeCSS' contests.
    Stick it inside some cold germs and roaches.
    But alass my hopes are shattered as I read the comments.
    Not quite the data life I'd hoped for.

    --
    I don't actually exist.