Slashdot Mirror


The DNA Data Deluge

the_newsbeagle writes "Fast, cheap genetic sequencing machines have the potential to revolutionize science and medicine--but only if geneticists can figure out how to deal with the floods of data their machines are producing. That's where computer scientists can save the day. In this article from IEEE Spectrum, two computational biologists explain how they're borrowing big data solutions from companies like Google and Amazon to meet the challenge. An explanation of the scope of the problem, from the article: 'The roughly 2000 sequencing instruments in labs and hospitals around the world can collectively generate about 15 petabytes of compressed genetic data each year. To put this into perspective, if you were to write this data onto standard DVDs, the resulting stack would be more than 2 miles tall. And with sequencing capacity increasing at a rate of around three- to fivefold per year, next year the stack would be around 6 to 10 miles tall. At this rate, within the next five years the stack of DVDs could reach higher than the orbit of the International Space Station.'"

6 of 138 comments (clear)

  1. The problem will solve itself by Krishnoid · · Score: 5, Funny

    To put this into perspective, if you were to write this data onto standard DVDs, the resulting stack would be more than 2 miles tall.

    Once that happens, they'll be able to stop storing it on DVDs and move it into the cloud.

  2. Simple. Get the NSA to do it. by Anonymous Coward · · Score: 5, Funny

    Publish a scientific, paper stating that potential terrorists or other subversives can be identified via DNA sequencing. The NSA will then covertly collect DNA samples from the entire population, and store everyone's genetic profiles in massive databases. Government will spend the trillions of dollars necessary without question. After all, if you are against it, you want another 9/11 to happen.

  3. The answer is obvious! by plopez · · Score: 3, Funny

    They should use a NoSQL multi-shard vertically intgrated stack with a RESTfull rails driven in-memory virtual multi-parallel JPython enabled solution.

    Bingo!

    --
    putting the 'B' in LGBTQ+
  4. Re:Database Replication by __aasqbs9791 · · Score: 3, Funny

    I propose we call this new data method Data Neutral Assembly.

  5. Re:Who uses DVDs? by Samantha+Wright · · Score: 4, Funny

    And we can double storage efficiency by using two stacks! Clearly, they need to hire one of us.

    --
    Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!
  6. Yay, AdEnine & 1 click splicing by charlesjo488 · · Score: 4, Funny

    Scientists who viewed this sequence also viewed these sequences...