Slashdot Mirror


Chinese Lab Speeds Through Genome Processing With GPUs

Eric Smalley writes "The world's largest genome sequencing center once needed four days to analyze data describing a human genome. Now it needs just six hours. The trick is servers built with graphics chips — the sort of processors that were originally designed to draw images on your personal computer. They're called graphics processing units, or GPUs — a term coined by chip giant Nvidia. This fall, BGI — a mega lab headquartered in Shenzhen, China — switched to servers that use GPUs built by Nvidia, and this slashed its genome analysis time by more than an order of magnitude."

14 of 408 comments (clear)

  1. The Future Is Here!! by mastershake82 · · Score: 5, Funny

    Sounds like these newfangled "GPUs" are gonna change the world.

    1. Re:The Future Is Here!! by MollyB · · Score: 4, Insightful

      If one reads to page 2 of tfa, they only claim the technique works well in this instance. They go on:

      Even for computer-intensive aspects of analysis pipelines, GPUs aren’t necessarily the answer. “Not everything will accelerate well on a GPU, but enough will that this is a technology that cannot be ignored,” says Gollery. “The system of the future will not be some one-size-fits-all type of box, but rather a heterogeneous mix of CPUs, GPUs and FPGAs depending on the applications and the needs of the researcher.”

      and

      GPUs have cranked up the speed of genome sequencing analysis, but in the complicated and fast-moving field of genomics that doesn’t necessarily count as a breakthrough. “The game changing stuff,” says Trunnell, “is still on the horizon for this field.”

      So yes, the article is a bit breathless, but if utilizing GPUs helps cure my potentially impending genetic disorder, I'm all for it.

    2. Re:The Future Is Here!! by Stormthirst · · Score: 3, Insightful

      Unless you are fortunate to live in a civilised part of the world with a universal healthcare system.

  2. News for nerds by Anonymous Coward · · Score: 5, Funny

    I always wondered what GPUs are. Thanks Slashdot!

    1. Re:News for nerds by galanom · · Score: 5, Funny

      No, it's "Guinea Pig Units"

    2. Re:News for nerds by nevillethedevil · · Score: 3, Funny
      I thought it was 'Gnomes Processing Underpants' and that we finally had that elusive missing step

      1. Steal underpants
      2. Process underpants
      3. Profit

      --
      Be gone from my sight or prepare to feel my flaming wraith!
  3. Summary dumbed down enough for you? by Anonymous Coward · · Score: 3, Insightful

    Explaining what a GPU is in a slashdot summary? Come on.

    This is similar to someone telling you a story about something funny happening to them while shopping at the store, pausing mid-story to inform you that a 'store' is a business where goods are displayed and exchanged for a papery substance called 'money'.

  4. A reminder by Mannfred · · Score: 5, Insightful

    It's hardly news that GPUs can be used to speed up parallel tasks/computations, but even so this article is a useful reminder of two things; 1) there are still many important processes that can be sped up by using GPUs, and 2) this can be achieved pretty much anywhere in the world.

  5. A better article by arielCo · · Score: 4, Informative
    http://hpcwire.com/hpcwire/2011-12-15/bgi_speeds_genome_analysis_with_gpus.html

    Excerpt:

    At BGI, he says, they are currently able to sequence 6 trillion base pairs per day and have a stored database totaling 20 PB.

    The data deluge problem stems from an imbalance between the DNA sequencing technology and computer technology. According to Dr. Wang, using second-generation sequencing machines, genomes can now be mapped 50,000 times faster than just a decade ago. The technology on track to increase approximately 10-fold every 18 months. That is 5 times the rate of Moore's Law, and therein lies the problem.

    Obviously it would be impractical to upgrade one's computational infrastructure at that rate, so BGI has turned to NVIDIA GPUs to accelerate the analytics end of the workflow. The architecture of the GPU is particularly suitable for DNA data crunching, thanks to its many simple cores and its high memory bandwidth.

    --
    This post contains no rudeness or derision of any kind. All arguments are friendly. Terms and exclusions may apply.
    1. Re:A better article by Samantha+Wright · · Score: 4, Informative

      ...countering this stunning and exciting revelation is BGI's stunning and exciting reputation for producing stunningly and excitingly low-quality raw data from said stunning and exciting second-generation sequencing machines. This is a little like the biology equivalent of being told that your least-favourite Slashdot editor (please pick just one) has just gotten a brain implant so he can spam the front page with dupes, typo-ridden summaries, and fallacy-laden opinion pieces ten times an hour.

      --
      Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!
  6. Re:This article is almost painfully dumbed down... by Zakabog · · Score: 5, Informative

    The summary is pulled directly from the top of the article.

    Here's the article from HPC Wire and some details from nvidia as well as the nvidia press release

  7. Re:first by Pieroxy · · Score: 4, Insightful

    So, a site dedicated to nerds needs to explain what a GPU is? Are we not nerds anymore?

  8. Part of the problem is Low Standards by MaizeMan · · Score: 3, Informative

    Although at least in my field the problem is that no one ever thought to set lower limits on the quality of what you can call a genome. So now we get "genomes" made up of 100,000 contigs (many only a couple of hundred base pairs long) and even counting all of those, the total sequence might account for only 70% of the total size of the genome. But it's still a "genome" paper, which is still an instant ticket to Nature Genetics (or Nature Biotechnology if the assembly is REALLY bad).

    BGI is certainly one of the biggest offenders (Cucumber and Pigeonpea are both examples of the sort of terrible genomes-in-name-only BGI puts out) but I think the real problem is that Illumina sequence data is so cheap people keep trying to use it to sequence genomes, thinking if they throw enough raw data and enough mate-pair libraries at the problem it'll eventually make up for the fact that Illumina reads are so short. Illumina data is great for a lot of things. Calling SNPs, measuring gene expression, studying methylation patterns.

    But, at least for any genome significant transposon content, it simply does not work.

  9. For the curious... by Cow+Jones · · Score: 5, Funny
    --

    Ah, arrogance and stupidity, all in the same package. How efficient of you. -- Londo Mollari