Slashdot Mirror


Chinese Lab Speeds Through Genome Processing With GPUs

Eric Smalley writes "The world's largest genome sequencing center once needed four days to analyze data describing a human genome. Now it needs just six hours. The trick is servers built with graphics chips — the sort of processors that were originally designed to draw images on your personal computer. They're called graphics processing units, or GPUs — a term coined by chip giant Nvidia. This fall, BGI — a mega lab headquartered in Shenzhen, China — switched to servers that use GPUs built by Nvidia, and this slashed its genome analysis time by more than an order of magnitude."

23 of 408 comments (clear)

  1. The Future Is Here!! by mastershake82 · · Score: 5, Funny

    Sounds like these newfangled "GPUs" are gonna change the world.

    1. Re:The Future Is Here!! by MollyB · · Score: 4, Insightful

      If one reads to page 2 of tfa, they only claim the technique works well in this instance. They go on:

      Even for computer-intensive aspects of analysis pipelines, GPUs aren’t necessarily the answer. “Not everything will accelerate well on a GPU, but enough will that this is a technology that cannot be ignored,” says Gollery. “The system of the future will not be some one-size-fits-all type of box, but rather a heterogeneous mix of CPUs, GPUs and FPGAs depending on the applications and the needs of the researcher.”

      and

      GPUs have cranked up the speed of genome sequencing analysis, but in the complicated and fast-moving field of genomics that doesn’t necessarily count as a breakthrough. “The game changing stuff,” says Trunnell, “is still on the horizon for this field.”

      So yes, the article is a bit breathless, but if utilizing GPUs helps cure my potentially impending genetic disorder, I'm all for it.

    2. Re:The Future Is Here!! by Stormthirst · · Score: 3, Insightful

      Unless you are fortunate to live in a civilised part of the world with a universal healthcare system.

  2. News for nerds by Anonymous Coward · · Score: 5, Funny

    I always wondered what GPUs are. Thanks Slashdot!

    1. Re:News for nerds by galanom · · Score: 5, Funny

      No, it's "Guinea Pig Units"

    2. Re:News for nerds by nevillethedevil · · Score: 3, Funny
      I thought it was 'Gnomes Processing Underpants' and that we finally had that elusive missing step

      1. Steal underpants
      2. Process underpants
      3. Profit

      --
      Be gone from my sight or prepare to feel my flaming wraith!
  3. Summary dumbed down enough for you? by Anonymous Coward · · Score: 3, Insightful

    Explaining what a GPU is in a slashdot summary? Come on.

    This is similar to someone telling you a story about something funny happening to them while shopping at the store, pausing mid-story to inform you that a 'store' is a business where goods are displayed and exchanged for a papery substance called 'money'.

  4. This article is almost painfully dumbed down... by tiffany352 · · Score: 2

    Submitter couldn't find a more technically-oriented one?

    1. Re:This article is almost painfully dumbed down... by gman003 · · Score: 2

      Hell, even the summary is condescending.

      This is Slashdot. You don't have to explain what a GPU is.

    2. Re:This article is almost painfully dumbed down... by Zakabog · · Score: 5, Informative

      The summary is pulled directly from the top of the article.

      Here's the article from HPC Wire and some details from nvidia as well as the nvidia press release

    3. Re:This article is almost painfully dumbed down... by heironymous · · Score: 2

      I agree, and it would be a better policy to define acronyms the first time they are used. The same could be said about the names of software packages in other summaries. I'm mystified that so many commenters are miffed that GPU is explained.

  5. A reminder by Mannfred · · Score: 5, Insightful

    It's hardly news that GPUs can be used to speed up parallel tasks/computations, but even so this article is a useful reminder of two things; 1) there are still many important processes that can be sped up by using GPUs, and 2) this can be achieved pretty much anywhere in the world.

    1. Re:A reminder by peragrin · · Score: 2

      The only reminder should bethat processors designs for different types of math can do that math faster than processors designed for other types of math.

      I don't understand why companies don't realize that. Running graphics on a floating point processors is like using a train to go across an ocean. Sure you can do it doesn't mean that it is a good idea.

      --
      i thought once I was found, but it was only a dream.
    2. Re:A reminder by blahplusplus · · Score: 2

      "The only reminder should bethat processors designs for different types of math can do that math faster than processors designed for other types of math."

      Not all kinds of math can be parallelized.

    3. Re:A reminder by the+gnat · · Score: 2

      I always wondered why FPGA's aren't used for this kind of stuff, or if they already are. I would imagine they would even be faster because you can design a circuit specifically optimized for the problem.

      I think they are to some degree, but there is a major barrier to adopting them: they require specialized programming knowledge which you won't find in most genomics centers. GPUs are commodity technology and APIs like CUDA are easier to tackle (and more transferable to other fields) than FPGA programming. (Or such was my impression - I know a lot about bioinformatics, but much less about FPGAs.)

      There is at least one company that sells hardware specially accelerated for bioinformatics, CLC bio. I don't know if they use FPGAs or some kind of ASIC.

  6. Wonder the speed for using AMD by witherstaff · · Score: 2

    I wonder if the AMD use of more cores, whereas Nvidia uses faster cores, would change the time. I have no idea how genetic algorithms work. I do know simple hashes like bitcoins are best on AMD.

  7. A better article by arielCo · · Score: 4, Informative
    http://hpcwire.com/hpcwire/2011-12-15/bgi_speeds_genome_analysis_with_gpus.html

    Excerpt:

    At BGI, he says, they are currently able to sequence 6 trillion base pairs per day and have a stored database totaling 20 PB.

    The data deluge problem stems from an imbalance between the DNA sequencing technology and computer technology. According to Dr. Wang, using second-generation sequencing machines, genomes can now be mapped 50,000 times faster than just a decade ago. The technology on track to increase approximately 10-fold every 18 months. That is 5 times the rate of Moore's Law, and therein lies the problem.

    Obviously it would be impractical to upgrade one's computational infrastructure at that rate, so BGI has turned to NVIDIA GPUs to accelerate the analytics end of the workflow. The architecture of the GPU is particularly suitable for DNA data crunching, thanks to its many simple cores and its high memory bandwidth.

    --
    This post contains no rudeness or derision of any kind. All arguments are friendly. Terms and exclusions may apply.
    1. Re:A better article by Samantha+Wright · · Score: 4, Informative

      ...countering this stunning and exciting revelation is BGI's stunning and exciting reputation for producing stunningly and excitingly low-quality raw data from said stunning and exciting second-generation sequencing machines. This is a little like the biology equivalent of being told that your least-favourite Slashdot editor (please pick just one) has just gotten a brain implant so he can spam the front page with dupes, typo-ridden summaries, and fallacy-laden opinion pieces ten times an hour.

      --
      Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!
  8. Re:first by Pieroxy · · Score: 4, Insightful

    So, a site dedicated to nerds needs to explain what a GPU is? Are we not nerds anymore?

  9. Part of the problem is Low Standards by MaizeMan · · Score: 3, Informative

    Although at least in my field the problem is that no one ever thought to set lower limits on the quality of what you can call a genome. So now we get "genomes" made up of 100,000 contigs (many only a couple of hundred base pairs long) and even counting all of those, the total sequence might account for only 70% of the total size of the genome. But it's still a "genome" paper, which is still an instant ticket to Nature Genetics (or Nature Biotechnology if the assembly is REALLY bad).

    BGI is certainly one of the biggest offenders (Cucumber and Pigeonpea are both examples of the sort of terrible genomes-in-name-only BGI puts out) but I think the real problem is that Illumina sequence data is so cheap people keep trying to use it to sequence genomes, thinking if they throw enough raw data and enough mate-pair libraries at the problem it'll eventually make up for the fact that Illumina reads are so short. Illumina data is great for a lot of things. Calling SNPs, measuring gene expression, studying methylation patterns.

    But, at least for any genome significant transposon content, it simply does not work.

  10. For the curious... by Cow+Jones · · Score: 5, Funny
    --

    Ah, arrogance and stupidity, all in the same package. How efficient of you. -- Londo Mollari
  11. Re:The Answer Lies in Parallel Computation by the+gnat · · Score: 2

    the Chinese are picking up on the technology and on genomic data mining far faster and with more intensity than is the broader US tech community.

    You're forgetting that the vast majority of countries actually developing this technology, and making it available to consumers, are based in the US (and Britain, to some degree). One recent article about the BGI that I read last year noted the irony of seeing several crates of sequencing machines stamped "MADE IN THE USA" waiting to be unloaded in Shenzhen. The Chinese government is certainly willing to spend large amounts of money advancing their capabilities, but I haven't seen any evidence that they're significant surpassing the US in anything other than sequencing capacity. (And the machines they're using are very good for generating large quantities of data, but the quality of said data is somewhat suspect.)

    Given the size of their brainpower base and the rate at which they are adapting the technology the Chinese are well on their way to dominating the drug development and physiological/functional genomic sciences in the next 10 years.

    Except that genomics has as of yet proven minimally useful for drug development. Until they actually develop significant amounts of homegrown technology (which, to be fair, they are actually doing in the bioinformatics arena, as opposed to sequencing), I'm not convinced that they're that much of a threat. What they will certainly accomplish, I think, is a record of high-profile scientific output and the ability to compete on even terms with the rest of the industrial superpowers. No mean feat considering where they were 40 years ago, and certainly some cause for concern given their large and inexpensive labor force, but it's not the same thing as suddenly eclipsing the USA in technology that they're still mostly importing or stealing.

  12. Re:NOT coined by Nvidia by russotto · · Score: 2

    A search of Usenet reveals the Atari Jaguar had a unit called a "GPU" in 1993, considerably before NVIDIA's "first GPU" in 1999. The Amiga unit was also called a GPU.

    The term's generic, and NVIDIA knows it... they don't have it registered as a trademark.