IBM To Build 3-Petaflop Supercomputer

← Back to Stories (view on slashdot.org)

IBM To Build 3-Petaflop Supercomputer

Posted by Soulskill on Monday December 13, 2010 @07:02PM from the onward-and-upward dept.

angry tapir writes "The global race for supercomputing power continues unabated: Germany's Bavarian Academy of Science has announced that it has contracted IBM to build a supercomputer that, when completed in 2012, will be able to execute up to 3 petaflops, potentially making it the world's most powerful supercomputer. To be called SuperMUC, the computer, which will be run by the Academy's Leibniz Supercomputing Centre in Garching, Germany, will be available for European researchers to use to probe the frontiers of medicine, astrophysics and other scientific disciplines."

4 of 73 comments (clear)

Min score:

Reason:

Sort:

Re:Not POWER7, Not BlueGene(BlueGene/Q) by Required+Snark · 2010-12-13 20:06 · Score: 3, Informative

Here is a look at the guts of the IBM next generation BlueGene/Q. http://www.theregister.co.uk/2010/11/22/ibm_blue_gene_q_super/page2.html

The Sequoia super that Lawrence Livermore will be getting in 2012 — IBM said it'd be in late 2011 back when the deal was announced in February 2009, so there's been some apparent slippage — will consist of 96 racks and will be rated at 20.13 petaflops. Argonne National Laboratory said back in August that it wanted a BlueGene/Q box, too, and it will have 48 racks of compute drawers for a total of 10 petaflops of floating-point power.

Both the Chinese machine and the German machine are not cutting edge designs. They represent what you can do with near commodity hardware and good but not fully custom packaging. They may look like top end machines today, but by 2012 they will not be in the top ten.

--
Why is Snark Required?
How should we measure supercomputers now? by Entropius · 2010-12-13 20:09 · Score: 3, Informative

Once upon a time, supercomputers were bunches of general-purpose cpu's, and you made them faster by connecting up more of them.
Now people have realized that massively parallel special purpose chips (like Cell and, even more so, GPU's) can be used to do general-purpose computing, and have started to add those to clusters. But those chips have a lower bandwidth:flops ratio than the x86 etc. CPU's that have been historically used; the gap between a computer's "peak" FLOPS (on an ideal job with no communication requirements to either other nodes or to memory) and the performance it actually achieves is wider using something like CUDA than on a standard supercomputer. CUDA machines are so bandwidth-limited that people use rather hairbrained data compression schemes to move data from place to place, just because all the nodes have extra compute power lying around anyway, and the bottleneck is in communication. (The example that comes to mind is sending the coefficients of the eight generators of an SU(3) matrix rather than just sending the eighteen floats that make up the damn matrix. It's a lot of work to reassemble, relatively speaking, but it's worth it to avoid sending a few bits down the wire.)
CUDA is wonderful, and my field at least (lattice QCD) is falling over itself trying to port stuff to it. Even though it falls far short of its theoretical FLOPS, it's still a hell of a lot faster than a supercomputer made of Opterons. But we shouldn't fool ourselves into thinking that you can accurately measure computer speed now by looking at peak FLOPS. It makes the CUDA/Cell machines look better than they really are.
1. Re:How should we measure supercomputers now? by afidel · 2010-12-13 20:54 · Score: 4, Informative
  
  No, the computers that the term supercomputer was coined for were all special purpose vector machines that couldn't even run an OS, they had to be fronted by a management processor. Only much later were clusters of commodity machines (often with specialized interconnects for high bandwidth and low latency) accepted as contenders for the name. Now with Cell and GPU's we are getting back to fast vector machine with a management computer in the front but now the front end computer is capable of computations (at least in the case of the GPGPU machines) and each machine is a few rack units instead of a couple racks.
  
  Oh, and the measure you are looking for are Rmax to Rpeak which will tell you how efficient the machine is (at least for LINPACK which may or may not track with your own code depending on how chatty it is in comparison to the benchmark).
  
  --
  There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Re:Why I love Moore's law by Esvandiary · 2010-12-13 20:43 · Score: 3, Informative

I believe May's Law is the one you're referring to; a corollary to Moore's Law, stating that software efficiency halves every 18 months (or two years).