Teraflop In A Box At SC2003
HPC Prophet writes "For those of you that can't go to SC2003 or can't afford the US$750 late registration, here is a small taste of what we put together for our friends at Mellanox Technologies...It benches out at over 1.2TFLOP (192 dual Intel Xeon Processor blades, 64 in a Rackable chassis, 128 crammed into a Ciara chassis and all connected via InfiniBand) and loaded up with Callident Rx (based on NPACI Rocks) OS/Middleware. Total estimated time to unpack, build and get up and running was 17 hours." Read on for some details on this power-hog.
"We had the single-most power density for the smallest size booth they offer (380amps @ 208v in a 5U of rack space (look closely at the bottom of the middle rack containing all the cables and InfiniBand switches). Cooling was very nice too, we maxed out our Liebert HVAC when building it initially. Oh, by the way, this would end up somewhere in the neighborhood of #38 on the June 2003 Top500 list. There are a couple of other pictures on there too of some of the other attractions at SC2003 like the 128-node cluster that NPACI folks will build in a 2 hour period. Sorry about the cheezy slide show, I had to be quick."
I though itanic was supposed to be wonderful according to intel and HP. So why are they not promoting huge clusters of itanics? Why are they talking terraflops with cheap and nasty Xeons? 32-bit Xeons?! Everyone else is 64-bit nowadays.
If you look at the more recent November 2003 list instead of the older June 2003 one, this cluster would rate more like #84 than #38.
/cj
I only wish the price of these things would slide down a little more.
Cost of this 1 teraflop Mellanox machine is less than US$1e6 according to this brochure.
That's considerably less than the US$50e6 that the first teraflop machine cost (Sandia's ASCI Red see this SC1996 flier) 7 years ago.
I don't have a spare million, either, but that kind of 98% price reduction is still fairly impressive.
"Provided by the management for your protection."
It's not actually the speed that matters, here. It's how well the applications are parallelized. Things like protein folding, most population modelling simulations, graphics rendering, etc are -highly- parallel in nature, and run beautifully on clusters and large SMP machines (by large we're talking >32 way).
A really good example is the genomic search tool BLAST. The "stock" version from NIH isn't natively parallel, however due to it being available in source form, it's been modified to run in parallel....and it's -much- faster that way.
Basically, if your problem set can be broken into chunks and -then- worked on, you can make good use of any sort of parallel system. Clusters are really the "poor man's" way of parallelizing computation...they're also becoming the most prevalent -because- you get a lot of bang for your buck...think about it: Earth Simulator cost 8 figures to build, IIRC, to get 17 TFlops. Earth Simulatr is a more tradition vektor system, so it's -really- freaking good at certain operations...but it's also freakishly expensive to design and build.