Cray XT-3 Ships
anzha writes "Cray's XT-3 has shipped. Using AMD's Opteron processor, it scales to a total of 30,580 CPUs. The starting price is $2 million for a 200 processor system. One of its strongest advantages over the std linux cluster is that it has an excellent interconnect built by Cray. Sandia National Labs and Oak Ridge National Labs are among the very first customers. Read more here."
When you have a single CPU, designing the system to be pretty fast is easy. There's no major contention to deal with.
Two CPUs? Slightly harder, but reasonably straightforward. You don't see a 2x improvement in speed over one CPU, but it's around 1.95x, give or take a bit.
Four CPUs? Now you're starting to see less improvement ... probably around 3.2x, because of all the contention issues.
Sixty-four CPUs? You'll be lucky to get a 50x speed up over a single CPU.
When you get to 200 CPUs, the issue of access to shared memory and other shared resources becomes critically important. It's also an issue that most computer buyers don't need to worry about, because they don't have 200 CPUs in their system. This means that you have a lot of highly specialised research going on, and relatively few buyers to spread the cost of that research over.
Two million for a 200 CPU box which has low latency, low contention, and solid reliability is not a lot at all. You might not buy it. That doesn't mean nobody will.
What a value!!
That is, until you throw a tightly coupled problem at it and the Cray is 10 times faster because it has much better internode bandwidth and lower latency.
And, you forgot to count the cost of the InfiniBand interconnect that the VT cluster used? That's a couple grand per node.
Bottom line, apples and oranges. If your applications is easily parallelizable (i.e. doesn't require much communication between the nodes) you'd be stupid to piss away your money on a "real" supercomputer instead of a cluster. And vice versa.
Ah the joys of youth.
Back in my day we spelled "enuff" without the 'f' character and it was good enough for us.
Actually, there is no reason to cluster a few of these. If you have a 2000 node xt3 (or t3e, paragon, blue-gene, cm5, insert mesh-structured mpp here) and a 4000 node xt3, you stick them together and make a 6000 node xt3. But that's just picking nits.
Curiously the xt3 IS about shaving dollars off the price. If you go read the origional whitepapers on the system, they go through EXTENSIVE cost-return analysis. They studied their (then-) current generation of cluster systems, as well as future linux/solaris/aix clusters, and rejected them as (interestingly) FAR TOO EXPENSIVE, once the administrative costs are factored in. They then looked at, and rejected, cray's vector solution, the X1. They then decided that the (amazingly) most cost effective solution was to underwrite cray's product development cycle on a wholey new product. Basically they asked for an update to the system they already had. (asci-red i.e. intel paragon++) Nobody was building such a thing. Since cray had a really strong similar product in the 90s. (T3D, T3E) the department of energy asked them to create an update. Some designs never die.
What I'm most interested in is the reliability. One of the biggest difficulties in the T3D engineering cycle was dealing with memory failure. red-storm is going to have 10,000 processors. Lets assume each has 2 banks time 3 dimms (chip-kill) of memory. That means there are 10,000 x 6 x 18 = 1 million+ memory chips in the system. IF 1/100th or a percent of these fail, that's still a lot of memory failures. How well are faults isolated? That's the big question for systems this big.
I'm also a little wary of cray's use of lustre. I've used lustre before, as well as other cluster-FSes. While I'm not aware of other filesystems that will scale to 700+ i/o nodes, I'm not confident in lustre. It's an immature product at best. (I don't mean to disparage the people working on it, it's a neat architecture, but it's a hard problem, and I'm not sure it's ready for prime-time.)