Cray XT-3 Ships

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Monday October 25, 2004 @07:32PM from the and-they-mean-ships dept.

anzha writes "Cray's XT-3 has shipped. Using AMD's Opteron processor, it scales to a total of 30,580 CPUs. The starting price is $2 million for a 200 processor system. One of its strongest advantages over the std linux cluster is that it has an excellent interconnect built by Cray. Sandia National Labs and Oak Ridge National Labs are among the very first customers. Read more here."

12 of 260 comments (clear)

Min score:

Reason:

Sort:

Re:How big is it? by Anonymous Coward · 2004-10-25 19:41 · Score: 4, Informative

Dimensions (cabinet): H 80.50 in. (2045 mm) x W 22.50 in. (572 mm) x D 56.75 in. (1441 mm)

Weight (maximum): 1529 lbs per cabinet (694 kg)

http://www.cray.com/products/xt3/specifications. ht ml
Just the name brings back memories by Dancin_Santa · 2004-10-25 19:42 · Score: 3, Informative

In this day and age of very fast computers and clusters built in our basements, there sometimes comes along a story that whispers of the computing age of days long past. Cray is one of those names that can drop a jaw just by the mere utteration of the name.

The name is synonymous with speed and power and the unwillingness to cut corners in order to shave a few dollars off the final product. When you buy a Cray, you know you are getting top of the line hardware.

It looks like Sandia wants to build the fastest supercomputer in the world by clustering a few of these monsters, and I have no doubt that they will. Looks like more fun articles about this in the future. :-D

There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction. By feeding in current weather variables into a well-written model, a supercomputer is able to predict to a large degree of accuracy the future weather. Such an application will always be welcome.

I think I'm going to have to fire up the old ][e, the nostalgia is killing me!
1. Re:Just the name brings back memories by joib · 2004-10-25 21:59 · Score: 4, Informative
  
  There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction.
  
  Oh, please. Buy a clue, will ya? There's lots and lots and lots of applications that use supercomputers, or could use if they were more affordable. A few examples from the top of my head:
  
  Materials science, that is ab initio simulations, moldyn, you name it. This alone probably uses > 50 % of all supercomputer cpu time in the world. By comparison, weather prediction and nuke simulations is small potatoes (or shall we say, the simulations as such are big, but the number of people engaged in weather prediction or nuke simulation is really small compared to all the supercomputing materials scientists).
  
  CFD, the automobile and aerospace sectors are big users.
  
  Electronic design.
  
  Seismic surveys, the oil industry uses lots and lots of supercomputers to find oil deposits.
  
  Biology. Gene sequencing, moldyn simulations of lipid layers and whatever.
  
  Climate prediction, somewhat related to weather prediction. Official purpose of the Earth Simulator.
  
  All of the examples above could easily use almost any amount of cpu power you can throw at them. The only thing that stands between a lot of scientists and improved understanding of the world is computing power.
Re:real FPU operations by jmv · 2004-10-25 19:44 · Score: 4, Informative

Opterons beat the pants off the Pentium 4s in x87 (i.e. old) FPU operations. If you want to get good performance, you need SSE/SSE2. Both for AMD and Intel. For pure SSE, the Pentium 4s beat the Opterons mainly because of the clock speed, but for multi-processor systems, the hyper-transport and all more than makes up for that.

--
Opus: the Swiss army knife of audio codec
You don't have to begin to imagine by commodoresloat · 2004-10-25 19:47 · Score: 3, Informative

You could just read on the spec page: Power: 14.8 kVA (14.5 kW) per cabinet. Circuit Requirement: 80 AMP at 200/208 VAC (3 Phase & Ground), 63 AMP at 400 VAC (3 Phase, Neutral & Ground) Cooling Requirement: Air Cooled, Air Flow: 3000 cfm (1.41 m3/s) Intake: bottom, Exhaust: top.
1. Re:You don't have to begin to imagine by wronskyMan · 2004-10-25 21:46 · Score: 5, Informative
  
  Disclaimer: IANACEBIATAPEC (I Am Not A Cray Engineer But I Am Taking A Power Engineering Course)
  It's fairly common to get a KVA !=KW.
  Overall power used by a load is expressed as S=P+jQ, where P is the "real" power and Q is the reactive power (capacitive/inductive from motors, fluorescent lamp ballasts, etc).
  
  While the "units" of S, P, and Q are power=voltage*current, S is generally expressed in VA, P in W, and Q in VAR(volt-ampere reactive) to differentiate the variables. Because the magnitude of S=sqrt(P^2+Q^2), S will always be greater than or equal to P (in this case, 14.8kVA=sqrt((14.5kW)^2+(+-2.965kVAR)^2)
  
  --
  --- You shall know the truth, and the truth shall make you mad- Neal (not Cowboy) Boortz
Re:software by Coryoth · 2004-10-25 20:04 · Score: 5, Informative

what kind of operation system runs on this beast?

UNICOS is usually a safe bet. In this case the specs say UNICOS/lc, which is made up of "SUSE(TM) Linux(TM), Cray Catamount Microkernel, CRMS and SMW software"

I'm not entirely clear how to interpet that, but I think it runs as follows: It runs the Catamount Microkernel as the kernel, and uses SUSE for everything else (so we have SUSE Linux, without the Linux - all of a sudden that GNU/Linux stuff starts to make sense). The CRMS is their interconnect management and monitoring software, and SMW is the System Management Workstation - which I'm guessing is their administration frontend.

It's worth noting that that's some pretty serious software there (because Cray has a lot of experience dealing with large systems) - you can bet that the management and monitoring software is some very serious stuff.

This thing is to a beowulf cluster what a dual G5 PowerMac is to homebuilt PC system running Linux From Scratch. It's going to work flawlessly "out of the box" with a smooth and polished interface that lets you get done everything you want to do simply and easily. You can of course make your home built PC with LFS work just as well, it's just going to take you an awful lot of effort.

Jedidiah.

--
Craft Beer Programming T-shirts
Re:So......the cost compared to? by Coryoth · 2004-10-25 20:16 · Score: 4, Informative

So, how does this compare to running Apple's Xserve? Bang per buck? Heat? Space? Etc etc....

There's not a lot to compare. We're talking apples and oranges. It's like asking to compare a PowerMac G5 with a bunch of PC parts scattered on the floor as desktop machines. Sure, you can put the PC together, load it with Linux, tinker with it to get everything working, etc. but that's a fair amount of work compared taking the PowerMac out of the box, plugging it in, turning it on, and having everything work perfectly.

Read the specs, particularly with regard to the interconnect, system administration, and hardware and software reliability features. This thing is seriously engineered to be massively parallel system with top of the line hardware and software to support and maintain that, as well as extremely impressive reliability features.

Jedidiah.

--
Craft Beer Programming T-shirts
Re:MP performance overhead by Big+Mark · 2004-10-25 20:39 · Score: 3, Informative

If Crays were built the same was as desktop dual-proc machines, then yes, the multi CPU overhead would cripple it. Fortunately, it's designed completely differently - e.g. they use PowerPC chips to handle almost all of the inter-processor communication.

You can't really compare something that can hold thousands of CPUs to something powered by Abit that can hold two, anyway. It's like comparing apples and a strange bug thing with tentacles.
Re:imagine a... by crimsun · 2004-10-26 00:35 · Score: 4, Informative

It's not just hardware: the amount of non-parallelizable code in parallel applications impacts scalability most tremendously.

The upper bound on speedup is generally Amdahl's law. Plainly, the efficiency approaches zero as the number of processes is increased. Generally we consider the major sources of overhead to be communication, idle time, and extra computation. Interprocess communication is considered negligible for serial programs in this context (we consider message passing). Idle time ends up contributing to overhead, because processes idle awaiting information from others. Extra computation is virtually unavoidable at some point; for instance in MPI's Single Program Multiple Data model, each process in tree-structured communication other than the root is eventually idled prior to the completion of computation, and each process determines IPC at some point based on rank.

There are notable exceptions to Amdahl's law, however; Gustafson, Montry and Benner wrote about such in Development of parallel methods for a 1024-processor hypercube, SIAM Journal on Scientific and Statistical Computing 9(4):609-638, 1988.
hybrid system with multiple kernels by Dink+Paisy · 2004-10-26 01:01 · Score: 3, Informative

From the documents, it looks like it runs Linux on the management nodes and Catamount on the compute nodes. The idea is you can do what you like with the general purpose nodes, but for the compute nodes, you run a lightweight operating system that has low overhead, minimal services and predictable scheduling. BlueGene/L works the same way; it runs Linux on the management nodes and a custom operating system on the compute nodes. Compute nodes likely provide scheduling for only the number of threads that run on the node, communication through MPI and some proprietary API, and basic debugging facilities. Compute nodes probably lack normal OS services like network, disk, or even a console.

--

Whoever corrects a mocker invites insult;
whoever rebukes a wicked man incurs abuse.
--Proverbs 9:7
Re:software by flaming-opus · 2004-10-26 03:00 · Score: 3, Informative

This split microkernel architecture has been in use for a long time on big mpp systems like the paragon and the t3e. The software base (catamount/linux) is new, but the design is old.

catamount is the kernel that runs on the compute nodes. IT's a tiny kernel that packages up the OS service requests, and sends them, over the interconnect, to an OS or I/O node, which does the real work of the operating system. catamount is a descendant of PUMA, which came from Cougar. These are heavily derived from work done at caltech. (I believe CMU, and one of the UTexas schools also played a role, but am not sure). The idea is that the microkernel is small and unobtrusive, and it gets the hell out of the way so the application can use the CPU as much as is possible.

The OS and I/O nodes run linux, and provide services to the compute nodes. This is probably, but it could just as easily be running as a user-space daemon on the OS node. (Though you might have to do some mem-copys that way, which would lower performance)

NOTE: Though these nodes take advantage of some of linux's features (like the lustre file system) they do NOT necessarily implement these features for the system as a whole. They probably provide a minimal set of features necessary for the sorts of problems that the xt3 runs. All the scheduling work that has gone into more recent linux kernels is of little use, as the compute nodes have their own scheduler, probably more closely tied to the batch dispatcher than to the linux kernel. To say that the system runs linux is true, but a little misleading. It's a very different linux than what runs on my desktop, and it's used in a very different way.