Cray XT-3 Ships
anzha writes "Cray's XT-3 has shipped. Using AMD's Opteron processor, it scales to a total of 30,580 CPUs. The starting price is $2 million for a 200 processor system. One of its strongest advantages over the std linux cluster is that it has an excellent interconnect built by Cray. Sandia National Labs and Oak Ridge National Labs are among the very first customers. Read more here."
single node of those.
Comment removed based on user account deletion
I read the article (okay, so I kinda read it :-) ) and it has the speed and specs to be a geek's improvement on sliced bread. But how big is it, physically?
The article doesn't appear to mention its dimensions, and I'm curious to know what kind of space you need to install this baby. Anyone got any idea?
Daar is nie 'n lepel nie
This is only the XT-3. I'll wait for the Pentium-3-4.
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
A few more years of advances like this and we might have a machine capable of running Longhorn!
or else!
I can't believe people complain about the price of iMacs....
How are the Opterons at standard FPU operations in double precision? SSE2 and friends are nice, unless you have to make compromises in your simulations.
I ask, because I remember that the Athlons beat the pants off the Pentium 4's in FPU operations, so all the benchmarks were rewritten to use SSE2.
In this day and age of very fast computers and clusters built in our basements, there sometimes comes along a story that whispers of the computing age of days long past. Cray is one of those names that can drop a jaw just by the mere utteration of the name.
:-D
The name is synonymous with speed and power and the unwillingness to cut corners in order to shave a few dollars off the final product. When you buy a Cray, you know you are getting top of the line hardware.
It looks like Sandia wants to build the fastest supercomputer in the world by clustering a few of these monsters, and I have no doubt that they will. Looks like more fun articles about this in the future.
There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction. By feeding in current weather variables into a well-written model, a supercomputer is able to predict to a large degree of accuracy the future weather. Such an application will always be welcome.
I think I'm going to have to fire up the old ][e, the nostalgia is killing me!
from TFA -
Dimensions (cabinet):
H 80.50 in. (2045 mm) x W 22.50 in. (572 mm) x D 56.75 in. (1441 mm)
Sorry to reply twice but I forgot this detail.
You could just read on the spec page: Power: 14.8 kVA (14.5 kW) per cabinet. Circuit Requirement: 80 AMP at 200/208 VAC (3 Phase & Ground), 63 AMP at 400 VAC (3 Phase, Neutral & Ground) Cooling Requirement: Air Cooled, Air Flow: 3000 cfm (1.41 m3/s) Intake: bottom, Exhaust: top.
It seems that the XT-3 not only use Opteron processors but they also use PowerPC 440 co-processors from IBM to off load inter-processor communication from the main computing CPUs. Quite an interessting set up.
The XT-3's biggest comptetitor in this segment must be the BlueGene/L type super computer made by IBM. The processors in Blue Gene/L is a custom built dual core version of the PowerPC 440 with built in high speed interconnects.
Just like IBM have a finger in all the future game consoles, they seem to have a finger in several of the next generation super computers also. Nice going IBM.
- Henrik
- when the Shadows descend -
X-serve clusters would be cheaper, but I think that Cray has the edge n the interconnect tech. So, you need massive bandwidth in the system, get the Cray. you need next best bandwidth at a low price, get the Xserve cluster.
I am the Alpha and the Omega-3
It seems like Cray is not capable of sustaining its heritage. Buying cheap AMD processor and connecting them with customized HT interconnect is not enough to build a machine capable of record-breaking single-task performance, old Crays exhibited. When one could be sure with Cray XMP that he has the best machine money can buy (with outstanding scalar and vector abilities), new Cray is just another loosely-coupled AMD cluster. Thanks god it's not a NEC clone (at least).
Strangely, it took roughly a week. The second test was a simulation of the moderation results of this post.
It received a +5 Funny, which puzzled researchers, as it is currently modded -1 Offtopic.
Damn you Schroedinger!
what kind of operation system runs on this beast?
UNICOS is usually a safe bet. In this case the specs say UNICOS/lc, which is made up of "SUSE(TM) Linux(TM), Cray Catamount Microkernel, CRMS and SMW software"
I'm not entirely clear how to interpet that, but I think it runs as follows: It runs the Catamount Microkernel as the kernel, and uses SUSE for everything else (so we have SUSE Linux, without the Linux - all of a sudden that GNU/Linux stuff starts to make sense). The CRMS is their interconnect management and monitoring software, and SMW is the System Management Workstation - which I'm guessing is their administration frontend.
It's worth noting that that's some pretty serious software there (because Cray has a lot of experience dealing with large systems) - you can bet that the management and monitoring software is some very serious stuff.
This thing is to a beowulf cluster what a dual G5 PowerMac is to homebuilt PC system running Linux From Scratch. It's going to work flawlessly "out of the box" with a smooth and polished interface that lets you get done everything you want to do simply and easily. You can of course make your home built PC with LFS work just as well, it's just going to take you an awful lot of effort.
Jedidiah.
Craft Beer Programming T-shirts
So, how does this compare to running Apple's Xserve? Bang per buck? Heat? Space? Etc etc....
There's not a lot to compare. We're talking apples and oranges. It's like asking to compare a PowerMac G5 with a bunch of PC parts scattered on the floor as desktop machines. Sure, you can put the PC together, load it with Linux, tinker with it to get everything working, etc. but that's a fair amount of work compared taking the PowerMac out of the box, plugging it in, turning it on, and having everything work perfectly.
Read the specs, particularly with regard to the interconnect, system administration, and hardware and software reliability features. This thing is seriously engineered to be massively parallel system with top of the line hardware and software to support and maintain that, as well as extremely impressive reliability features.
Jedidiah.
Craft Beer Programming T-shirts
So 96 processors, AMD gets about 144K per PE node at 1500 per cpu, or does Cray get a discount?
Also, a 30,000 cpu complex, AMD must be making a tidy sum.
If Crays were built the same was as desktop dual-proc machines, then yes, the multi CPU overhead would cripple it. Fortunately, it's designed completely differently - e.g. they use PowerPC chips to handle almost all of the inter-processor communication.
You can't really compare something that can hold thousands of CPUs to something powered by Abit that can hold two, anyway. It's like comparing apples and a strange bug thing with tentacles.
from their Tech.sheet they are using the Luster file system
This is the first time i see a shipped linux with this file system. Now the intersting part is that lusterfs is made for linux clusters, but this monster is not a cluster... any body can shed some light?
The lunatic is in my head
Maybe if you included promises of free iPods...
Cray never went "belly up". It was acquired by SGI around 1997 or so, then divested and merged with Tera, who renamed the resultant entity "Cray Research".
Although it's true that Cray was not growing strongly before the SGI buy-out, it was not failing either. It could have kept running quite happily for many years, but in the bizarro-world of Wall Street, a company which is not growing is dying. I so love it when economists use biological terminology for corporations. In Wall Street's thinking, the only healthy growth would be a cancerous tumor.
Anyway....
The whole SGI-period of Cray is actually quite fascinating, and I suspect the true story will never be fully known. Lots of SGI engineers had their non-Cray technology branded with Cray marketting names, most egregiously LegoNet becoming CrayLink. Lots of Cray folks - aka. Crayons - felt that the core of their company was gutted by an SGI operation which didn't care for the extreme high-ends of HPC.
One rumor I heard, from a well-placed source, is that the Cray merger with SGI was primarily arranged by the USG. The intelligence services have huge investments in both company's products, so the merger between them made sense. I was told that as a quid-pro quo, the USG had an in-principle agreement to continue purchasing Cray gear to provide enough revenue inside SGI to keep both Cray architectures alive. However, certain parts of SGI felt that the US government didn't live up to their agreement, negotiations to rectify that weren't successful, and so SGI management defunded significant aspects of the Cray engineering work.
Also, FYI, Cray is one of those companies which will never totally go "belly up" anyway. Given the sensitivity of the work which they did, their support databases alone are full of sensitive and/or classified information. Should the company cease trading, it would be acquired by a shelf company whose sole function is to ensure this data would remain private. That's been the fate of almost all of the now-defunct supercomputer and high-end graphics companies who formerly supplied the defence and intelligence market.
I admire your positive outlook on the prospects of simulations, but as an experimentalist, I find this "soon we won't need experiments at all" (see Rev. Mod. Phys. 64, 1045-1097 (1992), for instance) attitude very dangerous. Simulations and models, even at the first principles level, should never be trusted implicitly. They only sure way to tell how nature works is via experimentation.
I can sort of understand simulating nuclear explosions, but simulating the aging process of a warhead doesn't make that much sense to me - unless the simulations are accompanied by direct observation of the (accelerated) aging of a warhead.
The owls are not what they seem
Ha - well I'm sure the guy behind CherryOS will have a press release that it runs The Mac OS at 30 Terahertz.
Yell & scream & rant & rave... it's no use... you need a shaaaave ~ Bugs Bunny
It seems to be really lacking in the blinkenlight department though.
What good is a supercomputer without blinkenlights ? They just don't make them like they used to...
May contain traces of nut.
Made from the freshest electrons.
You say it's comparing Apples to Oranges but its not really ...
The VT Supercomputer specs vs the Cray specs page you pointed to:
CRAY 460 GFLOPS per cabinet (96 processors @ 2.4 GHz)
Apple - if my math is right - 420 GFLOPS (100 processors @ 2.0Ghz)
The new specs for the specialized VT Supercluster are pretty impressive.
Their throughput and interconnect is most likely weaker - but still VERY strong with fiber channel.
Yell & scream & rant & rave... it's no use... you need a shaaaave ~ Bugs Bunny
What a value!!
That is, until you throw a tightly coupled problem at it and the Cray is 10 times faster because it has much better internode bandwidth and lower latency.
And, you forgot to count the cost of the InfiniBand interconnect that the VT cluster used? That's a couple grand per node.
Bottom line, apples and oranges. If your applications is easily parallelizable (i.e. doesn't require much communication between the nodes) you'd be stupid to piss away your money on a "real" supercomputer instead of a cluster. And vice versa.
There's not a lot to compare. We're talking apples and oranges.
;-)
No, we're talking Apples and Crays... Didn't you read the post before replying?
...Sadly I think that beats my Volkswagen on all three
We'll be able now to install Gentoo in just a few days !
The real problem that stands between scientists and them having lots of shiny toys is funding.
E.g., yeah, having a 30,000 CPU super-computer to simulate your gene model on would be nice. Forking over half a billion for it, well, it's suddenly not that nice any more.
Having one of those to simulate an electronic circuit, now that would probably rock. Again, paying half a billion for it, suddenly isn't that attractive.
The real question isn't how nice a toy you'd like to have, it's ROI. (Unless you work for the government, and just have a budget you _have_ to blow on stuff, whether you need that stuff or not.)
And in that context, you'd be surprised what you _can_ do with a lot less expensive toys.
Having Cray's custom interconnects sure is impressive, but for a lot of problems they're not even needed any more. _That_ is what killed Cray.
Most RL problems are not really the kind described as "_one_ huge indivisible data set, that you have to process in _one_ huge batch process." They're more like "we have this process with a small data set that we have to run 100,000,000 times." Most design problems or biology problems are really of that kind: run the same thing 100,000,000 times with different parameters.
And as Seti@Home or Folding@Home proved, a helluva lot of those don't really need _any_ kind of shared memory or fancy interconnects. The real ticket is noting that instead of accelerating the batch run 200 times, you could just split it into 200 smaller batches ran on 200 single-CPU machines.
The super-computer solution costs 2,000,000 just for the machine alone, while the 200 PCs solution costs 200,000 or so. I.e., 10 times cheaper. Better yet, the 200 PCs solution is also far cheaper to program. (Anyone can program a non-threaded batch app.) _And_ for that kind of a problem the 200 PCs solution would actually finish faster, since it has no contention issues whatsoever.
Again, that's what really killed Cray and the super-computers. They're techologically impressive, they're a geek's wet dream, but... for 99.9% of the problems out there they're just not worth the price any more.
A polar bear is a cartesian bear after a coordinate transform.
So come on, ante up. How many remember being awed at the mere sight of old Crays back in the day? Like the Cray-3? I remember the first time I saw a Cray .... thing was in an anti-static environment. To access it, one had to pass through an airlock and be "decharged" or "depolarized" etc. Basically they some how charged the air to get rid of static electricity. Then you had this system that was running *in* liquid! Take that "Oh I'm so cool cause I have a l337 haX0r water cooled CPU" overclockers
They (Cray) were so proud of this accomplishment that the upper portion of the cabinet was some kind of plexiglass so you could see the fluid as it moved, and moved wiring and what not with it. Very surreal feeling, almost like the thing was breathing.
And what about the Cray-1? Wasn't that a true testiment to 70's *art* and sculpture? The thing looks like some kind of freaky bus station bench with it's odd red and white panels and black base. Though, I don't know if they all looked like that, maybe you could get them in other colors?
Ahh .... those were the days.
"Genius may shine aloof and alone, like a star, but goodness is social, and it takes two men and God to make a Brother."
Last time I bought a Cray super-computer, I was kicking myself for weeks about the 2 million dollars I wasted.
Next time, I'm just gonna build a beowulf cluster out of 200 overclocked AMD Barton 2500s. I shall NOT be suckered again!
Please stop stalking me, bro.
From the documents, it looks like it runs Linux on the management nodes and Catamount on the compute nodes. The idea is you can do what you like with the general purpose nodes, but for the compute nodes, you run a lightweight operating system that has low overhead, minimal services and predictable scheduling. BlueGene/L works the same way; it runs Linux on the management nodes and a custom operating system on the compute nodes. Compute nodes likely provide scheduling for only the number of threads that run on the node, communication through MPI and some proprietary API, and basic debugging facilities. Compute nodes probably lack normal OS services like network, disk, or even a console.
Whoever corrects a mocker invites insult;
whoever rebukes a wicked man incurs abuse.
--Proverbs 9:7
Because, IIRC, that was the one that they were only building one of, and when the govt cancelled the order, thats when Cray Research went under.
My opinions are my own, and do not necessarily represent those of my employer.
You're leaving out a lot of stuff necessary to make a cluster:
#1 RAM: $3000 for the G5 cluster node includes 512mb ram. Most places demand atleast 2gb ram per CPU, we require 3GB ram per CPU in all new system purchases. This brings the node price (from apple.com) to $6500
200x $6500 = $1,300,000
#2 Racks and power: Each rack can hold about 32 machines (without getting way to hot/dense) for 200 nodes, this would be about 7 racks.
7x $1200 = $8400
#3 Interconnect: No HPC system is usefull without an interconnect. An 80 node myrinet system was $250,000, so at $3125/node you're looking at:
200x $3000 (estimate) = $600,000
#4 Networking: you need a network switch and cabling to connect all the nodes... gige is a must these days. Let's say we go cheap with HP ProCurve 2848 Layer2 managed for $3300 each we need 7 of those, one for each rack cabinet.. with trunking we can get 4gb back to a central switch. not too bad. Say we add $10/cable for pre-made patch cables, (length averaged) that's about $2250 in cables.
7x $3300 + $2200 = $25,300
#5 Disk: You quoted a bunch of XservRaid's without any kind of apple care.. with IDE raid.. I'm not going without some kind of support on it. Oh wait.. 1 file server is NOT enough to handle 200 nodes of HPC.. and apple doesn't have a clustered filesystem. You're going to have to go with Linux/Intel with RedHat GFS for that one (yes, there are other options, but I know GFS)
Say we do 4 XserveRaid's with applecare:
4x $16,000 = $64,000
We also need for dual whatever intel machines: (i'll be nice and include F-C cards in the price)
4x $3000 = $12,000
We also need a F-C switch to link all the nodes:
SanBox 8 port $5200 and 8x SFP modules $750 = $11,200
I'll pretend like we don't need GFS software support, but most places would want it. (it's another $20,000 or so, but eh.. we want cheap solution)
Disk total comes to: $87,200
Price so far: $2,020,900
And that doesn't even include setup!
This split microkernel architecture has been in use for a long time on big mpp systems like the paragon and the t3e. The software base (catamount/linux) is new, but the design is old.
catamount is the kernel that runs on the compute nodes. IT's a tiny kernel that packages up the OS service requests, and sends them, over the interconnect, to an OS or I/O node, which does the real work of the operating system. catamount is a descendant of PUMA, which came from Cougar. These are heavily derived from work done at caltech. (I believe CMU, and one of the UTexas schools also played a role, but am not sure). The idea is that the microkernel is small and unobtrusive, and it gets the hell out of the way so the application can use the CPU as much as is possible.
The OS and I/O nodes run linux, and provide services to the compute nodes. This is probably, but it could just as easily be running as a user-space daemon on the OS node. (Though you might have to do some mem-copys that way, which would lower performance)
NOTE: Though these nodes take advantage of some of linux's features (like the lustre file system) they do NOT necessarily implement these features for the system as a whole. They probably provide a minimal set of features necessary for the sorts of problems that the xt3 runs. All the scheduling work that has gone into more recent linux kernels is of little use, as the compute nodes have their own scheduler, probably more closely tied to the batch dispatcher than to the linux kernel. To say that the system runs linux is true, but a little misleading. It's a very different linux than what runs on my desktop, and it's used in a very different way.
You have to understand though that the stock market's expectations have nothing to do with whether the company is doing well or not.
Surrealistic point in case: at one point 3Com had a lower market value than the Palm daughter-company. Basically if you subtract the value of the Palm shares, the whole rest of 3Com was actually worth a _negative_ value for the stock market.
And we're talking divisions which were making a tidy profit. Yet they were apparently worth a _negative_ number.
No, it's not a joke. Roll it around a bit in your head to fully grasp how completely sad and idiotic that is. Real profits, real assets, worth a negative number of dollars. Stupid.
Or at the other end of the spectrum you have Microsoft whose stock market value is _way_ above the value of its assets. Without paying any dividends or acquiring much in the way of long term assets, people just flocked to drive the price up and make Bill Gates rich. Basically to give their money to Bill Gates and not even get a Windows CD in return.
The thing is, however, the stock market value has _nothing_ to do with a company's value or profits. The value of a share is only worth as much or as little as people want to believe it is. It is like Monopoly (the board game, not MS;) money: if tomorrow we decide that the blue bills are worth 10% more and the red bills are worth 10% less, who's to argue with that.
The _only_ reason the stock market on the whole goes up is basically because yearly people dump more money into it. Basically it goes up just because people want to believe it's going up, and put their money where their belief is.
And the way those values fluctuate, now that just has to do with hype and greed.
The stocks worth buying are those who'll make you a profit: typically meaning they'll raise in value. The stocks worth selling are those who don't.
Except with no intrinsic value it becomes a game of guessing what the other lemmings will buy (driving the price up), and what the other lemmings will sell (driving the price down.)
One thing that makes lemmings buy is the prospect of growth. Hence, hype is good. Hence, yes, shares in a cancerous tumor would sell like hot cakes and rocket sky high in price.
Hence, conversely, shares in a company which doesn't grow or otherwise cause more lemmings to buy, are not worth holding on to. Because they won't bring a profit. If Microsoft truly plateaued and didn't pay dividends either, regardless of how much profits it made at that point, its shares would plummet. Because between holding onto a share in MS that doesn't bring a profit, and investing in some startup that grows quickly, the second promises more of a ROI.
Now that's all a bit of an over-simplification.
Of course, there are other factors. Like just paying dividends to give people a reason to hold onto your shares even without massive hype and growth. (See why MS started doing that when its market explosion slowed down.) Or like fraud: "analysts" just telling lemmings what to buy, and thus drive up the price of the shared owned by the "analyst" and his/her clients. Etc.
But as a quick intro to the madness of the stock market, it will have to do.
A polar bear is a cartesian bear after a coordinate transform.
Cray-3 memories by Steve Gombosi From a comp.unix.cray posting
Graywolf ("S5") was installed at NCAR. Like all NCAR supercomputers, until fairly recently, it was named after a Colorado locale.
This was the *only* Cray-3 shipment, installed in May 1993, the machine was a 4-processor, 128 Megaword system.
Two problems in the Cray-3 system were uncovered as a result of running NCAR's production climate codes (particularly MM5): a problem with the "D" module causing intermittent problems with parallel codes, and an error in the implementation of the square root approximation algorithm which caused incorrect results for certain data patterns (kinda like the Pentium divide bug ;-) ). These were rectified and replacement CPU modules were installed, although I can't remember the date.
The machine ran NCAR production until CCC folded in March, 1995. Since NCAR never paid for it, at some point we reduced the CPU count to 2 and let the machine run essentially unattended. I'm not too sure when that happened, although it marked the end of my regular commuting between Colorado Springs and Boulder.
There were a total of 7 Cray-3 "tanks" constructed. S1-S4 were single "octant" tanks (the smallest that could be constructed) which accomodated up to a 2 processor/128MW configuration. S5 and S6 were two-octant tanks. S7 was a four-octant tank which we used as a software development and benchmarking platform. S6 was chiefly used for system testing.
S1-S3 were diverted to Cray-4 testing once the Cray-4 project built up steam. S4 was diverted to the quite possibly suicidal Cray-3/SSS project after S7 became available (S4 was previously our software development machine).
For those of you who have Cray-3 posters lying around (by the way, I took all the photos on that poster as well as the Cray-3 and Cray-4 brochures and all the annual reports except the first two):
1) The big photo is of S5 ;-)
2) Seymour is leaning on S5 (and you have no idea how hard it was to get him to hold still that long while wearing a suit...or to talk him into that particular pose)
3) The two "cooling system" photos are S6
4) The hand holding the module is mine
Cray-3 modules were 4x4x0.25 inches in size. Each module consisted of a multi-layer "sandwich" of PC boards (69 electrical layers), with 2 layers of 16 1x1 inch stacks. The stacks were the circuit boards containing the actual circuits (GaAs for logic, SRAM for memory modules). There were 16 bare GaAs chips mounted to each side of a logic stack. I think there were 12 bare SRAM chips on each side of a memory stack (the logic chips were square, the memory chips were rectangular).
"Genius may shine aloof and alone, like a star, but goodness is social, and it takes two men and God to make a Brother."