Cray XD1 Now Available
cyngus writes "Cray announced the availability of their XD1 systems. Each XD1 chassis has up to 12 AMD Operton processors. Up to 12 chassis can be clustered together in a rack. The XD1 uses Cray RapidArray Interconnect technology, based on HyperTransport, for high bandwidth and low latency communications between processors and chassises. The XD1 also has a handful of other technologies aimed at the HPC market, including Xilinx FPGAs, communications accelerators, etc."
It would take sixty racks of these to best the Earth Simulator's theoretical peak; more than 60% more processors.
Still, if they need someone to, uh, test one...
That what was all this school was for... to teach us how to solve our own problems. -- janeowit
Since they had been bought by SGI, I've actually been wondering whether they would make me dream again.
Trolling using another account since 2005.
Maybe! Linux runs on the MicroBlaze softCPU on Xilinx FPGAs, as a derivitave of uCLinux. Care to port it to this new hot hardware?
--
make install -not war
" this is a chance to hunt them down ". Go get 'em, tiger!
--
make install -not war
A Cray is not a true Cray unless it can be used as a stylish sofa :p
frotz grue
So, is the Operton more or less powerful than the Opteron?
Also, mandatory: imagine a Beowulf cluster of these.
Dr Cray was killed in 1995(in think thats the year) when his SUV was T-boned by a short car and rolled over.
I only know this from a Sci-Am article on using supercomputers to predict crash situation.
Bacardi + slashdot = negative karma.
Hes dead.
http://en.wikipedia.org/wiki/Seymour_Cray
Seymour Roger Cray (September 28, 1925 - October 5, 1996) was a supercomputer architect who founded the company Cray Research. For about 30 years, the short answer to the question "What company makes the fastest computer?" was "Wherever Seymour Cray is working now."
You can laugh, but you'll probably need that much to run the next iteration of Windows.
It's good to use your head, but not as a battering ram.
I've heard conflicting reports on this - reading Cray's own literature, you see them say:
"Tightly coupled to the AMD Opterons and switching fabric, [the RapidArray Communications Processors] handle memory to memory copies, global memory management, and system wide process synchronization, freeing..."
(Emphasis mine)
Does this mean the HT links give the OS the view of a single-system for each chassis? (Or rack, even?) Ie, can I utilize a single processor out of those 12 in a chassis, and access 96GB of RAM with that one process WITHOUT using MPI or rDMA?
Most computer journalism is PR. One of the reasons I like computing is that many of the ads are interesting. I read some of the paper magazines in the field mainly for their ads. It's a competitive marketplace of ideas, even though some are bad and wrong. There's very little news about tech that isn't announcements about products, even free ones. What's the alternative? _Nerd People Magazine_ (shudder)?
--
make install -not war
I thought Cray was trying to convince the world that Clusters were not as good as true supercomputers, but this looks like a glorified cluster. In looking under the hood it appears to be just a collection of 2-way SMP Opterons with a superfast proprietary network backbone.
And it's running Linux, if that matters to you
Hey jack nuts, I posted this. I like Cray, because I think companies that put a lot of thought into their product and make great ones deserve a cheering section. Of course you're a BC kid, so I'll forgive you, we (BU) spanks you enough in hockey to let you have a shot here and there.
Dilbert: I can compute many values of pi. Some people discuss areas of circles, but I'm doing something about it!
"A witty saying proves nothing." ~Voltaire
"d'Oh!" ~Homer
From the linked page:
Highly modular, the Cray XD1 base unit is a chassis. Up to 12 chassis can be installed in a rack. Multirack configurations integrate hundreds of processors into a single system.
Farther down the same page:
The Cray XD1 compute subsystem is composed of 12 AMD Opteron(TM) 64-bit processors that run Linux and are organized as six 2-way SMPs to deliver 58 GFLOPs* per chassis. Finely tuned memory and I/O performance removes bottlenecks and maximizes processor performance.
Wow - do the math: 696 GFLOPs per chassis. That's rather impressive.
However, part of me is a bit saddened by seeing the Cray name attached to X86s. Yes, I felt the same thing with SGI, DEC, and Sun. Yes, I need to get over it and move on.
I want to drag this out as long as possible. Bring me my protractor.
yes -- Seymour Cray died in an auto accident in Colorado Springs in 1996.. I googled "Seymour Cray" for this: http://ei.cs.vt.edu/~history/Cray.Pepper.html
Yes, I believe it is 12-way SMP. The memory is connected to the CPU's in a crossbar switch.
The plural of chassis is chassis. This Gollum thing on /. is going too far.
Cray has announced a lot of different sales of the XD1 the past couple of weeks. We have all the details here.
ignorance is bliss. googlefiberatx.com
the nec SX architecture uses these ridiculously huge custom vector processors to get performance (similar to the Cray 1, 2, XMP, YMP, etc design)
this Cray is more like building MPPs off of scalar units (opterons) and doing some real innovation around the MPP interconnect. It's sort of off the shelf, yet not at the same time.
The big thing here that kicks ass is the 6 FPGAs per chassis. If you can write a highly tuned software algorithm, there's a chance you can write a highly tuned peice of hardware, deploy that to the FPGA, and you've got an application specific hardware accelerator. 6 per chassis, infact. That's pretty cool, and its in some ways a HUGE innovation over having a dedicated vector unit (as was the cray1 design).
the really interesting thing here is that these are essentially opterons running linux, with custom interconnect goo. The interconnect bypasses the PCI bus - its closer to the PE's than that.. their claim is that it attaches to the AMD hypertransport bus (the Proc -> Proc -> Mem bus for SMP AMD machines)
My opinions are my own, and do not necessarily represent those of my employer.
> Cray HPC-enhanced Linux, Kernel version 2.4.21
I wonder what that means - Red Hat EL 3.0 with enhancements, or their own thing..
Interconnect - I wonder how their proprietary interconnect compares to IB..
File system - ext3? No cluster file system?
Chassisses?
Why do I have the sneaking suspicion that you're not pronouncing it correctly, either?
Proud member of the Weirdo-American community.
Hmm, I think now I was wrong; from reading further it looks like you do need MPI to get beyond 2-way. But there is enough memory bandwidth that single system image would probably work quite well.
Actually this is AMD64 / x86-64 - it just emulates x86...
Every time I shut down one of their nefarious schemes, the Countess herself personally thanks me for rooting out corruption in her corporation. I'm beginning to think Crey's not as innocent as they claim... Wait, what? Oh, sorry, wrong Crey.
Crayola!
For my apps, I do iterative matrix calculations. However, one of the required data tables scales as n^2.3 (ish) of the system size. These can be precalculated, or calculated on demand. Typical size for a small run is 4-6 GB. I've filled a 40 GB array with data tables before.
Thus, the part that impacts runtimes the most is either the on disc lookup, which is still faster than direct calculation, which we've also had to do.
I looked into FPGA's a while back. Some back of envelope calculations show that a single FPGA should be able to calculated the data table on demand, and it'll be faster than reading from disc.
(Turns out, that to actually get a usable solution for a basic PC would need to hack up the whole tool chain. FPGA cards for a PC are all designed for DSP, rather than numerics).
So, with an FPGA and a CPU, I could elminated the slowest part of the job, and scale up to, what, a 1GB working matrix, which is about 8 time larger than the biggest job I've ever run, which hogged a T3E1200 for 6 hours.
So, in short, gimme an FPGA and some reasonable tool chain, and I will be able to about half runtimes, and, more importantly, scale up to 10 times larger calculations. 5 time larger calculations is the most I've ever been asked about.
Time to brush up on my VHDL, I think.
Cray did not develop this system themselves, they simply bought this little startup company and relabled its product.
i was looking at cray.com and there's no mention of the Tera MTA. The Tera MTA was the innovative idea they had to have 128 logical threads on a single CPU.. think of hyperthreading but with 128 logical threads instead of.. 2.. and also it was working at least 8 years ago.
:/
If you look at cray.com today its pretty sad. 3 product lines - the TD1 opteron+magic, the X1, which is traditional cray vector (smp vector nodes, and MPP's of those nodes), and their 3rd product line is the NEC SX-6... they're reselling it in the states for NEC.
If you hit tera.com, you get a 404
My opinions are my own, and do not necessarily represent those of my employer.
Id releases Doom 3 for Linux, Cray announces availability of new supercomputer.
Dare we say, we've finally actually found the hardware that can run this game?
You can accomplish anything you set your mind to. The impossible just takes a little longer.
Obviously I pissed somebody off, cause I got modded down as "overrated." I just get worn down by the endless parade of PR trumpeted as news in the mainstream media. And sorry about your misfortune in attending BU.
... the Gentoo Chief Marketing Officer made the following statement :
"We welcome the Cray XD1 as the first platform on which Gentoo installs in less than 12 hours. Looking forward to renaming Gentoo to 'One-Click-Linux'. Stay tuned !"
War doesn't prove who's right, just who's left.
Have the NSF buy a few billion dollars worth of high end Opteron hardware and make it availible to those who are doing public research. I think that access to a few million dollars worth of high end hardware would help cut down on the R&D costs for drugs that are partially paid for with taxpayers' dollars. The good side is that if part of the package is free time on really sophisticated NSF clusters capable of really cutting down number crunching time, the public can demand on its end lower prices and/or shorter patent intervals.
Click here or a puppy gets stomped!
This meme has been around so long that my great-great grandfather used to include it in his messages sent by carrier pigeon.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
I guess the chickens win after all.
~D
This sig has been enciphered with a one-time pad. It could say almost anything.
And I have never seen a press release that "says what needs to be said." Press releases say what companies want to be said.
Thats what it says on its specs sheet "Cray HPC-enhanced Linux, Kernel version 2.4.21"
I think its wise they went with the 2.4 kernel though, but i wonder what is this cray linux, never heared of it.
The lunatic is in my head
XD1 would probably be a good emoticon for anyone who manages to get their hands on one of these babies.
Firstly, Let's see 12 processors in a rack. That sounds great, but what happens when the chip blows... or something breaks how easy is it to fix. 58 GFLOPs ( I personally think it should be GFLOPS) . Anyway IBM's BlueGene will make Cray look like a digital watch. With already great performance we are gonna see a lot of changes we do computing. BlueGene is the future Cray is the clock telling us how far _into_ the future we are. Also another thing thank God ( please substitute ) for a 12-way x86 processor ( man what a waste ...). Hopefully this will cause better MPI programs to be written and become more main stream.
Go Red Coats
Loads of Opterons? Who cares if GFX card is teh sux? Cray are a bunch of noobs. I bet it doesn't even have neon fans! You'll never get the chix showing them your 1337 skillz in CS with that heap of junk.
Chernobyl 'not a wildlife haven' - BBC News
Quote the marketing info page: "runs wide variety of ISV applications and open source codes"
That's the power of Cray's parallel processing: each machine runs its own "open source code" therefore a cluster is more powerful because the entire cluster runs "open source codes". At least that's their sales reps understanding of it.
This machine is really not much different to SGI's Altix, except running the AMD processors rather than Intel. This means that although each processor likely runs faster than the ones SGI uses, Cray can't bundle as many together, as AMD hasn't progressed nearly as far on SMP-aware chipsets as Intel.
This is some of the stupidest piles of drivel I have read on slashdot. SGI and Cray both do ALL of the glue logic chips themselves, that's the whole point of buying from them. They don't use the off the shelf chipset, they design their own with the design goal of large scalable systems. Besides Intel uses a shared bus where AMD uses the point to point bus they bought from Compaq which was origionally designed for the Alpha. So if anyone has a scalability lead it's AMD.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
If you can't afford this Cray, you can at least buy the parts to start putting together your own multi-processor Opteron system:
:-\
http://www.monarchcomputer.com/
A friend of mine and I were talking the other night about local Atlanta, GA computer stores, and he mentioned that Monarch Computer is one of the only vendors from whom you can purchase the 4-way Opteron 800 series processors ($1200 a piece -- damn!).
He's been in grad school out of state for a few years and was suprised to learn that Monarch Computer is, in fact, in his hometown backyard. Kind of kewl to walk in a store in your own town and walk out with a $1200 4-way processor.
Until the wife finds out and sends you back to said store with the receipt in hand for a refund.
IronChefMorimoto
P.S. - I don't work for these guys or advocate their store. I just thought it was cool to have such a vendor nearby. Too bad they don't sell Shuttle XPCs.
They still make vector-based supercomputers, massively parallel systems, clusters, etc. These are designed to appeal to different markets.
If you read Cray's 10-K, you will see that they believe that some computer problems can be resolved optimally on clusters, but others are better suited for vector-based systems.
LedgerSMB: Open source Accounting/ERP
Cray systems may not always be the fastest thing around, but they are solid. It would be nice to see more producers paying careful attention to clean design and reliability over having the latest speed-booster.
It's nice to see our old friend Cray continue to keep a foot in the market -- if nothing else, it makes everyone else stay on their toes.
We may not imagine how our lives could be more frustrating and complex—but Congress can. – Cullen Hightower
We likes chassisessess
The MTA idea is neat, but nobody's ever been able to find a problem that runs all that well on them. The original MTA didn't have enough memory bandwidth to make it competitve with a vector machine, and the small number of them in the field (less than 10 IIRC) are notoriously cantankerous. When Tera bought Cray, the one of the main things they were buying, aside from name recognition, was Cray's CMOS design experience; they were hoping Cray's designers could help with the problems they'd run into with MTA.
"My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
Cray now has three product lines to address 3 different market segments.
They have the X1, which is a massively parallel vector system for the very high-end. (For those who need 30+Gbytes/second of memory bandwidth for EACH cpu) These things are huge, expensive, and used by a limited number of users, mostly governments.
They are getting ready to productize red storm, which is also a bunch of opterons, but strung together in a shared-memory system like the T3E. also a high-end solution.
This system, the Xd1, is a low end system designed to be a half-step better than a cluster of off-the-shelf opterons. It's a multi-kernel cluster using MPI for all the data sharing. However the interconnect basically sits where the south-bridge sits on most opteron boxes.
So Cray still has the absolute cutting edge systems, but have now expanded down-market. (Rather, they acquired octiga-bay who did the early design work).
This is also not the first time this has happened. In the early 90s, Cray purchased a small start-up that was developing a NUMA-style mini-super based on sparc processors. They turned it into a product and sold a few, though not as many as they would have liked. During the SGI acquisition they sold the product to SUN, who branded it the E10000, and made about a billion dollars off of it. It's now the foundation for all of Sun's high-end Unix servers.
Cray also bought a small company (I forget the name) that made a cmos implementation of the YMP. This became the ymp-el, the J90, which pioneered technology for the SV1.
Cray has often built mid-range systems. Nothing new.
CRAY used to be so cool back in the day. Hell, the old CRAY systems are CHEAP now! http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&cate gory=1484&item=5722664206&rd=1&ssPageName=WDVW
DAMN YOU OCTODOG! DAMN YOU TO HELL!
NO.
For that you need to buy crays mpp system "strider", which is a productized version of red-storm.
The xd1 hardware is probably capable of shared memory, the software is not. The nodes (each 2-cpu blade) run off-the-shelf linux, and use MPI to share data.
12 Opterons deliver 58* GFlops (where * = peak). The Army's recent G5 cluster (1566*2 G5 processors running at 2GHz) deliver 25* TFlops. 58 divided by 12 yields 4.8* GFlops per chip for an Opteron, and 25000 divided by 3132 yields 8* GFlops per chip for the G5. What's wrong with this math? I didn't think the G5 had numbers THAT much better than an Opteron. And with G5s hitting 2.5GHz today the numbers would be much worse (or better, depending on your point of view).
If I didn't have absolutely NOTHING to do, I wouldn't be here.
Cray not-too-long-ago had major announcements with the RedStorm project. I believe that system is supposed to be a single image 10,000 CPU AMD based rig. There are some oddities friends have pointed out, like the OS is based on IRIX I believe...
Yea check this out:
Cray Unicos/mp"
Actually that references the X1, which is not based on PeeCee stuff, but actually a 8 core MPM.
Sad thing is, even with Red Storm I think IBM will remain on top as their contract calls for 130,000 of their powerPCs on one system?
It would be nice to see Cray on top, with something other than a commoditiy processors. I realize the T3D and T3E were both Alpha based systems.
PS, I still have a J932se 32 proc Vector Cray ( for sale ) if anyone wants a Cray for home. $4500, real deal 3 cabinet Cray from 97', most likely used for gov't nuclear energy something-or-other. Located in Southeastern Virginia.
Southeastern Virginia REPRESENT!
Sandia contracted out Red Storm development, hence RS is based on Sandia technologies towards petascale computing: Puma, Catamount, Portals, etc. No UNICOS, no IRIX, no Linux (except on the service nodes).
This is a quintessential kernel architecture without support for threads, VM, IPC, etc. The interconnect is also asynchronous (5ns end-end speculated), point-to-point, and uses Portals.
I meant "696 GFLOPs per rack", not "per chassis", where a fully-loaded rack contains 12 chassis (58 * 12 = 696). D'oh! I appreciate the correction.
I want to drag this out as long as possible. Bring me my protractor.
After searching everywhere for the legendary "Wang Computer" tshirt, I decided to fall abck on teh second geekiest computer company to get a shirt from, Cray. I couldn't find a shirt through the normal outlets (eBay/ThinkGeek), so I called them directly. The woman that answered was glad to help and shipped out, not a tshirt, but a very nice collared shirt that makes it look like I work for Cray! I wer it to all the conventions and I become cool(er).
*queue calls to Cray*
My all-time favorite Semour Cray quote is this one, possibly apocryphal: "A supercomputer is a tool for turning compute-bound problems into I/O-bound problems."
I write in my journal
I would REALLY like to know what appereance a operton beowulf cluster would have...
Your head a splode
My math shows this to be a 6u unit (72/12=6)
There are quad-opteron 1U boxes... So currently 6u of space can hold twice as many Opterons as these Cray units (24 Opterons with normal servers, 12 Opterons with the 6u Cray). The rapidly approaching introduction of dual-core Opterons would allow 48 opteron cores in the space this 12 opteron Cray.
Yes, the Cray has many extras, (The FPGAs for example?), but for pure power, you might be better off with normal servers.
Sort of sad they abandonned their custum CPUs for these commodity CPUs. Their liquid cooling was pretty nihilistic. You'd think there would be a lot to be gained from the old techniques of restricting everything to 64 bit operations, liquid evaporation cooling, and quad core parts.
Id did a pretty lame job of porting Doom3 to linux, and hardly any interesting features are present. No alsa, no amd64... doesn't work with radeons.
/. answering questions, pretty into the linux movement. Has he dumped us?
I wonder what John has been thinking... a few years ago he was on
"And we have seen and do testify that the Father sent the Son to be the Savior of the World"
1 John 4:14
If you get the Operton Prime, it can also drive itself around.
paintball
Cant one use some parts of NUMA kernel?
...
:)
It was designed by SGI after they've acquired Cray
Well anyway, I suppose you can run something like http://www.mosix.org/ on the XD1 racks
Most itanium systems use a shared bus.
One should note, however, that the altix does not use the shared bus features of the itanium. Or at least that that bus is only shared by one cpu, the memory, and the bridge chip. The interconnect architecture of the altix is identical to the interconnect used on the old SGI origin systems, which were based on MIPS processors. From an architecture point of view, the Altix and the XD1 are very very similar. One uses itanium, one uses opteron.
Altix tries to run a single OS image across the entire machine, while XD1 relies on MPI to do data sharing. However, even SGI doesn't spread a single linux image across their biggest machines. They also create a cluster-in-a-box on large enough configurations.
And no, clarifying the source of a typographical error by using [sic] in a quotation does not make me feel "way smart." Being way smart makes me feel way smart.
> Each XD1 chassis has up to 12 AMD Operton processors. Up to 12 chassis
... (drumroll please...) gross.
> can be clustered together in a rack.
Man, that's just
Cut that out, or I will ship you to Norilsk in a box.
What do Slashdot's editors do again?
Or a fountain
http://www.csm.ornl.gov/ssi-expo/cray2.jpg/
The coolest computer ever.
Michael
This is also not the first time this has happened. In the early 90s, Cray purchased a small start-up that was developing a NUMA-style mini-super based on sparc processors. They turned it into a product and sold a few, though not as many as they would have liked.
The SuperDragon! I worked at Cray Research in that timeframe, I have some SuperDragon memorabilia. When I was there, the T3D was under development, and was considered a mid-range supercomputer, beneath the C90, above the "entry-level" EL series.
Larry