Time For A Cray Comeback?

← Back to Stories (view on slashdot.org)

Posted by simoniker on Monday August 4, 2003 @09:41AM from the cray-cray-come-again-another-day dept.

Boone^ writes "The New York Times has an article (free reg. req.) talking about Cray Inc.'s recent resurgence in the realm of supercomputing. It discusses a bit of Cray's decline when the Cold War ended, "the occupation" under SGI, and the rebirth of the company after the Tera (now Cray Inc.) purchase. Recently Cray Inc. has been shipping their vector-based Cray X1 machine, designing ASCI Red Storm, and recently was one of 3 (also Sun, IBM) to win a large DARPA contract (PDF link) to design and develop a PetaFlops machine by 2010. Could Cray Inc. be poised for a comeback? Wall Street seems to think so."

3 of 266 comments (clear)

Min score:

Reason:

Sort:

Re:explain by virtual_mps · 2003-08-04 10:12 · Score: 5, Interesting

MTBF: Mean time between failures. Commodity hardware goes kaputt much more often. A cluster capable of teraflop performance of custom hardware tends to need constant and evil levels of care and feeding: ie you better have a grad student on roller blades.

Hahahaha. Have you ever actually run a supercomputer? They tend to have much higher failure rates then normal servers. Couple of reasons: first, they push the envelope of a given technology. The sweet spot for stability is not the leading edge. Second, they're not nearly as well tested as mainstream hardware. On a platform with thousands of installations you're much less likely to run into a problem nobody has seen before than you are on a platform with only dozens of installations.
The trick is keeping ahead of the commodity guys by putaro · 2003-08-04 10:47 · Score: 5, Interesting

Supercomputing per se died because Intel, DEC, IBM/Motorola had a lot more money to throw at speeding things up than the supercomputing community.

In the 70's up until the early 90's it was possible to build a custom CPU out of discrete logic that ran significantly faster than the available microprocessors. Cray was able to push their clock cycle down into the nanosecond range through clever design. However, a 1ns clock rate == 1GHz. You can go buy that multi-million dollar CPU for a couple of hundred bucks in today's market.

In order for superocmputing to be viable you have to be able to provide quantum leap performance above the commodity hardware AND keep your cost/performance ratio in line as well.

The CRAY-1 came out with a clock speed of about 80 MHz and vector processing and high memory bandwidth at a time when mainstream systems like the PDP 11/70 were running at about 7MHz with a 1MB/s memory bus. Microprocessors weren't even't a joke compared with the Cray.

The new Japanese NEC supercomputer came with a price tag of about $160 million if I remember correctly (some estimates say that it took $1G in research funding) and hits 35 TFlops (sustained). #3 on the Top 500 supercomputers list is a Beowulf cluster with 2304 processors coming in at 7.6 TFlops (sustained). Even figuring $2000/processor + interconnect, that puts the Beowulf cluster at around $5 million or 1/32 of the cost for 1/5th of the performance (roughly speaking).

There are other factors, of course, but the key is that for the supercomputer to stay ahead of the microprocessor a boatload of funding is needed for the supercomputer and the payoff just isn't really there. If it was a lot more supercomputer companies would still be in business.
Re:Icon is back by CausticWindow · 2003-08-04 12:09 · Score: 5, Interesting

I remember a story from a NSA contract worker.

In the early days of Cray, he and many others were wondering how they could keep things running, considering that their official budgets only showed ten or so sales per year.

Until he got the tour of the NSA computer plant, where they had a hall the size of two football fields, filled with Crays.

--
How small a thought it takes to fill a whole life