Slashdot Mirror


Time For A Cray Comeback?

Boone^ writes "The New York Times has an article (free reg. req.) talking about Cray Inc.'s recent resurgence in the realm of supercomputing. It discusses a bit of Cray's decline when the Cold War ended, "the occupation" under SGI, and the rebirth of the company after the Tera (now Cray Inc.) purchase. Recently Cray Inc. has been shipping their vector-based Cray X1 machine, designing ASCI Red Storm, and recently was one of 3 (also Sun, IBM) to win a large DARPA contract (PDF link) to design and develop a PetaFlops machine by 2010. Could Cray Inc. be poised for a comeback? Wall Street seems to think so."

17 of 266 comments (clear)

  1. Registration not required by Anonymous Coward · · Score: 5, Informative

    Partner Link

    Posting as Anonymous Coward, please award my Karma to starving children in the world.

  2. Definately by Anonymous Coward · · Score: 4, Informative

    There are still MANY applications for supercomputers. A lot of people think that linux/beo-clusters are going to be replacing supercomputers of the Cray/NEC/IBM variant. Not true. There are still many research, scientific, and military applications that require machines developed not for "slow" distributed number crunching, but require ultra high speed processor and memory architechtures.

    So definately, time for Cray to come back and retake the supercomputer industry crown.

  3. Re:explain by Alien+Being · · Score: 3, Informative

    memory bandwidth

  4. Re:explain by Moeses · · Score: 2, Informative

    Bandwidth.

  5. Re:explain by Doesn't_Comment_Code · · Score: 5, Informative

    Well, a well engineered supercomputer has much less overhead than a cluster. One superfast processor doesn't have to deal with interprocessor communcations like a cluster does.

    And if your supercomputer has multiple processors, they are generally made to cooperate nicely to speed efficiency. Whereas a cluster has to go through ethernet and hardware layers to communicate between nodes. Granted that is fast, but on-board communication is faster.

    It seems strange, but a multiple processor computer can actually perform a task slower than just one processor working on the problem if the program and os aren't designed well. So a lot of the value of a supercomputer comes in its design, and the reputation of the manufacturer. And Cray is pretty reliable in my book.


    But the REAL key to the potential comeback of the Cray computer will be whether or not it still has cool bubbles! Wow!!! Cray computing... the inventor of case mods.


    --

    Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
  6. Re:explain by anzha · · Score: 5, Informative

    Memory to processor feeding: std ots processors are often idle because the memory subsystem cannot feed the processor fast enough. This is bad now. It will be getting a lot worse.

    Interconnections between processors: this goes beyond merely processors on a board, but between boxes. The bus architectures out there for the std ots hardware get saturated very quickly. This gets worse between boxes. In addition the latency on Myranet and Quadrics (compared to what Cray et al do) is horrible even if it is excellent compared to ethernet.

    Problem set vs architecture: Not all problems map out well to clusters, or even SMP boxen. Some map best to vector machines. Some map best to tightly integrated MPPs. Some map out to moderately tight clusters. Some are just plain 'embarassingly parallel'. Others are highly threaded and don't work well on vector or scalar machines. etc, etc. The architecture ought to match the problem set.

    MTBF: Mean time between failures. Commodity hardware goes kaputt much more often. A cluster capable of teraflop performance of custom hardware tends to need constant and evil levels of care and feeding: ie you better have a grad student on roller blades.

    Those are just off the top of my head. I am sure that others will Tell you others before I can post again. ;)

    Summarized: bandwidth, latency, problem set, and failure rate.

    HTH.

    --
    Do you know why the road less traveled by is littered with the bones of the unwary?
  7. Sun Enterprise 10000 by DNS-and-BIND · · Score: 2, Informative

    Didn't Sun basically buy out or hire away a bunch of Cray, Inc.? I always heard the E10000 was actually a Cray product. Oh, and just to brag, I have a blue jacket with a picture of a Y-MP-90 on the back with the words, "CRAY - WORLD'S FASTEST SUPERCOMPUTERS". Too cool for words. Ebay rules.

    --
    Shutting down free speech with violence isn't fighting fascism. It IS fascism!
    1. Re:Sun Enterprise 10000 by putaro · · Score: 4, Informative

      The E10000 is a Celerity product. Celerity was an independent Unix box maker back in the 80's with their own processor architecture. Celerity went bust trying to bring a "minisupercomputer" version of the architecture to market in about 1987 (33 MHz, whoo hoo!). The assets and technology of Celerity along with the design team in San Diego were acquired by Floating Point Systems (FPS). FPS brought the system to market and made the transition to a SPARC based architecture (66 MHz) before going bust. The assets and technology of FPS along with the design team in San Diego and now the manufacturing team in Beaverton were acquired by Cray. Cray did a couple of turns of the crank on the FPS product and sold it as a "business supercomputer". When Cray was acquired by SGI, SGI wanted no part of the SPARC business and sold (yes, again) the San Diego design team (and I think the Beaverton group) to Sun who finally brought a SUCCESSFUL product to market with the E10000.

      But it's still the same core team down in San Diego, so I like to think of the E10000 as being a Celerity product.

    2. Re:Sun Enterprise 10000 by laird · · Score: 2, Informative

      Quite a few of the people working on the E10K were from Thinking Machines Corporation. TMC was Danny Hillis' company that introduced massively parallel supercomputing. The first generation machine was a Symbolics workstation coordinating up to 65,536 single-bit CPU's connected by a hypercube network. Each CPU was fairly slow, but there were tons of CPU's and CPU performance was balanced nicely with network throughput (whereas most MPP machines have fast CPU's starved for data). Weird, but also astoundingly fast. Anyway, more relevant to Sun, the last generation machine from TMC was based on UltraSPARC's with custom FPU's (128 MFLOPS per compute node, which was cool at the time). I don't think that there was an upper limit on the number of CPU's, but the biggest I saw (I worked there for a few years) was 4,096 compute notes, and a few hundred storage nodes. Anyway, TMC ended up getting out of the hardware business (check out think.com), and Sun hired quite a few of the engineers (who knew how to build an MPP SPARC-based machine, with compilers, etc.) which rolled into the E10K nicely.

  8. Cascade Link: Karma Whoring by anzha · · Score: 2, Informative

    The home page at Cray for the Cascade project.

    There are some interesting PDFs there. Chew, mull, and consider.

    Also consider what Horst Simon, head of NERSC said here too.

    --
    Do you know why the road less traveled by is littered with the bones of the unwary?
  9. Re:explain by fgodfrey · · Score: 5, Informative
    As other replies have posted, bandwidth is the big issue. And by bandwidth, we are talking bandwidth of the processor to memory. Cache is great and all, but if you are stepping through gigabytes of data (or in some cases terabytes of data), your problem isn't going to fit in cache. The speed of your processor will then be dominated by the speed at which it can get to main memory. On a PC, that's slow. What's even slower is when you have to exchange data to a remote node in the cluster. Current massively parallel supercomputers (which is pretty much all of them) have phenomenal bandwidth between processors and memory and between nodes.


    Second, (yes, I work for Cray so now I'm going to put in a sales pitch :) our processors are vector processors. As such, you can hide a lot of the latency of getting to memory by queueing up 64 loads at once. Short length vectors are what is used by MMX and Altivec to accelerate graphics. With sufficient vector operation chains, you can keep the processor busy all the time. You can't do that on a PC. I've heard (no, I don't have actual links to articles) that 10% of peak performance on a cluster is considered really good. Our customers wouldn't consider that anywhere near "really good".


    Finally, there's memory. Lots of it. A single system image supercomputer can have terabytes of memory in one kernel image. You're simply not going to get that in a single PC cabinet.


    Finally, in case anyone doubts that vectors, big memory, and large bandwidth can make a good system, the fastest machine in the world right now is the Japanese "Earth Simulator" machine which is an NEC SX machine. That is somewhat similar in architecture to a Cray in that it has large bandwidth and vectors.

    --
    Go Badgers! -- #include "std/disclaimer.h"
  10. Re:Correct me if I'm wrong ... by morcheeba · · Score: 5, Informative
    Yep, you are a bit wrong... (you didn't think a challenge to the slashdot community would go unnoticed?!)

    From this site, you can see the breakdown by organization:
    Usage..... Count Share Rmax Rpeak Procs
    Industry... 202 40.4 % 82398 182964 62869
    Research... 131 26.2 % 187689 278030 120046
    Academic... 115 23 % 77143 133564 45216
    Classified.. 27 5.4 % 14167 20691 12892
    Vendor...... 22 4.4 % 11033 15545 5230
    Government... 3 0.6 % 1317 2256 528
    Total...... 500 100 % 373749 633052 246781
    There are a lot of companies that use supercomputers, although maybe not the type you're thinking of. Of course, there are the number-crunchers: oil companies are big users (to crunch data & find new oil), and car companies (BMW). But there are also the transaction-processors, like SprintPCS and Ebay (used to be in the top 500), that make the list just by the sheer number of connected processors.

    Here's the latest list
  11. Economics of Scale by dprice · · Score: 4, Informative

    In the 1970's and 1980's, Cray and other supercomputer companies fit in the niche of "fastest computing at any cost". The design cycles were long for the specialized hardware that pushed the boundaries of the available technology. Companies and government agencies were willing to pay the high price since there was enough processing speed difference between the supercomputers and the "vanilla" computers.

    By the early 1990's, the "attack of the killer microprocessors" came. The PC class processors were still weak, but the higher dollar RISC processors used in workstations, like Sun, were reaching performance levels close to what the supercomputers were able to deliver. Since they were based on higher volume and more standardized processors, the price/performance of the RISC workstations started eating into the mainframe and supercomputer market. Many of the supercomputer companies died off, and some started to incorporate RISC processors into their designs. By the mid 1990's I believe that Tera and Cray were the last remaining old-school supercomputer companies left. The rest either died or were absorbed into other companies.

    Today, the investment required to produce the fastest processor chips is so high that it requires large unit volumes to pay for the cost of development and production. The PC class processors, with their high volumes, are putting pressure on the old style workstation market, where each company makes their own processor (SPARC/Sun, PA-RISC/HP, Alpha/DEC). We see Sun struggling as the PC's eat their market. Even some large scale supercomputers are based on the PC processors. The majority of the computer spectrum from low to high end is based on the same families of processors (Intel, AMD, PowerPC).

    So that brings us to Cray/Tera. Cray seems to go against the economics of scale that drive the rest of the computing industry. What keeps them running is a small niche that the government is willing to keep funded. It is similar to the funding of exotic bombers and fighter jets. We probably won't see Cray grow much larger than they currently are. They be kept running since they form a critical part of the national security, at least that is what the government believes.

  12. looks like Cray is going with the Opteron by Kargan · · Score: 5, Informative

    The Sandia National Labs supercomputer (code name: Red Storm), currently being built by Cray, is going to be powered by 10,000 Opteron processors. A 40 Teraflop theoretical peak will put it at the top of the supercomputer list, being approximately 4 Teraflops faster than the NEC Earth Simulator, the current champ.

    --
    Palaces, barricades, threats, meet promises
  13. Re:More elegant than the macs, back in the day by Anonymous Coward · · Score: 1, Informative

    The waterfall Cray is the T90 and that's not water, but Fluorinert. I've seen one in person. Talk about cool furniture.

  14. Cray Comeback? Desktop Cray! by Styx · · Score: 3, Informative

    I've been using Desktop Cray for a while now. It took me some time to weak the settings to perfection, but now it's just running along. Check it out!

    --
    /Styx
  15. Re:Correct me if I'm wrong ... by adam872 · · Score: 2, Informative

    I work in the Oil&Gas business and we use Linux clusters (and in the past bloody large Sun, IBM and SGI systems) for seismic processing and reservoir simulation. These particular problems are DSP and FP intensive and also can require a fairly large amount of memory to run. They are exactly the kind of commercial workload either a supercomputer or cluster can chew on.

    Some of our customers (I work for a company that writes the software, amongst other things) have upwards of 100TB of 3D Seismic they want to process. These jobs can take weeks or months to run. The simulation jobs can take days as well. Obviously having a big computer or tight cluster of lots of small ones will help decisions get made faster and/or more accurately.

    There are other examples too: I met a gentleman who works for the lab that does crash simulation for Porsche, Audi and VW. Another example would be an ex-boss of mine who went to work for an engine manufacturer, who used a couple of SGIs to simulate the bore and stroke in a cylinder. The simulation took several weeks to run. They need large computers to do this too. So there is a market for these machines.