Slashdot Mirror


Time For A Cray Comeback?

Boone^ writes "The New York Times has an article (free reg. req.) talking about Cray Inc.'s recent resurgence in the realm of supercomputing. It discusses a bit of Cray's decline when the Cold War ended, "the occupation" under SGI, and the rebirth of the company after the Tera (now Cray Inc.) purchase. Recently Cray Inc. has been shipping their vector-based Cray X1 machine, designing ASCI Red Storm, and recently was one of 3 (also Sun, IBM) to win a large DARPA contract (PDF link) to design and develop a PetaFlops machine by 2010. Could Cray Inc. be poised for a comeback? Wall Street seems to think so."

19 of 266 comments (clear)

  1. Icon is back by aspelling · · Score: 2, Insightful

    Many scientists are very concern about state of supercomputing in US. Hopefully new generation of supercomputers improve this situation.

  2. Correct me if I'm wrong ... by SuperDuG · · Score: 5, Insightful
    ... but wouldn't the fact the market for supercomputers isn't exactly that large. I mean you've got governmental contracts (research, educational, who knows what) that have to take up 95% of all the purchases made, and then a small private market. I mean how many companies are striving for a petaflop machine to run their database server?

    If you look at the list of top 100 supercomputers, there are systems that are almost 15 years old or even older (not sure on a few). I know these take years to build and are multibillion dollar projects, but between time has got to be a killer.

    Then there's the question of ... what do you need a supercomputer for? The applications are pretty limited for a need for a petaflop computer, unless your doing mass storage, cryptography (cracking), or simulations.

    Don't get me wrong I'm all about nuclear testing being done in 1's and 0's instead of in the ocean or in the desert, but how big of a bomb do you really need when it's estimated theres enough nukes to blast the entire land surface of the earth 3 times over.

    --
    Ignore the "p2p is theft" trolls, they're just uninformed
    1. Re:Correct me if I'm wrong ... by MxTxL · · Score: 4, Insightful

      Then there's the question of ... what do you need a supercomputer for?

      To advance the state of the art. And not just in the field of computers, but also in any field that ends up benefitting from this. Which is potentially very many. Aerospace, geology, meterology... there are BUNCHES of fields that greatly benefit having more and more massively powerful computers. Sure, most projects can't afford to have the latest and greatest of the state of the art in supercomputing, but the fact that the state of the art progresses will push prices down on the older technologies that most labs CAN afford. This is a benefit for science as a whole.

  3. Resurrection, not come back by Anonymous Coward · · Score: 2, Insightful

    Cray died. Anything else is just bartering on his name.

  4. Re:explain by Arker · · Score: 3, Insightful

    Other posters have already pointed out the bandwidth issues over and over, so I'll skip that obvious difference.

    The fact is that not all problems are suitable to parallel processing. Sometimes you really need to know the outcome of one operation before you can go on to the next.

    Beowulf clusters really suck on problems where that applies. Cray style supercomputers shine on them.

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Friends don't let friends enable ecmascript.
  5. It's also about better (not just faster) computing by binaryDigit · · Score: 3, Insightful

    Don't just think about solving a static problem faster, it's also about solving a problem better through the use of more variables. Take weather simulation. If having too many variables stretches todays forcast into next week, then it's useless. So you limit the amount of variables to come up with a "close enough" forcast in a more timely manner. With a faster computer, you can get a more accurate simulation in a more reasonable time period. This increase in accuracy/complexity is then useful in many fields.

  6. Comeback? by virtual_mps · · Score: 5, Insightful

    Probably not. Cray made some money back when a supercomputer was something that an ordinary company might need. The capabilities of "normal" computers was much more limited then today, so there was a much higher percentage of the buying public likely to want something more. These days the vast majority of users are happy with something mainstream

    But, you ask, isn't there a lunatic fringe who wants more power at any price? Well, the lunatic fringe ain't what it used to be. During the heyday of cray you got a damn fine box and nothing else. Cray didn't want to worry about your software--or even an OS. A person who needed the speed would plunk down the money for the box and then pay a couple of guys to code everything from scratch. Those days are gone--software is the driving factor these days, and people are far less willing to buy something that's going to force a total code rewrite. Especially if that thing is only going to buy them a couple of years of edge before they need to recode for the next best thing.

    Then there's the question of whether cray can afford to be bigger. The answer is "probably not". If you sell to a lot of customers you need a huge support infrastructure. Cray doesn't have much of one anymore, so they'd need to buy one. (Most of the old support guys left one way or another when SGI came in, or stayed with SGI.) If you have a lot of customers you can spread the costs around, but in the case of a company like cray a support infrastructure means having a people sitting around most of the time in every region you sell a machine. Maybe two to four guys per system (24x7, right?) plus some sorta warehouse facility if you enter a new geographical market. That's expensive. You can bill a lot of that cost back to the customers, but that just makes your systems less competetive.

    I think the long term answer is that cray will be a very small niche player, selling to a very select group of (U.S.) government agencies, with the occasional pro forma business customer thrown in so the company can issue press releases. Even most government facilities aren't in a position to buy a cray anymore. (Research money is fairly tight, recoding costs are prohibative, MTBF's are more of an issue then they used to be, etc.)

    1. Re:Comeback? by Rasta+Prefect · · Score: 4, Insightful
      Probably not. Cray made some money back when a supercomputer was something that an ordinary company might need. The capabilities of "normal" computers was much more limited then today, so there was a much higher percentage of the buying public likely to want something more. These days the vast majority of users are happy with something mainstream

      Cray has never sold computers that are anything like a normal company would need. Cray machines are made for heavy number crunching - Vector processors are made for simulation tasks. They're very good at them. However they perform abyssmally at most other tasks - buying one for use as say, a database or application server would be stupid.

      But, you ask, isn't there a lunatic fringe who wants more power at any price? Well, the lunatic fringe ain't what it used to be. During the heyday of cray you got a damn fine box and nothing else. Cray didn't want to worry about your software--or even an OS.

      Last time I checked Cray shipped UNICOS with their machines. It's a fairly BSDish UNIX variant. It's a bit of an oddball, but not all that much more of a PITA than say, IRIX or AIX. Want to port your beowulf apps? No problem! When I spent a summer working on a T3E all of our multi processor apps used MPI. Vectorization of C and FORTRAN apps is largely taken care of by the compiler. So wheres all this programmer investment you're talking about? Most of the kinds of apps that you're going to run on a Cray (Weather models, crash simulations, Gaussian for chemical sims, etc) already run on a Cray, and you're probably going to be modifying them anyway.

      I think the long term answer is that cray will be a very small niche player, selling to a very select group of (U.S.) government agencies, with the occasional pro forma business customer thrown in so the company can issue press releases. Even most government facilities aren't in a position to buy a cray anymore. (Research money is fairly tight, recoding costs are prohibative, MTBF's are more of an issue then they used to be, etc.)

      Cray isn't in the selling large business systems. Cray is, always has been, and likely always will be a competitor in the scientific computing market. Yeah, this means they're not going to be a Sun or IBM that sell to business customers for business needs, but that's not the sort of company they're trying to be so the comparison is pointless. They're selling machines to people who need to do heavy duty number crunching. This means Universities, government agencies and large companies doing lots of product research. Typically the cost of using these sorts of machines is spread around - frequently instead of buying the machine, you'll go to a company like Network Computing Services and buy time on a machine. It works out well. There will always be a certain number of organizations that need this sort of heavy duty computing power, and Cray will be there to serve them.

      --
      Why?
    2. Re:Comeback? by virtual_mps · · Score: 3, Insightful
      Cray has never sold computers that are anything like a normal company would need. Cray machines are made for heavy number crunching - Vector processors are made for simulation tasks. They're very good at them. However they perform abyssmally at most other tasks - buying one for use as say, a database or application server would be stupid.

      I don't recall saying that cray was trying to sell general business machines. But even for scientific applications, the number of customers who need a cray as opposed to being able to use a commodity cluster is much lower then the number who needed a cray instead of an IBM 360. There are businesses out there who use computers for more then spreadsheets and web servers. By "ordinary company" I meant to draw attention to that part of the market whose budget isn't classified.

      Last time I checked Cray shipped UNICOS with their machines. It's a fairly BSDish UNIX variant. It's a bit of an oddball, but not all that much more of a PITA than say, IRIX or AIX.


      I guess you didn't do much porting of mainstream applications to a cray. The lack of virtual memory, the funny type sizes in C, and other things that application writers make assumptions about (things that aren't technically guaranteed to work in ANSI C but do work on every other system in the world) could make porting a real problem. Things have gotten a lot better, but I can assure you that a unicos port of, say, perl or gcc was not in the same league as an irix port of the same app. One of the things cray is finally bowing to is the demand for virtual memory. Seymour never wanted it (didn't want the performance hit) but it's real hard to sell that in today's marketplace. The question is how much cray can back off of its old "speed is king" philosophy when their whole business is making fast computers.

      Want to port your beowulf apps? No problem! When I spent a summer working on a T3E all of our multi processor apps used MPI.

      You've kinda missed the boat. The point of the cutting-edge cray supercomputers isn't to run mpi apps--those do quite nicely on commodity clusters. The T3E is a MPP super--not a vector super. It's where cray was 10+ years ago, not where they want to be tomorrow. The point of cutting edge is to create new paradigms. That definately helps your performance, but it kills your compatibility.

      Vectorization of C and FORTRAN apps is largely taken care of by the compiler.

      Wow. Let's just say that when you're on the kind of project that can command the state of the art you don't depend on compiler autoparallelization.

      So wheres all this programmer investment you're talking about? Most of the kinds of apps that you're going to run on a Cray (Weather models, crash simulations, Gaussian for chemical sims, etc) already run on a Cray,

      Please, read up on the tera system, for example, and try to understand how it's different from a T3E.
    3. Re:Comeback? by virtual_mps · · Score: 4, Insightful
      My thinking, however, is that the same is true today and for all of the top 100 supercomputers in the world. That is to say, each one of those machines is a custom hardware installation,

      Yes and no. The problem is that a cray box has to cover the whole R&D cost for an entire system. When IBM sells you an SP2 most of the R&D is spread across their much higher volume business lines. Same with an intel based cluster--the technology specific to the HPC market is basically the interconnect, and the rest is subsidized by video game players. There's also the compiler cost (you don't sell many fortran compilers outside the scientific market) but the salaries for a few compiler writers is much lower than the cost of desiging a cutting-edge cpu from scratch.

      At the same time, however, any of these applications are fully capable of utilizing as much hardware resources as you have available.

      That's always true. The question is whether they can use the resources efficiently, and whether the cost/op is competetive. You're right about the algorithms being the driving force, but I'd argue that it is unusual for an algorithm that's optimized for one architecture to run optimally if you move it a radically different architecture. People can spend years trying to squeeze a couple more percent out of their code, and they don't want to start from scratch unless there's a very good reason. Then there's the problem that researchers tend to not work in a bubble. Even if you can afford to buy the most expensive machine on the block you might end up shooting yourself in the foot if nobody else in your field can collaborate with you.

      user interface software is kept at a minimum

      You've got that right--most of the examples I've seen are pretty...spartan.
  7. Re:explain by Anonymous Coward · · Score: 1, Insightful

    You're simply not going to get that in a single PC cabinet.

    In 10 years I might! =P

  8. Classified? Re:Correct me if I'm wrong ... by SpikeSpiff · · Score: 4, Insightful
    To me, the 5.4% classified is improbable. The same defense establishment that kept the $100s of millions stealth fighter secret for five years can certainly keep multi-million dollar computers secret.

    Especially because it's so much easier to hide a computer than an airplane. No sightings in area 51....

    We have to assume that the state of the art is way past the public data. Cray has a "lousy" $150 MM in yearly revenue. They could be spending 10X that on heavy computing for national security. The government is spending $25BB on intelligence and another $400 BB on defense every year. Cray could be a drop in the bucket, even a red herring. I'd love to know what is going on in the basements at Fort Meade.

    --
    "All that is required for evil to triumph is for good men to do nothing." - Edmund Burke
  9. More elegant than the macs, back in the day by BelugaParty · · Score: 2, Insightful

    I really want to see cray come out with more waterfall computers. I thought that was the greatest thing in the world when I saw it on Beyond2000! way back in the day. The contemporary "elegant mac" isn't even in the same aesthetic/functional dimension as that cray machine.

    Ah, glory days.

  10. Re:explain by imnoteddy · · Score: 4, Insightful
    I've heard (no, I don't have actual links to articles) that 10% of peak performance on a cluster is considered really good.

    Sounds like Cray marketing articles. For example, Daniel Katz at JPL wrote in 1997:

    it is possible to construct a 16-node machine with a theoretical peak performance of 3.2 GFlop/s and a typical sustained performance of 1.2 GFlop/s
    which is > 35% of peak. Or consider this from the Universiry of Liverpool:

    The current Beowulf cluster can deliver a theoretical peak performance of about 100 Gigaflops (billions of floating point operations per second) and has been observed to deliver about 60 Gigaflops.

    The observed performance was based on LU decomposition.

    For sustained/peak of about 60%.

    I have no doubt that one could find problems where a Beowulf cluster has 10% efficiency, but there are real many problems that are good to go on a cluster. And even if you only got 10% it would be worth it if the cluster cost 5% of what a vector computer costs. Not to mention that performance/$ on commodity hardware increases by a factor of 2 every 12-24 months. It takes years to develop a supercomputer, and they are stuck at their level of technology for several years since they are so expensive to redesign.

    --
    No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
  11. Re:explain by funbobby · · Score: 3, Insightful

    Moving people in planes is not a good analogy because it is perfectly parallel. Each person getting to the destination is not in any way dependant on the other people's journey, so splitting up the work has no overhead.

    The Cray design philosophy is for solving problems that can't be split up easily. If all of the parts of the problem depend heavily on one another, you pay a large price for communication when you split it up. That's the situation where the cluster doesn't do as well as the Cray. So each design has its strengths, and it really depends on the problem.

  12. Re:The trick is keeping ahead of the commodity guy by Rasta+Prefect · · Score: 2, Insightful
    The new Japanese NEC supercomputer came with a price tag of about $160 million if I remember correctly (some estimates say that it took $1G in research funding) and hits 35 TFlops (sustained). #3 on the Top 500 supercomputers list is a Beowulf cluster with 2304 processors coming in at 7.6 TFlops (sustained). Even figuring $2000/processor + interconnect, that puts the Beowulf cluster at around $5 million or 1/32 of the cost for 1/5th of the performance (roughly speaking).

    Number of TFLOPS isn't everything. The move back to vector style processors in super computing has been largely inspired by the fact that beowulf clusters work really well for some problems - and very, very poorly for others. If you've got a problem that divides nicely into discrete chunks that don't require a lot of interprocessor communication, then yeah, sure go with beowulf. But complex simulation problems have a tendancy to leave most of the processors idling while the cluster talks to itself due to network speed issues.

    --
    Why?
  13. Re:Or maybe.... by virtual_mps · · Score: 2, Insightful

    Yeah, your point? You said nothing about the reliability of one system versus another. There's a lot more that goes into designing a reliable system then spouting off some made-up statistics about cpu failures.

  14. Cray and Wall St? by Anonymous Coward · · Score: 1, Insightful

    Wall St. can't buy it, whatever it is... Cray Inc has more shares outstanding than Cray Research did in its heyday approaching $1B/yr sales. Anybody on Wall St. who thinks this stock is going up like the old Cray simply hasn't done their homework. As other posters have pointed out, vectors are cool and have a place but way too much of the everyday supercomputer work can be handled by clusters and such. They have a niche and its cool but don't expect it to grow like the last Cray did.

    The really frightening thing about Cray is the people in control (Seattle) built a computer that doesn't work (Tera) and the people not in control (Mpls / Chippewa Falls) are generating all the revenue with their boxes that do work. Too bad they have to carry Burton Smith around on their backs.

    No insider info here. You can find all this and more in the annual reports. Happy reading.

  15. Re:Economics of Scale by tesmako · · Score: 2, Insightful
    What you are missing is that Cray really does have a niche that PC processors cannot at this time touch, vector processors. Having insane performance at vector tasks with a somewhat specialised vector processor is a lot easier than with a general purpose mips-descendant. It is not an all that highly competitive niche but is highly profitable (if you have a vector-heavy task a modern Cray vector-processor is not only extremely fast, it is even price efficient at that speed). Lets not forget either that Cray holds a lot of neat patents (most interesting are their compilation technique patents) for vector-processing problems.

    Other things interesting to note is that old Cray is not only keeping the company "Cray" afloat, to some part it is a division from Cray that is making Sun the most money these days too. The extreme SMP machines from Sun (think 106 processor Fire 15K) is created by a division of the company that Sun bought from SGI when SGI bought Cray, Cray toyed with Sparc SMP's back in that day and SGI felt a bit uncomfortable dealing with sparcs so they sold it off cheap. The best purchase Sun has made in the last decade.

    All in all I am sure that Cray has a lot to offer, they have shown off their technical skills many times in the past and the technology has aged quite well for this business.