Time For A Cray Comeback?
Boone^ writes "The New York Times has an article (free reg. req.) talking about Cray Inc.'s recent resurgence in the realm of supercomputing. It discusses a bit of Cray's decline when the Cold War ended, "the occupation" under SGI, and the rebirth of the company after the Tera (now Cray Inc.) purchase. Recently Cray Inc. has been shipping their vector-based Cray X1 machine, designing ASCI Red Storm, and recently was one of 3 (also Sun, IBM) to win a large DARPA contract (PDF link) to design and develop a PetaFlops machine by 2010. Could Cray Inc. be poised for a comeback? Wall Street seems to think so."
Partner Link
Posting as Anonymous Coward, please award my Karma to starving children in the world.
SCO vs. Cray
Of course I expect that...in my Playstation IV,
equipped with an opto-quantic Emotion Engine VI
and a couple petabytes of holographic storage.
-><- no
There are still MANY applications for supercomputers. A lot of people think that linux/beo-clusters are going to be replacing supercomputers of the Cray/NEC/IBM variant. Not true. There are still many research, scientific, and military applications that require machines developed not for "slow" distributed number crunching, but require ultra high speed processor and memory architechtures.
So definately, time for Cray to come back and retake the supercomputer industry crown.
There's a whole bunch of PETAFlops outside of McDonalds right now having a sit in and screaming about how fur is murder.
I had to literally step on their faces to get a Big Mac.
I don't need no instructions to know how to rock!!!!
If you look at the list of top 100 supercomputers, there are systems that are almost 15 years old or even older (not sure on a few). I know these take years to build and are multibillion dollar projects, but between time has got to be a killer.
Then there's the question of ... what do you need a supercomputer for? The applications are pretty limited for a need for a petaflop computer, unless your doing mass storage, cryptography (cracking), or simulations.
Don't get me wrong I'm all about nuclear testing being done in 1's and 0's instead of in the ocean or in the desert, but how big of a bomb do you really need when it's estimated theres enough nukes to blast the entire land surface of the earth 3 times over.
Ignore the "p2p is theft" trolls, they're just uninformed
Well, a well engineered supercomputer has much less overhead than a cluster. One superfast processor doesn't have to deal with interprocessor communcations like a cluster does.
And if your supercomputer has multiple processors, they are generally made to cooperate nicely to speed efficiency. Whereas a cluster has to go through ethernet and hardware layers to communicate between nodes. Granted that is fast, but on-board communication is faster.
It seems strange, but a multiple processor computer can actually perform a task slower than just one processor working on the problem if the program and os aren't designed well. So a lot of the value of a supercomputer comes in its design, and the reputation of the manufacturer. And Cray is pretty reliable in my book.
But the REAL key to the potential comeback of the Cray computer will be whether or not it still has cool bubbles! Wow!!! Cray computing... the inventor of case mods.
Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
is there a secret message here? should tom ridge be called?
2 1337 4 u!
can someone explain to me what the benefit of a moving van is compared to buying a fleet of pintos?
Memory to processor feeding: std ots processors are often idle because the memory subsystem cannot feed the processor fast enough. This is bad now. It will be getting a lot worse.
Interconnections between processors: this goes beyond merely processors on a board, but between boxes. The bus architectures out there for the std ots hardware get saturated very quickly. This gets worse between boxes. In addition the latency on Myranet and Quadrics (compared to what Cray et al do) is horrible even if it is excellent compared to ethernet.
Problem set vs architecture: Not all problems map out well to clusters, or even SMP boxen. Some map best to vector machines. Some map best to tightly integrated MPPs. Some map out to moderately tight clusters. Some are just plain 'embarassingly parallel'. Others are highly threaded and don't work well on vector or scalar machines. etc, etc. The architecture ought to match the problem set.
MTBF: Mean time between failures. Commodity hardware goes kaputt much more often. A cluster capable of teraflop performance of custom hardware tends to need constant and evil levels of care and feeding: ie you better have a grad student on roller blades.
Those are just off the top of my head. I am sure that others will Tell you others before I can post again. ;)
Summarized: bandwidth, latency, problem set, and failure rate.
HTH.
Do you know why the road less traveled by is littered with the bones of the unwary?
SCO vs Nike
Look at me, I'm a stock analyst!
Second, (yes, I work for Cray so now I'm going to put in a sales pitch
Finally, there's memory. Lots of it. A single system image supercomputer can have terabytes of memory in one kernel image. You're simply not going to get that in a single PC cabinet.
Finally, in case anyone doubts that vectors, big memory, and large bandwidth can make a good system, the fastest machine in the world right now is the Japanese "Earth Simulator" machine which is an NEC SX machine. That is somewhat similar in architecture to a Cray in that it has large bandwidth and vectors.
Go Badgers! -- #include "std/disclaimer.h"
Hahahaha. Have you ever actually run a supercomputer? They tend to have much higher failure rates then normal servers. Couple of reasons: first, they push the envelope of a given technology. The sweet spot for stability is not the leading edge. Second, they're not nearly as well tested as mainstream hardware. On a platform with thousands of installations you're much less likely to run into a problem nobody has seen before than you are on a platform with only dozens of installations.
...isn't 'Cray' today about as 'Cray' as the company that now owns 'Atari'? What's left besides the name of the original company?
Does it hurt to hear them lying? Was this the only world you had?
Well, almost. Let's say I have a plane that can accomodate 100 people and does NY->London in 6 hours.
My problem is that I have to move 1000 people from NY to London
Now I can either:
1. I can buy a plane that is 20 time faster, 20 times more expensive. That's the supercomputer
2. I can buy 9 other planes (same as mine) and accomodate the same results as in 1 for less than half the price (I'll let you do the math). That's the cluster.
3. I can buy a plane that has a capacity of 1000 people. That's the parallel supercomputer. But if that one can do the deal for my specific problem, it proves to be not that flexible if my problem changes (ie: 500 people NY->London and 500 people from NY->LA).
That's the power of the bewolf cluster!!!
Write boring code, not shiny code!
Probably not. Cray made some money back when a supercomputer was something that an ordinary company might need. The capabilities of "normal" computers was much more limited then today, so there was a much higher percentage of the buying public likely to want something more. These days the vast majority of users are happy with something mainstream
But, you ask, isn't there a lunatic fringe who wants more power at any price? Well, the lunatic fringe ain't what it used to be. During the heyday of cray you got a damn fine box and nothing else. Cray didn't want to worry about your software--or even an OS. A person who needed the speed would plunk down the money for the box and then pay a couple of guys to code everything from scratch. Those days are gone--software is the driving factor these days, and people are far less willing to buy something that's going to force a total code rewrite. Especially if that thing is only going to buy them a couple of years of edge before they need to recode for the next best thing.
Then there's the question of whether cray can afford to be bigger. The answer is "probably not". If you sell to a lot of customers you need a huge support infrastructure. Cray doesn't have much of one anymore, so they'd need to buy one. (Most of the old support guys left one way or another when SGI came in, or stayed with SGI.) If you have a lot of customers you can spread the costs around, but in the case of a company like cray a support infrastructure means having a people sitting around most of the time in every region you sell a machine. Maybe two to four guys per system (24x7, right?) plus some sorta warehouse facility if you enter a new geographical market. That's expensive. You can bill a lot of that cost back to the customers, but that just makes your systems less competetive.
I think the long term answer is that cray will be a very small niche player, selling to a very select group of (U.S.) government agencies, with the occasional pro forma business customer thrown in so the company can issue press releases. Even most government facilities aren't in a position to buy a cray anymore. (Research money is fairly tight, recoding costs are prohibative, MTBF's are more of an issue then they used to be, etc.)
You can't haul the A-Team around in a Pinto.
Does it hurt to hear them lying? Was this the only world you had?
My next couch should be a Cray..
Especially because it's so much easier to hide a computer than an airplane. No sightings in area 51....
We have to assume that the state of the art is way past the public data. Cray has a "lousy" $150 MM in yearly revenue. They could be spending 10X that on heavy computing for national security. The government is spending $25BB on intelligence and another $400 BB on defense every year. Cray could be a drop in the bucket, even a red herring. I'd love to know what is going on in the basements at Fort Meade.
"All that is required for evil to triumph is for good men to do nothing." - Edmund Burke
In the 1970's and 1980's, Cray and other supercomputer companies fit in the niche of "fastest computing at any cost". The design cycles were long for the specialized hardware that pushed the boundaries of the available technology. Companies and government agencies were willing to pay the high price since there was enough processing speed difference between the supercomputers and the "vanilla" computers.
By the early 1990's, the "attack of the killer microprocessors" came. The PC class processors were still weak, but the higher dollar RISC processors used in workstations, like Sun, were reaching performance levels close to what the supercomputers were able to deliver. Since they were based on higher volume and more standardized processors, the price/performance of the RISC workstations started eating into the mainframe and supercomputer market. Many of the supercomputer companies died off, and some started to incorporate RISC processors into their designs. By the mid 1990's I believe that Tera and Cray were the last remaining old-school supercomputer companies left. The rest either died or were absorbed into other companies.
Today, the investment required to produce the fastest processor chips is so high that it requires large unit volumes to pay for the cost of development and production. The PC class processors, with their high volumes, are putting pressure on the old style workstation market, where each company makes their own processor (SPARC/Sun, PA-RISC/HP, Alpha/DEC). We see Sun struggling as the PC's eat their market. Even some large scale supercomputers are based on the PC processors. The majority of the computer spectrum from low to high end is based on the same families of processors (Intel, AMD, PowerPC).
So that brings us to Cray/Tera. Cray seems to go against the economics of scale that drive the rest of the computing industry. What keeps them running is a small niche that the government is willing to keep funded. It is similar to the funding of exotic bombers and fighter jets. We probably won't see Cray grow much larger than they currently are. They be kept running since they form a critical part of the national security, at least that is what the government believes.
Supercomputing per se died because Intel, DEC, IBM/Motorola had a lot more money to throw at speeding things up than the supercomputing community.
In the 70's up until the early 90's it was possible to build a custom CPU out of discrete logic that ran significantly faster than the available microprocessors. Cray was able to push their clock cycle down into the nanosecond range through clever design. However, a 1ns clock rate == 1GHz. You can go buy that multi-million dollar CPU for a couple of hundred bucks in today's market.
In order for superocmputing to be viable you have to be able to provide quantum leap performance above the commodity hardware AND keep your cost/performance ratio in line as well.
The CRAY-1 came out with a clock speed of about 80 MHz and vector processing and high memory bandwidth at a time when mainstream systems like the PDP 11/70 were running at about 7MHz with a 1MB/s memory bus. Microprocessors weren't even't a joke compared with the Cray.
The new Japanese NEC supercomputer came with a price tag of about $160 million if I remember correctly (some estimates say that it took $1G in research funding) and hits 35 TFlops (sustained). #3 on the Top 500 supercomputers list is a Beowulf cluster with 2304 processors coming in at 7.6 TFlops (sustained). Even figuring $2000/processor + interconnect, that puts the Beowulf cluster at around $5 million or 1/32 of the cost for 1/5th of the performance (roughly speaking).
There are other factors, of course, but the key is that for the supercomputer to stay ahead of the microprocessor a boatload of funding is needed for the supercomputer and the payoff just isn't really there. If it was a lot more supercomputer companies would still be in business.
"Well, a well engineered supercomputer has much less overhead than a cluster. One superfast processor doesn't have to deal with interprocessor communications like a cluster does."
I like the way Cray put it:
"If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?"
- Seymour Cray (1925-1996), father of supercomputing
And how about a few more Cray quotes?
"#3 pencils and quadrille pads."
- Seymoure Cray (1925-1996) when asked what CAD tools he used to design the Cray I supercomputer; he also recommended using the back side of the pages so that the lines were not so dominant.
"I just bought a Mac to help me design the next Cray."
- Seymoure Cray (1925-1996) when was informed that Apple Inc. had recently bought a Cray supercomputer to help them design the next Mac.
I wonder what he's using now? a Palmpilot?
The E10000 is a Celerity product. Celerity was an independent Unix box maker back in the 80's with their own processor architecture. Celerity went bust trying to bring a "minisupercomputer" version of the architecture to market in about 1987 (33 MHz, whoo hoo!). The assets and technology of Celerity along with the design team in San Diego were acquired by Floating Point Systems (FPS). FPS brought the system to market and made the transition to a SPARC based architecture (66 MHz) before going bust. The assets and technology of FPS along with the design team in San Diego and now the manufacturing team in Beaverton were acquired by Cray. Cray did a couple of turns of the crank on the FPS product and sold it as a "business supercomputer". When Cray was acquired by SGI, SGI wanted no part of the SPARC business and sold (yes, again) the San Diego design team (and I think the Beaverton group) to Sun who finally brought a SUCCESSFUL product to market with the E10000.
But it's still the same core team down in San Diego, so I like to think of the E10000 as being a Celerity product.
The Sandia National Labs supercomputer (code name: Red Storm), currently being built by Cray, is going to be powered by 10,000 Opteron processors. A 40 Teraflop theoretical peak will put it at the top of the supercomputer list, being approximately 4 Teraflops faster than the NEC Earth Simulator, the current champ.
Palaces, barricades, threats, meet promises
Sounds like Cray marketing articles. For example, Daniel Katz at JPL wrote in 1997:
which is > 35% of peak. Or consider this from the Universiry of Liverpool:For sustained/peak of about 60%.
I have no doubt that one could find problems where a Beowulf cluster has 10% efficiency, but there are real many problems that are good to go on a cluster. And even if you only got 10% it would be worth it if the cluster cost 5% of what a vector computer costs. Not to mention that performance/$ on commodity hardware increases by a factor of 2 every 12-24 months. It takes years to develop a supercomputer, and they are stuck at their level of technology for several years since they are so expensive to redesign.
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
I remember a story from a NSA contract worker.
In the early days of Cray, he and many others were wondering how they could keep things running, considering that their official budgets only showed ten or so sales per year.
Until he got the tour of the NSA computer plant, where they had a hall the size of two football fields, filled with Crays.
How small a thought it takes to fill a whole life