Time For A Cray Comeback?
Boone^ writes "The New York Times has an article (free reg. req.) talking about Cray Inc.'s recent resurgence in the realm of supercomputing. It discusses a bit of Cray's decline when the Cold War ended, "the occupation" under SGI, and the rebirth of the company after the Tera (now Cray Inc.) purchase. Recently Cray Inc. has been shipping their vector-based Cray X1 machine, designing ASCI Red Storm, and recently was one of 3 (also Sun, IBM) to win a large DARPA contract (PDF link) to design and develop a PetaFlops machine by 2010. Could Cray Inc. be poised for a comeback? Wall Street seems to think so."
Partner Link
Posting as Anonymous Coward, please award my Karma to starving children in the world.
Naturally. We have another Bush in the Whitehouse, and I even hear the Wang Chung is making a comeback -- so why not Cray?
Roving Web-Teleoperated Robot
SCO vs. Cray
Many scientists are very concern about state of supercomputing in US. Hopefully new generation of supercomputers improve this situation.
Of course I expect that...in my Playstation IV,
equipped with an opto-quantic Emotion Engine VI
and a couple petabytes of holographic storage.
-><- no
There are still MANY applications for supercomputers. A lot of people think that linux/beo-clusters are going to be replacing supercomputers of the Cray/NEC/IBM variant. Not true. There are still many research, scientific, and military applications that require machines developed not for "slow" distributed number crunching, but require ultra high speed processor and memory architechtures.
So definately, time for Cray to come back and retake the supercomputer industry crown.
memory bandwidth
It's SUPER! Off-the-shelf components are just kind of "Meh."
There's a whole bunch of PETAFlops outside of McDonalds right now having a sit in and screaming about how fur is murder.
I had to literally step on their faces to get a Big Mac.
I don't need no instructions to know how to rock!!!!
If you look at the list of top 100 supercomputers, there are systems that are almost 15 years old or even older (not sure on a few). I know these take years to build and are multibillion dollar projects, but between time has got to be a killer.
Then there's the question of ... what do you need a supercomputer for? The applications are pretty limited for a need for a petaflop computer, unless your doing mass storage, cryptography (cracking), or simulations.
Don't get me wrong I'm all about nuclear testing being done in 1's and 0's instead of in the ocean or in the desert, but how big of a bomb do you really need when it's estimated theres enough nukes to blast the entire land surface of the earth 3 times over.
Ignore the "p2p is theft" trolls, they're just uninformed
Bandwidth.
Well, a well engineered supercomputer has much less overhead than a cluster. One superfast processor doesn't have to deal with interprocessor communcations like a cluster does.
And if your supercomputer has multiple processors, they are generally made to cooperate nicely to speed efficiency. Whereas a cluster has to go through ethernet and hardware layers to communicate between nodes. Granted that is fast, but on-board communication is faster.
It seems strange, but a multiple processor computer can actually perform a task slower than just one processor working on the problem if the program and os aren't designed well. So a lot of the value of a supercomputer comes in its design, and the reputation of the manufacturer. And Cray is pretty reliable in my book.
But the REAL key to the potential comeback of the Cray computer will be whether or not it still has cool bubbles! Wow!!! Cray computing... the inventor of case mods.
Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
Cray died. Anything else is just bartering on his name.
can someone explain to me what the benefit of a moving van is compared to buying a fleet of pintos?
Memory to processor feeding: std ots processors are often idle because the memory subsystem cannot feed the processor fast enough. This is bad now. It will be getting a lot worse.
Interconnections between processors: this goes beyond merely processors on a board, but between boxes. The bus architectures out there for the std ots hardware get saturated very quickly. This gets worse between boxes. In addition the latency on Myranet and Quadrics (compared to what Cray et al do) is horrible even if it is excellent compared to ethernet.
Problem set vs architecture: Not all problems map out well to clusters, or even SMP boxen. Some map best to vector machines. Some map best to tightly integrated MPPs. Some map out to moderately tight clusters. Some are just plain 'embarassingly parallel'. Others are highly threaded and don't work well on vector or scalar machines. etc, etc. The architecture ought to match the problem set.
MTBF: Mean time between failures. Commodity hardware goes kaputt much more often. A cluster capable of teraflop performance of custom hardware tends to need constant and evil levels of care and feeding: ie you better have a grad student on roller blades.
Those are just off the top of my head. I am sure that others will Tell you others before I can post again. ;)
Summarized: bandwidth, latency, problem set, and failure rate.
HTH.
Do you know why the road less traveled by is littered with the bones of the unwary?
Didn't Sun basically buy out or hire away a bunch of Cray, Inc.? I always heard the E10000 was actually a Cray product. Oh, and just to brag, I have a blue jacket with a picture of a Y-MP-90 on the back with the words, "CRAY - WORLD'S FASTEST SUPERCOMPUTERS". Too cool for words. Ebay rules.
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
SCO vs Nike
Look at me, I'm a stock analyst!
The home page at Cray for the Cascade project.
There are some interesting PDFs there. Chew, mull, and consider.
Also consider what Horst Simon, head of NERSC said here too.
Do you know why the road less traveled by is littered with the bones of the unwary?
Maybe now that there's once again a major player in the computer market with machine casing designs even SILLIER than Apple's, the rest of the Geek Community will give us a little slack..
Second, (yes, I work for Cray so now I'm going to put in a sales pitch
Finally, there's memory. Lots of it. A single system image supercomputer can have terabytes of memory in one kernel image. You're simply not going to get that in a single PC cabinet.
Finally, in case anyone doubts that vectors, big memory, and large bandwidth can make a good system, the fastest machine in the world right now is the Japanese "Earth Simulator" machine which is an NEC SX machine. That is somewhat similar in architecture to a Cray in that it has large bandwidth and vectors.
Go Badgers! -- #include "std/disclaimer.h"
Hahahaha. Have you ever actually run a supercomputer? They tend to have much higher failure rates then normal servers. Couple of reasons: first, they push the envelope of a given technology. The sweet spot for stability is not the leading edge. Second, they're not nearly as well tested as mainstream hardware. On a platform with thousands of installations you're much less likely to run into a problem nobody has seen before than you are on a platform with only dozens of installations.
Why do people buy those really expensive supercomputers, when they could just buy an Apple one instead? They're much cheaper!
Other posters have already pointed out the bandwidth issues over and over, so I'll skip that obvious difference.
The fact is that not all problems are suitable to parallel processing. Sometimes you really need to know the outcome of one operation before you can go on to the next.
Beowulf clusters really suck on problems where that applies. Cray style supercomputers shine on them.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
Don't just think about solving a static problem faster, it's also about solving a problem better through the use of more variables. Take weather simulation. If having too many variables stretches todays forcast into next week, then it's useless. So you limit the amount of variables to come up with a "close enough" forcast in a more timely manner. With a faster computer, you can get a more accurate simulation in a more reasonable time period. This increase in accuracy/complexity is then useful in many fields.
...isn't 'Cray' today about as 'Cray' as the company that now owns 'Atari'? What's left besides the name of the original company?
Does it hurt to hear them lying? Was this the only world you had?
Have you ever actually run a supercomputer?
You know, that's kinda funny, since it's my current job. ;) I'm a NERSC employee. :P
You're right, until the the system hits maturity. Our T3E before being retired had a lot less hardware problems than our linux cluster does. Or the SP3 we have for that matter.
BTW, since it's rather hard to find a job these days for some people in the computing realm, we're hiring.
Do you know why the road less traveled by is littered with the bones of the unwary?
OK this is about as much a kiddy thing as how many VWs fit inside a football stadium or something, but... ...anyone know of a site with info on how current and past supercomputers compare to current desktops? Where are we at now with 2GHz G5s and 3.3GHz P4s, relatively?
One of the comparisons made when I was at university was of a 30-something MHz 386, with a supercomputer from 1973, showing how they do about the same amount of processing/data transfer but in completely different ways. I found that fascinating
Well, almost. Let's say I have a plane that can accomodate 100 people and does NY->London in 6 hours.
My problem is that I have to move 1000 people from NY to London
Now I can either:
1. I can buy a plane that is 20 time faster, 20 times more expensive. That's the supercomputer
2. I can buy 9 other planes (same as mine) and accomodate the same results as in 1 for less than half the price (I'll let you do the math). That's the cluster.
3. I can buy a plane that has a capacity of 1000 people. That's the parallel supercomputer. But if that one can do the deal for my specific problem, it proves to be not that flexible if my problem changes (ie: 500 people NY->London and 500 people from NY->LA).
That's the power of the bewolf cluster!!!
Write boring code, not shiny code!
Probably not. Cray made some money back when a supercomputer was something that an ordinary company might need. The capabilities of "normal" computers was much more limited then today, so there was a much higher percentage of the buying public likely to want something more. These days the vast majority of users are happy with something mainstream
But, you ask, isn't there a lunatic fringe who wants more power at any price? Well, the lunatic fringe ain't what it used to be. During the heyday of cray you got a damn fine box and nothing else. Cray didn't want to worry about your software--or even an OS. A person who needed the speed would plunk down the money for the box and then pay a couple of guys to code everything from scratch. Those days are gone--software is the driving factor these days, and people are far less willing to buy something that's going to force a total code rewrite. Especially if that thing is only going to buy them a couple of years of edge before they need to recode for the next best thing.
Then there's the question of whether cray can afford to be bigger. The answer is "probably not". If you sell to a lot of customers you need a huge support infrastructure. Cray doesn't have much of one anymore, so they'd need to buy one. (Most of the old support guys left one way or another when SGI came in, or stayed with SGI.) If you have a lot of customers you can spread the costs around, but in the case of a company like cray a support infrastructure means having a people sitting around most of the time in every region you sell a machine. Maybe two to four guys per system (24x7, right?) plus some sorta warehouse facility if you enter a new geographical market. That's expensive. You can bill a lot of that cost back to the customers, but that just makes your systems less competetive.
I think the long term answer is that cray will be a very small niche player, selling to a very select group of (U.S.) government agencies, with the occasional pro forma business customer thrown in so the company can issue press releases. Even most government facilities aren't in a position to buy a cray anymore. (Research money is fairly tight, recoding costs are prohibative, MTBF's are more of an issue then they used to be, etc.)
You can't haul the A-Team around in a Pinto.
Does it hurt to hear them lying? Was this the only world you had?
Oh!.... that Cray!
Never mind!Our T3E was having problems well past the point where it was getting long in the tooth. Cray started adding functionality to make it more supportable a few years back, but when it was actually a cutting edge system it was pretty unstable. They probably couldn't widely sell a system today that had the problems of the earlier T3E's (one hardware problem and you need to reboot the whole thing) but that just increases the development costs and time to market in a market where delay means that the peasents will be nipping at your heels. Remember, by the time a super hits maturity, it's obsolete.
My next couch should be a Cray..
Especially because it's so much easier to hide a computer than an airplane. No sightings in area 51....
We have to assume that the state of the art is way past the public data. Cray has a "lousy" $150 MM in yearly revenue. They could be spending 10X that on heavy computing for national security. The government is spending $25BB on intelligence and another $400 BB on defense every year. Cray could be a drop in the bucket, even a red herring. I'd love to know what is going on in the basements at Fort Meade.
"All that is required for evil to triumph is for good men to do nothing." - Edmund Burke
In the 1970's and 1980's, Cray and other supercomputer companies fit in the niche of "fastest computing at any cost". The design cycles were long for the specialized hardware that pushed the boundaries of the available technology. Companies and government agencies were willing to pay the high price since there was enough processing speed difference between the supercomputers and the "vanilla" computers.
By the early 1990's, the "attack of the killer microprocessors" came. The PC class processors were still weak, but the higher dollar RISC processors used in workstations, like Sun, were reaching performance levels close to what the supercomputers were able to deliver. Since they were based on higher volume and more standardized processors, the price/performance of the RISC workstations started eating into the mainframe and supercomputer market. Many of the supercomputer companies died off, and some started to incorporate RISC processors into their designs. By the mid 1990's I believe that Tera and Cray were the last remaining old-school supercomputer companies left. The rest either died or were absorbed into other companies.
Today, the investment required to produce the fastest processor chips is so high that it requires large unit volumes to pay for the cost of development and production. The PC class processors, with their high volumes, are putting pressure on the old style workstation market, where each company makes their own processor (SPARC/Sun, PA-RISC/HP, Alpha/DEC). We see Sun struggling as the PC's eat their market. Even some large scale supercomputers are based on the PC processors. The majority of the computer spectrum from low to high end is based on the same families of processors (Intel, AMD, PowerPC).
So that brings us to Cray/Tera. Cray seems to go against the economics of scale that drive the rest of the computing industry. What keeps them running is a small niche that the government is willing to keep funded. It is similar to the funding of exotic bombers and fighter jets. We probably won't see Cray grow much larger than they currently are. They be kept running since they form a critical part of the national security, at least that is what the government believes.
Supercomputing per se died because Intel, DEC, IBM/Motorola had a lot more money to throw at speeding things up than the supercomputing community.
In the 70's up until the early 90's it was possible to build a custom CPU out of discrete logic that ran significantly faster than the available microprocessors. Cray was able to push their clock cycle down into the nanosecond range through clever design. However, a 1ns clock rate == 1GHz. You can go buy that multi-million dollar CPU for a couple of hundred bucks in today's market.
In order for superocmputing to be viable you have to be able to provide quantum leap performance above the commodity hardware AND keep your cost/performance ratio in line as well.
The CRAY-1 came out with a clock speed of about 80 MHz and vector processing and high memory bandwidth at a time when mainstream systems like the PDP 11/70 were running at about 7MHz with a 1MB/s memory bus. Microprocessors weren't even't a joke compared with the Cray.
The new Japanese NEC supercomputer came with a price tag of about $160 million if I remember correctly (some estimates say that it took $1G in research funding) and hits 35 TFlops (sustained). #3 on the Top 500 supercomputers list is a Beowulf cluster with 2304 processors coming in at 7.6 TFlops (sustained). Even figuring $2000/processor + interconnect, that puts the Beowulf cluster at around $5 million or 1/32 of the cost for 1/5th of the performance (roughly speaking).
There are other factors, of course, but the key is that for the supercomputer to stay ahead of the microprocessor a boatload of funding is needed for the supercomputer and the payoff just isn't really there. If it was a lot more supercomputer companies would still be in business.
"Well, a well engineered supercomputer has much less overhead than a cluster. One superfast processor doesn't have to deal with interprocessor communications like a cluster does."
I like the way Cray put it:
"If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?"
- Seymour Cray (1925-1996), father of supercomputing
And how about a few more Cray quotes?
"#3 pencils and quadrille pads."
- Seymoure Cray (1925-1996) when asked what CAD tools he used to design the Cray I supercomputer; he also recommended using the back side of the pages so that the lines were not so dominant.
"I just bought a Mac to help me design the next Cray."
- Seymoure Cray (1925-1996) when was informed that Apple Inc. had recently bought a Cray supercomputer to help them design the next Mac.
I wonder what he's using now? a Palmpilot?
The Sandia National Labs supercomputer (code name: Red Storm), currently being built by Cray, is going to be powered by 10,000 Opteron processors. A 40 Teraflop theoretical peak will put it at the top of the supercomputer list, being approximately 4 Teraflops faster than the NEC Earth Simulator, the current champ.
Palaces, barricades, threats, meet promises
I really want to see cray come out with more waterfall computers. I thought that was the greatest thing in the world when I saw it on Beyond2000! way back in the day. The contemporary "elegant mac" isn't even in the same aesthetic/functional dimension as that cray machine.
Ah, glory days.
Sounds like Cray marketing articles. For example, Daniel Katz at JPL wrote in 1997:
which is > 35% of peak. Or consider this from the Universiry of Liverpool:For sustained/peak of about 60%.
I have no doubt that one could find problems where a Beowulf cluster has 10% efficiency, but there are real many problems that are good to go on a cluster. And even if you only got 10% it would be worth it if the cluster cost 5% of what a vector computer costs. Not to mention that performance/$ on commodity hardware increases by a factor of 2 every 12-24 months. It takes years to develop a supercomputer, and they are stuck at their level of technology for several years since they are so expensive to redesign.
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
They bought part of CRAY, the one that made the CS6400 server, which was a really neat SMP system based on supersparcs.
The rest of the company went to SGI.
So basically the server/sparc division went to SUN and then they got the technology for their Enterprise systems.
The rest of the supercomputer (the Alpha based and the Vector based units) units went to SGI, which did.... nothing with them. Oh, yeah they named some interconnections as CRAYlink or something, but they had 0 CRAY technology on them, they just wanted the name.
Same with TERA, they wanted the name and a way of ditching their crappy TM technology.
Moving people in planes is not a good analogy because it is perfectly parallel. Each person getting to the destination is not in any way dependant on the other people's journey, so splitting up the work has no overhead.
The Cray design philosophy is for solving problems that can't be split up easily. If all of the parts of the problem depend heavily on one another, you pay a large price for communication when you split it up. That's the situation where the cluster doesn't do as well as the Cray. So each design has its strengths, and it really depends on the problem.
Number of TFLOPS isn't everything. The move back to vector style processors in super computing has been largely inspired by the fact that beowulf clusters work really well for some problems - and very, very poorly for others. If you've got a problem that divides nicely into discrete chunks that don't require a lot of interprocessor communication, then yeah, sure go with beowulf. But complex simulation problems have a tendancy to leave most of the processors idling while the cluster talks to itself due to network speed issues.
Why?
Every solution has to be chosen corresponding to any specific need. My point was just to show that in most cases the cluster makes sense. Of course some special cases might be better suited by option 1 or 3.
;-)
you couldn't surgically separate them
How do you stuff them in the plane then?
A good constraint for option 1 would be that you need to have them ASAP and the overall transfer could be interrupted anytime (before the 6th hour) and at that at that time you still want as much people as possible. Let's say 3 hours. Option 1 will have brought half the people there while option 2 leave all the planes above iceland at hour 3 with noone in England.
Write boring code, not shiny code!
Yeah, your point? You said nothing about the reliability of one system versus another. There's a lot more that goes into designing a reliable system then spouting off some made-up statistics about cpu failures.
SRC Computers is his legacy, not Cray Computer Corp.
He co-founded this company (with several other
ex-Cray employees) and died while still an employee/owner.
Interestingly, SRC is still around without any evidence on their website
of shipping a product. My guess is that their customers and/or investors
prefer to stay out of the limelight.
I've been using Desktop Cray for a while now. It took me some time to weak the settings to perfection, but now it's just running along. Check it out!
/Styx
Well, it doesn't have to be. We could say that a company wants to send 250 people to London and want to use the 6 hours flight to have a corporate meeting in the plane... You're kind of screwed with 10 planes containing 100 people...
In this case option 3 makes sense.
You could say that the 6 hours is a reasonnable limit but sometimes (not predictable) you need as many people as you can in England before (amound of time not predictable either). In this case, option 1 make sense because both options 2 and 3 doesn't deliver anything before the 6 hour delay.
Write boring code, not shiny code!
John Markoff, the same jerkoff that wrote the less then factual articles and book about kevin mitnick, and happens to belong to one of the less reputable media outles (aka the plagarized and false stories coming from the ny times).
Lawyers, MBA's, RIAA? A jedi fears not these things!
However it happens, it is unlikely Cray was wrong about Gallium Arsenide -- he was not stupid. The question is when will a bureaucratic organization be able to throw marching morons at the problem and make it happen -- since that appears to be the only way technology is funded anymore.
It's unfortunate Seymour allowed Cray, Inc. to keep his name after he left to found CCC. Even though Cray himself was capitulating to massively parallel silicon in his final days -- he did die almost immediately thereafter.
PS: It seems creepy he died in a "jeeping accident" -- because that's exactly the way I had portrayed him dying in an April fools joke faxed to all members of congress a few years before -- an "accident" following shortly on the heels of CCC being taken over by Craig Fields of DARPA. I was sending out the joke because of the horrifying way DARPA had spent money on silly favorites within the academic community while guys who were really pushing the envelope like Seymour were going begging for customers -- having acquired private investments.
Seastead this.
I think there is one single reason that the market is poised for a Cray comeback... HEAT!
Commodity PCs managed to push the speed envelope by pushing the heat envelope... That's the main reason AMD took the speed advantage, because they were willing to operate their processors at higher temperatures than Intel would at the time.
Now, I would say it's quite a different story. First off, processors are getting closer and closer to the end of the line for heat increases.. Pretty soon, no known metal will be able to conduct heat away fast enough to allow computers to operate at room-temperatures. Even now, dumb little personal computers need serious cooling solutions... Either that, or they need to be some place that has serious air conditioning.
So, what are companies going to do, even with the current line of processors? Should they invest loads of money in dispersing waste heat, powerful air conditioners, system cooling fans, and software and/or hardware to closely monitor temperatures? OR Should they invest in a higher-end system that doesn't put off so much heat, doesn't use up so much electricity, etc?
In fact, I think we are even nearing the point where home users are going to get seriously pissed off and start demanding lower-power systems... It's interesting that C3 processors have become so popular despite their lowsy perfomance... (Maybe AMD/Intel will learn something from that)
So, I do think that either commodity processors will hit the heat ceiling, and stagnate like the rotational speeds of current IDE hard drives, OR the electrical and major cooling requirements of commodity processors will become too much to justify the small price savings. Either way, that will leave the market wide open for serious computing companies once again. The only question really is how much longer will it be until one of those two things happens? Well, in the Southern California Desert, electricty prices are still very high, and the temperatures are so very high that running a modern computer 24 hours a day requires your home cooling to also be running 24 hours a day, just to operate within the heat tolerances. I don't think it will be much longer before more of the country, and the world, will reach the same point.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant