On the Supercomputer Technology Crisis
scoobrs writes "Experts claim America has been eating our 'supercomputer feed corn' by developing clusters rather than new supercomputer processors and interconnects. Forbes says America is playing catch-up and that the new federal budget items are too little too late. Cray is laying people off due to decreased federal spending and claims lower margin products have forced them to create products based on commodity parts. Red Storm, one of their new Linux-based products, is being delayed to next year."
when you can build a top 5 supercomputer for under 6 million dollars, using off the shelf parts. Why spend the hundreds of millions of dollars?
This is an expected and predicted fallout from the recent rise in popularity of beowulf clusters. Slowly but surely managers are realizing, yes, it is possible to have a supercomputer on mass-market hardware, running a free OS.
Don't see this as bad news... it's a sign that we're winning.
+ Donald Gunth
+ Email: dgunth@quicktek.net
"Caffeine is the greatest lubricant ever created." -ESR
Of course people are going to cry that companies like Cray are falling by the wayside, but the truth is that their services simply aren't as needed as they were in years past.
If you have to ask, you'll never know.
I think that should have been "Seed Corn."
Free market sucess might lead to us actually having to pay for our own supercomputer research that we use in profit making ventures.
Random Array of Inexpensive Servers.
If the 'supercomputers' of today are increasing performance, does it really matter the design?
Maybe that is a signal that monolithic computer tasks are best handled in a hive mentality - have the Queen issue the big orders, have the warriors performing security, have the workers transporting the goodies (data), and have the requisite extra daughters and suitors to grow the hive and assure its viability (redundancy).
The fact that it is cost-effective is even better.
a mesh of nodes on a network will do just as well
In some cases.
Unfortunately, some problems are particularly unsuitable for clusters of commercial computers, and really benefit from specialized architectures such as shared memory or vector processors.
A while ago it was decided by the US government to essentially abandon such specializations, and buy COTS. It is certainly cheaper, but not necessarily effective.
Its the fact that clusters require higher skill to program efficiently for than do single processor systems. Plus you have all of the wasted processing power used for communication between the nodes. Granted, many problems lend themselves well to distributed computing (essentially what a cluster is, but the nodes are closer and communicate faster), but there are also problems that are handled better by a smaller amount of specialized hardware. The other point is that by using off the shelf parts, we are not really innovating in this space like we should be. We are allowing the commodity computer market determine the direction of the supercomputer market.
Technology first developed on the high end slowly works it's way down into the low end. What happens when the high end is no longer there.
Not that many people really need a race care, but advances in fuels, materials, engineering in race cars eventually leads to bette passenger car. And for raw performsnce, strapping together a bunch of Festivas will not get you the same as an Indy racer.
There seems to be some historical revisionism going on regarding the demise of the "supercomputer industry". People are coming out of the woodwork now saying that lack of government support caused the great supercomputer die off.
As Eugene Brooks predicted in his paper Attack of the Killer Micros, the supercomputer dieoff was caused by the increasing performance of microprocessor based systems. Many of us now own what used to be called supercomputers (e.g., 3GHz Pentinum processors, capable of hundreds of megaFLOPs).
The problem with supercomputers is that high performance codes must be specially designed for the supercomputer. This is very expensive. As people were able to fill their needs with high performance microprocessors they quit buying supercomputers.
Many people who need supercomputer levels of performance for specialized applications (e.g., rendering Finding Nemo or The Lord of the Rings) are able to use walls of processors or clusters.
There are, of course, groups where putting together off-the-shelf supercomputers will not suffice. But these groups are few and far between. As far as I can tell they consist of the government and a few corporations doing complex simulations. The problem is that this is not much of a market. Even if the government funds computer and interconnect architectural research, there does not seem to be a market to sustain the fruits of this research.
In the heyday of supercomputers there were those who argued that when cheap supercomptuers were available the market would develop. The problem is, again, programming. High performance supercomputer codes tend to be specialized for the architecture. Also, no supercomputer architecture is equally efficient for all applications. It is difficult to build a supercompter that is good at doing fluid flow calculations for Boeing and VLSI netlist simulation for Intel (the first applications tends to be SIMD, the second, MIMD). The end result of these problems tends to suppress any emerging supercomptuer market.
The reality right now seems to be that those who are doing massive computation must build specialized systems and throw a lot of talent into developing specialized codes.
So, what tasks still require a high-speed shared data memory? Answer that, and you'll understand where you can still sell a supercomputer.
Bruce
Bruce Perens.
For things like weather forecasting, maybe big vector machines still have an edge, but I suspect that's changing as the weather guys get more experience in using machines with large numbers of micros. This seems to have already occurred, in fact; NCAR appears to have mostly IBM RS6000 and SGI computers these days, with nary a Cray in sight.
The most common term I used to hear in the early 90's was Killer Micros; I think the term dates back David Bailey in the 80's sometime. If you want more evidence that the death of the supercomputer has been going on for a long time, check out The Dead Supercomputer Society, which lists dozens of failed companies and projects over the years; this page was apparently last updated 6 years ago!
Have you read my blog lately?
I've been in this field over 25 years, been in public position at a major lab now for 8.
If this was a simple issue, the HPC community would already have completely moved to clusters and never looked back 3 or 4 years ago. But it's not kiddies.
Want to run a physics projection for more than 1 microsecond? Takes real horsepower that clusters cannot provide even distributed. Just too much damn data. Chem codes that include REAL data for useable time slices? too slow for clustered memory. Every auto maker in the world (almost) has been whining about the lack of BIG horsepower for a few years now.(crash codes and FEA) I could go on forever. Sure, some problems work awesome on clusters, which is why we have them. But definately not all of them.
The problem is partly diminishing returns, partly the pathetic ammount of useable memory on a cluster and its joke for memory throughput, partly the growth in power of the low end and clustered networking, partly the ridiculously long development cycles invloved in High Performance Computing and the low $ returns,
One of the biggest things congress sees is that this country will more than likely NEVER again lead the world in computing power for defense and research.
And thats something we ought to do as the last real Superpower.
The national labs TRIED clusters, they don't get all the jobs done they wanted. (see testimony before congress, writings in HPC jounals, and the last couple RFPs from US gov. labs,heck every auto maker in the world) People in HPC _know_ it now, but having let what little there was of the supercomputer industry die out, there isn't mcuh of an industry left to turn to now. It just may be too darned late. HPC hasn't been a money making industry since the early 80s.
Heck, even Intel abandoned their clustered machine they custom built for the government.
Most folks in HPC will readily admit the Top500 is kind of a joke. The HPC-challenge #s are a little more realistic for the tests, but we really do need something that approximately real world applications, not just a 70s cpu benchmark.
For those that think this is a 'Linux wins' issue,
consider that mostly it was fast interconnect networks that allowed clustering, not the OS. Examine the history of clusters and you'll see this is true. Btw, the last few SC companies are already mostly moving to linux anyway.(nec,fujitsu,cray;ibm dabbles in hpc)
Hopefully the industry will survive long enough to allow for even better mergers of supercomputing power with low end cost, but at this point I doubt it. Cray has been on the ropes since 96, fujitsu's sc division is a loss leader, and NEC has been trying to get out of it for a while for something with a margin.
Ed -gov labs HPC research punk
-former Cray-on
-former CDC type
I've seen a lot of naive comments suggesting that supercomputers are being replaced by clusters. The truth is, anyone who can replace their supercomputer with a cluster didn't need a supercomputer in the first place:
- (compared to a supercomputer):
- The prime advantage of an x86-based server is that it is cheap, and it has a fast processor. It is only fast for applications in which the whole dataset resides in memory - and even then, it is still the slowest of the group.
- Clusters are a little better, but suffer from severe scalability problems when driving IO-bound processes. As with the x86 server, if you can't put the full dataset into memory, you might as well forget using a cluster. The node to node throughput is several orders of magnitude slower than the processor bus in multiple CPU systems. (6.4GB/s vs 17MB/s for regular ethernet, or 170MB/s for Gigabit)
- Multiple CPU servers do better, but still lack the massive storage capacity of the mainframe. They work better than clusters for parallel algorithms requiring frequent syncronization, but still suffer from a lack of overall data storage capacity and throughput.
- Mainframes, OTOH, possess relatively modest processors, but the combined effect of having several of them, and the massive IO capability makes them very good for data processing. However, their processors aren't fast at anything, and often run at 1/2 or 1/3 the speed of their desktop counterparts.
- Supercomputers combine the IO throughput of a mainframe with the fast processors typically associated with RISC architectures (if you can still consider anything RISC or CISC nowadays). They have faster processors, more memory, and much greater IO throughput than any other category.
It used to be that the prime reason for faster computers came from the scientific and business communities. But now that the internet has turned computers into glorified televisions, the challenges have gone from that of crunching numbers to serving content:As our economy has shifted away from a technological base to an entertainment one, the need for supercomputers has begun to evaporate. We outsource innovation overseas so that we can lounge around on the couch watching tv and drinking beer (or surfing the net and drinking beer). The primary purpose of technological innovation has shifted from that of discovering the universe to merely bringing us better entertainment.
The society for a thought-free internet welcomes you.
Uh, Cray have a backlog of orders. A backlog to the tune of $153 million, if I recall correctly.
That's not the sign of a dying buisness model. If they are having problems, it's down to the mangement, not lack of demand.
There are problems that don't work well on clusters, but rocket on a proper supercomputer. These include a lot of interesting areas, there will always be demand for a few pieces of big iron. At the risk of echoing the ghost of IBM CEO's past, I think somewhere around 20-30 serious top end supercomputers in the world [0]. Most of the rest of the jobs will do just fine on high end clusters.
If you read the article, there are no quotes from Cray people. What there are quotes from is the people who used to get to play with special hardware, who now admin those clusters.
It's toys for the boys, not a buggy whip issue.
[0] That's informed by being someone who uses high perfromance computing, both cluster and supercomputer.