On the Supercomputer Technology Crisis
scoobrs writes "Experts claim America has been eating our 'supercomputer feed corn' by developing clusters rather than new supercomputer processors and interconnects. Forbes says America is playing catch-up and that the new federal budget items are too little too late. Cray is laying people off due to decreased federal spending and claims lower margin products have forced them to create products based on commodity parts. Red Storm, one of their new Linux-based products, is being delayed to next year."
when you can build a top 5 supercomputer for under 6 million dollars, using off the shelf parts. Why spend the hundreds of millions of dollars?
This is an expected and predicted fallout from the recent rise in popularity of beowulf clusters. Slowly but surely managers are realizing, yes, it is possible to have a supercomputer on mass-market hardware, running a free OS.
Don't see this as bad news... it's a sign that we're winning.
+ Donald Gunth
+ Email: dgunth@quicktek.net
"Caffeine is the greatest lubricant ever created." -ESR
What most people don't seem to understand is that you don't need a supercomputer when a mesh of nodes on a network will do just as well. Just like most people don't understand that a 386 running Linux and Word Perfect 5.1 is just as good of a word processor as a 2.5Ghz Itanium running Windows and Word. Computer power has *usefull* limits as well as technological limits.
SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.
Of course people are going to cry that companies like Cray are falling by the wayside, but the truth is that their services simply aren't as needed as they were in years past.
If you have to ask, you'll never know.
I think that should have been "Seed Corn."
Free market sucess might lead to us actually having to pay for our own supercomputer research that we use in profit making ventures.
Random Array of Inexpensive Servers.
If the 'supercomputers' of today are increasing performance, does it really matter the design?
Maybe that is a signal that monolithic computer tasks are best handled in a hive mentality - have the Queen issue the big orders, have the warriors performing security, have the workers transporting the goodies (data), and have the requisite extra daughters and suitors to grow the hive and assure its viability (redundancy).
The fact that it is cost-effective is even better.
It's seed corn. Seed, as in, what you don't eat, but save to plant next year.
Kids these days.
With reasonable men I will reason; with humane men I will plead; but to tyrants I will give no quarter. -- William Lloyd
Clusters are not good for very chattery parallel processes, a shared memory supercomputer can still do much better for computational fluid dynamics.
Its the fact that clusters require higher skill to program efficiently for than do single processor systems. Plus you have all of the wasted processing power used for communication between the nodes. Granted, many problems lend themselves well to distributed computing (essentially what a cluster is, but the nodes are closer and communicate faster), but there are also problems that are handled better by a smaller amount of specialized hardware. The other point is that by using off the shelf parts, we are not really innovating in this space like we should be. We are allowing the commodity computer market determine the direction of the supercomputer market.
Cray has been engaging in scare tactics about "America being dominated by overseas competitors" for a while, because they're terrified of losing the lucrative business contracts from government and big business, they'll pull out all the stops. They've come up in the IT press recently a couple of times.
Screw 'em. If there's a need, the market will provide. If it turns out that the important tasks can be parallelized and run on much less expensive clusters, then all that means is that we have a more efficient solution to the problem.
May we never see th
If you really want a vector-processor supercomputer you can program in Fortran, get yourself a G5 and gcc. The PPC64 supports SIMD vector processing. For that matter, any problem which benefits from vector processing is trivial to parallelize with threads.
Technology first developed on the high end slowly works it's way down into the low end. What happens when the high end is no longer there.
Not that many people really need a race care, but advances in fuels, materials, engineering in race cars eventually leads to bette passenger car. And for raw performsnce, strapping together a bunch of Festivas will not get you the same as an Indy racer.
There seems to be some historical revisionism going on regarding the demise of the "supercomputer industry". People are coming out of the woodwork now saying that lack of government support caused the great supercomputer die off.
As Eugene Brooks predicted in his paper Attack of the Killer Micros, the supercomputer dieoff was caused by the increasing performance of microprocessor based systems. Many of us now own what used to be called supercomputers (e.g., 3GHz Pentinum processors, capable of hundreds of megaFLOPs).
The problem with supercomputers is that high performance codes must be specially designed for the supercomputer. This is very expensive. As people were able to fill their needs with high performance microprocessors they quit buying supercomputers.
Many people who need supercomputer levels of performance for specialized applications (e.g., rendering Finding Nemo or The Lord of the Rings) are able to use walls of processors or clusters.
There are, of course, groups where putting together off-the-shelf supercomputers will not suffice. But these groups are few and far between. As far as I can tell they consist of the government and a few corporations doing complex simulations. The problem is that this is not much of a market. Even if the government funds computer and interconnect architectural research, there does not seem to be a market to sustain the fruits of this research.
In the heyday of supercomputers there were those who argued that when cheap supercomptuers were available the market would develop. The problem is, again, programming. High performance supercomputer codes tend to be specialized for the architecture. Also, no supercomputer architecture is equally efficient for all applications. It is difficult to build a supercompter that is good at doing fluid flow calculations for Boeing and VLSI netlist simulation for Intel (the first applications tends to be SIMD, the second, MIMD). The end result of these problems tends to suppress any emerging supercomptuer market.
The reality right now seems to be that those who are doing massive computation must build specialized systems and throw a lot of talent into developing specialized codes.
If there truly is a demand for those kind of processors, then somebody will likely meet that demand. Right now, it seems that actual demand is so low that they have to drum up this legislation a as a sort of wellfare for vector processor manufacturers.
It's a simple cost tradeoff. If you can save millions in purchasing computers, it means more money to pay for people to run those computers and do the real work.
This sig has been temporarily disconnected or is no longer in service
So, what tasks still require a high-speed shared data memory? Answer that, and you'll understand where you can still sell a supercomputer.
Bruce
Bruce Perens.
For things like weather forecasting, maybe big vector machines still have an edge, but I suspect that's changing as the weather guys get more experience in using machines with large numbers of micros. This seems to have already occurred, in fact; NCAR appears to have mostly IBM RS6000 and SGI computers these days, with nary a Cray in sight.
The most common term I used to hear in the early 90's was Killer Micros; I think the term dates back David Bailey in the 80's sometime. If you want more evidence that the death of the supercomputer has been going on for a long time, check out The Dead Supercomputer Society, which lists dozens of failed companies and projects over the years; this page was apparently last updated 6 years ago!
Have you read my blog lately?
Granted, it is more difficult to program something (from the ground up) that runs distributed, than it is to program something that runs on a giant 2048-way box.
Just like it's more difficult to write multithreaded code than it is to write single-threaded code.
That's where software, and platforms come in. There is a TON of research being done, which uses technologies like Infiniband and Myrinet as interconnects, and can make a cluster "look" like a big monolithic machine. If you as an end user write code that goes down into the TCP stack itself, you're working too hard, and you're going about it the wrong way.
Put it this way: In 5 years the odds are overwhelming that there will be a good software platform that can let you pick 5000 servers and run your app 10,000 threaded, with everything appearing just like a single process, and running "as it would on a Cray." It's easier to solve this stuff with software -- take your problem (distributed computing) and solve the problem with a different set of technologies (high performance/low latency interconnects, shared address space/DMA across machines, etc).
Apple's Xgrid is a step in this direction. It's missing a ton of "Supercomputer" functionality right now, but it's a nice cross-machine GUI scheduler. Right now this type of app can address maybe 20% of what supercomputer apps need... in the future maybe more like 98%.
And in other news today, buggy whip manufacturers demand increased government subsidies.
Jesus was all right but his disciples were thick and ordinary. -John Lennon
Forbes has been complaining that federal support of advanced computing is too little? If the government over-stimulates an industry that has too small of a market, it wil just delay the failure.
Of course the governent should continue in its current policy of funding a few leading-edge machines that are too costly to sell into the general market, but will test new technology. The governemnt itself is a customer will energy testing, weather modeling, medicine development, etc.
I've been in this field over 25 years, been in public position at a major lab now for 8.
If this was a simple issue, the HPC community would already have completely moved to clusters and never looked back 3 or 4 years ago. But it's not kiddies.
Want to run a physics projection for more than 1 microsecond? Takes real horsepower that clusters cannot provide even distributed. Just too much damn data. Chem codes that include REAL data for useable time slices? too slow for clustered memory. Every auto maker in the world (almost) has been whining about the lack of BIG horsepower for a few years now.(crash codes and FEA) I could go on forever. Sure, some problems work awesome on clusters, which is why we have them. But definately not all of them.
The problem is partly diminishing returns, partly the pathetic ammount of useable memory on a cluster and its joke for memory throughput, partly the growth in power of the low end and clustered networking, partly the ridiculously long development cycles invloved in High Performance Computing and the low $ returns,
One of the biggest things congress sees is that this country will more than likely NEVER again lead the world in computing power for defense and research.
And thats something we ought to do as the last real Superpower.
The national labs TRIED clusters, they don't get all the jobs done they wanted. (see testimony before congress, writings in HPC jounals, and the last couple RFPs from US gov. labs,heck every auto maker in the world) People in HPC _know_ it now, but having let what little there was of the supercomputer industry die out, there isn't mcuh of an industry left to turn to now. It just may be too darned late. HPC hasn't been a money making industry since the early 80s.
Heck, even Intel abandoned their clustered machine they custom built for the government.
Most folks in HPC will readily admit the Top500 is kind of a joke. The HPC-challenge #s are a little more realistic for the tests, but we really do need something that approximately real world applications, not just a 70s cpu benchmark.
For those that think this is a 'Linux wins' issue,
consider that mostly it was fast interconnect networks that allowed clustering, not the OS. Examine the history of clusters and you'll see this is true. Btw, the last few SC companies are already mostly moving to linux anyway.(nec,fujitsu,cray;ibm dabbles in hpc)
Hopefully the industry will survive long enough to allow for even better mergers of supercomputing power with low end cost, but at this point I doubt it. Cray has been on the ropes since 96, fujitsu's sc division is a loss leader, and NEC has been trying to get out of it for a while for something with a margin.
Ed -gov labs HPC research punk
-former Cray-on
-former CDC type
I've seen a lot of naive comments suggesting that supercomputers are being replaced by clusters. The truth is, anyone who can replace their supercomputer with a cluster didn't need a supercomputer in the first place:
- (compared to a supercomputer):
- The prime advantage of an x86-based server is that it is cheap, and it has a fast processor. It is only fast for applications in which the whole dataset resides in memory - and even then, it is still the slowest of the group.
- Clusters are a little better, but suffer from severe scalability problems when driving IO-bound processes. As with the x86 server, if you can't put the full dataset into memory, you might as well forget using a cluster. The node to node throughput is several orders of magnitude slower than the processor bus in multiple CPU systems. (6.4GB/s vs 17MB/s for regular ethernet, or 170MB/s for Gigabit)
- Multiple CPU servers do better, but still lack the massive storage capacity of the mainframe. They work better than clusters for parallel algorithms requiring frequent syncronization, but still suffer from a lack of overall data storage capacity and throughput.
- Mainframes, OTOH, possess relatively modest processors, but the combined effect of having several of them, and the massive IO capability makes them very good for data processing. However, their processors aren't fast at anything, and often run at 1/2 or 1/3 the speed of their desktop counterparts.
- Supercomputers combine the IO throughput of a mainframe with the fast processors typically associated with RISC architectures (if you can still consider anything RISC or CISC nowadays). They have faster processors, more memory, and much greater IO throughput than any other category.
It used to be that the prime reason for faster computers came from the scientific and business communities. But now that the internet has turned computers into glorified televisions, the challenges have gone from that of crunching numbers to serving content:As our economy has shifted away from a technological base to an entertainment one, the need for supercomputers has begun to evaporate. We outsource innovation overseas so that we can lounge around on the couch watching tv and drinking beer (or surfing the net and drinking beer). The primary purpose of technological innovation has shifted from that of discovering the universe to merely bringing us better entertainment.
The society for a thought-free internet welcomes you.
Uh, Cray have a backlog of orders. A backlog to the tune of $153 million, if I recall correctly.
That's not the sign of a dying buisness model. If they are having problems, it's down to the mangement, not lack of demand.
There are problems that don't work well on clusters, but rocket on a proper supercomputer. These include a lot of interesting areas, there will always be demand for a few pieces of big iron. At the risk of echoing the ghost of IBM CEO's past, I think somewhere around 20-30 serious top end supercomputers in the world [0]. Most of the rest of the jobs will do just fine on high end clusters.
If you read the article, there are no quotes from Cray people. What there are quotes from is the people who used to get to play with special hardware, who now admin those clusters.
It's toys for the boys, not a buggy whip issue.
[0] That's informed by being someone who uses high perfromance computing, both cluster and supercomputer.
What does it matter if we don't develop single unit supercomputers. Clearly in a free market if these thing had value they would be persued. There is not predetory tax laws on supercomputer, or any other regulations on domestic use. The only reason development has slowed is there is not much market for the beasts.
There are many reasons for that too, for one other then in stealer, neculear, mathematic, and bio research feilds few industries need more computing power then can be had off the shelf any day of the week. That was not true yesterday it took all sorts of custom hardware to make CGI happen in films that can be done now in my basement in resonable time frames. So no more super computer market there the ROI is gone I am sure this plays out in all sorts of other engineering feilds as well.
Many places where you do need super computing power can be done with clusterd systems that are cheap to build and cheap to maintain.
At least people in the pure science and research fields have learned to be better thinkers and programers, they found ways to do things in parallel that were traditionally serial. Things that still are serial can be made to work on a cluster, sure it might take longer then a single computer considered to be equal FLOPSwise but considering I could either spend all the money I saved makeing my cluster bigger and more powerful so I can get back to equal time or on other profitable efforts while I wait there is again no ROI.
It so happens that may of the most interestin questions in math, physics and computer science such as quatum theory need massive amounts of parallel work, rather then serial so that works better on a cluster anyway.
If there is a real reason to do it people will build supercomputer, because there is nothing stopping them other then economics. No need to fear Supercomputers are not going away. Everyone else that needs that kinda proc-ing power will settle for clusters, as well they should. This is just another largly obsolete industry wanting someone to bail them out because they have failed to adapt to a changing market. If they are going to die we should let them, just like we should let the Universitys adapt or die, and the RIAA needs to adapt or die, we need to stop proping up obsolete undustries so new ones can replace them!
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
There never really was a supercomputer market. There was a cold war, that subsidized the supercomputer market.
Then there is the cost. Companies stopped making SC because they were too expensive. If the guy from Ford wants to pay 1 billion for a supercomputer I am sure someone will build him one. The cost build a FAB is over 4 billion. Why do you think HP teamed with Intel. Why do you think there are so few processor families? You have to make a living in the commodity market where you can sell things in the millions because supercomputers even in their heyday were sold in the hundreds.
Then there is the problem that many problems are solvable on clusters. So those specialized problems can not depend on other parts of the HPC market to help subsidized their corner of the market. i.e. clusters make the really hard problems more expensive.
It is question of how much you want to pay to solve your problem? Simple economics actually. If the numbers don't work, the problem doesn't get solved. If the Gov. wants to solve some problems (and during the cold war they did) then they can step in and subsidize the market.
And don't cry about Japan and the Top500. When the top500 has price column then it will start to be meaningful.
HPC for Primates. Read Cluster Monkey