On the Supercomputer Technology Crisis
scoobrs writes "Experts claim America has been eating our 'supercomputer feed corn' by developing clusters rather than new supercomputer processors and interconnects. Forbes says America is playing catch-up and that the new federal budget items are too little too late. Cray is laying people off due to decreased federal spending and claims lower margin products have forced them to create products based on commodity parts. Red Storm, one of their new Linux-based products, is being delayed to next year."
when you can build a top 5 supercomputer for under 6 million dollars, using off the shelf parts. Why spend the hundreds of millions of dollars?
This is an expected and predicted fallout from the recent rise in popularity of beowulf clusters. Slowly but surely managers are realizing, yes, it is possible to have a supercomputer on mass-market hardware, running a free OS.
Don't see this as bad news... it's a sign that we're winning.
+ Donald Gunth
+ Email: dgunth@quicktek.net
"Caffeine is the greatest lubricant ever created." -ESR
What most people don't seem to understand is that you don't need a supercomputer when a mesh of nodes on a network will do just as well. Just like most people don't understand that a 386 running Linux and Word Perfect 5.1 is just as good of a word processor as a 2.5Ghz Itanium running Windows and Word. Computer power has *usefull* limits as well as technological limits.
SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.
Of course people are going to cry that companies like Cray are falling by the wayside, but the truth is that their services simply aren't as needed as they were in years past.
If you have to ask, you'll never know.
I think that should have been "Seed Corn."
Free market sucess might lead to us actually having to pay for our own supercomputer research that we use in profit making ventures.
Don't see this as bad news... it's a sign that we're winning.
Right. The Cray folks have just realized that they are about to go the way of buggy whip and the slide rule. They don't like it one bit. They can only complain by making a lot of noise. But it won't work. When you're extinct, there is no coming back.
Its the fact that clusters require higher skill to program efficiently for than do single processor systems. Plus you have all of the wasted processing power used for communication between the nodes. Granted, many problems lend themselves well to distributed computing (essentially what a cluster is, but the nodes are closer and communicate faster), but there are also problems that are handled better by a smaller amount of specialized hardware. The other point is that by using off the shelf parts, we are not really innovating in this space like we should be. We are allowing the commodity computer market determine the direction of the supercomputer market.
...and in many cases, even if they *don't* need them. If you want to ensure the economic health of a nation, investing in basic, long-term research would be a good start. This gives a head start in the area industry is weakest in.
Trying to create some sort of supercomputing subsidy out of a misplaced fear that the U.S. will miss out is silly -- if the demand is there, the companies will be as well.
One of my professors (everybody has one of these it seems) is working on cluster computing research, extensions of MOSIX. He's a guy with networking and operating systems expertise. I wouldn't hire him to build a new generation of super computing interconnects or processors. As the Republicans have taught us, federal budgets are not a zero sum game. Why divert focus from one to the other when we could have both?
We have to be careful about measuring these things however. One of the goals of cluster computing was to lower the cost of computing. If the government is spending less and still meeting needs, thats not nessecarily an indicator of a problem. If that means that we aren't writing code to fit into a vector platform, so be it!
I Browse at +4 Flamebait
Open Source Sysadmin
There seems to be some historical revisionism going on regarding the demise of the "supercomputer industry". People are coming out of the woodwork now saying that lack of government support caused the great supercomputer die off.
As Eugene Brooks predicted in his paper Attack of the Killer Micros, the supercomputer dieoff was caused by the increasing performance of microprocessor based systems. Many of us now own what used to be called supercomputers (e.g., 3GHz Pentinum processors, capable of hundreds of megaFLOPs).
The problem with supercomputers is that high performance codes must be specially designed for the supercomputer. This is very expensive. As people were able to fill their needs with high performance microprocessors they quit buying supercomputers.
Many people who need supercomputer levels of performance for specialized applications (e.g., rendering Finding Nemo or The Lord of the Rings) are able to use walls of processors or clusters.
There are, of course, groups where putting together off-the-shelf supercomputers will not suffice. But these groups are few and far between. As far as I can tell they consist of the government and a few corporations doing complex simulations. The problem is that this is not much of a market. Even if the government funds computer and interconnect architectural research, there does not seem to be a market to sustain the fruits of this research.
In the heyday of supercomputers there were those who argued that when cheap supercomptuers were available the market would develop. The problem is, again, programming. High performance supercomputer codes tend to be specialized for the architecture. Also, no supercomputer architecture is equally efficient for all applications. It is difficult to build a supercompter that is good at doing fluid flow calculations for Boeing and VLSI netlist simulation for Intel (the first applications tends to be SIMD, the second, MIMD). The end result of these problems tends to suppress any emerging supercomptuer market.
The reality right now seems to be that those who are doing massive computation must build specialized systems and throw a lot of talent into developing specialized codes.
If there truly is a demand for those kind of processors, then somebody will likely meet that demand. Right now, it seems that actual demand is so low that they have to drum up this legislation a as a sort of wellfare for vector processor manufacturers.
It's a simple cost tradeoff. If you can save millions in purchasing computers, it means more money to pay for people to run those computers and do the real work.
This sig has been temporarily disconnected or is no longer in service
So, what tasks still require a high-speed shared data memory? Answer that, and you'll understand where you can still sell a supercomputer.
Bruce
Bruce Perens.
Granted, it is more difficult to program something (from the ground up) that runs distributed, than it is to program something that runs on a giant 2048-way box.
Just like it's more difficult to write multithreaded code than it is to write single-threaded code.
That's where software, and platforms come in. There is a TON of research being done, which uses technologies like Infiniband and Myrinet as interconnects, and can make a cluster "look" like a big monolithic machine. If you as an end user write code that goes down into the TCP stack itself, you're working too hard, and you're going about it the wrong way.
Put it this way: In 5 years the odds are overwhelming that there will be a good software platform that can let you pick 5000 servers and run your app 10,000 threaded, with everything appearing just like a single process, and running "as it would on a Cray." It's easier to solve this stuff with software -- take your problem (distributed computing) and solve the problem with a different set of technologies (high performance/low latency interconnects, shared address space/DMA across machines, etc).
Apple's Xgrid is a step in this direction. It's missing a ton of "Supercomputer" functionality right now, but it's a nice cross-machine GUI scheduler. Right now this type of app can address maybe 20% of what supercomputer apps need... in the future maybe more like 98%.
There's the argument in a nutshell. A cluster ain't worth shiite to a modeler who needs to move petabytes of contiguous data in his algorithms.
There is no super computer technology crisis, there is however a paradigm shift happening in the supercomputer market. Twenty years ago building your own supercomputer, even a loosely coupled cluster, was not a very viable option for most research institutions. Today this option is not only viable but often exercised.
Obviously the big SC vendors and designers seeing less business roll their way, why pay them tons of money when you can have grad students assemble your cluster for the price of some pizzas? That isn't to say SC clusters are the end-all be-all of computing but they're very useful and relatively inexpensive. Realistically they're simply an extension of what Cray started with their T3D supercomputer. The T3D was very impressive in its days but now the technology to build such systems is in the hands of just about everyone.
Taco: What the hell is up with the IT color scheme? This is even worse than the scheme for the Games section. I know the Slashdot editors don't actually read the site but other people try to and we're not all colorblind or reading from grayscale monitors.
I'm a loner Dottie, a Rebel.
In particular, he wants a 2000s-version of an 1980s architecture running a 1960s language. For $1M, he could train his technology guys to use newer programming techniques. Yes, I realize that Fortran 90 is newer than Fortran 77 which is newer than Fortran IV which is newer than Fortran 1, and that the biggest CPU job these guys do is usually crunching big matrices of floating point numbers. That's a job for a subroutine you write once and feed with data and user interfaces that are written in languages that are more efficient for prototyping and user interface design.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Is it because there is a perceived zero-sum game being played between Linux-based clusters and supercomputers? Hey, let's take a reality check here, a lot of research is not directly applicable. In fact, I've read numerous discussions on /. railing against the MBAs and the Bush regime for not funding anything that doesn't turn a profit within about 18 months or have something to do with killing brown people.
/.'rs that the country as a whole subsidizes advanced computing? Isn't computer science all about seeing what can be computed? Letting supercomputing die because it's expensive seems like an extraordinarily short-sighted thing to do.
Letting supercomputing die may be harmless, after all, the US doesn't have to be the best at everything in the world and some other country will fund the research. But from some of the more coherent posts I've read, it seems like supercomputing has a definite niche in the natural sciences, something we should be pushing for a better society - learning for learning's sake - and paying for out of public coffers. My taxes go to a lot of shitty things I'd rather them not go to, like subsidizing Haliburton with no-bid contracts. Why is it so offensive to
What does it matter if we don't develop single unit supercomputers. Clearly in a free market if these thing had value they would be persued. There is not predetory tax laws on supercomputer, or any other regulations on domestic use. The only reason development has slowed is there is not much market for the beasts.
There are many reasons for that too, for one other then in stealer, neculear, mathematic, and bio research feilds few industries need more computing power then can be had off the shelf any day of the week. That was not true yesterday it took all sorts of custom hardware to make CGI happen in films that can be done now in my basement in resonable time frames. So no more super computer market there the ROI is gone I am sure this plays out in all sorts of other engineering feilds as well.
Many places where you do need super computing power can be done with clusterd systems that are cheap to build and cheap to maintain.
At least people in the pure science and research fields have learned to be better thinkers and programers, they found ways to do things in parallel that were traditionally serial. Things that still are serial can be made to work on a cluster, sure it might take longer then a single computer considered to be equal FLOPSwise but considering I could either spend all the money I saved makeing my cluster bigger and more powerful so I can get back to equal time or on other profitable efforts while I wait there is again no ROI.
It so happens that may of the most interestin questions in math, physics and computer science such as quatum theory need massive amounts of parallel work, rather then serial so that works better on a cluster anyway.
If there is a real reason to do it people will build supercomputer, because there is nothing stopping them other then economics. No need to fear Supercomputers are not going away. Everyone else that needs that kinda proc-ing power will settle for clusters, as well they should. This is just another largly obsolete industry wanting someone to bail them out because they have failed to adapt to a changing market. If they are going to die we should let them, just like we should let the Universitys adapt or die, and the RIAA needs to adapt or die, we need to stop proping up obsolete undustries so new ones can replace them!
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
Oh really. Don't blame me for not trusting a guy with that kind of potential bias.
Time makes more converts than reason
One AC to another: All the simulations you are talking about will have to be re-run a lot of times with different combinations of input parameters anyhow. Therefore, these are trivially parallelizable on a cluster (just run one combination of input data on each of the CPUs).
- non-punk who actually saw how Chem simulations are run
- old enough to see CDC run
There never really was a supercomputer market. There was a cold war, that subsidized the supercomputer market.
Then there is the cost. Companies stopped making SC because they were too expensive. If the guy from Ford wants to pay 1 billion for a supercomputer I am sure someone will build him one. The cost build a FAB is over 4 billion. Why do you think HP teamed with Intel. Why do you think there are so few processor families? You have to make a living in the commodity market where you can sell things in the millions because supercomputers even in their heyday were sold in the hundreds.
Then there is the problem that many problems are solvable on clusters. So those specialized problems can not depend on other parts of the HPC market to help subsidized their corner of the market. i.e. clusters make the really hard problems more expensive.
It is question of how much you want to pay to solve your problem? Simple economics actually. If the numbers don't work, the problem doesn't get solved. If the Gov. wants to solve some problems (and during the cold war they did) then they can step in and subsidize the market.
And don't cry about Japan and the Top500. When the top500 has price column then it will start to be meaningful.
HPC for Primates. Read Cluster Monkey
I hear this type of FUD all the time from some of the older folks I work with -- all the hype about Cray systems, shared memory capabilities. There are several problems with the 'supercomputing' market though. First off if you go to any local University, they teach fortran as a basic intro course, but most professors will footnote their comments by saying things like "Fortran is no longer considered a marketable language"... so students think -- why waste time learning it (other than the basic programming capability?); this erodes the base support for vectorized programming support (putting aside arguements that fortran is not vector programming for now please). More importantly (and secondly), Cray architectures are MASSIVELY expensive in relation to where the standard desktop CPU is without enough benefit to make them worth it. People talk about the need to play "catch up" as if there is a real crisis at hand. The reality is that clusters are gaining in popularity because they're cheap, and per CPU tend to be more 'powerful'. Institutions that purchase Crays end up with a number of folks trying to run software and inevitably you run into time bottlenecks. What folks at these institutions realized is that by purchasing a 'state-of-the-art' 32/64 bit system with dual processors and a couple gig of memory, that by the time their jobs were starting on the Crays, they could already be running on their new desktops.... In the end I say why rent time on a big Cray when you can purchase your own system and run it into the ground with jobs? Despite some of the FUD I've seen splashed around by folks who are proponents of Cray systems, Clusters are relatively simple to setup, do not require multi-million dollar yearly contracts, and can generally be maintained by the purchaser without much effort. Most of the cost of clusters anymore is the cost of the system administration -- which you don't need a "staff" of SA's to administer. IMHO, you don't even need an SA anymore for designing, building, and running a cluster; with only a baseline of knowledge of computers, and a little reading just about anyone could build their own. When all is said and done, you end up with more compute power in a cluster at a much lower cost. Some folks would say that throughput is an issue; that's problem dependent, and can be rectified with a little problem solving. Some folks indicated that clusters aren't optimized for peak performance... WHO CARES?? If it takes me two minutes longer for a 5 week job to finish, I've still made out better for not purchasing a Cray. I've heard folks complain about not enough memory, or programming problems -- generally these tend to be older folks (what I like to call the obsolete-engineers; aka -- old fogies) These are generally the folks that complain about the 'new fangled' software they have to use, and simply don't want to have to reinvent their "marvelous software" in a new environment. I've run 64 gig memory jobs on clusters without a problem; again my problems allow for parallelization optimization... Eventually I end up telling folks who are steadfast supporters of the Cray to wake up, this is a brand new era about you. The days of the single supercomputer vendor filling all your needs are over. You now have choices, and with those choices you have to take the responsibility of defining how your new parallel supercomputer will function. Either learn and adapt, or become obsolete.
I'll pretend someone will read this late post, but most of the arguments people are presenting against supercomputers are business arguments. Why do we need a business model to want to build supercomputers? Governments build a lot of stuff that makes horrible business sense. For example, particle accelerators. These things are damn expensive and I doubt profitable. But they do enable fundamental scientific research. The same can be said about supercomputers. It's going to be less than 20 years before we could start thinking about almost science-fiction type problems --- for example, molecular dynamics simulations of whole human cells. However, we're not going to get there by strapping all of our 2025 cell phones together and making a conference call (the 2025 idea of a cluster). Supercomputers are designed for fundamentally different scalability, reliability, compute performance, and bandwidth than consumer systems. While they might not be a good business idea, it hard to argue against the benfits to material science, physics, medicine, etc. that supercomputers could provide to society. In that light, $200M of government spending isn't about holding onto ideas of the past, it's about preparing for the future.