Slashdot Mirror


Supercomputers To Move To Specialization?

lucasw writes "The Japan Earth Simulator outperformed a computer at Los Alamos (previously the world's fastest) by a factor of three while using fewer, more specialized processors and advanced interconnect technology. This spawned multiple government reports that many suspected would ask for more funding in the U.S. for custom supercomputer architectures and less emphasis on clustering commodity hardware. One report released yesterday suggests a balanced approach."

15 of 174 comments (clear)

  1. Cost comparison? by Tyrdium · · Score: 4, Interesting

    Ignoring size, how does the cost of a cluster of fewer, highly specialized computers (with special interconnects, etc.) compare with that of a cluster of more, less specialized computers?

    1. Re:Cost comparison? by ybmug · · Score: 5, Insightful

      The problem is that it may not be possible to match the computation of a cluster with specialized interconnects using just commodity hardware no matter how many machines you throw at it. If a simulation has a low computation to communication ratio it's scalability is bound by the perfomance of the interconnects. In this case throwing more commodity machines at the problem will actually increase the total time required to run the experiment.

    2. Re:Cost comparison? by mfago · · Score: 4, Informative

      The interconnects are (usually) not commodity parts -- just the servers.

      As an example, the first IBM SP "supercomputers" were essentially just common Power workstations bolted into racks, but connected with a custom made SP switch.

      Nevertheless, EarthSimulator has shown what can be done by designing the entire server from the ground-up with the application in mind.

      We'll have to see how ASCI Purple performs...

  2. performance vs cost by harmless_mammal · · Score: 4, Interesting

    Teraflops per dollar is important, let's not forget that.

  3. Specialization by bersl2 · · Score: 4, Interesting

    If you're going to have a supercomputer do one thing, of course specialize it. An Earth simulation surely has a set number of formulae whose calculations are to be optimized as much as possible, even to the hardware level.

    But if you want a versitile, general-purpose supercomputer, why not go with the clustering solution?

  4. The motivation is a tad depressing by Faust7 · · Score: 4, Insightful

    The two studies resulted, in part, from NEC Corp.'s May 2002 announcement of the Earth Simulator, a custom-built supercomputer that delivers 35.8 teraflops. That system packed five times the performance of the fastest U.S. supercomputer at that time...

    "The Earth Simulator created a tremendous amount of interest in high-performance computing and was a sign the U.S. may have been slipping behind what others were doing," said Jack Dongarra...

    Graham said researchers should not overreact to NEC Corp.'s Earth Simulator that blindsided many in the high-performance computing community eighteen months ago by delivering a custom-built system five to seven times more powerful than the more off-the-shelf clusters developed in the U.S.


    I don't mean to draw a crude analogy here, but I really can't help but read this and be reminded of the space race.

    It took Sputnik to kickstart our spacemindedness; I for one consider it sad that a "tremendous amount of interest" -- and the funding that comes with it -- in high-performance computing seems only to have arisen/regenerated with the influence of competitive international politics. Are we really so hardly advanced that our respective national egos are still the driving force behind enthusiasm, financial or otherwise, in certain areas of science?

    1. Re:The motivation is a tad depressing by Pharmboy · · Score: 4, Interesting

      It took Sputnik to kickstart our spacemindedness; I for one consider it sad that a "tremendous amount of interest" -- and the funding that comes with it -- in high-performance computing seems only to have arisen/regenerated with the influence of competitive international politics. Are we really so hardly advanced that our respective national egos are still the driving force behind enthusiasm, financial or otherwise, in certain areas of science?

      I don't really see that as bad. Yes, it may look like pure ego, but the space race gave us so much that filtered into the commercial/private sector. From advanced computers to Velcro(tm). From my perspective, being the most advanced nation in as many areas as possible is a good defense, both economically and in a homeland security sense.

      Frankly, I don't want the fastest computer chips on the desktop to be designed by a company in another country (even if Intel makes them outside of the US) and I would rather that the cutting edge, be cut here, in my native country. I am sure other people in other countries feel the same, that pushed all of us to new heights. In the end, the technologies are shared anyway. Most anyone in the world can buy Intel chips, for example.

      If no one cared who could race a bicycle the fastest, Lance Armstrong would be just some guy who had cancer. Instead, our desire to compete and excell and outdo our neighbors has benefited EVERYONE a great deal. It can bring out the bad side from time to time, but the benefits far outweigh the costs. This urge to compete and win is not unique to America by any means, it is part of being human: man the animal.

      I say bring on the computer chip wars: Lets all compete, Japanese, Americans, Europeans, Russians, come one come all. In the end, we will all benefit, no matter who has the bragging rights for a day.

      --
      Tequila: It's not just for breakfast anymore!
  5. Specialized always outperforms... by I'm+a+racist. · · Score: 5, Insightful

    Specialized hardware (almost) always outperforms commodity stuff.

    I use custom designed amplifiers because they work better for my application. I could buy off-the-shelf stuff (~$500~$10,000 range), but that won't be exactly what I want. I use custom software too... know why? Because it's designed specifically for the job. That same software shouldn't really be used for other fields of research, neither should my amplifiers. The thing about this stuff is that it takes a lot of time to maintain (plus initial development). That means grad students, postdocs, and technicians who may spend over 90% of their time just keeping systems in working order and/or adding features. The benefits of customized hardware/software, in this instance, is worth the headaches associated with it.

    All of my optics is commodity stuff (some is rare/exotic, but it's still basically black-box purchasing). I don't have the facilities to make coated optics, nor do I need anything that specialized, so... I just buy it.

    When I was in telecom, we used Oracle and Solaris and Apache. It worked, and the cost of developing the same functionality in-house was ridiculously high (plus we'd never get to designing our products that sit on top of it).

    Eventually, it always comes down to a comparison between the cost (man hours, equipment, etc) of custom building and of integrating stuff from OEMs.

    So, the question our labs need to answer is, does clustered COTS hardware get the job done? Supplementary to that, is it cost-effective to buy/design it in light of the previous answer?

    In any field where you are pushing the limits of technology, you have to make such trade-offs. Personally, I don't care who has the absolute fastest supercomputer (measured in flops, factoring-time, whatever)... what really counts is, who does the best research with the supercomputers.

    --


    Down with Saudi Arabia!!!
  6. Specialization by bytesmythe · · Score: 4, Insightful

    Specialized systems are almost always going to outperform generalized systems when you're dealing with similar levels of technology (for instance, specialized abacasuses vs. a generalized Cray T3E).

    The great thing about generalized systems is you can use them to explore new areas, then design a specialized system to take advantage of specific optimizations the generalized one can't support.

    I'm glad for the report suggesting a "balanced approach". I can't imagine forsaking one type of system for the other, as each has its place. (Uhoh... generalized systems have a "place"? Does that mean they're specialized at being generalized? Oh, the irony! ;))

    --
    bytesmythe
    Hypocrisy is the resin that holds the plywood of society together.
    -- Scott Meyer
  7. Re:Oh! No! End of the World! by BabyDave · · Score: 5, Funny

    There's a far more important thing to worry about - could this be the end of "Imagine a Beowulf Cluster ..." jokes? After all, the phrase "Imagine a custom-built supercomputer utilising similar technology (albeit more specialised) to that found in one of those!" doesn't exactly roll off the tongue, does it?

  8. Re:trigonometry? by Gherald · · Score: 5, Funny

    Do HP's Saturn or other such special-purpose processors have hard-coded higher-level functions?

    Indeed, functions Cost_an_arm_and_a_leg() and Fork_over_much_dough() are hard-coded, and always return a value of "1".

  9. This greatly surprises me by ikewillis · · Score: 4, Interesting
    As an employee of an atmospheric modelling group I am very surprised to hear this. Our atmospheric modelling program, the Regional Atmospheric Modelling System, is not I/O bound in the slightest and is instead very much CPU bound. We currently use 100bT for the interconnect on our cluster, and have tried moving to Gigabit with negligable performance gains.

    The main area in which we saw benefit was switching from the Portland Group Fortran Compiler to the Intel Fortran Compiler, which cut the timestep (simulation time/real time) nearly in half.

    Every cluster in the department is assembled from commodity x86 components. Groups here have been moving from proprietary Unix architectures to Linux/x86 systems and clusters. Our group started out on RS/6000s, then moved to SPARC, and is now moving to x86. In terms of price/performance there really is no comparison.

    As for TCO, the lifetimes of clusters here are relatively short, one or two years at the most. Thus a high initial outlay cannot be set by lower cost of operation.

    1. Re:This greatly surprises me by FullyIonized · · Score: 5, Interesting
      And I'm surprised to hear that you are surprised since fluid modeling is one of the applications that do very well with the vector processors that the Earth Simulator uses. I attended a lecture by Dr. Sato, head of the Earth Simulator, who stated that the best application usage was 65% peak usage (the theoretical peak which assumes that the processor always has data to crunch and no branches) and the average was 30% of theoretical peak. By contrast, typical fluid-like codes on current U.S. machines typically get less than 10% of peak usage if they have any type of implicitness (currently the magnetohydrodynamics code I use gives about 6% usage on an IBM SP that is #5 on the Top 500 supercomputer list).

      I get tired of seeing figures that compare peak flop rates and then don't mention that actually code usage isn't keeping up at all. The Japanese (and Europeans who are allowed to buy NEC machines) are absolutely spanking the US when it comes to fluid codes (for climate modeling for example) and it is largely because they are using vector machines with their old highly optimized Fortran (or High Performance Fortran) codes. The MPP revolution in the U.S. has been manna for the CompSci community, but has set the computational physics community back by 10 years (except for those lucky bastards with embarrassingly parallel jobs).

      I would give up an unnecessary body part for an Earth Simulator.

      --
      Sigs are bad for you.
  10. Re:Why oh why? by mfago · · Score: 4, Informative

    computers like the earth simulator go vastly under utilized for the most part

    From first-hand experience, such computers are running jobs almost 24x7. Due to job scheduling details there are times when some of the machine is idle, but this is still a small percentage. These machines are used for a vast array of applications, not just the advertized ones.

    Now the utilization as a percentage of peak theoretical is another matter. For some algorithms, 20% of peak performance (IIRC) is considered good (ie. a particular code might only get 2 TFlops on a machine rated for 10).

  11. In the real world its a bit more complicated... by depeche · · Score: 4, Insightful

    There is also a direct trade-off between more general purpose systems and systems custom tailored to a task. Good examples are Deep Blue and Blue Gene. Both of these systems are designed with a particular task in mind (i.e. chess and protein folding) and therefor are able to leverage knowledge about the problem space to constrain the kind of hardware, the particular low-level instructions and the information flow within the system while achieving signifigantly greater performance on a small class of problems. I work with clusters that are used in scientific communities that have various researchers working on various problems. In these cases, the questions are about basic applicability of a particular problem to a particular architecture. For example a cluster with high-speed interconnects made of good COTS hardware will allow a user with a very granular problem to effectively use the cluster and it will also allow a user who needs the high speed interconnect because the problem space demands a high degree of internal communication. But the first researcher might also be able to make use of a grid of (for instance) many more computers with a total lower cost because (s)he doesn't need the high speed interconnect. The Earth Simulator gains a lot of performance (on a class of problems) because of the underlying vector processor architecture. Given the right internal bus it is conceivable that adding vector processor daughter boards to the next generation of COTS clusters could achieve similar results--but, of course, only for problem spaces that make efficient use of such processors and aren't bottlenecked by the communication requirements.

    Real answers are always more complicated. For example: the equations needed for nuclear simulation will probably require dedicated hardware (as the need for protein folding has lead to Blue Gene) to achieve the results that the Pentagon needs. But for many super computing tasks, the flexibility of COTS clusters will still be compelling, especially for areas where the algorithms are not yet fully developed (e.g. brain simulation). An interesting keynote at OLS 2003 argued that (some of) the problems are not going to be the local computing power but the need to move large quantities of data between research labs across the world and combine computational systems using the 'grid.' (For a down home examples of problems that have been successfully tackled through course granular distribution just look at SETI@Home and Distributed.Net. So its not just the flops anymore...