Slashdot Mirror


Two Directions for the Future of Supercomputing

aarondsouza writes: "The NY Times (registration required, mumble... mutter...) has this story on two different directions being taken in the supercomputing community. The Los Alamos labs have a couple of new toys. One built for raw numbercrunching speed, and the other for efficiency. The article has interesting numbers on the performance/price (price in the power consumption and maintenance sense) ratios for the two machines. As an aside... 'Deep Blue', 'Green Blade' ... wonder what Google Sets would think of that..."

10 of 148 comments (clear)

  1. Full article w/o registration by af_robot · · Score: 1, Informative
    At Los Alamos, Two Visions of Supercomputing
    By GEORGE JOHNSON

    Moore's Law holds that the number of transistors on a microprocessor -- the brain of a modern computer -- doubles about every 18 months, causing the speed of its calculations to soar. But there is a downside to this oft-repeated tale of technological progress: the heat produced by the chip also increases exponentially, threatening a self-inflicted meltdown.

    A computer owner in Britain recently dramatized the effect by propping a makeshift dish of aluminum foil above the chip inside his PC and frying an egg for breakfast. (The feat -- cooking time 11 minutes -- was reported in The Register, a British computer industry publication.) By 2010, scientists predict, a single chip may hold more than a billion transistors, shedding 1,000 watts of thermal energy -- far more heat per square inch than a nuclear reactor.

    The comparison seems particularly apt at Los Alamos National Laboratory in northern New Mexico, which has two powerful new computers, Q and Green Destiny. Both achieve high calculating speeds by yoking together webs of commercially available processors. But while the energy-voracious Q was designed to be as fast as possible, Green Destiny was built for efficiency. Side by side, they exemplify two very different visions of the future of supercomputing.

    Los Alamos showed off the machines last month at a ceremony introducing the laboratory's Nicholas C. Metropolis Center for Modeling and Simulation. Named for a pioneering mathematician in the Manhattan Project, the three-story, 303,000-square-foot structure was built to house Q, which will be one of the world's two largest computers (the other is in Japan). Visitors approaching the imposing structure might mistake it for a power generating plant, its row of cooling towers spewing the heat of computation into the sky.

    Supercomputing is an energy-intensive process, and Q (the name is meant to evoke both the dimension-hopping Star Trek alien and the gadget-making wizard in the James Bond thrillers) is rated at 30 teraops, meaning that it can perform as many as 30 trillion calculations a second. (The measure of choice used to be the teraflop, for "trillion floating-point operations," but no one wants to think of a supercomputer as flopping trillions of times a second.)

    Armed with all this computing power, Q's keepers plan to take on what for the Energy Department, anyway, is the Holy Grail of supercomputing: a full-scale, three-dimensional simulation of the physics involved in a nuclear explosion.

    "Obviously with the various treaties and rules and regulations, we can't set one of these off anymore," said Chris Kemper, deputy leader of the laboratory's computing, communications and networking division. "In the past we could test in Nevada and see if theory matched reality. Now we have do to it with simulations."

    While decidedly more benign than a real explosion, Q's artificial blasts -- described as testing "in silico" -- have their own environmental impact. When fully up and running later this year, the computer, which will occupy half an acre of floor space, will draw three megawatts of electricity. Two more megawatts will be consumed by its cooling system. Together, that is enough to provide energy for 5,000 homes.

    And that is just the beginning. Next in line for Los Alamos is a 100-teraops machine. To satisfy its needs, the Metropolis center can be upgraded to provide as much as 30 megawatts -- enough to power a small city.

    That is where Green Destiny comes in. While Q was attracting most of the attention, researchers from a project called Supercomputing in Small Spaces gathered nearby in a cramped, stuffy warehouse to show off their own machine -- a compact, energy-efficient computer whose processors do not even require a cooling fan.

    With a name that sounds like an air freshener or an environmental group (actually it's taken from the mighty sword in "Crouching Tiger, Hidden Dragon"), Green Destiny measures about two by three feet and stands six and a half feet high, the size of a refrigerator.

    Capable of a mere 160 gigaops (billions of operations a second), the machine is no match for Q. But in computational bang for the buck, Green Destiny wins hands down. Though Q will be almost 200 times as fast, it will cost 640 times as much -- $215 million, compared with $335,000 for Green Destiny. And that does not count housing expenses -- the $93 million Metropolis center that provides the temperature-controlled, dust-free environment Q demands.

    Green Destiny is not so picky. It hums away contentedly next to piles of cardboard boxes and computer parts. More important, while Q and its cooling system will consume five megawatts of electrical power, Green Destiny draws just a thousandth of that -- five kilowatts. Even if it were expanded, as it theoretically could be, to make a 30-teraops machine (picture a hotel meeting room crammed full of refrigerators), it would still draw only about a megawatt.

    "Bigger and faster machines simply aren't good enough anymore," said Dr. Wu-Chung Feng, the leader of the project. The time has come, he said, to question the doctrine of "performance at any cost."

    The issue is not just ecological. The more power a computer consumes, the hotter it gets. Raise the operating temperature 18 degrees Fahrenheit, Dr. Feng said, and the reliability is cut in half. Pushing the extremes of calculational speed, Q is expected to run in sprints for just a few hours before it requires rebooting. A smaller version of Green Destiny, called Metablade, has been operating in the warehouse since last fall, requiring no special attention.

    "There are two paths now for supercomputing," Dr. Feng said. "While technically feasible, following Moore's Law may be the wrong way to go with respect to reliability, efficiency of power use and efficiency of space. We're not saying this is a replacement for a machine like Q but that we need to look in this direction."

    The heat problem is nothing new. In taking computation to the limit, scientists constantly consider the trade-off between speed and efficiency. I.B.M.'s Blue Gene project, for example, is working on energy-efficient supercomputers to run simulations in molecular biology and other sciences.

    "All of us who are in this game are busy learning how to run these big machines," said Dr. Mike Levine, a scientific director at the Pittsburgh Supercomputing Center and a physics professor at Carnegie Mellon University. A project like Green Destiny is "a good way to get people's attention," he said, "but it is only the first step in solving the problem."

    Green Destiny belongs to a class of makeshift supercomputers called Beowulf clusters. Named for the monster-slaying hero in the eighth-century Old English epic, the machines are made by stringing together off-the-shelf PC's into networks, generally communicating via Ethernet -- the same technology used in home and office networking. What results is supercomputing for the masses -- or, in any case, for those whose operating budgets are in the range of tens or hundreds of thousands of dollars rather than the hundreds of millions required for Q.

    Dr. Feng's team, which also includes Dr. Michael S. Warren and Eric H. Weigle, began with a similar approach. But while traditional Beowulfs are built from Pentium chips and other ordinary processors, Green Destiny uses a special low-power variety intended for laptop computers.

    A chip's computing power is ordinarily derived from complex circuits packed with millions of invisibly tiny transistors. The simpler Transmeta chips eliminate much of this energy-demanding hardware by performing important functions using software instead -- instructions coded in the chip's memory. Each chip is mounted along with other components on a small chassis, called a blade. Stack the blades into a tower and you have a Bladed Beowulf, in which the focus is on efficiency rather than raw unadulterated power.

    The method has its limitations. A computer's power depends not just on the speed of its processors but on how fast they can cooperate with one another. Linked by high-speed fiber-optical cable, Q's many subsections, or nodes, exchange data at a rate as high as 6.3 gigabits a second. Green Destiny's nodes are limited to 100-megabit Ethernet.

    The tightly knit communication used by Q is crucial for the intense computations involved in modeling nuclear tests. A weapons simulation recently run on the Accelerated Strategic Computing Initiative's ASCI White supercomputer at Lawrence Livermore National Laboratory in California took four months of continuous calculating time -- the equivalent of operating a high-end personal computer 24 hours a day for more than 750 years.

    Dr. Feng has looked into upgrading Green Destiny to gigabit Ethernet, which seems destined to become the marketplace standard. But with current technology that would require more energy consumption, erasing the machine's primary advantage.

    For now, a more direct competitor may be the traditional Beowulfs with their clusters of higher-powered chips. Though they are cheaper and faster, they consume more energy, take up more space, and are more prone to failure. In the long run, Dr. Feng suggests, an efficient machine like Green Destiny might actually perform longer chains of sustained calculations.

    At some point, in any case, the current style of supercomputing is bound to falter, succumbing to its own heat. Then, Dr. Feng hopes, something like the Bladed Beowulfs may serve as "the foundation for the supercomputer of 2010."

    Meanwhile, the computational arms race shows no signs of slowing down. Half of the computing floor at the Metropolis Center has been left empty for expansion. And ground was broken this spring at Lawrence Livermore for a new Terascale Simulation Facility. It is designed to hold two 100-teraops machines.

  2. all i see by martissimo · · Score: 3, Informative

    is that the writer has noticed that is cheaper to run a beowulf than to run a true supercomputer, but in return for the price you sacrifice performance...

    though i did find the line about Q needing rebooted every few hours kinda funny, i mean when are they gonna learn to stop installing Windows on a 100 million dollar supercomouter ;)

  3. Re:Google set reply - OT by flonker · · Score: 2, Informative

    A little bit of research shows up this on how google sets works. There's a link on the bottom of that message for an introduction to faceted sets.

    And now for the fun bit. Looking for set with just the keyword Porn, I got some very interesting results:

    Predicted Items
    Porn
    Warez Sites
    pirated software
    Irc Bots
    Mp3
    Spamming Software
  4. Moores Law by wiZd0m · · Score: 2, Informative

    Moore's Law holds that the number of transistors on a microprocessor -- the brain of a modern computer -- doubles about every 18 months, causing the speed of its calculations to soar.

    This is a myth for the non techie, it's transistor density that doubles every 18 months, not the number of transistors.

  5. Re:Gridcomputing sites by grid+geek · · Score: 4, Informative
    Rubbish. Talking as someone doing a PhD in the subject, Grid computing is *not* the answer to every high performance computing problem.

    Latency issues are still going to be there and which would make Grid environments unsuitable for the majority of simulations. You can't do nuclear event simulations effectively if you have a multiple second delay in communicating between processors which you get in Grids.


    On the other hand Grids do have several advantages in terms of providing similar TFLOPS for a much lower price, by using several geographically seperated systems you give access to more researchers and research in this area has a lot of practical spin offs in the future.

  6. Q machine interconnect by Anonymous Coward · · Score: 3, Informative

    For those of you who are wondering what they mean by high performance networks inside the Q machine..

    The Q machine utilizes dual-rail Quadrics card according to this. Dual rail refers to using two NI cards (each one on a separate 64b/66MHz PCI bus so they can get the most out of the I/O system of the host).

    I hadn't heard of Quadrics so I looked them up. At the web site you find out that they're a switched network that gets 340 MBytes per second between applications and with latencies around 3-5 microseconds. Compare this to 100Mbps ethernet, which gets 10MBytes/s and latencies of 70+ microseconds and you'll understand why the Q machine will run fine grained parallel apps that the green machine won't be able to touch.

    Looking a bit through the literature, I noticed that Quadrics uses IEEE 1596.3 for its link signaling (400 MBaud, 10 bit). While they don't say it anywhere, this IEEE standard is the well-known SCI standard (scalable coherent interconnect.. pretty popular in Europe, but the US has been dominated by Myrinet..which I conicidentally use at school)..

    Hope this gives some more detail about the arch..

  7. Re:Teraops? by pclminion · · Score: 3, Informative
    Since when did Flops turn into ops? It's importatnt to make a distinction between floating point operations and integer operations, right?

    Not really, for two reasons: first, supercomputer CPUs are rigged for floating point, and they do it really fast anyway. Second, a super CPU is so fast compared to RAM that the time difference between an integer op and a floating point op is almost totally amortized into the RAM access time anyway. In other words, computing a float multiplication might be 1.5 times slower than an integer multiplication, but it's still 200 times faster than a RAM access.

    Then you have to work out what exactly you mean by "operation" -- a single multiplication, or a single vector instruction (which might multiply 64 numbers in one shot). It quickly becomes difficult to judge performance based on some "flops" or "ops" number. To figure out performance it's better to just run the real application and see how fast it goes...

  8. Re:Efficiency of Programming? by joib · · Score: 3, Informative

    No. They use standard compilers and tools for their respective architectures (that is Tru64 for Q and I guess Linux for Green Destiny). The applications are programmed using MPI, a FORTRAN/C/C++ message passing API which is an absolute bitch to program.

  9. Re:Green Destiny looks like an RLX cluster by Anonymous Coward · · Score: 1, Informative

    That's because it is. :) But RLX initially intended their hardware for web hosting applications. Green Destiny just integrates it differently (kind of like typical PCs are meant for non-supercomputer use, but Beowulf clusters integrate them differently).

  10. Re:Green Destiny looks like an RLX cluster by hippster · · Score: 3, Informative

    It IS a cluster of RLX Transmeta blades, each containing a 667MHz processor and 640MB memory connected by 100Mbit Ethernet. It's not meant to compete with "Q". It's simply a great departmental or workgroup cluster. However, its' efficiency suggests it might be a concept worth exploring for future cluster supercomputing architectures. Hippster