Slashdot Mirror


Has Supercomputing Hit a Brick Wall?

anzha writes "Horst Simon, Deputy Director of Lawrence Berkeley National Laboratory, has stood up at conferences of late and said the unthinkable: supercomputing is hitting a wall and will not build an exaFLOPS HPC system by 2020. This is defined as one that passes linpack with a performance of one exaFLOPS sustained or better. He's even placed money on it. You can read the original presentation here."

32 of 185 comments (clear)

  1. It is tough by cold+fjord · · Score: 2

    You can't really make factor 10 improvements indefinitely. Eventually the numbers overwhelm you and you hit roadblocks. The only real solution will ultimately be new computing technology, such as quantum computers.

    --
    much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
    1. Re:It is tough by unixisc · · Score: 2

      The problem is extracting parallelism. What is there to stop one from building, say Itanium based MPP systems and tossing more CPUs into the mix, using either an unified memory architecture, or distributed memory architecture? Point is that it won't speed up computing beyond a point simply because there ain't that much of parallelism in most processes.

    2. Re:It is tough by OakDragon · · Score: 2, Interesting

      Why not just make a Beowulf cluster?

      Can you imagine?

    3. Re:It is tough by fisted · · Score: 3, Funny

      > Just the yesterday there was another thread where someone was trying to suggest [...] instead of realizing [...].

      What?! Someone was wrong on the Internet?

    4. Re:It is tough by gman003 · · Score: 2

      Memory latency. Beowulf clusters are good for things that are highly parallel *and* have a high degree of memory locality, ie. you rarely need to make memory calls between boxes.

      True supercomputers use high-speed interconnects between systems for this reason, usually using something like Infiniband or a weird proprietary system, and usually with some network topology with numerous inter-system links. This gives them much lower latency when one system uses data in memory in another system.

  2. No? by oGMo · · Score: 4, Informative

    "Japan to develop new exaflop computer by 2020" ... why not? And if it's even a few microseconds into 2021 I suppose that supercomputing has failed, will pack up, and go home.

    --

    Don't think of it as a flame---it's more like an argument that does 3d6 fire damage

    1. Re:No? by oGMo · · Score: 2

      Sure but they're one of many. Even if one of the many don't accomplish this, surely another will. If not by (or before!) 2020, sometime later. People aren't just going to give up if it doesn't happen by some arbitrary date. This is my real point.

      These days, how much is really revolutionary anyway? So many new supercomputing announcements are "we threw N parts at this, so it's Yflops".

      --

      Don't think of it as a flame---it's more like an argument that does 3d6 fire damage

    2. Re:No? by gentryx · · Score: 5, Informative

      Power consumption and MTBF: power consumption (high operating costs) be solved perhaps be solved by a larger budget, but the mean time between failures (MTBF) means, that the machine will fail before it can compute anything meaningful. Right know the machines we build, and even more importantly, the software we build rely on all parts of the machine to function. If even a single node fails, then the data it holds becomes inaccessible and the rest of the compute job crashes like a house of cards.

      This can be remedied by taking frequent snapshots and then restarting from the last snapshot, but the time for checkpoint/restart has been continuously growing for the last systems. No one really expects exascale systems to do full system checkpoint/restart in a reasonable time frame. They'd spend more time taking snapshots than actually computing.

      Source: I'm doing my PhD in supercomputing.

      --
      Computer simulation made easy -- LibGeoDecomp
    3. Re:No? by Forever+Wondering · · Score: 2

      You might need to broaden your research beyond what is available in the academic literature. Google handles redundancy. When they do a map/reduce, the clusters are self forming. If a cluster leader/master goes down, the cluster reelects a new master. They trust the integrity of nothing. Not even DRAM. They checksum everything. The actual architecture of Google's data centers is a closely guarded trade secret, but from what [little] I've been able to glean, they're light years ahead of "big iron" vendors such as Cray. Likewise for Amazon and [even] Facebook.

      Also, there are some systems in development where the individual compute cells are modeled on neural networks. This is in relation to the power consumed. The cells use a bare fraction of the most low power cores (even Intel's Haswell/trigate), something like 100x or higher.

      You might be astonished by this, but you're not alone. Students that do PhD's in information search get to Google. They find out that the best knowledge they have is 10 years out-of-date compared to what Google does internally.

      --
      Like a good neighbor, fsck is there ...
  3. Re:Ha, not the first by ssam · · Score: 5, Insightful

    moore's law only talks about transistor counts. building a supercomputer means getting thousands of CPUs to cooperate which is a much harder challenge.

    Anyone (with a large wallet) can stick an exoflop worth of CPUs in a large room. by 2020 you'll be able to do that with a not so large wallet. but that does not result in a useful exoflop computer

  4. Clarke's Three Laws by Tokolosh · · Score: 5, Interesting

    Clarke's Three Laws are three "laws" of prediction formulated by the British writer Arthur C. Clarke. They are:

    1. When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2. The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
    3. Any sufficiently advanced technology is indistinguishable from magic.

    --
    Prove anything by multiplying Huge Number times Tiny Number
    1. Re:Clarke's Three Laws by Anonymous Coward · · Score: 2, Insightful

      So, since Freeman Dyson said "Faster-than-light travel is rubbish" that means he's probably wrong, and we'll be warping around the galaxy soon enough?

    2. Re:Clarke's Three Laws by Anonymous Coward · · Score: 2, Funny

      he should stick to building vacuum cleaners.

    3. Re:Clarke's Three Laws by tgd · · Score: 3, Funny

      You don't seem to understand the "concept" behind "warp."

      You are not exceeding the speed of light, you are just not traveling the linear distance between the two points.

      That's like saying that he doesn't understand the concept behind a Stargate. Made up is made up is made up.

      You can't have an honest discourse on the speed if light when you're trying to involve fiction. You might as well go full star trek and say that thetalon radiation transmorphs subspace and changes the value of C, but only in the presence of an extradimensional rift, and if-and-only-if you have a humpback whale.

  5. Re:Ha, not the first by wagnerrp · · Score: 2

    Right. When it comes to supercomputing, the network is just as important, if not more so, than the nodes it connects. Grid computers like the various @Home projects are far and away more powerful than anything on the TOP500 list, but that doesn't make them supercomputers.

  6. Re:Ha, not the first by fuzzyfuzzyfungus · · Score: 5, Insightful

    It's a particular nuisance because the speed of light is pretty strictly enforced...

    Even if you went full-on-nuts and replaced fiber interconnects with little tubes full of hard vacuum, to squeak out that slight improvement over the speed of light in glass or air, you'll still see latency that meaningfully hinders the cooperation of multi-GHz CPUs and RAM across systems of any nontrivial size.

    For loosely coupled problems, that barely matters; but not all problems are loosely coupled.

  7. I just woke up... by Kaenneth · · Score: 3, Funny

    And still a little fuzzy headed, but the first thing I though of was arranging the racks for shortest maximim path, instead of one big football field sized room, stacking the datacenter into a cube shape... Then I thoght, "That's probably why Borg ships are Cubes."

  8. Re:Happy Tuesday from The Golden Girls! by Meyaht · · Score: 2

    Cronglebaun

    --
    I believe in karma, which is why, when I do something bad to people, I assume they deserve it.
  9. Latency not as important as expected by gentryx · · Score: 3

    Although latency isn't so much of an issue: the #1 systems of the last ~3 years did all have torus networks (all Blue Genes, all Crays, K computer, too). These networks only perform well for next neighbor communication -- which is fine since most codes running on these machines are simulation codes and they only need this type of communication. If you scale up the system, you'll typically also scale the size of the simulation instance (this is known as "weak scaling").

    This means that your program can still spend the same time waiting for the network as it could on a smaller machine. The cables do not need to become shorter.

    --
    Computer simulation made easy -- LibGeoDecomp
    1. Re: Latency not as important as expected by epiphyte(3) · · Score: 2

      I was on the architecture team for Cray & SGI mpp &Cc-NUMA machines in the 90s. afaik the first cray mpp (T3D) had the lowest barrier sync latency of any machine ever built, before or since. we could sync 512 nodes in less than a microsecond. turned out to be extremely expensive overkill from the pov of app algorithms. may not be so these days since the compute phases are so much quicker w.r.t the comms than they were back then.

  10. Re:Ha, not the first by fuzzyfuzzyfungus · · Score: 4, Insightful

    I'm no expert on the refined world of supercomputers; but my money would be on latency. If you are made of money, bandwidth is a problem that you can substantially brute force. Not 100% efficiently; and layout gets to be a real headache; but if the state of the art in serial interconnects isn't good enough, you can bolt a bunch of them together and have a parallel interconnect(it'll be harder to do board layout for, the wiring will suck more, and it'll cost more; but the major sticking point is money).

    If you want to cut latency, even the most exotic photonics-on-die-with-hollow-fiber arrangement imaginable still gives you surprisingly short distances before you start losing CPU cycles to waiting for the return photon.

  11. Re:Ha, not the first by swillden · · Score: 3, Interesting

    building a supercomputer means getting thousands of CPUs to cooperate which is a much harder challenge.

    Looking at his presentation, that seems to be his point. He concludes that power efficiency is going to become the limiting factor driving design decisions, and that since the power cost of increasing FLOPS has been so much lower than the power cost of moving larger quantities of data we're heading into an era where connectivity costs will so dominate the cost of cycles that cycles will be essentially free.

    Hes's then basically arguing that it won't be cost-effective to build data transmission architectures that can effectively utilize exaflops, so no one will bother to build an exaflop machine.

    He didn't state it, but if the rest of his arguments are correct, perhaps we're going to see the definition of a new metric for HPC, one that somehow captures the ability of a machine to distribute data to its computation nodes.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  12. The Nanosecond by wcrowe · · Score: 4, Interesting

    Back in the early 80's I got the opportunity to hear Grace Hopper speak. One of the stories she used to like to tell at her talks was about the time that she was having trouble visualizing a nanosecond. Eventually she sent a memo to her engineers which said, "Please send up one nanosecond." She waited, curious as to how they would respond. After a couple of days a response came back in the form of a metal rod 11-3/4 inches in length with the note attached, "One Nanosecond", and no other explanation. After puzzling over the metal rod she called down to the engineering department and asked, "I give up, what is it"? "That's the distance light travels in a nanosecond", was the response. Later, she sent another memo to the engineers with the request, "Please send up one picosecond." The engineers immediately responded with a memo instructing her to, "put the nanosecond in a pepper grinder and you can make picoseconds all over your desk."

    Grace Hopper's humorous anecdote underlines the serious problems faced by researchers when they push the boundaries. In her case, it was a real concern over how far a bit can travel at the speed of light. I have no idea if that has any bearing on the exascale problem, but it might illustrate the kinds of problems they might be running into.

    --
    Proverbs 21:19
  13. so what? by markhahn · · Score: 4, Insightful

    I'm an HPC professional, and do not see much value in these "hero" machines. Yes, you can go on all you want about the march of progress and tier-1 and grand challenges, but you're just reiterating an unquestioned manifest destiny-based view of history. Why do we need an Exaflop machine? is it because some particular set of applications need it? where is the threshold for those applications where the compute facility will be fast enough to achieve some breakthrough?

    it's hard to find areas that are primarily limited by compute facilities. for instance, genetics/proteomics/metabilomics/whatever are *not* compute-limited, especially at the high end. they're laboratory-limited, the same way weather simulations are good and getting better, but not past the quality of their input data.

    we need more compute in general, but not necessarily in one machine. a single exaflop machine will cost much more than a thousand petaflop machines. letting a thousand flowers bloom is much prettier than one excruciatingly beautiful flower...

    and no, hero machines do not provide an efficient way to improve the tech of lesser or later machines. they have to be justified by their own need.

    1. Re:so what? by iggymanz · · Score: 2

      you are silly. systems biology modeling of cells will require exascale computing, so will simulations in chemistry of miilions or more atoms for hundredth of a second or more. Lattice simulations for physics are demanding them too.

    2. Re:so what? by Nite_Hawk · · Score: 3, Insightful

      I'm an HPC professional too.

      I don't totally disagree with your premise, but what the heck are you doing talking about genetics and proteomics in reference to giant supercomputers? If you know anything about proteomics codes, you know that the commonly used search engines like sequest and mascot were never designed to run on systems like that. Hell, they barely run on small clusters and yet people are getting enough science done that they just don't care. That doesn't mean that it's hard to find problems that need supercomputers though.

      If you want to talk about the really big systems, you are talking about things like nuclear weapons simulations, astrophysics, molecular dynamics, and quantum mechanics. There are only a handful of guys that will actually make really good use of those systems and scores of folks that would otherwise be perfectly fine running on significantly smaller ones. Having smaller jobs backfill on the big machines when the really hardcore guys are off doing something else isn't such a bad situation though. It lets you get the big science done and still keep the machines being used efficiently in the interim.

      Beyond that, just because some researchers aren't scaling their codes to those levels yet doesn't mean we should give up on big systems. There will always be people pushing the envelop and others playing catch up. Our job is to help the slow guys scale their codes when possible so they can do even better and more intensive science. Yes, not all problems require the big systems, but there are many that do, many that can be made to scale even when they don't appear to at first, and others that can serve as backfill to keep the systems busy. They have their place just as smaller clusters, cloud resources, and big data resources do.

    3. Re:so what? by angel'o'sphere · · Score: 2

      Te problem itself can't. However sou can solve many problems of the same kind at the same time in parallel. (That actually is what most super computers in our days do)

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  14. Re:If you ignore the best news in supercomputing . by Anonymous Coward · · Score: 3, Insightful

    Even if you ignore all the controversy over D-Wave's system and its nature, and take it all at face value, it is still only applicable to a narrow class of problems. CMOS or not, it amounts to something similar in principle to an ASIC. It is no surprised that a custom built chip can solve a specific class of problems orders of magnitudes faster than a general purpose processor. This used to be slightly more popular for a while in the 80s, where a few custom computers were built that were specifically designed for doing things like orbital calculations. And it pops up every so often, like custom chips for playing chess, and now bit coin mining chips. That is great for a small computer, but when your price gets into the millions or billions of dollars, the people bankrolling it will probably want to build a system that can be used for a wider class of problems even if it means running slower.

  15. Really? by Murdoch5 · · Score: 2

    I'm pretty sure at one point, someone stood up in a meeting and said "No one will ever make a 1MB memory chip" or "No one will ever achieve a 64 bit processor", so how about sit down and just wait.

    1. Re:Really? by ebno-10db · · Score: 2

      I'm pretty sure at one point, someone stood up in a meeting and said "No one will ever make a 1MB memory chip" or "No one will ever achieve a 64 bit processor", so how about sit down and just wait.

      The author of the presentation didn't say we'd never get to Exaflops, just that it might take longer than anticipated. Second, the fact that some technologies have scaled incredibly well doesn't mean that all technologies do or that there are no limits. Chips are perhaps history's greatest example of a technology that scales well. However, we were also supposed to have flying cars and visit Jupiter by 2001. Sometimes the limits are practical rather than strictly technical. SST's were built designed in the 60's (Concorde) and more were being designed in the 70's, but they turned out not to be worth the cost. I'm anything but a technology pessimist, but I'm old enough to have seen lots of predictions not materialize, or just take much longer than expected (in the 60's they said we'd have flat screen TV's by the 70's).

    2. Re:Really? by Xyrus · · Score: 2

      You seem to be forgetting about the laws of physics. In fact, we are already hitting them. You can't shrink transistors much more or you get slapped with Schrodinger's cat. The interconnects are already using fiber optics. You can only put machines so close to one another. So on and so forth.

      When people have made claims before, it was due to either their idea of market forces or the limits of the current technology. Now, the actual physical limits are beginning to present roadblocks. Even if quantum computing becomes an everyday thing by 2020, you still have to get data to the QPU which still requires a speed-of-light limited data transfer to every node running the computation.

      The problem isn't processing power or memory or even disk space. It's latency, and that is limited by the speed of light.

      --
      ~X~
  16. Re:Moores Law... by ebno-10db · · Score: 2

    If Intel can cut the power to its 'big iron' cpu's (the 4/6/8 core chips), then just increasing the number of processors in supercomputers from 10,000 to 100,000 will give you an 10x increase in speed while using the same or less power. ... An 80x increase at the same size/power as what we have now puts us into exaflops range.

    RTFA. Flops are easy. The scaling problem is data links between nodes.