Slashdot Mirror


Breaking Supercomputers' Exaflops Barrier

Nerval's Lobster writes "Breaking the exaflops barrier remains a development goal for many who research high-performance computing. Some developers predicted that China's new Tianhe-2 supercomputer would be the first to break through. Indeed, Tianhe-2 did pretty well when it was finally revealed — knocking the U.S.-based Titan off the top of the Top500 list of the world's fastest supercomputers. Yet despite sustained performance of 33 petaflops to 35 petaflops and peaks ranging as high as 55 petaflops, even the world's fastest supercomputer couldn't make it past (or even close to) the big barrier. Now, the HPC market is back to chattering over who'll first build an exascale computer, and how long it might take to bring such a platform online. Bottom line: It will take a really long time, combined with major breakthroughs in chip design, power utilization and programming, according to Nvidia chief scientist Bill Dally, who gave the keynote speech at the 2013 International Supercomputing Conference last week in Leipzig, Germany. In a speech he called 'Future Challenges of Large-scale Computing' (and in a blog post covering similar ground), Dally described some of the incredible performance hurdles that need to be overcome in pursuit of the exaflops barrier."

13 of 96 comments (clear)

  1. Re:Barrier? by Tastecicles · · Score: 2

    yeah, strange harmonics and shit, and word around the cooler is that it would require an infinite amount of energy as well... that or set the atmosphere on fire or some shit.

    --
    Operation Guillotine is in effect.
  2. NVIDIA's bread and butter long term by storkus · · Score: 2

    My take away from reading this and the blog post is that, while NVIDIA may consider graphics to be their bread & butter, it looks like they're looking at this space (HPC) very seriously in the long term--perhaps they even think they can dominate it. This is a big difference from the other players: IBM isn't bothering to throw POWER at it, and AMD/ATI is only present on older machines; ATI in particular seems more interested in going after the mobile space rather than HPC. I don't know what to make of Intel other than they know they're the choice for the non-GPU side and are at the top of their game.

    One problem I see is that NVIDIA is still a fabless house and has performance limitations tied to whatever fab they partner with; perhaps this is why they downplay process gains in the blog post.

    Of course, if the conspiracy theorists are to be believed, NSA and friends already have this 10-years-into-the-future technology...

    1. Re:NVIDIA's bread and butter long term by ebno-10db · · Score: 2

      Of course, if the conspiracy theorists are to be believed, NSA and friends already have this 10-years-into-the-future technology...

      I heard 20 years - they're still learning stuff from the Roswell crash.

    2. Re:NVIDIA's bread and butter long term by HuguesT · · Score: 2

      Actually Intel is pretty much the king of the hill at the moment for HPC. They don't have a "GPU" solution, but they do have a massively parallel CPU + PCIe compute card available called the "Xeon Phi". Extremely confusing, yet this is what the current fastest supercomputer uses

      http://www.datacenterdynamics.com/focus/archive/2013/06/xeon-phi-powered-supercomputer-tops-top500

      Xeon phi is easier to deal with than Nvidia's solution for GPU, essentially because it is currently much easier to program.

      http://goparallel.sourceforge.net/independent-test-xeon-phi-shocks-tesla-gpu/

  3. Re:Barrier? by holmstar · · Score: 3, Insightful

    I'm sure the same sort of things were said about a petaflop machine, back in the day. Doesn't make exaflop a barrier. Just an engineering challenge, like every other bleeding edge supercomputer has been.

  4. Re:Department of Energy secret supercomputer by elfprince13 · · Score: 2

    Dunno about "top secret", but the DoE puts a huge amount of computing resources into physical simulation. Check out some of the NERSC projects (GTC, for example).

  5. Re:Barrier? by Zargg · · Score: 3, Interesting

    I'm pretty sure the parent is questioning why the word "barrier" is used instead of something like "milestone", which I would have chosen. A barrier implies there is something special stopping you there that you need to work around or resolve, but milestone is just a convenient number to stop at, as in this case. I see no difference between passing exaflop and say 0.9 exaflop, since both require "a really long time, combined with major breakthroughs in chip design, power utilization and programming", so it isn't a barrier, just a convenient number.

  6. Re:Mea Culpa by ebno-10db · · Score: 4, Interesting

    Should have used "tera" in place for "giga"

    I'm getting tired of all the prefixes, couldn't we just use scientific notation? 1e18 flops means a lot more to me than exaflop.

  7. Re:Has this been turned into another pissing conte by CODiNE · · Score: 4, Interesting

    Well I don't know anything at all about nuclear simulations and fluid dynamics modeling...

    But for pure benefit to mankind I'd say folding@home is a pretty worthy project. It's been running for years and has helped make actual discoveries and raised understanding of protein folding's effects.

    According to Wikipedia it was running at 14 Petaflops when last updated. Would taking that up to an exaflop be a huge benefit? You bet!

    How about being able to simulate an entire life cycle of a human body at atomic scale? That would gain us tremendous understanding of well... EVERYTHING.

    Most definitely there are worthy projects that have a real need for exaflop computing and it's not a waste of time.

    You remind me of my friend who years ago said that his 802.11b wireless network was as fast as he'd ever need. Guess he didn't plan on people watching multiple HDTV streams throughout the house.

    --
    Cwm, fjord-bank glyphs vext quiz
  8. Re:That'd be quite a piss! by timeOday · · Score: 2

    So if we JUST put roughly 30 of the Tianhe-2s or 500,000 nodes with 100,000,000 computing cores in one big system, we'd have our exascale computer!

    Actually, no, that's the problem/challenge... linking 30 Tianhe-2s would make a supercomputer that is only slightly faster than a single Tianhe-2, because the cores would mainly be sitting idle due to communication latency. Granted this is not true for computations that are completely parallel (e.g. cracking passwords) but that is NOT what "exaflop" means; it means an exaflop on a scientific computing benchmark.

  9. Re:Information != benefit by alexandre_ganso · · Score: 2

    Huge supercomputers have the advantage that they are efficient, when compared to projects such as those running "@home", and their interconnects allows them to solve problems that need strong communications between the computing elements. Such problems cannot be solved in an efficient way by this "@home" model, where a machine receives a work unit, computes it and returns the result for final aggregation.

    Those interconnects can sum to as much as half the price of building a supercomputer.

    When you mention the environmental rising costs, I suspect you mean the carbon footprint, caused by energy consumption for manufacturing and operating those machines. The costs are not negligible, granted, but they are probably not as big as that caused by the cars of the thousands of scientists who use such machines :-) This is especially true in US, where cars are horribly inefficient, public transport from the suburbs to research centers is spotty and distances are large.

    I understand that these environmental costs are much smaller than the benefits given by the use of such machines. Remember that supercomputers are used to simulate things such as nuclear explosions, ballistics and radiation decay. The costs for the environment are certainly better than blowing atomic bombs around! Not to mention the gains in health research, for example.

    So, yes, there is a HUGE demand for such behemoths, and they are much better than the alternative.

  10. Re:Mea Culpa by Blaskowicz · · Score: 2

    1 exponentiated to the 18th power is still 1.

  11. Xeon Phi vs GPU by Ottibus · · Score: 2

    The advantage of Xeon Phi cards is that parallelization on those cards works similar like classical parallelization on supercomputers.

    Not really, no. Classic supercomputers were vector machines whereas Xeon Phi is wide SIMD.

    You just use MPI

    MPI is equally applicable to GPU or Xeon Phi, it operates at a level above the raw computation. In both cases you have controlling CPUs with accelerators attached (GPU in once case, Xeon Phi in the other). MPI is used to manage the data flow between these units but has little to do with the architecture of those units themselves.

    For GPUs, on the other hand, you have to adapt a lot of code.

    You have to adapt code either way:

    For GPU you express the problem as a scalar kernel that is executed in parallel. You have to make sure that the work doesn't overlap but you only have to consider one element at a time.

    For SIMD you break your problem in to SIMD-width chunks that are computed in parallel. It is easier to synchronise operations but you have to fit the problem into chunks of the right size.

    Xeon Phi has an advantage where you have existing SIMD code (e.g. SSE), but if you are starting from scratch then there is no clear winner. And HPC code is increasingly being written in languages like OpenCL and CUDA which are designed for GPU rather than SIMD.