Slashdot Mirror


LLNL/RPI Supercomputer Smashes Simulation Speed Record

Lank writes "A team of computer scientists from Lawrence Livermore National Laboratory and Rensselaer Polytechnic Institute have managed to coordinate nearly 2 million cores to achieve a blistering 504 billion events per second, over 40 times faster than the previous record. This result was achieved on Sequoia, a 120-rack IBM Blue Gene/Q normally used to run classified nuclear simulations. Note: I am a co-author of the coming paper to appear in PADS 2013."

22 of 79 comments (clear)

  1. rPi is different from RPI by stewsters · · Score: 5, Funny

    Was i the only one who thought for a second that this was about a raspberry pi cluster?

    1. Re:rPi is different from RPI by DrData99 · · Score: 2, Funny

      Yes.

    2. Re:rPi is different from RPI by Anonymous Coward · · Score: 2

      No

    3. Re:rPi is different from RPI by TheAngryMob · · Score: 2

      No. For us old-timers, RPI stands for Rockwell Protocol Interface.

      http://encyclopedia2.thefreedictionary.com/Rockwell+Protocol+Interface

      POS Modems....

      --

      Don't just game, Dungeoneer
    4. Re:rPi is different from RPI by RabidReindeer · · Score: 2

      They didn't have acronyms back in the 19th century, silly.

      I always thought that acronyms were invented by IBM.

      They used so many of them that the same 3 letters often applied to 5 different products. At the same time.

  2. Only Warp 2.7? by maxwell+demon · · Score: 2

    I was already running Warp 3 in 1995! :-)
    (OS/2 Warp 3, to be exact)

    --
    The Tao of math: The numbers you can count are not the real numbers.
    1. Re:Only Warp 2.7? by olsmeister · · Score: 2

      At present, we are now at {Warp Speed 2.7}. It will be nearly 150 years before we expect to reach {Warp Speed} 10.0.

      And then, Delta Quadrant here we come!

  3. can you put the paper online? by Trepidity · · Score: 5, Insightful

    Note: I am a co-author of the coming paper to appear in PADS 2013.

    I clicked hoping to read the paper, but the actual paper doesn't seem to be posted, only the abstract. The ACM copyright policy explicitly allows authors to "Post the Accepted Version of the Work on ... the Author's home page", so there is no legal barrier to the authors putting a PDF online. Doing so would of course increase readership of the paper, so ought to benefit everyone.

    1. Re:can you put the paper online? by Lank · · Score: 5, Informative

      I didn't realize that it was acceptable to post it before the conference even happened. But you're right so here it is.

      --
      Gotta get me one of these!
    2. Re:can you put the paper online? by Trepidity · · Score: 2

      Thanks! My own policy is that I don't post draft or submitted versions, but once something is finalized (camera-ready final copy as it's going to appear in the proceedings), I'll post the PDF online.

      One plus side for those who care about such things is that it'll get into Google Scholar faster—GS is surprisingly good at picking these PDFs up in its crawls and figuring out how to index them.

  4. Re:Simulation of what? by Bill,+Shooter+of+Bul · · Score: 4, Funny

    No, those events are Who. Simulating is How. What is calculated.

    --
    Well.. maybe. Or Maybe not. But Definitely not sort of.
  5. Re:Simulation of what? by flayzernax · · Score: 3, Funny

    Cats.

  6. Re:keys still safe from brute force by SuricouRaven · · Score: 2

    This is a simulation, events. The summary doesn't say what the events are, but probably more complicated than just testing a key.

    Besides, brute-forcing a key wouldn't be best done on general-purpose or even GPU. An ASIC would be the fastest, and you can be confident such chips would be easily within the capability of any major and a lot of not-to-major governments. So you're looking at a chip that can do, as a back-of-the-envelope, a key every cycle and clocked at 1.2GHz - standard for a lot of systems, as a sort of performance-per-watt peak. Times 64 cores per chip, times eight chips per PCI-e card, times eight processor cards per 2U case, times 42/3=14 systems per rack (leave space for cooling and switch), that's 1.2 * 64 * 8 * 8 * 14 = 68812 GK/s per rack.

  7. This could be good... by aussie.virologist · · Score: 4, Interesting

    I'd be interested in seeing if this system could run our full Poliovirus simulations (consisting of around 3.5 million atoms). I've run our simulations on the BlueGene/Q at VLSCI using 32,768 cores (65,536 threads) and have been getting a very respectable 11.2 nanoseconds per day of simulation data using NAMD. Some data on our full virus simulations can be found here... (VIDRL supercomputer simulation page). Hey Lank, maybe you can help me figure out a way to crack the millisecond mark for our full-virus sims??? Great work and cheers from down under :-)

    1. Re:This could be good... by DJefferson · · Score: 2

      I would think that the macroscopic behavior of 3.5 million atoms in (poly)crystals or in a fluid or plasma states are within the capability of Sequoia. That's about 2 atoms per core and per GB of RAM. But the complex dynamics of proteins, DNA, RNA, and any other complex polymers that comprise the polio virus interacting with, say, a cell membrane, are still probably out of reach for accurate calculation in a reasonable amount of time.

    2. Re:This could be good... by aussie.virologist · · Score: 2

      Agreed, at this point we are looking at virus dynamics in response to drug binding events and gross alterations in conformational structure in response to significant changes in temperature and ionic content. So for these simulations, the longer the better. I dream of a day when we can model complex host cell interactions and hopefully I will a grey bearded old man still full of enthusiasm when these sort of simulations are considered "run of the mill". Your work helps to keep me excited about the future of HPC and how it can benefit not only my research, but humanity's understanding of the world as a whole. Cheers.

    3. Re:This could be good... by aussie.virologist · · Score: 2

      Hey thanks "ratbag" for your kind words. The work that Barnes et al. are doing is so important for researchers like us. It opens the door for us to answer questions in a manner that even 5 years ago was considered "ambitious" to say the least. I am very lucky to be in a position where I have access to resources that allow me to explore new ways of answering some very old questions about how viruses behave, with the added bonus that we may hopefully be able to contribute to making the world just a little bit better. Fingers-crossed.

      "jkflying" I started off by working in electronics engineering when I left school, funnily enough I was running a company with some friends designing and building robotics systems, mainly focusing in animatronics. I wanted to start using my robotics background to work in the development of prosthetic limbs, but ended up changing the focus of my undergrad from anatomy and physiology to pathology, specifically microbiology with a lot of biochemistry thrown in. My post-grad was in computational biology. I actually started doing the simulation work after playing around with the tutorials on the VMD/NAMD website at the University of Illinois. I would recommend doing them, it's great nerdy fun and it gets you thinking about the different ways that you apply the techniques.

      Have a great day:)

  8. It was an LLNL supercomputer, not an RPI supercomp by DJefferson · · Score: 4, Informative

    The title to this piece is wrong. The supercomputer in question was Sequoia, the Blue Gene/Q supercomputer located at Lawrence Livermore National Laboratory. Some preliminary work was done on a smaller RPI BG/Q machine, however. (I am a coauthor of the paper.)

  9. Re:what OS please? by DJefferson · · Score: 4, Informative

    It runs a custom IBM OS specifically designed for Blue Gene/Q. It proveds an API very similar to Linux, but with some restrictions, e.g. static limits on threads, no process forking, and custom MPI messaging instead of a TCP/IP stack.

  10. Re: Simulation of what? by DJefferson · · Score: 3, Informative

    The simulation was a well-known parallel discrete event benchmark called PHold. It is not a model of any particular physical system, but is more of a stress and scalability test for the simulator, in this case the ROSS simulator developed at RPI. PHold has particularly fine-grained events, which stresses the synchronization mechanism known as Time Warp, implemented ROSS with support for reverse computation. It stresses the scalability of the Global Virtual Time commitment mechanism (used for I/O, error detection, storage management, and termination detection). And because PHold has no locality in its communication, it greatly stresses the underlying communication layer, MPI. The general idea is that a simulator that can achieve high performance on PHold at very large parallel scale can achieve high performance on just about any realistic, load balanced discrete event simulation at that scale.

  11. Re: Not that impressive - just running a benchmark by DJefferson · · Score: 2

    I have to disagree. PHold was not designed to run well under Time Warp. It was designed as a stress test for any parallel discrete event simulator, whether based on Time Warp or not, and in particular originally to compare optimistic to conservative synchronization algorithms. Also, Sequoia is much less biased toward regular geometry continuum simulations that other world class supercomputers. It has no GPUs, for example. Machines of this class will be used more and more in the future for discrete simulations such as network models, or agent-based models, or for huge data problems, or for mixed continuous-discrete models such as of the power grid.

  12. Re:Fast money by tehcyder · · Score: 2

    It's good to see that you've thought this through properly.

    --
    To have a right to do a thing is not at all the same as to be right in doing it