Slashdot Mirror


1.21 PetaFLOPS (RPeak) Supercomputer Created With EC2

An anonymous reader writes "In honor of Doc Brown, Great Scott! Ars has an interesting article about a 1.21 PetaFLOPS (RPeak) supercomputer created on Amazon EC2 Spot Instances. From HPC software company Cycle Computing's blog, it ran Professor Mark Thompson's research to find new, more efficient materials for solar cells. As Professor Thompson puts it: 'If the 20th century was the century of silicon materials, the 21st will be all organic. The question is how to find the right material without spending the entire 21st century looking for it.' El Reg points out this 'virty super's low cost.' Will cloud democratize access to HPC for research?"

14 of 54 comments (clear)

  1. 1.21 PetaFLOPS (RPeak) by serviscope_minor · · Score: 4, Informative

    1.21 PetaFLOPS (RPeak)

    Getting RPeak high is simply a matter of getting enough computers which you have access to. They could be connected by TCP/IP over pigeons or PPP over two tin cans and a piece of wet string.

    Basically getting a high RPeak on EC2 requires the following procedure:
    1. Pay a fuck load of money
    2. Create new instance.
    3. Goto 2.

    Basically this article translates to "Amazon has a lot of computers and this guy rented out a bunch of them at once".

    Which I'm sure is good for his research, which must be of the very parallelizable type. I have done such stuff too in the past and it's nice when you have it.

    --
    SJW n. One who posts facts.
    1. Re:1.21 PetaFLOPS (RPeak) by fuzzyfuzzyfungus · · Score: 5, Informative

      The one (slightly) novel aspect of this, presumably also made possible because the workload parallelized well, is the use of Spot Instances. As the name suggests, these aren't Amazon's standard fixed-price instances; but are rather instances whose price changes according to demand.

      You make a bid (specifying maximum price/hour, number and type of instances, availability zones, etc.) If the spot price falls at or below your maximum, your instance starts running. Should it exceed your maximum, your instance gets terminated. Using these things obviously requires a tolerance for server outages far above even the shoddiest physical systems; but if you can divide your problem space into relatively small, discrete, chunks, and get the results off the individual servers once computed, you won't lose more than a single chunk per shutdown, and spot instances can be crazy cheap, depending on demand at the time. My impression is that Amazon offers them whenever they don't have enough reserved instances to fill a given area, and will pretty much keep offering them as long as they pay better than they cost in additional electricity and cooling, so if you are willing to bottom feed, and potentially wait, there are some bargains to be had.

    2. Re:1.21 PetaFLOPS (RPeak) by Graymalkin · · Score: 2

      Basically this article translates to "Amazon has a lot of computers and this guy rented out a bunch of them at once".

      No the article translates to "if you've got embarrassingly parallel workloads you can use EC2 to churn through it without a massive infrastructure outlay of your own". Amazon isn't just renting out the actual CPUs but the power, HVAC, storage, and networking to go along with it. Infrastructure and maintenance is a huge cost of HPC and puts it out of reach for many smaller projects.

      You're entirely correct that a massive Rpeak value isn't impressive in terms of actual purpose-built super computers but reporting of the Rpeak is only half of the story. The lede buried in the reporting is that for $33,000 a professor was able to take off the shelf software and run it on a 1.21 petaflop parallel cluster. That's high teraflop to petaflop computing at relatively small research grant prices. I think that's the interesting fact out of this story.

      --
      I'm a loner Dottie, a Rebel.
  2. Old Joke by Squiddie · · Score: 2

    But can it run Crysis?

    1. Re:Old Joke by CaseCrash · · Score: 2

      But can it run Crysis?

      Dear lord, this is an old joke now?

      --
      No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."
  3. FTA by Saethan · · Score: 4, Insightful
    FTA:

    Megarun's compute resources cost $33,000 through the use of Amazon's low-cost spot instances, we're told, compared with the millions and millions of dollars you'd have to spend to buy an on-premises rig.

    Running somebody else's machines for 18 hours costs less than buying a machine that powerful for yourself to run 24/7...

    NEWS AT 11!

    1. Re:FTA by AvitarX · · Score: 2

      That's what I thought. It is great that it is possible to run a simulation on a five figure budget, but if it's something that gets heavy use, having your own is better. I predict this will help Cray (contrary to the article's implication), with companies able to start using big power and see where it takes them without dropping the capital expense, they will then be able to move to a more constant use of such resources with lower marginal cost by bringing it in house.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    2. Re:FTA by fuzzyfuzzyfungus · · Score: 2

      It wouldn't surprise me if organizational dynamics come into the picture as well. If researcher X can purchase consumables and services related to his work up to X dollars on his own (subject only to oversight after the fact if somebody raises an eyebrow) and up to Y dollars with a sign off from the lab head or somebody; but would need 6 signatures, university-level approval for the facilities repurposing, and who knows what else, he has a pretty strong incentive to just pay Amazon to do it, even if getting an in-house system makes more sense in the longer term.

      On the other side of the coin, if a university is looking for a prestige project that'll look pretty damn cool through the glass when they take tours around, they might get a butch, black, blinkencomputer even if utilization ends up being tepid.

  4. HPC? by NothingMore · · Score: 5, Insightful

    "Supercomputing applications tend to require cores to work in concert with each other, which is why IBM, Cray, and other companies have built incredibly fast interconnects. Cycle's work with the Amazon cloud has focused on HPC workloads without that requirement." While this is cool, Can you really call something like this an HPC system if you are picking work loads that require little cross node communication? The requirement of cross node communication is pretty much the whole reason large scale HPC machines like ORNL's Titan exist at all. Wouldn't this system be classified closer to HTC because it is targeting workloads that are similar to those which would be able to run on HTC Condor pools?

    1. Re:HPC? by sunderland56 · · Score: 2

      If we follow the article's reasoning, then SETI@home was one massive supercomputer, not 10,000 individual computers working on parts of a common task.

  5. Good but not great by Enry · · Score: 4, Insightful

    So this ran for 18 hours, or about $1800/hour. That gives you just under $44,000 per day, or $16 million for a year.

    Give me $16 million a year and I can build you a very kick-butt cluster - the one I'm just finishing up is 5000 cores at about $3 million.

    EC2 is great if your needs are small and intermittent. But if you're part of a larger organization that has continual HPC needs, you're going to be better off building it yourself for a while.

    1. Re:Good but not great by cdrudge · · Score: 2

      Give me $16 million a year and I can build you a very kick-butt cluster - the one I'm just finishing up is 5000 cores at about $3 million.Presuming costs scale approximately linearly, $16m would net you 26-27k cores. They hit 6x that at peak. I didn't see them mention what they sustained over the long haul or averaged, but it looks like it was well above your scaled core numbers.

  6. 1.21 gigawatts. by Cammi · · Score: 2

    No relation in any way to doc brown ... must be another troll posting articles! 1.21 gigawatts.

  7. Re:High Throughput Computing not HPC by Yohahn · · Score: 3, Insightful

    The problem is that in a number of cases a researcher could easily use HTC, but they follow the fashion of HPC, using more specialized resources than necessary.
    Don't get me wrong, there are a number of cases where HPC makes sense, but usually what you need is a large amount of memory, or a large amount of processors.
    HPC only makes sense where you need both.