1.21 PetaFLOPS (RPeak) Supercomputer Created With EC2
An anonymous reader writes "In honor of Doc Brown, Great Scott! Ars has an interesting article about a 1.21 PetaFLOPS (RPeak) supercomputer created on Amazon EC2 Spot Instances. From HPC software company Cycle Computing's blog, it ran Professor Mark Thompson's research to find new, more efficient materials for solar cells. As Professor Thompson puts it: 'If the 20th century was the century of silicon materials, the 21st will be all organic. The question is how to find the right material without spending the entire 21st century looking for it.' El Reg points out this 'virty super's low cost.' Will cloud democratize access to HPC for research?"
1.21 PetaFLOPS (RPeak)
Getting RPeak high is simply a matter of getting enough computers which you have access to. They could be connected by TCP/IP over pigeons or PPP over two tin cans and a piece of wet string.
Basically getting a high RPeak on EC2 requires the following procedure:
1. Pay a fuck load of money
2. Create new instance.
3. Goto 2.
Basically this article translates to "Amazon has a lot of computers and this guy rented out a bunch of them at once".
Which I'm sure is good for his research, which must be of the very parallelizable type. I have done such stuff too in the past and it's nice when you have it.
SJW n. One who posts facts.
But can it run Crysis?
Boring PR exercise is boring. Yes Virginia, supers you can simply script together are boring. Especially because they're not at all super in the least if you throw but the most embarrasingly of parallel problems at them. As soon as you need any sort of communicating among the nodes at all, turns out the interconnect just isn't that great, and the efficiencie goes through the floor. Whoops.
What'd it take to impress me? That lone guy with his lone desktop breaking the record for calculating digits of pi was rather impressive. Now that's been done, figure out something new. But this, throwing some cash at amazon, this isn't it.
the 1.2PFlop/s is a theoretical peak performance. In comparison, the numbers that you'll find on the Top500 list are all sustained performances.
Megarun's compute resources cost $33,000 through the use of Amazon's low-cost spot instances, we're told, compared with the millions and millions of dollars you'd have to spend to buy an on-premises rig.
Running somebody else's machines for 18 hours costs less than buying a machine that powerful for yourself to run 24/7...
NEWS AT 11!
How about let's not use the anti-science mouthbreathers at the Register as a source.
If you don't understand any of my sayings, come to me in private and I shall take you in my German mouth.
"Supercomputing applications tend to require cores to work in concert with each other, which is why IBM, Cray, and other companies have built incredibly fast interconnects. Cycle's work with the Amazon cloud has focused on HPC workloads without that requirement." While this is cool, Can you really call something like this an HPC system if you are picking work loads that require little cross node communication? The requirement of cross node communication is pretty much the whole reason large scale HPC machines like ORNL's Titan exist at all. Wouldn't this system be classified closer to HTC because it is targeting workloads that are similar to those which would be able to run on HTC Condor pools?
So this ran for 18 hours, or about $1800/hour. That gives you just under $44,000 per day, or $16 million for a year.
Give me $16 million a year and I can build you a very kick-butt cluster - the one I'm just finishing up is 5000 cores at about $3 million.
EC2 is great if your needs are small and intermittent. But if you're part of a larger organization that has continual HPC needs, you're going to be better off building it yourself for a while.
1) Did they FIND any exceptional and useful photovoltaic behavior in the compounds tested?
2) How much will this sort of crunch make up of the revenue lost to the rest of the world's migration away from US-based cloud services, in the wake of Snowden's revelations?
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
No relation in any way to doc brown ... must be another troll posting articles!
1.21 gigawatts.
While this a nice use of Amazon's EC to build a high throughput system, that doesn't translate as nicely to what most High Performance computing users need- high network bandwidth, low latency between nodes and large, fast shared filesystems on which to store and retrieve the massive amounts of data being used or generated. The cloud created here is only useful to the subset of researchers who don't need those things. I'd have a hard time calling this High Performance Computing.
Look at XSEDE's HPC resources page. While each of those supercomputers has something special about the services they offer (GPU's SSD's, fast access, etc), they all spent a significant portion of their build budget on a high performance network to link the nodes for parallel codes. They also spent money on high performance parallel filesystems instead of more cores. Their users can't get their research done effectively on systems or clouds without those important elements.
I think that it's great that public cloud computing has advanced to the point where useful, large-scale science can be accomplished on it. Please note that it takes a separate company (CycleCloud) to make it possible to use Amazon EC in this way (lowest cost and webapp access) for your average scientist, but it's still an advance.
Disclaimer: I work for XSEDE, so do your own search on HPC to verify what I'm saying.
The Internet has no garbage collection
Now all they need is a flux capacitor, and then they can... oh wait...
That's almost enough to run Vista.
...will find this the sort of thing they like. For people/groups who have SETI@home or Folding@home style workloads - the type that the HPC community call "embarrassingly parallel" - and some money, this is useful. But it's sad that there is no mention made in the article of Condor - a job manager for loosely coupled machines that has been doing the same kind of thing since the '80s - essentially, since there has been a network between a few sometimes-idle computers in a CS department. Cycle Computing itself has used Condor as part of its CycleServer product. Jupiter is their own task distribution system which goes to larger scales than Condor can reach.
It's cool that Cycle Computing have packaged up this cycle scavenging approach into infrastructure that lets people easily deploy and farm work out to EC2 spot instances. But as they make those instances easier to use, the demand will go up, and the spot price of compute capacity will likely go up too. Which is nice for Amazon, of course, but harder on groups that are trying to make a budget forecast of what their simulations will cost to run. The free market grid computing cheerleader types will be over the moon at the opportunity to write papers about spot instance futures markets on a service that actually got popular. But, as another poster points out, it's High Throughput Computing, not HPC, and the very thing that makes it amenable to spot markets, which is the fungibility of loosely coupled EC2 instances, also restricts it to loosely coupled workloads, especially ones that don't produce a huge amount of data for each separate run - although a couple of years ago Cycle were already looking at ways of improving this last restriction.
-Snorbert, somewhere in the antipodes
Amazon makes a killing renting computers. Certain kinds of enterprises really want to pay extra for the privilege of outsourcing some of their IT to Amazon - sometimes it really makes sense and sometimes they're just fooling themselves.
People who do HPC usually do a lot of HPC, and so owning/operating the hardware is a simple matter of not handing that fat profit to Amazon. Most HPC takes place in consortia or other arrangements where a large cluster can be scheduled to efficiently interleave bursty usage patterns. That is, of course, precisely what Amazon does, though it tunes mainly for commercial (netflix, etc) workloads - significantly different from computational ones. (Real HPC clusters often don't have UPS, for instance, and almost always have higher-performance, high-bisection, flat/uniform networks, since inter-node traffic dominates.)