The Problem With the Top500 Supercomputer List
angry tapir writes "The Top500 list of supercomputers is dutifully watched by high-performance computing participants and observers, even as they vocally doubt its fidelity to excellence. Many question the use of a single metric — Linpack — to rank the performance of something as mind-bogglingly complex as a supercomputer. During a panel at the SC2010 conference this week in New Orleans, one high-performance-computing vendor executive joked about stringing together 100,000 Android smartphones to get the largest Linpack number, thereby revealing the 'stupidity' of Linpack. While grumbling about Linpack is nothing new, the discontent was pronounced this year as more systems, such as the Tianhe-1A, used GPUs to boost Linpack ratings, in effect gaming the Top500 list."
Fortunately, Sandia National Laboratories is heading an effort to develop a new set of benchmarks. In other supercomputer news, it turns out the Windows-based cluster that lost out to Linux stumbled because of a bug in Microsoft's software package. Several readers have also pointed out that IBM's Blue Gene/Q has taken the top spot in the Green500 for energy efficient supercomputing, while a team of students built the third-place system.
Now that the Chinese are ahead, there's suddenly a problem with the list/benchmark.
I have always wondered why being on the Top500 list of supercomputers that important for those on the list. I will be better served by being told the advantage(s) or edge those who've been on that list have gotten since they got onto the list. Thanks.
How exactly would one perform a sparse matrix product on 100,000 Android cell phones? Linpack isn't great, but it isn't terrible, either. Many supercomputing processor cycles go into large matrix operations--FFT, products, SVD/PCA/Eigenvectors. These can be run in parallel, but they don't break down into trivially parallel operations and the machines on the list really do perform well on them.
A bigger issue is that the rules allow operators to essentially optimize for the specific problem sizes and input, and that the machine doesn't have to be running stably or in the same configuration for all of the runs. It is OK if your machine is overclocked and melts one minute after it executes the benchmark. That isn't useful. The Green 500 is a better metric because it considers the power cost of the computations--that's what most supercomputing centers care about--but even it needs much more stringent rules.
I don't see a problem with using GPUs.
They do lots of parallel shit really fast.
It's no different than slapping on a math coprocessor, or adding a block of hardware to accelerate common encryption/decryption functions.
I want the one with the bigger GPUs and the WiFis.
Flexible bare-metal recovery for Linux/UNIX
As the article alludes, the big problem with ranking supercomputers via Linpack is that it doesn't advance supercomputer design. The net result is a pissing match over scalability, where winning is dependent upon who can cram the most cores into a single room. The real innovatiors should be recognized for their efforts to reduce space, power and cost, or finding new algorithms to crunch the numbers in more efficient or useful ways.
When you have nothing left to burn you must set yourself on fire
int i = 0;
while(i infinite)
{
i++;
}
---
Whatever computer finishes first is clearly the fastest supercomputer.
Is there a useful purpose for this list in the first place? It isn't likely to useful to THAT many people...
If it can't handle a Starcraft 2, 8 player, full army of all zerglings rush without choking with max settings at a 1080p resolution, it's not a supercomputer.
The guide has this to say about supercomputers: "Supercomputers," it says, "are big. Really big. You just won't believe how vastly, hugely, mindbogglingly big they are. I mean, you may think your SGI Challenge DM is big, but that's just peanuts to supercomputers, listen..."
Redefine all variables in LINPACK to be higher precision than available from any graphics processor. 128 bit floats, for example. (Requires writing a library to handle the new floats, obviously.)
Contribute to civilization: ari.aynrand.org/donate
The Top500 has the problem in that many of the systems on there aren't super computers, they are clusters. Now clusters are all well and good. There's lots of shit clusters do well, and if your application is one of them then by all means build and use a cluster. However they aren't supercomputers. What makes supercomputers "super" is their unified memory. A real supercomputer has high speed interconnects that allow direct memory access (non-uniform with respect to time but still) by CPUs to all the memory in the system. This is needed in situation where you have calculations that are highly interdependent, like particle physics simulations.
So while you might find a $10,000,000 cluster gives you similar performance to a $50,000,000 supercomputer on Linpack, or other benchmark that is very distributed and doesn't rely on a lot of inter-node communication, you would find it falls flat when given certain tasks.
If we want to have a cluster rating as well that's cool, but a supercomputer benchmark should be better focused on the tasks that make owning an actual supercomputer worth it. They are out there, that's why people continue to buy them.
Even with an hypothetical hyper-fast network, 100.000 android phones won't get you anywhere near the top of the list.
Heck, even 100 000 Nehalem (core i7) cores won't get you in the top 5.
So, android phones ? You'll need millions of them.
...is that *no* single-figure-of-merit benchmark is going to be worth anything. Sandia's "Graph 500" Johnny-come-lately isn't going to be any better than Linpack that way, and will just skew the results towards a different not-generally-useful architecture. A far better idea has been around for over five years: the HPC Challenge benchmark. It consists of seven different tests (Linpack is just one) which stress different aspects of system design. Anybody who knows anything about building big systems would identify some mix of these tests that best approximates their own workload, use that as a starting point for looking at likely alternatives, and then remember that it's just a starting point. The only benchmark that really matters is the one that you run yourself on your own application, but that can be a very expensive and time-consuming exercise so these lists can be a good way to figure out which systems deserve that more extended analysis. Linpack, on the other hand, isn't even useful for that. What's sad is that some people either didn't know (which says something about how we train engineers) or didn't care until a Chinese system found its way to the top (which says something even worse).
Slashdot - News for Herds. Stuff that Splatters.
Good news, everyone! Our supercomputer OS only lost because it's buggy!
How can I believe you when you tell me what I don't want to hear?
In other supercomputer news, it turns out the Windows-based cluster that lost out to Linux stumbled because of a bug in Microsoft's software package.
As it should. That's not news; that's how the game is played. If your software is buggy, and those bugs drag your performance far enough down, you don't deserve a top500 spot.
If they fix their software, rerun the test, and perform better than Linux, then they will have won that battle (the battle for the top500 spot, not the battle for market share) fair and square.
Let q be a radix > 1. I am in ur base-q, killing 10 d00ds.
The HPC challenge has been available for a long time now. It has never got real attention because people do want a single metric to rank computers and make a classification. If you really want to know if your computer is fit for a particular purpose, you can. Don't blame the top500 for providing what the people want to see. As a side note, I quite don't see why using accelerators would be "gaming" the top500. This is a very stupid statement, accelerators have a wide range of applications for real science, it is not about getting the biggest number in HPL. This message written from the premises of the SC meeting.
You are allowed to use hardware-specific features and change the algorithm for this benchmark. That way, any optimization is used and innovation, as you call it, emerges. Besides, scalability *is* the most desired quality for a supercomputer that doesn't aim for space, power and cost... like the ones most likely to be in TOP500. You have Green500 for the other things you mentioned.
I rarely respond to comments. Also, don't ask for clarifications: a brain and Google are faster, believe me!
Yes, noticed that.
Here's the actual benchmark used for Top500: "HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers". It solves linear equations spread across a cluster. The clustered machines have to communicate at a high rate, using MPI 1.1 message passing, to run this program. See this discussion of how the algorithm is parallelized. You can't run this on a set of machines that don't talk much, like "Folding@home" or like cryptanalysis problems.
Linpack is a reasonable approximation of computational fluid dynamics and structural analysis performance. Those are problems that are broken up into cells, with communication between machines about what's happening at the cell boundaries. Those are also the problems for which governments spend money on supercomputers. (The private market for supercomputers is very small.)
So, quit whining. China built the biggest one. Why not? They have more cash right now.
Superpack = number of years taken to compile a full Gentoo branch * number of hours taken to render slashdot home page
Still cant play Crysis!!! Literally...
I will not be pushed, filed, stamped, indexed, briefed, debriefed or numbered. My life is my own.
Speed is one thing, but how about normalizing the list by how well its owners are utilizing those transistors?
The single largest expense, over the lifetime of a supercomputer, is power consumption.
"To those who are overly cautious, everything is impossible. "
To quote the original article "Because Tsubame uses a KVM hypervisor and various cloud-like provisioning tools, it can run both Windows and Linux at the same time on different nodes, and offer users various types of processing configurations." As one of the commentors on the original article pointed out, KVM is virtual machine software designed for linux, so is this benchmark comparing the performance of linux and windows virtual machines (running on a linux host), rather than comparing the performance of linux and windows directly? Or is this comment relevant only in the sense that Tsubame is currently running KVM and totally irrelevant in the context of the testing performed?
>Fortunately, Sandia National Laboratories is heading an effort to develop a new set of benchmarks...
Bummer for you, Sandia. NASA already did that with the NAS Parallel Benchmarks. Here's a hint: you're funded by the US Government (just like NASA), and NPB died when the Japanese started kick US butt on NPB.
>But the Benchmark is in FLOPs which uses FLUs.(Like ALUs)
So if they have problems then the only ones to blame are the Designers of the computer.
A FLOP, is a FLOP, is a FLOP.
The possible problems are:
A. The aren't using the Math Libraries correctly.
B. They aren't using the CPU's or GPU's FLU correctly.
C. They configured the system bad. (I.e. Networking, etc)
If this phone cluster actually did outperform the fastest supercomputers, why would that make the benchmark stupid?
I mean, the idea of using 100,000 smartphones might be stupid when examined pragmatically, but I don't see how that affects the validity of the performance.
And nobody, so far, is claiming that using GPUs is an inherently stupid idea for any reason, so that should have no bearing on the Tianhe's victory.
When a foreign computer wins, the benchmark needs to be changed? Now that is gaming the system, American style.
That is incorrect. It was not in the software designed to do the rating. It was in the Microsoft designed software that ran on their OS, and that was rated. If I need to perform a task, and I pay a company to provide the hardware and software to achieve that task, I don't really care if it was a bug in their OS or their application that reduces my productivity.
It is true that the bug wasn't in Windows OS, but it was still a bug in the supercomputer implementation, and therefore a Microsoft flaw in the supercomputers performance, and not a flaw in the benchmarking tools or methodology.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
The Chinese computer with GPUs works with around 40% efficiency because of communication overhead and wasted SIMD ways, whereas most winners historically worked with around 90% (The inefficiency of 100k handsets would be even higher than the lameness of the joke). The silicon power-wall is the limit in scaling of the supercomputers that are built with GPUs.
So, giving examples on GPUs and handsets is not a good way to promote the use of other benchmarks because even this simple benchmark is enough to fail them (assuming your concern power in year 2010 and existence in year 2020). However, it would be a motivation to change this benchmark if there was proof that it is not a representative set for the use of these machines.
> one high-performance-computing vendor executive joked about stringing together 100,000 Android smartphones to get...
Now that's not entirely fair. I just looked at an Android smartphone at Best Buy, and the sales rep assured me that I should be buying the bigger, more expensive one, rather than the one I could fit in my pocket, because it was faster. It even has a snapdragon processor! (whatever that is) Surely it can't be that far away from a true supercomputer.
the reason they use GPU's is to run the simulation tasks much more efficient. You have a good compiler that can split-up the work load - I suspect something like an extended FORTRAN
I'd define some standard 'real world' problems that can be run using open-source software, and simply take total accumulative runtime to solve them to be the score. For one benchmark, I'd suggest OpenFoam to run a Direct Eddy Simulation of turbulent flow around a bluff body. I'm sure others could come up with things in the field of cryptography, protein-folding, travelling salesmen, etc, etc.
should be mandatory to be allowed to speak. "Stringing together 100,000 Android smartphones to get the largest Linpack number" is a stupidity. You'll have incredibly ridiculous linpack results. Computer science is a science ...
And since many supercomputer purchases fall into the political realm (in terms of congressional earmarks to pay for them, or national programs to improve "competitiveness"), you need something that your congressman or premier can handle easy.
No, no, no! This is not how it works at all. Here's how it works. Congress appropriates some amount of money for computing. Big labs across the country submit proposals to the funding agency (DARPA, NSF, etc.) to build a computer. Those proposals have to show how real science will get done. Thus simply putting a LINPACK score in the proposal will get you laughed out of the room and a lifetime ban from competing for computing dollars. The labs have to actually study how their codes will run on the proposed machine and then prove it once it's built. The process of proposing, building and running a supercomputer is complex and highly technical. Politicians aren't reviewing these things, scientists are.
When you consider how important times & speeds are to this sport, it was pretty revolutionary (and progressive) of them to do this. Maybe it is just time for the Supercomputer list people to do the same thing.
I come here for the love