The Problem With the Top500 Supercomputer List
angry tapir writes "The Top500 list of supercomputers is dutifully watched by high-performance computing participants and observers, even as they vocally doubt its fidelity to excellence. Many question the use of a single metric — Linpack — to rank the performance of something as mind-bogglingly complex as a supercomputer. During a panel at the SC2010 conference this week in New Orleans, one high-performance-computing vendor executive joked about stringing together 100,000 Android smartphones to get the largest Linpack number, thereby revealing the 'stupidity' of Linpack. While grumbling about Linpack is nothing new, the discontent was pronounced this year as more systems, such as the Tianhe-1A, used GPUs to boost Linpack ratings, in effect gaming the Top500 list."
Fortunately, Sandia National Laboratories is heading an effort to develop a new set of benchmarks. In other supercomputer news, it turns out the Windows-based cluster that lost out to Linux stumbled because of a bug in Microsoft's software package. Several readers have also pointed out that IBM's Blue Gene/Q has taken the top spot in the Green500 for energy efficient supercomputing, while a team of students built the third-place system.
Now that the Chinese are ahead, there's suddenly a problem with the list/benchmark.
I have always wondered why being on the Top500 list of supercomputers that important for those on the list. I will be better served by being told the advantage(s) or edge those who've been on that list have gotten since they got onto the list. Thanks.
How exactly would one perform a sparse matrix product on 100,000 Android cell phones? Linpack isn't great, but it isn't terrible, either. Many supercomputing processor cycles go into large matrix operations--FFT, products, SVD/PCA/Eigenvectors. These can be run in parallel, but they don't break down into trivially parallel operations and the machines on the list really do perform well on them.
A bigger issue is that the rules allow operators to essentially optimize for the specific problem sizes and input, and that the machine doesn't have to be running stably or in the same configuration for all of the runs. It is OK if your machine is overclocked and melts one minute after it executes the benchmark. That isn't useful. The Green 500 is a better metric because it considers the power cost of the computations--that's what most supercomputing centers care about--but even it needs much more stringent rules.
I don't see a problem with using GPUs.
They do lots of parallel shit really fast.
It's no different than slapping on a math coprocessor, or adding a block of hardware to accelerate common encryption/decryption functions.
I want the one with the bigger GPUs and the WiFis.
Flexible bare-metal recovery for Linux/UNIX
As the article alludes, the big problem with ranking supercomputers via Linpack is that it doesn't advance supercomputer design. The net result is a pissing match over scalability, where winning is dependent upon who can cram the most cores into a single room. The real innovatiors should be recognized for their efforts to reduce space, power and cost, or finding new algorithms to crunch the numbers in more efficient or useful ways.
When you have nothing left to burn you must set yourself on fire
int i = 0;
while(i infinite)
{
i++;
}
---
Whatever computer finishes first is clearly the fastest supercomputer.
Is there a useful purpose for this list in the first place? It isn't likely to useful to THAT many people...
If it can't handle a Starcraft 2, 8 player, full army of all zerglings rush without choking with max settings at a 1080p resolution, it's not a supercomputer.
The guide has this to say about supercomputers: "Supercomputers," it says, "are big. Really big. You just won't believe how vastly, hugely, mindbogglingly big they are. I mean, you may think your SGI Challenge DM is big, but that's just peanuts to supercomputers, listen..."
And if changing the laws don't work, chairs will be delivered to you in person by our chairman.
Redefine all variables in LINPACK to be higher precision than available from any graphics processor. 128 bit floats, for example. (Requires writing a library to handle the new floats, obviously.)
Contribute to civilization: ari.aynrand.org/donate
The Top500 has the problem in that many of the systems on there aren't super computers, they are clusters. Now clusters are all well and good. There's lots of shit clusters do well, and if your application is one of them then by all means build and use a cluster. However they aren't supercomputers. What makes supercomputers "super" is their unified memory. A real supercomputer has high speed interconnects that allow direct memory access (non-uniform with respect to time but still) by CPUs to all the memory in the system. This is needed in situation where you have calculations that are highly interdependent, like particle physics simulations.
So while you might find a $10,000,000 cluster gives you similar performance to a $50,000,000 supercomputer on Linpack, or other benchmark that is very distributed and doesn't rely on a lot of inter-node communication, you would find it falls flat when given certain tasks.
If we want to have a cluster rating as well that's cool, but a supercomputer benchmark should be better focused on the tasks that make owning an actual supercomputer worth it. They are out there, that's why people continue to buy them.
Even with an hypothetical hyper-fast network, 100.000 android phones won't get you anywhere near the top of the list.
Heck, even 100 000 Nehalem (core i7) cores won't get you in the top 5.
So, android phones ? You'll need millions of them.
The Tianhe system utilizes GPUs to increase its general purpose computing power. The system is not designed to only perform embarrassingly parallel computations -- it is going to be used for actual scientific research. Not only that, but the system beat out the old #1 in raw petaflops by a significant margin while using 40% less power. This efficiency gain is huge within the HPC community. Also note that the student built third place green500 system utilizes GPUs to achieve its efficiency (5 of the top 12 on the list actually use nvidia GPUs).
...is that *no* single-figure-of-merit benchmark is going to be worth anything. Sandia's "Graph 500" Johnny-come-lately isn't going to be any better than Linpack that way, and will just skew the results towards a different not-generally-useful architecture. A far better idea has been around for over five years: the HPC Challenge benchmark. It consists of seven different tests (Linpack is just one) which stress different aspects of system design. Anybody who knows anything about building big systems would identify some mix of these tests that best approximates their own workload, use that as a starting point for looking at likely alternatives, and then remember that it's just a starting point. The only benchmark that really matters is the one that you run yourself on your own application, but that can be a very expensive and time-consuming exercise so these lists can be a good way to figure out which systems deserve that more extended analysis. Linpack, on the other hand, isn't even useful for that. What's sad is that some people either didn't know (which says something about how we train engineers) or didn't care until a Chinese system found its way to the top (which says something even worse).
Slashdot - News for Herds. Stuff that Splatters.
Good news, everyone! Our supercomputer OS only lost because it's buggy!
How can I believe you when you tell me what I don't want to hear?
In other supercomputer news, it turns out the Windows-based cluster that lost out to Linux stumbled because of a bug in Microsoft's software package.
As it should. That's not news; that's how the game is played. If your software is buggy, and those bugs drag your performance far enough down, you don't deserve a top500 spot.
If they fix their software, rerun the test, and perform better than Linux, then they will have won that battle (the battle for the top500 spot, not the battle for market share) fair and square.
Let q be a radix > 1. I am in ur base-q, killing 10 d00ds.
The HPC challenge has been available for a long time now. It has never got real attention because people do want a single metric to rank computers and make a classification. If you really want to know if your computer is fit for a particular purpose, you can. Don't blame the top500 for providing what the people want to see. As a side note, I quite don't see why using accelerators would be "gaming" the top500. This is a very stupid statement, accelerators have a wide range of applications for real science, it is not about getting the biggest number in HPL. This message written from the premises of the SC meeting.
It's like taking qualifying for a race and assuming that's going to be how your winners line up.
It may, by some sheer chance work out that way, but more often than not, even if nobody crashes or has another mechanical problem, there is a lot that can go on to change matters. Fuel consumption, tire wear, changes to the track, they all have an impact.
Even the winner of the race doesn't necessarily have to have been the best performer throughout the race. Stranger things have happened.
You are allowed to use hardware-specific features and change the algorithm for this benchmark. That way, any optimization is used and innovation, as you call it, emerges. Besides, scalability *is* the most desired quality for a supercomputer that doesn't aim for space, power and cost... like the ones most likely to be in TOP500. You have Green500 for the other things you mentioned.
I rarely respond to comments. Also, don't ask for clarifications: a brain and Google are faster, believe me!
One could argue that there should be more "breadth" in the test, but Linpack scores are like IQ scores: Both are just one particular interpretation on what it means to be weak/strong in a particular field. In other words, we *know* that there is really no one single measure of how either should be judged. Specialized processors which may not fare too well on Linpack may do a great job at other tasks, etc.
So why Linpack? I would argue that large scale linear algebra is still the bread and butter of supercomputing. Yes, you can use "supercomputers" for things other than, say, crunching linear(ized) differential equtions to model nuclear explosions, but that wouldn't be the primary purpose of most of these systems.
There have been a number of attempts to come up with a replacement for LINPACK, some of which have gained traction in the industry. These deal with the fact that computer performance is multi-dimensional. Some deal with different technical metrics (memory bandwidth, disk bandwidth, memory latency), and others by measuring performance with applications or application kernels.
The problem, though, is political, not technical. Its a lot easier for a decision maker - often without a technical background - to deal with a single number and ranking, than it is to understand and deal with the complexities and tradeoffs that come with the more accurate multidimensional solutions. And since many supercomputer purchases fall into the political realm (in terms of congressional earmarks to pay for them, or national programs to improve "competitiveness"), you need something that your congressman or premier can handle easy. Its sad to see how many supercomputer procurements include LINPACK, and that some even ask for an estimate of where the new system will end up in the TOP500 list.
It'd be nice if this weren't the case, but I don't see anything that is likely to change it....
Yes, noticed that.
Here's the actual benchmark used for Top500: "HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers". It solves linear equations spread across a cluster. The clustered machines have to communicate at a high rate, using MPI 1.1 message passing, to run this program. See this discussion of how the algorithm is parallelized. You can't run this on a set of machines that don't talk much, like "Folding@home" or like cryptanalysis problems.
Linpack is a reasonable approximation of computational fluid dynamics and structural analysis performance. Those are problems that are broken up into cells, with communication between machines about what's happening at the cell boundaries. Those are also the problems for which governments spend money on supercomputers. (The private market for supercomputers is very small.)
So, quit whining. China built the biggest one. Why not? They have more cash right now.
Superpack = number of years taken to compile a full Gentoo branch * number of hours taken to render slashdot home page
Still cant play Crysis!!! Literally...
I will not be pushed, filed, stamped, indexed, briefed, debriefed or numbered. My life is my own.
Speed is one thing, but how about normalizing the list by how well its owners are utilizing those transistors?
....but are we really doing anything constructive with them at all?
It kinda seems like its just the US and China partaking in some international dick size contest, while the real winners are the companies given beefy government contracts to build the computers in both countries: IBM et. al.
You have to understand these cultures.
Most other places, exams are a measure of how well you know the stuff. atleast they try to be. You are a certified something, which requires XXX years of practical experience and YY years of study. So once you have the experience and years of study, you take the exam, clear it, and therefore would be considered whatevet it signifies.
In India and China, exams are seen as **The Goal**. Never mind the experience. Never mind the study. Just pass the exam before you have EVEN one day fo practical experience. Mug up the question bank. cheat in the exam. bribe the examiner. Whatever. Just pass the exam, get the credential, and then, expect to be the same as the ppl with the required experience etc.
This is how they try to measure up to the west.
So, got ot make the numbers with this super computing measure. Say there's a way to tweak linkpack to do it. Just do it and make the numbers. Get the numbers, publish it. Does not matter if your computer cannot do anything else. it passed the exam!
(For India - Does not matter if your software developer has not coded a line till now - He has passed the Java certification exams, and therefore must be good!.)
And so it goes. Nothing new here.
The single largest expense, over the lifetime of a supercomputer, is power consumption.
"To those who are overly cautious, everything is impossible. "
To quote the original article "Because Tsubame uses a KVM hypervisor and various cloud-like provisioning tools, it can run both Windows and Linux at the same time on different nodes, and offer users various types of processing configurations." As one of the commentors on the original article pointed out, KVM is virtual machine software designed for linux, so is this benchmark comparing the performance of linux and windows virtual machines (running on a linux host), rather than comparing the performance of linux and windows directly? Or is this comment relevant only in the sense that Tsubame is currently running KVM and totally irrelevant in the context of the testing performed?
Too bad Microsoft isn't the one bitching, and too bad Windows beat NeckBeard Linux in performance on smaller loads and just failed on the larger load due to a (fixed) bug.
>Fortunately, Sandia National Laboratories is heading an effort to develop a new set of benchmarks...
Bummer for you, Sandia. NASA already did that with the NAS Parallel Benchmarks. Here's a hint: you're funded by the US Government (just like NASA), and NPB died when the Japanese started kick US butt on NPB.
Seriously? This is what is keeping folks awake at night?
This is just as pointless as People Magazine's "500 Sexiest People List" -- but with a lot less cleavage. //TB
>But the Benchmark is in FLOPs which uses FLUs.(Like ALUs)
So if they have problems then the only ones to blame are the Designers of the computer.
A FLOP, is a FLOP, is a FLOP.
The possible problems are:
A. The aren't using the Math Libraries correctly.
B. They aren't using the CPU's or GPU's FLU correctly.
C. They configured the system bad. (I.e. Networking, etc)
If this phone cluster actually did outperform the fastest supercomputers, why would that make the benchmark stupid?
I mean, the idea of using 100,000 smartphones might be stupid when examined pragmatically, but I don't see how that affects the validity of the performance.
And nobody, so far, is claiming that using GPUs is an inherently stupid idea for any reason, so that should have no bearing on the Tianhe's victory.
When a foreign computer wins, the benchmark needs to be changed? Now that is gaming the system, American style.
" one high-performance-computing vendor executive joked about stringing together 100,000 Android smartphones to get the largest Linpack number"
Would take a few hundred thousand.
is talking about how "Green" your supercomputer is. So there's a problem with pretty much the entire discussion.
That is incorrect. It was not in the software designed to do the rating. It was in the Microsoft designed software that ran on their OS, and that was rated. If I need to perform a task, and I pay a company to provide the hardware and software to achieve that task, I don't really care if it was a bug in their OS or their application that reduces my productivity.
It is true that the bug wasn't in Windows OS, but it was still a bug in the supercomputer implementation, and therefore a Microsoft flaw in the supercomputers performance, and not a flaw in the benchmarking tools or methodology.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Who manages to get Linpack to run anyhow? You can install it from yum on Centos 5, but how do you actually RUN it? I ask seriously, as someone whose employer would be in the top500 by my calculations if I could figure out how to run linpack!
The Chinese computer with GPUs works with around 40% efficiency because of communication overhead and wasted SIMD ways, whereas most winners historically worked with around 90% (The inefficiency of 100k handsets would be even higher than the lameness of the joke). The silicon power-wall is the limit in scaling of the supercomputers that are built with GPUs.
So, giving examples on GPUs and handsets is not a good way to promote the use of other benchmarks because even this simple benchmark is enough to fail them (assuming your concern power in year 2010 and existence in year 2020). However, it would be a motivation to change this benchmark if there was proof that it is not a representative set for the use of these machines.
> one high-performance-computing vendor executive joked about stringing together 100,000 Android smartphones to get...
Now that's not entirely fair. I just looked at an Android smartphone at Best Buy, and the sales rep assured me that I should be buying the bigger, more expensive one, rather than the one I could fit in my pocket, because it was faster. It even has a snapdragon processor! (whatever that is) Surely it can't be that far away from a true supercomputer.
Clearly our computers are preparing themself for the day they capture us all, and use our body heat for power while we live in some kind of virtual world, simulated by repeatedly solving some kind of large, dense linear equation system.
the reason they use GPU's is to run the simulation tasks much more efficient. You have a good compiler that can split-up the work load - I suspect something like an extended FORTRAN
I think that somebody should develop a distributed computing project that works on cell / handheld devices. The biggest issue of course is battery life... but, make it so that it can (read "only") run while plugged in and that shouldn't be an issue. Make the problems a little less complex so you wouldn't have to worry about them never being completed and I think you'd get a pretty decent turn out to download.
should be mandatory to be allowed to speak. "Stringing together 100,000 Android smartphones to get the largest Linpack number" is a stupidity. You'll have incredibly ridiculous linpack results. Computer science is a science ...
I find this complaining a bit stupid. Linpack is not a benchmark, but a linear algebra package that is used
in many computer simulations. So "cheating" to improve your linpack rating actually means your simulation
will run faster.
When you consider how important times & speeds are to this sport, it was pretty revolutionary (and progressive) of them to do this. Maybe it is just time for the Supercomputer list people to do the same thing.
I come here for the love