The Problem With the Top500 Supercomputer List

← Back to Stories (view on slashdot.org)

The Problem With the Top500 Supercomputer List

Posted by Soulskill on Friday November 19, 2010 @05:20AM from the nobody-cares-about-the-bottom-490-or-so dept.

angry tapir writes "The Top500 list of supercomputers is dutifully watched by high-performance computing participants and observers, even as they vocally doubt its fidelity to excellence. Many question the use of a single metric — Linpack — to rank the performance of something as mind-bogglingly complex as a supercomputer. During a panel at the SC2010 conference this week in New Orleans, one high-performance-computing vendor executive joked about stringing together 100,000 Android smartphones to get the largest Linpack number, thereby revealing the 'stupidity' of Linpack. While grumbling about Linpack is nothing new, the discontent was pronounced this year as more systems, such as the Tianhe-1A, used GPUs to boost Linpack ratings, in effect gaming the Top500 list." Fortunately, Sandia National Laboratories is heading an effort to develop a new set of benchmarks. In other supercomputer news, it turns out the Windows-based cluster that lost out to Linux stumbled because of a bug in Microsoft's software package. Several readers have also pointed out that IBM's Blue Gene/Q has taken the top spot in the Green500 for energy efficient supercomputing, while a team of students built the third-place system.

29 of 175 comments (clear)

Min score:

Reason:

Sort:

Quelle surprise! by Elbart · 2010-11-19 05:25 · Score: 4, Insightful

Now that the Chinese are ahead, there's suddenly a problem with the list/benchmark.
1. Re:Quelle surprise! by Macman408 · 2010-11-19 06:21 · Score: 4, Insightful
  
  +1.
  There's nothing that's inherently "cheating" about using GPUs in a supercomputer. If your problem maps well to the hardware they have (and many large scientific and engineering workloads do), they can provide a huge speedup at a relatively low cost and relatively high performance per watt. After all, a GPU floating point throughput can be around 20 times faster than on a CPU; they're designed to do many things all at once (high throughput, high latency), while a CPU is designed to do one thing really really fast (lower throughput, lower latency). Recently, with multicore CPUs, the extra cores add performance very similarly to how a GPU would. If having a GPU is cheating, I'd surmise that having a multi-core CPU is cheating too.
  It is true that LINPACK doesn't measure everything - it doesn't put a heavy stress on the interconnect, for example. Though if your problem is compute bound, you'd probably do well to find a way to minimize interconnect use to begin with. In any case, LINPACK measures *something* - it's a place to start comparing speeds, not the absolute truth of who will always be the fastest.
  Besides, what's so important about the Top500? It gives somebody bragging rights for 6 months, or, if you're very lucky, a year or two, before something bigger comes along and squishes you. Not to mention, there are many supercomputers not on the list. If the NSA builds the world's largest supercomputer, they're probably not going to brag about how much compute power they have. They prefer the mystery, so outsiders have no idea what is within the realm of possibilities for them. I was at Cray once, and was told that they sometimes sell supercomputers into such secretive areas. The government (or whoever) will send a few guys to get trained about the computer, then it gets loaded onto trucks, and Cray never hears a thing about the computer ever again. No support calls, no upgrades, no idea where it even went to or what it's used for.
2. Re:Quelle surprise! by inhuman_4 · 2010-11-19 06:54 · Score: 5, Insightful
  
  The Linpack complaining has been going on for years. I remember this coming up with the NEC earth simulator, and other ASIC based systems.
  
  Here are some interesting numbers:
  AMD Radeon HD 4870X2 ~2.4 teraFLOPS
  Intel Core i7 980 XE ~107.55 gigaFLOPS
  
  According to this the AMD is 20x faster then the Intel; and this is true, but only in some cases. If what is need is graphic processing the AMD will crush the Intel. But if you need anything else (I am ignoring GPGPU for simplification) the AMD doesn't just lose, it doesn't run. This is a problem for all ASIC based systems, GPU ones are just the newest to come out.
  
  So this new Chinese supercomputer (and other ASIC based supercomputers) score very high in Linpack because the ASICs are designed to be good at this type of task. This makes for a non-general purpose, but very cost effective solution.
  
  But this then means that a supercomputer that cannot use GPUs for its intended task, score very low because they are general purpose machines. Because the Top500 is based on one benchmark (Linpack) you end up with a car to pickup-truck comparison; sure the car is faster, but what about towing capacity?
  
  The end result is the supercomputer analog of the megahertz-myth, people like bigger numbers. A high score proves that is it faster at somethings, but not that it is faster in general.
3. Re:Quelle surprise! by natet · 2010-11-19 07:16 · Score: 4, Insightful
  
  Agreed. It seems like the issue is "big enough" only now that other people are catching up.
  I call bullsh*t on this comment. Around 8 years ago, the top computer on the list was a Japanese machine, and it rode atop the list for 3 years straight. Those of us who have worked in high performance computing have known for years that the top 500 list was a load of crap. It's something to write a press release about so that the people that give us money to build the big computers feel like their money is well spent. I worked on a top 5 computer at one time, but our focus was always the science that we wanted to do on the computer. Running the linpack benchmark for the top 500 list was an afterthought (though it was a pleasant surprise to score as well as we did).
  
  --
  IANAL... But I play one on /.
4. Re:Quelle surprise! by KingMotley · 2010-11-19 08:05 · Score: 3, Informative
  
  That makes a nice headline, but everything the article is based on has been proven to be untrue and sensationalist. My 8 year old son, when he lost, used to also accuse others of cheating as well. Usually he was wrong as well, but I didn't take his word for it and then try to pass off a news article on it.
5. Re:Quelle surprise! by jgagnon · 2010-11-19 08:41 · Score: 3, Funny
  
  Japan isn't scary.
  Have you SEEN the kinds of porn that comes out of Japan???
  
  --
  Remember to maintain your supply of /facepalm oil to prevent chafing.
6. Re:Quelle surprise! by Profane+MuthaFucka · 2010-11-19 14:31 · Score: 2, Informative
  
  Blurry is frustrating, not relaxing. Unless you're talking about the relaxing man-made waterfalls of semen.
  What I would like to have is some Japanese porn where the actresses don't sound like a cat was set on fire. What the fuck is wrong with these people that they make those kinds of sex noises?
  
  --
  Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
Missing the Point by Lev13than · 2010-11-19 05:30 · Score: 4, Insightful

As the article alludes, the big problem with ranking supercomputers via Linpack is that it doesn't advance supercomputer design. The net result is a pissing match over scalability, where winning is dependent upon who can cram the most cores into a single room. The real innovatiors should be recognized for their efforts to reduce space, power and cost, or finding new algorithms to crunch the numbers in more efficient or useful ways.

--
When you have nothing left to burn you must set yourself on fire
1. Re:Missing the Point by timeOday · 2010-11-19 06:24 · Score: 4, Insightful
  
  The real innovatiors should be recognized for their efforts to reduce space, power and cost, or finding new algorithms to crunch the numbers in more efficient or useful ways.
  And NHRA should start awarding drag-racing championships on fuel efficiency rather than quarter-mile times.
  Look, the Top500 is about performance, as in speed. There are other metrics for flops/watt or flops/dollar, or whatever. If those were the lists that managed to draw competitors and eyeballs, then nobody would care about Top500 and we wouldn't have to quibble about whether Linpack is a representative benchmark of what it claims to measure: speed.
2. Re:Missing the Point by icebraining · 2010-11-19 07:09 · Score: 3, Funny
  
  Even if it is just a bunch of android phones hooked together. Actually that would be even sweeter. Please someone do it.
  Will an Arduino cluster do?
  
  --
  Dilbert RSS feed
New Benchmark by Monkeedude1212 · 2010-11-19 05:31 · Score: 2, Funny

int i = 0;
while(i infinite)
{
i++;
}
---
Whatever computer finishes first is clearly the fastest supercomputer.
1. Re:New Benchmark by Monkeedude1212 · 2010-11-19 05:32 · Score: 2, Informative
  
  Right. (Less than symbol didn't show up because I didn't choose plain text! Derr)
2. Re:New Benchmark by atmtarzy · 2010-11-19 05:42 · Score: 2, Funny
  
  We all know Linux is great... it does infinite loops in 5 seconds.
  - Linus Torvalds about the superiority of Linux on the Amterdam Linux Symposium
3. Re:New Benchmark by KarmaMB84 · 2010-11-19 05:51 · Score: 4, Interesting
  
  Also, I have seen cases where compiler optimization is smart enough to remove the entire loop if there are no side effects to incrementing i, and it's not used outside the loop.
  Most compilers should be doing this. Hell, even IE9 is supposed to do it for JavaScript now. It gets great scores on SunSpider because of it (the JIT can throw away entire tests).
4. Re:New Benchmark by Anonymous Coward · 2010-11-19 08:24 · Score: 2, Informative
  
  Also, I have seen cases where compiler optimization is smart enough to remove
  the entire loop if there are no side effects to incrementing i, and it's not
  used outside the loop.
  Most compilers should be doing this. Hell, even IE9 is supposed to do it for JavaScript now. It gets great scores on SunSpider because of it (the JIT can throw away entire tests).
  Hey mods, I think the parent was going for funny.
  The IE9 Javascript "optimization" is completely invalid because it eliminates branches that have side effects hidden by valueof().
  For example, you can define a global valueof() immediately prior to calling the function. If the branch was previously eliminated by an optimizer, then the valueof() calls will not be made; therefore the final result of the computation will be incorrect.
Mind-bogglingly complex by Yvan256 · 2010-11-19 05:43 · Score: 5, Funny

The guide has this to say about supercomputers: "Supercomputers," it says, "are big. Really big. You just won't believe how vastly, hugely, mindbogglingly big they are. I mean, you may think your SGI Challenge DM is big, but that's just peanuts to supercomputers, listen..."
Re:LINPACK isn't so bad by KenSeymour · 2010-11-19 05:49 · Score: 2, Interesting

I wonder why they don't use EISPACK?
That is for solving Eigen systems.
I remember in the early 1980's writing a program to check my linear algebra homework using Fortran and EISPACK.
This is why I love the fact that Bender likes to drink "Old Fortran" malt liquor.
I have to admit I don't know much about benchmarking but I remember using LINPACK and EISPACK on the VAX and later the Cray YMP.

--
"We can't solve problems by using the same kind of thinking we used when we created them." -- Albert Einstein
Good to hear by Sycraft-fu · 2010-11-19 05:50 · Score: 5, Informative

The Top500 has the problem in that many of the systems on there aren't super computers, they are clusters. Now clusters are all well and good. There's lots of shit clusters do well, and if your application is one of them then by all means build and use a cluster. However they aren't supercomputers. What makes supercomputers "super" is their unified memory. A real supercomputer has high speed interconnects that allow direct memory access (non-uniform with respect to time but still) by CPUs to all the memory in the system. This is needed in situation where you have calculations that are highly interdependent, like particle physics simulations.
So while you might find a $10,000,000 cluster gives you similar performance to a $50,000,000 supercomputer on Linpack, or other benchmark that is very distributed and doesn't rely on a lot of inter-node communication, you would find it falls flat when given certain tasks.
If we want to have a cluster rating as well that's cool, but a supercomputer benchmark should be better focused on the tasks that make owning an actual supercomputer worth it. They are out there, that's why people continue to buy them.
1. Re:Good to hear by timeOday · 2010-11-19 06:36 · Score: 3, Insightful
  
  OK, so you think only algorithms requiring uniform memory access are valid benchmarks. How uniform does it have to be? Real world problems do have structure, they do have locality, and an architecture that fails to exploit that is going to lose out to those that do.
  Sure, your point is taken, otherwise you could say "my benchmarks is IOPS" and "my computer is every computer in the world" and win. But Linpack is not that; you can't score well without a fast interconnect, because what it measures is a set of computations that are actually useful. (Which is why the quip about a beowulf cluster of Android smartphones is stupid... because it couldn't actually be done. Go ahead and try to get on Top500 with a seti@home-type setup.)
Heard at Microsoft headquarters: by pushing-robot · 2010-11-19 05:53 · Score: 3, Insightful

Good news, everyone! Our supercomputer OS only lost because it's buggy!

--
How can I believe you when you tell me what I don't want to hear?
1. Re:Heard at Microsoft headquarters: by MightyMartian · 2010-11-19 06:04 · Score: 2, Interesting
  
  Good news, everyone! Our supercomputer OS only lost because it's buggy!
  Leela: How is that good news, Professor?
  Professor Farnsworth: I still charge enough per seat to be feared.
  
  --
  The world's burning. Moped Jesus spotted on I50. Details at 11.
Well, there's a non-notable point! by icannotthinkofaname · 2010-11-19 05:54 · Score: 4, Insightful

In other supercomputer news, it turns out the Windows-based cluster that lost out to Linux stumbled because of a bug in Microsoft's software package.
As it should. That's not news; that's how the game is played. If your software is buggy, and those bugs drag your performance far enough down, you don't deserve a top500 spot.
If they fix their software, rerun the test, and perform better than Linux, then they will have won that battle (the battle for the top500 spot, not the battle for market share) fair and square.

--
Let q be a radix > 1. I am in ur base-q, killing 10 d00ds.
1. Re:Well, there's a non-notable point! by gerrywastaken · 2010-11-19 07:29 · Score: 2, Informative
  
  In this case it was the benchmark software that was buggy, not the OS.
  Yeah that's right, the LINPACK benchmark software that Microsoft strangely got to rewrite themselves. Yep that's apparently where the bug was.
  I wonder why MS was given the task of rewriting the benchmark in the first place. I guess it will always be a mystery... oh hold on, no wait... "It should be noted that Microsoft has helped fund the Tokyo Institute of Technology's supercomputing programs." Guess it helps to read the sentences they stick near the end of unrelated paragraphs at the end of the articles.
Re:Why is being on the the Top500 important? by SnarfQuest · 2010-11-19 05:59 · Score: 2, Insightful

I have always wondered why being on the Top500 list of supercomputers that important for those on the list.
I always choose my supercomputers from the Bottom500 list.

I will be better served by being told the advantage(s) or edge those who've been on that list have gotten since they got onto the list. Thanks.
At the price level these things cost, you can probably list your own requirements instead of accepting the vendors.
If you are purchasing a SuperComputer, you are looking for something to do raw number crunching. You aren't worrying about how well it will run MicroSoft Word, or how many frames/second you'll get out of doom.

--
Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
You missed the point too, btw by elsurexiste · 2010-11-19 06:11 · Score: 3, Informative

You are allowed to use hardware-specific features and change the algorithm for this benchmark. That way, any optimization is used and innovation, as you call it, emerges. Besides, scalability *is* the most desired quality for a supercomputer that doesn't aim for space, power and cost... like the ones most likely to be in TOP500. You have Green500 for the other things you mentioned.

--
I rarely respond to comments. Also, don't ask for clarifications: a brain and Google are faster, believe me!
The actual benchmark does stress interconnects by Animats · 2010-11-19 06:14 · Score: 4, Informative

Yes, noticed that.
Here's the actual benchmark used for Top500: "HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers". It solves linear equations spread across a cluster. The clustered machines have to communicate at a high rate, using MPI 1.1 message passing, to run this program. See this discussion of how the algorithm is parallelized. You can't run this on a set of machines that don't talk much, like "Folding@home" or like cryptanalysis problems.
Linpack is a reasonable approximation of computational fluid dynamics and structural analysis performance. Those are problems that are broken up into cells, with communication between machines about what's happening at the cell boundaries. Those are also the problems for which governments spend money on supercomputers. (The private market for supercomputers is very small.)
So, quit whining. China built the biggest one. Why not? They have more cash right now.
Re:Why is being on the the Top500 important? by 1729 · 2010-11-19 07:54 · Score: 4, Interesting

The advantage is that, contrary to the arguments of TFA, the test is very representative of scientific and engeneering problems.
No, it really isn't. I work in HPC at a national lab, and our bureaucrats buy these computers based on these benchmark numbers and then expect us to adapt our codes to fit these machines, rather than buying machines that are better suited to the problems we are solving. For example, one of our machines peaked at #2 on the Top500 list, and was essentially useless for real codes. Another machine of ours held the #1 spot for quite a while, and worked well for a small class of problems, but was so limited in functionality that it couldn't even run many of our codes. I've heard similar stories from people using other machines near the top of the Top500.
Real science codes often do not look anything like LINPACK, and the computers that run these benchmarks fast aren't necessarily good for true HPC.
Re:Why is being on the the Top500 important? by Jeremy+Erwin · 2010-11-19 08:50 · Score: 2, Insightful

And that's why Top500 should use another benchmark. If the beancounters use Top500 to allocate resources, and the supercomputing companies use the beancounter's allocations to determine the future direction of their products, the scientists lose out. It's not so much that Tianhe-1 gamed the benchmark, it that's this gaming could lead to a machine that's not very useful.
Re:Why is being on the the Top500 important? by 1729 · 2010-11-19 14:10 · Score: 2, Informative

Codes? Is that what internets are programmed with?
I know you're just being sarcastic, but the HPC equivalent of a program or an application is a code. Google "hpc codes" or "multiphysics codes" for some examples. And for some trivia: the input script for a code is typically called a deck, a term that's been around since the days when the input was handed over to computer operators as decks of punch cards.