Big Mac Officially Ranks 3rd
An anonymous reader noted that
according to Wired, it will be announced officially on Monday the Big Mac supercomputer is the third-fastest super-computer. The article also talks about some of the amazing supercomputers in the planning stages. The sort of stuff that will make Big Mac look like that old TI-85 collecting dust in your drawer.
...you insensitive clod!
It has to be said that Mac's haven't been famous for their speed, always pushing the "it does more", or "there are 2 procs" arguments, but this gives them some serious ammunition. Perhaps they'll even get their advert on the air in the UK now :-)
Simon
Physicists get Hadrons!
Clicky for the official November list
Excuse me, but Big Mac is a registered trademark of McDonald's according to http://mcdonalds.com/legal/index.html
So what's going on here? Can they actually do that?
Given the basic benchmarks used to rank supercomputers, could a cluster of loosely coupled machines compete, or is the bandwidth demands for the benchmark set too demanding? I'm just curious how projects like what is detailed at distributed.net compare: 1100 dual-processor macs would be vastly outranked by the hundreds of thousands (or millions) of PCs taking part in distributed processing for various code cracking or cancer curing purposes.
Can the Big Mac play games like the WOPR?
How about a nice game of chess?
Friends help you move. Real friends help you move bodies.
That said, for what is provided, the Earth Simulator seems to be the current king by about 2x. (Corrections appreciated.)
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
The Top500 site lists two competing 64bits architectures-based clusters: the Integrity rx2600, with 1938 Itanium2 at 1.5GHz (must be pricey), and an 2816 Opteron 2 GHz cluster, that achieves only three fourths of Big Mac's performance. Now that's a defeat for AMD.
Also, the VirginiaTech cluster is the only "self-made" supercomputer in the Top50 (the next one is ranked 63th, based on SunFire V60). The original #3 slipped to the 7th position because of the new supercomputers. Competition for that third place was tough !
Now where's the G5 XServe ? It was supposed to be out when OS X Server 10.3 was released.
Maybe we deserve this world ?
The sort of stuff that will make Big Mac look like that old TI-85 collecting dust in your drawer.
Cluster a billion TI-85s together and then we'll see who's collecting dust.
The coolest voice ever.
There are some other interesting semi-commodity hardware based new additions to the top 500 right under VT's #3 slot.
BigMac is certainly impressive, but even if these systems can't quite match it's scores, they deserve a mention.
4
NCSA
United States/2003
Tungsten
PowerEdge 1750, P4 Xeon 3.06 GHz, Myrinet / 2500
Dell
9819 Rmax
15300 Rpeak
5
Pacific Northwest National Laboratory
United States/2003
Mpp2
Integrity rx2600 Itanium2 1.5 GHz, Quadrics / 1936
HP
8633 Rmax
11616 Rpeak
6
Los Alamos National Laboratory
United States/2003
Lightning
Opteron 2 GHz, Myrinet / 2816
Linux Networx
8051 Rmax
11264 Rpeak
"The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
Interconnect is very important.
This is nothing like distributed.net.
For a problem that can be broken into millions of discrete, independent chunks, sure, distributed.net's model is fantastic, and works really well... (seti, folding, distributed.net, etc)
For something where you need lots of feedback from nodes, (like these benchmarks, and lots of simulation work), bandwidth is everything.
I mean, I understand reasonably well the benchmarks used... but my question is this:
In the past, we always looked to the DoE or DoD for who had the fastest computers... they had stuff we could only dream of.. huge, fast clusters of funky computers we've never heard of.
Now, a university built one out of macs... and it competes with the same benchmarks.
What I wonder is, are there applications the old-style supercomputers are still better at, or has technology simply advanced since then? (Things like 10gig ethernet and ghz processors and memory busses, etc)... have we simply surpassed them? Don't just feed me some line about I/O either....
I used to run an Intel-based supercomputer, but then one night, I was modelling a nuclear explosion on it, and all of a sudden it went berserk, the screen started flashing, and the model just disappeared. All of it. And it was a good model of a nuclear explosion! I had to cram and remodel it really quickly. Needless to say, my rushed model wasn't nearly as good, and I blame that Intel supercomputer for the fact that DARPA yanked our funding.
Now this system is the cheapest of the top 10. its cheaper than many it beat by a factor fo ten (more than that considering some of the building infrastructure are in that figure). Even more interesting these were stock mac at full price loaded with DVD-roms, firewire, blue tooth, the OS, etc..---not some stripped down model.
Its a good bet too that this thing is going to have lower maintainence costs and higher up-time given the macs attention to cooling, the use of high quality hard drives and power supplies, and high end memory chips. (on our cluster a tenth that size we blew 60 hard drives in the first 6 months and had to replace 10% of the motherboards.
Some drink at the fountain of knowledge. Others just gargle.
Imagine a beowulf cluster of..... oh, wait.
"'I pass the test,' she said. 'I will diminish, and go into the West, and remain Galadriel.'"
- JRR Tolkien.
Hmm, guess this means my submission a couple hours ago won't go through (dangit, Wired!)...
Here is the official press release and the list.
There is a lot of good points to note all around. The first is the G5 Terascale cluster at Virginia Tech at #3 (10.28 Tflops/s, 2200 CPU, Infiniband) is the first academic computer to break 10 teraflops/s. This extra performance was promised at Mac OS X Developer's conference last month. Not to sure if the price is a testament to Infiniband ($1.5 million cabling, cards, and routers) or the Macs ($4.2 million list).
Good thing too because in a surprise move the NCSA cluster made the list at #4 (9.82Tflops/s, 2500 CPU, Myrinet). This cluster is built using Dell's running Pentium 4 XEONs and Red Hat Linux! One subtle point to note is that they didn't get all the systems online in time (there should be 2900 CPUs, not 2500). I bet some programmer at PSC and an ex-Chief Scientist of SDSC is appreciating having a hand in edging out NCSA for #3--not to mention Apple beating Dell for #3.
The fastest Itanium cluster is at #5 (8.63 TFlops/s, 1936 CPU, Quadrics) which is looking like the odd man out boxed in by a PC based systems using Myrinet, the P4 Xeon above, and the most powerful Opteron system at #6 (8.05 Tflops/s, 2816 CPU, Myrinet). Another point of similarity:did I mention it's also using Linux?
And finally, It's easy to overlook #73, a single compute node of BlueGene/L (1.44 Tflops/s, 1024 CPU). Imagine 128 of these connected together and you have something that will easily take #1 when it's completed even if we handicap it 20-40%. As noted on SlashDot earlier, this will be running Linux.
I miffed some of my links in the parent post:
Here is the reference to the ex-SDSC scientist.
Here is the link showing that the Opteron cluster is using Linux Networx.
Finally in the interest of full disclosure and to pre-empt the anti-Mac zealots, I should mention that the $4.2 million for the G5 machines is probably the education list price, because when you go to Apple Store, putting 2GB of RAM into 1100 2x2Ghz G5's will cost you $4.4 million (+ a little more for having some spare machines).
Well now we know where they are being stockpiled. ;)
Now to get on with the research. It's a credit to them that this computer got from the drawing board to fruition in the tiny amount of time that it did. It's raised the bar for price/performance in the research computing world and hopefully many less wealthy institutions (I'm looking at UK universities especially here). At the end of the day its about the research they put into it and the results they get out of it.
Well, to get the ball rolling, here is a query on the top 500 supercomputers using Microsoft Windows. Corrections and insight are appreciated.
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
Yes, it might make some fast scientific calculations really really fast, but I want to know how fast it does some real world stuff. Give me some Quake framerates, or Photoshop gaussian blur benches.
The G5 is a cool processor, but it isn't the reason the VT cluster is so fast, the Infiniband interconnect is. The LINPACK benchmark that is used to determine position on the Top 500 list depends very strongly on the latency of the network connection.
Infiniband has ~ 8-12 us latency (probably even less by now), while ethernet is an order of magnitude slower. In real-life applications it's actually worse than this suggests.
We have tested a real-life application (socorro) using both gigabit ethernet and Myrinet (slightly slower than Infiniband), and gigE took 600 seconds to finish a run, while Myrinet took 4.
VT's cluster is using the largest Infiniband network yet built (or at least announced). The previous largest Infiniband network was O(100) machines. VT could have built the cluster using Xeons, Itaniums, or Opterons and arrived at roughly the same level of performance.
Run down the list and look at processor counts. We've got 5120 at the top (vector), but number 2 needed 8192 to get the job done. BigMac at #3 drops to 2200 and the processor counts hover in that 2000+ category. Until #19, when Cray's X1 jumps in at 252 processors.
Having a fast computer is cool and all, but if you can do it with 252 CPUs instead of 1024 (#22, P4 2.4), isn't that a win?
Besides, LINPACK doesn't stress interconnect latency and bandwidth, only cache and memory performance. When you run a "real" codes on these Mac/Xeon clusters and get 5% efficiency, suddenly the Earth Simulator (and the small Cray X1's) look good when they blow well past the 50% efficiency mark.
Yes, you're quite right, the networking hardware is important.
But as researched by the VT folk, the G5 is significant: It was cheaper for their needs than the Xeons, Itaniums, and Opterons of similar performance and energy consumption!
So both component choices were critical to their achieving number 3.
GPL Deconstructed
And yet equally, if not more, important products like amd64 don't have their own icons ?
Additionally, why does this CPU have a G5 icon? And not a PPC970 icon ?
Has slashdot sold out to apple ?
My TI-85 isn't collecting dust in my drawer. This summer I had a small accident with flavoured oatmilk. What started as a tilted backpack ended as a TI-85 with some IC connectors magically erroded away after 5 hours or so of traveling.
Kids, remember this: Oatmilk with salt = ionized water. Batteries = electricity. Ionized water + electricity isn't healthy for those small metall pieces of yours.
Whoa, do I smell a intresting+informative moderation?
Actually, you are horribly horribly wrong about two things.
You definitely could not do that with Opteron or Xeon systems. VT was in negotiations about price and delivery time with Dell and Apple. Apple beat out Dell's prices (shocking!!!).
Also, the G5 makes a great cluster computer. It comes standard with gigabit ethernet and has very easy access to parts (no screws required to install anything).
Finally, the Apples make a good cluster because in 5 years or so when they disassemble it they have 1,100 really nice desktop machines. PC's need to be upgraded more often to serve as a desktop computer (that's why Macs have awesome resale value compared with PCs).
Help I'm a rock.
The 1.5 GHz Itanium 2 costs over $3000 per chip, and even the 32-bit Xeon 3.06 GHz is about $1000, while the 2 GHz PPC 970 is about $300 or $400.  In addition, VT wants 64-bit chips, so Xeon is a nonstarter.
Excluding the Earth Simulator, the 2 GHz G5 has the highest Flops per CPU, even 5% higher than the 1.5 GHz Itanium 2 and 10 times cheaper:
#2 Alpha 13880 / 8192 = 1.69
#3 G5 10280 / 2200 = 4.67
#4 Xeon 9819 / 2500 = 3.92
#5 Itanium 8633 / 1936 = 4.45
#6 Opetron 8051 / 2816 = 2.85
can it run Maya AND Photoshop at the same time?
Jory
... you insensitive clod!
Etiquette is etiquette. He kills his mother but he can't wear grey trousers.
If the BlueGene/L interests you, take a look at the next member of the family BlueGene/P (the P means Petaflop). If I recall correctly, the Petaflop version is going to have more than a million processors in it. These computers are pretty much used for biological applications, and are going to benefit from some serious hardware, software, and networking.
P resentation_January_2002.pdf
Here is the project update from a while back, talks a bit about each level of the blue gene project. It also talks about the biological motivations for supercomputing.
http://www.research.ibm.com/bluegene/BG_External_
And more generally, the blugene homepage: http://www.research.ibm.com/bluegene/
-SF
Dig around the Top500 list and you'll see that for this benchmark (LINPACK), Myrinet and Infiniband don't do much better than plain GigE. (Which is one reason why the Cray X1 systems aren't ranked higher).
In fact, there are some nearly-identical setups in which there is no difference between GigE and Myrinet.
LINPACK is a good benchmark for generating big numbers for clusters, but it's a pretty poor supercomputing bechmark in general. The faster your machine can multiply and add fp numbers, the better its LINPACK score. This isn't SPECfp_rate. (Notice I said SPEC rate, not SPEC base).
The five year old iMac that I am typing this on is running OS X.3 (Panther) very, very well thank you. And, mind you, this is an all-in-one, blueberry iMac. Not a, at the time, top of the line PowerMac.
Your facts are quite off.
While I tend to agree with you that a rack-mounted cpu is generally easier to maintain than a typical PC, I am not so sure about the PowerMac. With the right rack mount you will get the same benefits that you would get from a dedicated rack-mount unit. Slide out the box, pull a switch and drop the side, do the work, raise the side, and slide it back in. The process is the same for both.
Now a typical Intel box set-up is rarely like that (there are exceptions). Their engineering sucks. Getting to parts and pieces is a real pain.
IBM makes the G5 for Apple. It also uses similiar processor in its own machines. And yes, they can cram a lot of them into a small amount of space and still deal with the heat. If you had read the article you might have noticed the following:
Meanwhile, IBM is working on a monster supercomputer that will easily rank as the world's fastest supercomputer when it comes online next year. Blue Gene/L will be capable of performing 360 trillion calculations per second, or 360 teraflops.
Commissioned by the Lawrence Livermore National Laboratory, Blue Gene/L will be based on 130,000 processors.
Not only will it be the fastest, but Blue Gene/L will also be the most compact, IBM said.
IBM has managed to cram 1,024 PowerPC 440GX processors into a slanted cabinet the size of a dishwasher. The unit -- described by IBM as a small-scale prototype of Blue Gene/L -- is already ranked 73rd in the new Top500 list.
When finished, Blue Gene/L will be about the size of half a tennis court. "That's very small considering how powerful it is," said IBM spokesman Adam Emery.
By contrast, the Earth Simulator's 5,120 processors would fill four tennis courts.
Lasers Controlled Games!