BlueGene/L Puts the Hammer Down
OnePragmatist writes "Cyberinfrastructure Technology Watch is reporting that BlueGene/L has nearly doubled its performance to 135.3 Teraflops by doubling its processors. That seems likely to keep it at no. 1 on the Top500 when the next round comes out in June. But it will be interesting to see how it does when they finally get around to testing it against the HPC Challenge benchmark, which has gained adherents as being more indicative of how a HPC system will peform with various different types of applicatoins."
Maybe this thing can keep the WoW service running.
...how do we slashdot it?
There is no sig
How much processing power does one need for any certain application? I know that projects like World Community Grid need massive amounts of computing power, but seriously, 135 TFlops?
...ok I couldn't resist
Imagine a beowulf cluster of these....
does anyone else find the similarities between the computer hardware world and DragonballZ irratating? right when you think its finally over, the best is exposed and found worthy, yet another difficulty comes up - along with the standard unfathomed power increases and bizare advances. then it all happens again :/
this sig no verb
Imagine a Beowolf cluster of ... ouch my brain just exploded.
Help save the critically endangered Blue Iguana
..about overclocking it?
and what type of frame rate do you get with Quake?
Is it just me or is 135.3 * 2 < 360 / 2?
Obviously that number's based on an unrealistic, 100% efficient scaling factor. But still. The 137 TFlop is coming from 64,000 processors.
It's fun to think about what's just around the corner.
Now to get that thing folding now...
http://onticfusion.sytes.net/
Pity.
A..........Beowulf.......
*sigh*
I guess I'm just a tired old whore.
Cool! Amazing Toys.
...host a spell check for Slashdot! ...as being more indicative of how a HPC system will peform with various different types of applicatoins."
Oh man, I *so* wanna put Windows HPC on this thing!
Play pacman at full speed??? and render it in software mode??? at full screen???
No way!!!!!
Now THAT would be amazing, wouldn't it?
Your teraflops are belong to us... get it?
Have a good one.
===== "Every head is a different world so don't invade mine you FREAK!" smartSAGA said
1) Solving linear equations. SIMD Matrix math, check.
2) DP Matrix-Matrix multiplies. IBM added DP support to their VMX set for Cell (though at 10% the execution rate), check.
3) Processor/Memory bandwidth. XDR interface at 25.6 GB/s, check.
4) Processor/Processor bandwidth. FlexIO interface at 76.8 GB/s, check.
5) "measures rate of integer random updates of memory", hmmmm... not sure.
6) Complex, DP FFT. Again, DP support at a price. check.
7) Communication latency & bandwidth. 100 GB/s total memory bandwidth, check (though this could be heavily influenced on how IBM handles its SPE threading interface)
Obviously, I'm not saying they used the HPC Challenge as a design document, but clearly Cell is meant as a supercomputer first and a PS3 second.
1.) How many frames/sec is that in Counter-Strike?
2.) How about CS:S?
3.) If Apache 2 were installed on it, could it survive a slashdotting?
4.) How fast could it run Avida?
Silence is golden... and duct tape is silver.
I found it odd that there aren't any pics of the machine on those sites, so I looked around... Here are some pics of the prototype at top, and the finished version at bottom. It looks like it's going to be in classic "IBM black", like the 2001 monolith : )
Some more pics of the prototype.
For comparison, the Earth simulator and big mac.
Anyone know what kind of facilities blue gene will be housed at? The one for the earth simulator looks like something out of a movie, IBM better be able to compete on the 'cool factor'. : )
And does anyone else get the warm and fuzzy feelings from looking at these pics, even though there's nothing you could possibly use that much power for? Ahhh, power...
135.3 Teraflops sounds very nice, but the achievement won't mean anything to me until I hear how many LoCs it can wipe its ass with per second.
How do these compare to the Cray Supercomputers? Last I checked, Cray was top-dog and everyone else was fighting for second place. I mean, it's cool that you can get 135.3 Teraflops out of the BlueGene, but the Cray X1E delievers up to 147 TFLOPS in a single system. Am I just confused and lost?
Come on that was funny , moderators give it up.
Anyway it was insightfull as he said it would be moderated down
? Oh please that was funny not redundant .
Stop trolling with moderation points
I'm posting anonymously to avoid a karma hit and blacklisting
-X
A graph would be neat (but I'd settle with a power of ten) :-)
It would give an idea of when we'll get that kind of power at home - and don't tell me we'll never know what to do with it...
so if i sit at my computer desk long enough ... ... i think ... or was it ...
i won't be able to reach my mouse anymore,
and ping times to slashdot are increasing.
one day i won't be able to reach the site
at all, since the server has moved or
accelerated to near light speed and is
moving away from my computer
anyway, what is intriguing for me i guess
is that atoms are made up of protons
and neutrons
moleculs? but as far as anybody can tell
lone neutrons decay at half time of 12
minutes(?), while protons are indestructable.
so has anybody check exactly if the whole
chain of changing electrons into neutrons
and neutrinos and protons into anti protons
and the like is acctually balanced? yeah yeah,
overall in the universe one cannot create
or destroy charges (magnetic or something),
but if the whole "birth" and "dead" cycle
of all particles evolves towards a certain
configuration, say in few million years
there will be less neutrons but more protons and
electrons, maybe after the big band and it
evolution from basic particles to complex
atoms (like plutonium?) it will "de-evolve"
back to flimsy particles, say hydrogen...
maybe? anyone
I wonder how many decmial place this IBM Monster will be able to compute in 24 hours? A trillion? I hope more... It would be awesome if it figure then trillion ^ trillion th Prime Number in a day.
May
What it would also be interesting is the power consumption and heat production figures of those systems when idle and under heavy load and also the load statistics.
In other words what is the cost in the quest for performance?
...explain why those genetic reseach need that much amount of cpu power? What calculations take that long to process so they need to build fastest computers. And also, are they sure that the programmers working at research labs are optimizing thier codes effectively so maybe the work done on those computers can be done w/ 1/4th of that current power?
So what do people think assuming speeds continue to leap ahead in the desktop arena, will it simply encourage further sloppy programming. After all if the choice is to optimise your product for a month to save a few Gigaflops or get it out into the market and so what if its a bit resource hungry, I imagine many teams will get pushed to release sooner rather than later.
Several decades ago, a computer filled an entire room, and "I think there is a world market for maybe five computers"
A few decades ago, people thought Bill Gates was wrong when he reckoned there would soon be a time when there was a computer in every home.
Now, a supercomputer fills an entire room. So how long before someone reckons that there will come a time when there will be a supercomputer in every home?
"She's furniture with a pulse"
I think the whole point of using a machine of this size is that you write your custom application specifically with it in mind. I would be highly surprised if after leasing one, or a share on one, IBM doesn't provide documentation on how to create an application which takes advantage of the machine's architecture.
It could be that the competition for the top of the 500 slot is becoming less of technological achievement and more of just who has the most $$$ to spend. Just like auto racing used to be about improvements in engines and transmissions etc but after a point everybody could make a faster car just by buying more commonly available, well known technology than the other guys. So they put in limitations for the races, only so big a venturi, displacement, etc.
Anyway, my point is - it's becoming just "I can afford more processors than you can so I win" instead of the heyday of Seymore Cray when you really had to be talented to capture the #1 spot from IBM.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
... what performance benchmark would be appropriate for workstations and small servers (2 to 4 processors)? Have you used some performance benchmark before? Which ones have you liked the most?
I was wondering if a test of loading OpenOffice.org writer would be usefull?
* Carthago Delenda Est *
Great! We can use it to get inaccurate weather forecasts twice as fast!
I could be wrong, but I bet the reason you get double the performance when you double the number of processors is that they are not adding on more slow processors. They keey adding faster chips. This might be why it seems to scale so well.
I am a viral sig. Please help me spread.
What's the scalar performance of one of these beasties?
Can an Athlon 64 / P4 beat it on scalar code? The whole HPC world has gotten boring since Cray died. Here's why I say that:
The Cray 1 had the best SCALAR and VECTOR performance in the world.
The Cray 2 was an ass kicker, the Cray 3 was a real ass kicker (if only they could build them reliably).
Cray pushed the boundaries, he pushed them too far at some points -- designing and trying to build machines that they couldn't make reliable.
So it'll be a cold day in hell before I get all fired up over the fact that someone else managed to glue together a bazillion 'killer micros' and win at Linpack...
Now if someone would bring back the idea of transputers, or we saw some *real* efforts at Dataflow and FP then I'd be excited. I'd love a PC with 8 small, simple, fast, in-order tightly bound cpus. Don't say CELL, all indications are that they will be a *real* PITA to program to get any decent performance out of.
Well, it comes down to a few different things.
First off, Opterons are pretty mediocre at double precision floating point benchmarks, it just isn't what they were designed for. Opterons effectively have only a single FPU (technically they have two, but one only does addition, while the other handles all multiplies), while most competing chips in the HPC arena have two full FPUs. They tend to get spanked by PPCs and Itanium2s, and even Xenons can do better.
Also, you should note that the modified PPC440s in BlueGene have a disproportionate amount of floating point resources. Making them about equivalent to the 970 in that area mhz for mhz, despite being massively outclassed in integer and vector ops. And the floating point units on those 440s are full 64-bit units (as fpus are on many other ostensibly 32 bit chips, as the bit width of a fpu has nothing to do with the integer units and mmus being 32-bit). Plus the PPC has a fused multiply-add instruction, allowing it to theoretically finish 2 FLOPS/unit/cycle, instead of just one.
And finally, you should know that individual nodes' ram sizes matter very little for Linpack.
When you take all that together, it's not too surprising that 700Mhz PPC440s with 2 64-bit FPUs each finishing up to 2 FLOPs/cycle (at least 2 of which must be adds) would perform on par with 2.xGhz Opterons finishing a total of 2FLOPs/cycle (at least one of which has to be an add).
"The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
Ummm... I think BlueGene (here and here) is cooler than the Earth Simulator (here).
The reason that it can be true that 1+1 > 2 is that very peculiar nonzero value of the + operator
The 70.72 TF BlueGene/L that debuted on the November list is only 16 of 64 racks of the full machine (25%). BluneGene/L was to be delivered in stages and be a 131072 CPU system when complete (64 racks * 2048 CPUs per rack). The beasty will be well over 200 TF sustained Linpack when it is completed. Oh, and it is binary compatible with System X at Virginia Tech.
I remember seeing a news article on TV recently about NASA and their upgrades to computer horse power for doing flight simulations and design work. The picture they showed? A late 80's connection machine. You know the beast, 8 black cubes glued together to make one big cube with hundreds of blinking LEDs over the faces, one for each of the 65536 simple processors. Sort of a Borg at Christmas time affair. Stock footage to be sure, and the news outlets trot it out every time the word supercomputer is used. At least they've quit showing IBM Model 726 Tape Units spinning reel-to-reel tapes back and forth as a show of awesome computing power.
Letter To Iran
Whats an applicatoin! New industry standard? :-p
If you like what I've said here, and want to read more, go to http://www.krillrblog.com
...that by the time Duke Nukem Forever launches, this will be the level of computing power on every desktop? I can hardly wait for Windows mean-time-to-failure to be measured in femtoseconds.
If my grammar and spelling are off, I am [distracted/tired/careless] (take your pick)
Yes, all applications may be improved.. to a point. But the simple fact is that when you are trying to simulate a system where you must track billions of individual data points, you must have a good deal of processing power to do it in anywhere close to a reasonable time.
Ex: Simulating a nuclear blast over a person. The requirement of the system is to track each cell in the person. And you are tracking it at the nano-second level. This problem cannot be optimized away.
Applying the AC relevancy converter, and we get:
FFFFirst post!
Yes it does run Linux.
0 3/ 15/cz_dl_0315linux.html
http://www.forbes.com/home/enterprisetech/2005/
My hyperlinks aren't worth the paper they're printed on.
Good sig, obscure reference, only one google entry, nice.
When broadband speeds increase significantly, and MS get their .NET "Applications for rent" system working, then each continent will need just one "computer", conected to your "home terminal".
The world will need just five "computers".
b3 4phr41d 0f my 4bov3-4v3r4g3 c0mpu73r kn0wI3dg3!
MadDwarf
it's just you. actually
135.3*2360/2
270.6>180