SGI & NASA Build World's Fastest Supercomputer
GarethSwan writes "SGI and NASA have just rolled-out the new world number one fastest supercomputer. Its performance test (LINPACK) result of 42.7 teraflops easily outclasses the previous mark set by Japan's Earth Simulator of 35.86 teraflops AND that set by IBM's new BlueGene/L experiment of 36.01 teraflops. What's even more awesome is that each of the 20 512-processor systems run a single Linux image, AND Columbia was installed in only 15 weeks. Imagine having your own 20-machine cluster?"
Let's see them predict the weather.....
...when they hit the "TURBO" button on the front of the boxes they'll really scream.
I have one of those... in a spare room!
Who cares about a 20 system cluster, I want a one 512 processor machine!
or 20, I'm not that picky
Just what I need to model my next H-bom... uhh... umm.... I mean render my next feature film. I call it "Kaboom."
I bet gentoo wouldn't be such a b**ch to get running with all of that compiling power behind it :)
According to the article it got 42.7 teraflops using only 16 of the 20 nodes, so the performance is going to be even better.
...they were *almost* able to get Longhorn to boot.
If the same software is used, its not going to make weather predictions more accurate. Its just going to give them the wrong answer, faster.
This page contains images of the NASA Altix system. After reading the article I was curious as to how much room 10K or so processors take up.
http://www.busyweather.com/
1) This was fully deployed in only 15 weeks.
(Link)
2) This number was using only 16 of the 20 systems, so a full benchmark should be larger too.
(link)
3) The storage attached holds 44 LoC's (link)
Your hair look like poop, Bob! - Wanker.
Seti@Home. They'll be in the Top 10 in no time!
Prof. Jack Dongarra of UTK is the keeper of the official list in the interim between the twice-yearly Top 500 lists:
http://www.netlib.org/benchmark/performance.pdf See page 54.
And here's the current top 20 as of 10/26/04...
Computer superclusters don't even have O-rings.
They don't carry schoolteachers.
They don't fly in the air.
This runs Linux, not Windows. It won't crash.
sigs, as if you care.
Wow, I didn't know the NewAdvancedSearchAgent had such an interest or budget for super computing. I'd think they'd be able to afford their own web server though instead of being parked at domainspa.com and having to fill their entire page with advertisments.
Try NASA.GOV.
Why does it take so long to build a super computer and why do they seem to be redesigned each time a new one is desired?
It's a little like how Canada's and France's nuclear power plant system are built around standardized power stations, cookie cutter if you will. The cost to reproduce a power plant is negligble compared to the initial design and implementation, so the reuse of designs makes the whole system really cheap. The drawback is that it stagnates the technology and the newest plants may not get the newest and best technology. Contrast this with the American system of designing each power plant with the latest and greatest technology. You get really great plants each time, of course, but the cost is astronomical and uneconomical.
So to, it seems with supercomputers. We never hear about how these things are thrown into mass production, only about how the latest one gets 10 more teraflops than the last and all the slashbots wonder how well Doom 3 runs on it or whether Longhorn will run at all in such an underpowered machine.
But each design of a supercomputer is a massive success of engineering skill. How much cheaper would it become if instead of redesigning the machines each time someone wants to feel more manly than the current speed champion, that the current design be rebuilt for a generation (in computer years)?
The amazing thing about it is that it's built at a fraction of the cost/space/size as the Earth simulatior. If I remember correctly, I think they already have some of the systems in place for 36 teraflops. It's the same Blue Gene/L technology from IBM, just a larger scale.
RAEM (redundant array of expensive machines) just doesn't ring right - to close to REAM.
This issue is a bit more complicated than you think.
NEC's is announced, this one is installed.
you had me at #!
Yes what is the point? We all know the resulting answer is going to be 42.
There's also a dark horse in the supercomputer race; a cluster of low-end IBM servers using PPC970 chips that is in between the BlueGene/L prototype and the Earth Simulator. That pushes the last Alpha machine off the top 5 list, and gives Itanium and PowerPC each two spots in the top 5. It's amazing to see the Earth Simulator's dominance broken so thoroughly. After so long on top, in one list it goes from first to fourth, and it will drop at least two more spots in 2005.
Whoever corrects a mocker invites insult;
whoever rebukes a wicked man incurs abuse.
--Proverbs 9:7
Does anyone know how much this system cost? It would be interesting to see how good of a teraflop per million dollar ratio they achieved.
For example, I know the Virginia Tech cluster (1,100 Apple Xserve G5 dual 2.3Ghz boxes) cost just under $6 million, runs at a bit over 12 teraflops, so it gets a bit over 2 teraflops per million dollars.
Other high-ranking clusters would be interesting to evaluate in terms of teraflops per million dollars, if anyone knows any.
Seriously, am I on candid camera?
Emulating a Centris 650 running Mac OS X at 2.5 Ghz.
ZZ
Seti@home is currently reporting 70.93 TeraFLOPs/sec. It would be Number One if the list were a bit more inclusive.
Ok, so we have Linux doing tens of teraflops in processing, FreeBSD doing tens of petabits in networking,
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
uhm... Well 2560 motherboards, 'cause their quad-cpu... Altix is the SGI C-bricks that used were built to house 4 IA64 cpu's per brick. otoh... no... really it really is 20 machines with 512 processors each, because the memory is globally shared (all processors have access to all the memory, albeit at different latency and performance: NUMA (Non Uniform Memory Access). and a single linux kernel is running on the whole thing.
Really, given the fact that most popular computers have enough processing power to handle anything, and the fact that clustering technology has evolved and is usable in case they aren't...what is the point in the "super computer"?
The super computer is a cluster (10k+ processors in 20 nodes).
Not all applications/computations scale by just adding computers to the cluster.
An example would be solving for z: x=84+19, y=5*3, z=x+y
The ultimate solution z is limited by the speed x & y can be solved. You can have an individual computer solve for x and another for y in parallel. But no matter how many more computers you add, none of them can solve z until x&y are solved first, and none of them would speed up the computation of x&y.
After a certain scale, you do not get benifits of parellel processing, so the only way to speed things up is to make each individual computer faster.
D6 63 0D 70 89 81 BB 8E 7B 7C 5F 5D 54 EA AB 73
They tried, but they ran out of blue.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
...rubbing his hands whilst sitting in a dark corner amongst an ever-dwindling pile of Microsoft-donated cash, salivating at this.
"512 processors, 20 machines, $699 per processor. All that intellectual property, yes! No free lunch no, Linux mine, MIIIINE, BWAAAAHAHAHAHA!!!"
*dials*
"Hello, NASA? About that $7,157,760 you owe me...
I'm sorry, where do you want me to jump?"
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
All those teraflops and still no aliens? How many freaking teraflops do we need? Come on, folks, I just want one goddamned spaceman!!
Maybe they want to run PearPC at a decent speed.
The most amazing part of this development is that the fastest computer in the world runs Linux . All these TFLOPS increases are really evolutionary, incremental. That the OS is the popular, yet largely underground open source kernel is very encouraging for NASA, SGI, Linux, Linux developers and users, OSS, and nerds in general. Congratulations, team!
--
make install -not war
Curiously enough, we were talking about the future of computing at lunch today.
There was a time when different computers ran on different processors, and supported different OSes. Now what's happening? Itanic and Opteron running Linux seem to be the only growth players in the market; and the supercomputer world is completely dominated by throwing more processors together. Is there no room for substantial architectural changes? Have we hit the merging point of different designs?
Just some questions. Although it's not easy, I'm less excited by a supercomputer with 10k processors than I would be by one containing as few as 64.
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
Weather prediction, it turns out, is *not at all* like playing chess. Chess is a deterministic linear process operating on rigid, unchanging rules. There is always a "best move" for every board state, which a sufficiently fast and capacious database could search for. Weather is chaotic, a nonlinear process. It feeds back its state into its rules, in that some processes increase the sensitivity to change of other simultaneous processes. Chaos cannot be merely "solved", like a linear equation; it must be simulated and iterated through its successive states to identify more states.
Of course, we're just getting started with chaos dynamics. We might find chaotic mathematical shortcuts, just like we found algebra to master counting. And studying weather simulation is a great way to do so. Lorenz first formally specified chaos math by modeling weather. While we're improving our modeling techniques to better cope with the weather on which we depend, we'll be sharpening our math tools. Weather applications are therefore some of the most productive apps for these new machines, now that they're fast enough to model real systems, giving results predicting not only weather, but also the future of mathematics.
--
make install -not war
It was great. I needed to build the kernel so I typed
# make -j 10534 bzImag
and even before I could hit the e and enter, it was done.
I was gonna build X but on this box the possible outcomes of "build World" scared me!
What new private space industry? Spaceship One, for example, reached space. That's a long way from being able to do anything useful in space. They were nowhere near orbital velocity, for example. We're still many years, if not decades, away from private industry being able to take over NASA's near-earth space role.
The answer here is "complexity". I do some scientific computing (have done chemistry, then materials science, now doing photonic devices) and there's always more you want to be able to consider. Of course, the best I've used is an 8-processor SGI machine (although that one was a bit old - I think the 2-processor opteron system I'm using now is actually better). But especially with the materials studies, ideally we wanted to do everything with full quantum-mechanical calculations. which turns into gigantic matrices, even for a system of 100 atoms or so. And even then we put strict limits on what orbitals we consider and all that good stuff.
Slightly more concrete example - right now with my photonics simulations (finite element) on my dual-opteron rig the max I can handle is about 180,000 elements (which means a (4*180000)x(4*180000) matrix with complex elements needs to be diagonalized, among other things), and it takes about half an hour for a standing-wave calculation. To do any time propogation, repeat same calculation in picosecond increments. And with the gridding I can do, for a 100 micron disc resonator in 2-D I have to use light at about 40 microns. To go to the 320nm wavelength these resonators are operating at, I'd need roughly 2 orders of magnitude more memory. There's also the time factor to be considered. As with any design process, one must iterate. Tweak a little here, run the program, rinse, repeat. How long are you willing to spend in this process before you feel something is "good enough"? The faster the computer spits the answer out, the more things you can try, and the more you can think things over and hopefully make it better.
And this is a single component in what can be a fairly complex integrated-photonics chip. [And might I mention again I've been working in 2-D this entire time instead of doing a full 3-D simulation?] You give me the computational power and I'll use it. And I'm an experimentalist doing fairly basic research who just wants to check some stuff in the computer before sinking a lot of time and effort into fabricating a test device.
On the other hand, I actually don't want to have one of the T100 supercomputers in our lab. That would mean I'd be spending all day writing code and designing complex simulations instead of in the lab getting my hands dirty.
And as for the commonality of problems requiring such computational power, I think almost any sort of simulation can easily use it. Consider more terms (everything I've done to date is horribly linearized - let's see some more terms in the Taylor expansion) to account for nonlinear behavior, grid things up finer to get more accurate results, consider more possibilities when dealing with chaotic behavior... I would hope any good scientist would find the possibilties endless.
1. I bet CIA has something in order of 10-100x more powerfull, I mean if you can afford to wire up 5 full office floors of computers, say 20*512 * 5 per floor * 5 , thats a hell lot more. CIA can afford to spend 200m on it, and have 10 super clusters of 1000 tf each.
2. I bet the CIA also can change the weather, go read HARP etc... if the russians can do it in the 80s then the CIA can do anything.
Liberty freedom are no1, not dicks in suits.
Like what? Go out and look up SPEC results next time you're bored. I think you'll find that I2 is quite a bit more capable than you make out. IBM's dual-core POWER5 is just about the only thing out there that's even close to (a single-core) I2 in FP performance, and Opteron isn't even in the game at that level.
Is it a commercial failure? Probably, but so was Alpha - commercial success is not an indicator of actual performance.
ABSURDITY, n.: A statement or belief manifestly inconsistent with one's own opinion.