North America's Fastest Linux Cluster Constructed
SeanAhern writes "LinuxWorld reports that 'A Linux cluster deployed at Lawrence Livermore National Laboratory and codenamed 'Thunder' yesterday delivered 19.94 teraflops of sustained performance, making it the most powerful computer in North America - and the second fastest on Earth.'" Thunder sports 4,096 Itanium 2 processors in 1,024 nodes, some big iron by any standard.
And you thought I was going to say something else...
But why did they use itanium processors? Were they acquiring parts before Opterons were availabel? Did they have a problem with Xeon processors? Or did they have too much cash lying around?
If my answers frighten you, stop asking scary questions.
...who gets the electric bill.
I cringe when I leave the A/C on for too long..
"Watch your cornhole, bud."
Look, any way you cut it the 100K computers Google is reputed to have is the most powerful Linux cluster anywhere in the world.
Is it fast enough to run all the latest spyware, adware, and viruses and not slow down your solitaire game?
People say my sig is the best thing about me.
That's amazing!
Now we can... uhh... what are we supposed to do with that much power again?
Can it run Windows?
LLNL built a supercomputer, and it's going to do things besides simulate nuclear weapons?
Quick, someone ring Satan and ask how the sno-cones are.
Please help metamoderate.
http://www.google.ca/search?sourceid=navclient&ie= UTF-8&oe=UTF-8&q=cache:http%3A%2F%2Fwww%2Ellnl%2Eg ov%2Flinux%2Fthunder%2F
With the moo and the cow and the fish. Minesweeper Record: 7 sec
this thing should do doom 3 with a software renderer at a very playable 47 FPS...
This is probably a stupid question, but would anyone care to explain how this is different than a really large cluster. For example, if people estimate google to approach 100K nodes, how does this compare?
That would be the Earth Simulator in Japan.
"But I'm still right here, giving blood and keeping faith. And I'm still right here."
...I can back up my brain
Did you know that "FTW" ("for the win") is a direct translation of "Sieg Heil"?
...there are basically three type of clusters: 1) shared nothing: in this, each computer is only connected to each other via simple IP network. no disks are shared. each machine serves part of data. these cluster doesn't work reliably when you have to aggregations. e.g. if one of the machine fails and you try to to "avg()" and if the data is spread across machines, the query would fail, since one of the machine is not available. most enterprise apps cannot work in this config without degradation. e.g. IBM study showed that 2 node cluster is slower and less reliable than 1 node system when running SAP IBM on windows and unix and MS uses this type of clustering (also called federated database approach or shared nothing approach). 2) shared disk between two computers: in this case, there are multiple machines and multiple disks. each disk is atleast connected to two computers. if one of the computer fails, other takes over. no mainstream database uses this mode, but it is used by hp-nonstop. still, each machine serves up part of the data and hence standard enterprise apps like SAP etc cannot take clustering advantage without lot of modification. 3) shared everything: in this, each disk is connected to all the machines in the cluster. any number of machines can fail and yet the system would keep running as long as atleast one machine is up. this is used by Oracle. all the machine sees all the data. standard apps like SAP etc can be run in this kind of configs with minor modification or no modification at all. this method is also used by IBM in their mainframe database (which outsells their windows and unix database by huge margine). most enterprise apps are deployed in this type of cluster configuration. the approach one is simpler from hardware point of view. also, for database kernel writers, this is the easiest to implement. however, the user would need to break up data judiciously and spread acros s machines. also adding a node and removing a node will require re-partitioning of data. mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage. it is also easy to make it scale for simple custom app and so most of TPC-C benchmarks are published in this configuration. approach 3 requires special shared disk system. the database implementation is very complex. the kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. this is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes. approach 2 is similar to approach 1 except that it adds redundancy and hence is more reliable. so what type are we talking about here?
And only 55 people were needed to build it!
19.94 teraflops??
Gimmy something I can grasp; what's this in BogoMips?
Also in completely unrelated news, Bill Gates announced the first fully installed test of Longhorn happened today.
Hey, with a Beowulf cluster of these, I can run Longhorn!
OK, I'm done. Sorry. Mod away!
Ce n'est pas un vrai mouvement de robot!
If I calculate right, they are claiming an Rmax of 19.94 teraflops with 4096 processors.
The Virginia Tech cluster for Apple had an Rmax of 10.28 teraflops with 2200 processors.
So, the Itaninum 2 delivered 4.8 gigaflops per processor, the G5 delivered 4.6 gigaflops per processor.
This seems like a pretty poor showing for Itanium 2, overall. It's a much hotter chip than the Opteron or the G5, so cooling and power costs are likely much higher than a comparable apple cluster. The Xserve G5 is also likely cheaper than a similarly equipped Itanium 2 server, given that the Itanium 2 is $1398 per chip on Pricewatch, and a dual processor Xserve G5 cluster node is $2,999 list. Even with 4 cpus in a single box, I think the Itanium 2 server would easily top $6,000.
But anyway, good game to Lawrence Livermore. I'll be curious to see if Apple has another volley to fire before the top500 list closes for this round.
- "When you want something with all your heart, the entire universe conspires to give it to you" -Paulo Coelho
"We sold the Inaniums! We sold the Inaniums!"
yeah, that we know about. I remember the article on google a few weeks ago that made everyone think just what they hell they're running over there. I wouldn't be surprised if governments kept other supercomputing clusters secret. I don't mean anything tin-foil-hatish here, I'm just thinking that some governments have test facilities that they don't let the public know about.
4,096 Itanium 2 processors in 1,024 nodes
So THAT'S what's causing our heat wave!
Here's a picture: http://doc.quadrics.com/quadrics/QuadricsHome.nsf/ DisplayPages/3A912204F260613680256DD9005122C7
Now you can't say you have the fastest "Thupercomputer" any more! You've been beat by Intel and Linux!
Best Buy can have you arrested
that they didn't build this just to win 2 grand from distributed.net.
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
yes, they're hot as hell and eat power the way oprah eats twinkies, and yes Intel has made a poor handling of the Itanium line, but the Itanium architecture is very interesting, and is actually very appropriate for a HPC environment. Not the part of the HPC market that clusters dominate, but the segment that Cray, SGI, HP Alphaservers, etc. have traditionally dominated. The segment that doesn't give a shit about cooling, power consumption, or price-performance, but who just need to get the job done as quickly as possible.
Some of the coolest features of the Itanium are also some of the reasons why a lot of people don't want to use it. The EPIC ISA, for example. It was designed ( along w/ the physical hardware ) to expose a lot of the internal workings of the processor to the user. But rather than recompile and re-optimize their code, people would rather bitch about migration. That's fine for workstations and servers, but in an HPC environment, you want the nifty features, you want to occasionally hand-tune code segments in assembler, etc.
Anyways, I'm not a fanboy ( well, maybe an AMD and MIPS fanboy ), just wanted to get in a few honest points before everyone started shooting holes in the Itanic.
PC moderators can suck my White pierced, tattooed dick. If you think pride == hate, s/dick/Aryan meat mallet/g.
do they have the nerve to go after this cluster?
afterall they are trying extortion by lawyer against other large Linux users
"We sold the Inaniums! We sold the Inaniums!"
"The Itaniums, however, remain unsold."
*hopes that was not an actual mistake but rather a poorly conceived pun on "inane"...*
I've got more mod points and GMail invi
Sorry to burst your bubble, but Itanium isn't x86.
It's hard to be religious when certain people are never incinerated by bolts of lightning.
Can it run WINE?
Some drink at the fountain of knowledge. Others just gargle.
Thunder sports 4,096 Itanium 2 processors in 1,024 nodes, some big iron by any standard.
If the government gets a hold of that, we're going to need some big tinfoil...
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
The NSA, on the other hand... I would guess that they have the most powerful cluster of machines in the world for breaking encryption. Though perhaps not as powerful as the article's supercomputer for other tasks.
Plus there are undoubtedly several other highly classified supercomputers designed to chew on other problems.
So it would seem that you'd have to caveat any claim of regarding the "fastest computer" by saying it's the fastest known, non-secret computer. But then the headline loses some of its appeal.
... if you want a practically guided tour of LLNL, watch TRON sometime. They filmed it there (the science-lab live action stuff anyway).
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
Google's cluster isn't a computational cluster.
You have several types of clusters, each are designed to do a specific task, although you can easily mix-n-match for different purposes.
1. Server clusters. Bunches of machines running together, providing services that compliment each other.
For example you have a file server that is mirrored to another that is hooked up to a different part of a Lan/Wan backbone in order to improve service. Lot's of databases are clusters like this.
2. High avaiblity clusters.
You have a machines that are backups of other machines. If one machine fails a backup is activated instantly and replaces the failed machine without ANY loss in services.
Sort of like a RAID harddrive setup. Hotswappable computers, that sort of thing.
Google is the first 2 types. It has several clusters with nodes. Each node is made up of a few computers, if a node fails then another backup can back it up instantly, giving the techs time to correctly fix the issue. The computers each take some of the burden, too, so that it seems that they would have to be running mega-machines to provide the performance when in reality they just run a bunch of PC-style computers.
3. Computational clusters. Clusters that are designed to pool their resources to create a single big computer that is used to proccess large amounts of data and intense mathmatical functions.
2 types of these are Beowolf clusters and OpenMosix clusters.
OpenMosix cluster is easy to setup if your a little bit familar with linux and even have knoppix cluster cdroms you can build ones quickly and easily.
Beowolf is used for big number crunching and programs that use it are generally written to run a specific cluster, although libraries and tools are portable.
Used lots in astromony for example. 10-12 PCs in a college lab can make a nice number crunching machine.
There are some clusters that do all 3, lots can do only 1 or 2 of the types easily. Different types can compliment each other.
"Big Iron" is a very vague term - server benchmarks behave very differently than scientific computation as far as performance is concerned; if you don't believe me I can easily point you to a couple of research papers analyzing them.
The humongous on-die caches makes the Itanium perform well on servers, and definitely not the instruction-set architecture. So "WAS DESIGNED FOR" is only 50% true.
The Raven
Intel provides excellent Linux support for Itanium. Also if you use the Intel compiler, which Lawrence Livermore does, you get considerable speed boost on Intel CPUs.
l ers
See: http://www.llnl.gov/linux/linux_basics.html#compi
Intel can afford to provide little niceties like this. Can AMD? I doubt it.
Ed Note: Unless the author wishes to narrow his/her audience to a small subset of Slashdot users, standard formatting and non-cutesy sentence case is always appropriate.
There are basically three type of clusters:
Shared Nothing: In this, each computer is only connected to each other via simple IP network: no disks are shared. and each machine serves part of data. These cluster doesn't work reliably when you have to aggregations. For example, if one of the machine fails and you try to to "avg()" and if the data is spread across machines, the query would fail, since one of the machine is not available. Most enterprise apps cannot work in this config without degradation. For example, IBM study showed that 2 node cluster is slower and less reliable than 1 node system when running SAP IBM on windows and unix and MS uses this type of clustering (also called federated database approach or shared nothing approach).
Shared Disk Between Two Computers: In this case, there are multiple machines and multiple disks. Each disk is at least connected to two computers. If one of the computer fails, other takes over. no mainstream database uses this mode, but it is used by hp-nonstop. Still, each machine serves up part of the data and hence standard enterprise apps like SAP etc cannot take clustering advantage without lot of modification.
Shared Everything: In this, each disk is connected to all the machines in the cluster. Any number of machines can fail and yet the system would keep running as long as at least one machine is up. This is used by Oracle. All the machine sees all the data. Standard apps like SAP etc can be run in this kind of configs with minor modification or no modification at all. This method is also used by IBM in their mainframe database (which outsells their Windows and Unix database by huge margin).
Most enterprise apps are deployed in this type of cluster configuration. The approach one is simpler from hardware point of view. Also, for database kernel writers, this is the easiest to implement. However, the user would need to break up data judiciously and spread across machines. Also adding a node and removing a node will require re-partitioning of data. Mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage.
It is also easy to make it scale for simple custom app and so most of TPC-C benchmarks are published in this configuration. Approach 3 requires special shared disk system. The database implementation is very complex. The kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. This is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes. Approach 2 is similar to approach 1 except that it adds redundancy and hence is more reliable.
So what type are we talking about here?
But Windows only has 50% less TCO if your time is worthless.
Karma: It's all a bunch of tree-huggin' hippy crap!
...you realise that it isn't a linear scale. Trying to make a G5 cluster which achieved 4.8 gigaflops per processor would take more than the 4400 processors, and thus would easily take more than 300 more processors than are used for the Itanium cluster.
300 processors. Thats 150 dual-processor boxes. I can't be bothered working it out now, but how far that goes to eliminating the power & heat advantage the G5 has would be interesting to find out...
Game dev and music blog
That's thermodynamics. It's true for any fuel. It's even true for oil and nuclear energy - the difference being only that the energy wasn't put in during our lifetime. (And in the case of nuclear, that the pre-existing energy is all but inexhaustible.)
>It's all for reserved for Doom III on longhorn.
Sorry, I'd played Doom III yesterday at E3. That's joke is (in your best Iron Chef voice) o-vah!
$2,863,104 in license fees going SCO's way!
I can see the investors now rubbing their 2 cents together....
This is the official top 500 list of supercomputers (not updated yet although thunder is mentioned as '*possibly* the second-most powerful computing machine on the planet'). Linux moving up to second place (from fifth a bit ago, iirc), woohoo! Only one left to beat!
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
It also has twice the processors, to generate the X2 times speed that they claim. Something tells me, now that VA is recieving the XServer G5 cluster nodes, that they may want to add some more units. they can put 48 units in each rack now, rather than 12 of the full size G5 Desktop form factor. According top my primitive calculations that would allow them to run 4 times as many machines in the same space (would be over 8000 CPUs. I figure that will likely kick the crap out of this new linux cluster...
MacOSX, because making *NIX better is a lot better than waiting for Micro$loth to fix Windows