North America's Fastest Linux Cluster Constructed

← Back to Stories (view on slashdot.org)

North America's Fastest Linux Cluster Constructed

Posted by CowboyNeal on Thursday May 13, 2004 @02:53PM from the where-was-the-lightning dept.

SeanAhern writes "LinuxWorld reports that 'A Linux cluster deployed at Lawrence Livermore National Laboratory and codenamed 'Thunder' yesterday delivered 19.94 teraflops of sustained performance, making it the most powerful computer in North America - and the second fastest on Earth.'" Thunder sports 4,096 Itanium 2 processors in 1,024 nodes, some big iron by any standard.

5 of 325 comments (clear)

Min score:

Reason:

Sort:

The way I see it... by blackula · 2004-05-13 15:04 · Score: 2, Redundant

...there are basically three type of clusters: 1) shared nothing: in this, each computer is only connected to each other via simple IP network. no disks are shared. each machine serves part of data. these cluster doesn't work reliably when you have to aggregations. e.g. if one of the machine fails and you try to to "avg()" and if the data is spread across machines, the query would fail, since one of the machine is not available. most enterprise apps cannot work in this config without degradation. e.g. IBM study showed that 2 node cluster is slower and less reliable than 1 node system when running SAP IBM on windows and unix and MS uses this type of clustering (also called federated database approach or shared nothing approach). 2) shared disk between two computers: in this case, there are multiple machines and multiple disks. each disk is atleast connected to two computers. if one of the computer fails, other takes over. no mainstream database uses this mode, but it is used by hp-nonstop. still, each machine serves up part of the data and hence standard enterprise apps like SAP etc cannot take clustering advantage without lot of modification. 3) shared everything: in this, each disk is connected to all the machines in the cluster. any number of machines can fail and yet the system would keep running as long as atleast one machine is up. this is used by Oracle. all the machine sees all the data. standard apps like SAP etc can be run in this kind of configs with minor modification or no modification at all. this method is also used by IBM in their mainframe database (which outsells their windows and unix database by huge margine). most enterprise apps are deployed in this type of cluster configuration. the approach one is simpler from hardware point of view. also, for database kernel writers, this is the easiest to implement. however, the user would need to break up data judiciously and spread acros s machines. also adding a node and removing a node will require re-partitioning of data. mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage. it is also easy to make it scale for simple custom app and so most of TPC-C benchmarks are published in this configuration. approach 3 requires special shared disk system. the database implementation is very complex. the kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. this is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes. approach 2 is similar to approach 1 except that it adds redundancy and hence is more reliable. so what type are we talking about here?
Comment removed by account_deleted · 2004-05-13 15:52 · Score: 1, Redundant

Comment removed based on user account deletion
but.... by charstar · 2004-05-13 15:58 · Score: 0, Redundant

...can i play warcraft3 on it?
A lot of modification by callipygian-showsyst · 2004-05-13 16:14 · Score: 0, Redundant

Pineapple on a lot of modification! Shared everything: in this, each disk is connected to all the data. standard apps like SAP etc can be used in other ways (you could run a web site off an Itanium 2), but the segment that Cray, SGI, HP Alphaservers, etc. have traditionally dominated.
The segment that doesn't give a shit about cooling, power consumption, or price-performance, but who just need to break up data judiciously and spread acros s machines. Also adding a node will require re-partitioning of data. Mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage.
It is also likely cheaper than a similarly equipped Itanium 2 server would easily top $6,000. But anyway, good game to Lawrence Livermore. I'll be curious to see if Apple has another volley to fire before the top500 list closes for this round.
- "When you want to occasionally hand-tune code segments in assembler, etc."
Anyways, I'm not a fan of Intel lately, but the Itanium could have been a good choice. ALSO: Don't forget that the Itanium 2 is $1398 per chip on Pricewatch, and a dual processor Xserve.

--
Best Buy can have you arrested
Re:Very great and all... by medelliadegray · 2004-05-13 17:02 · Score: 0, Redundant

"If you want the fastest, in many cases you want the Itanium. If you want the best value (which still performs quite close to the fastest), you want an Opteron."

You make good points throughout your reply, but if you're clustering--the idea of buying the fastest available just doesnt make sense, unless underlying it really is that much faster in even a cluser environment?

I'm guessing, as you also mentioned, that intel probably cut them a sweet deal if they used intel's flagship.

only other option would be they thought intel would hold up better/be more stable. /shrug

--
Troll, Troll, go away and flame again some other day