North America's Fastest Linux Cluster Constructed
SeanAhern writes "LinuxWorld reports that 'A Linux cluster deployed at Lawrence Livermore National Laboratory and codenamed 'Thunder' yesterday delivered 19.94 teraflops of sustained performance, making it the most powerful computer in North America - and the second fastest on Earth.'" Thunder sports 4,096 Itanium 2 processors in 1,024 nodes, some big iron by any standard.
And you thought I was going to say something else...
But why did they use itanium processors? Were they acquiring parts before Opterons were availabel? Did they have a problem with Xeon processors? Or did they have too much cash lying around?
If my answers frighten you, stop asking scary questions.
It's all for reserved for Doom III on longhorn.
this thing should do doom 3 with a software renderer at a very playable 47 FPS...
If google's cluster is interconnected via ethernet, there is a whole range of computational problems it can't tackle. If you want to simulate a spatial phenomenon with lot of things going back and forth in a volume, you're bound to have a _lot_ of communications. The cost of the interconnect system in those simulation systems is often a substantial proportion of the total cost of the installation.
Also in completely unrelated news, Bill Gates announced the first fully installed test of Longhorn happened today.
"We sold the Inaniums! We sold the Inaniums!"
"The Itaniums, however, remain unsold."
*hopes that was not an actual mistake but rather a poorly conceived pun on "inane"...*
I've got more mod points and GMail invi
I think you've got that backwards, Quadrics is the performance leading, not the price/performance leader. Myrinet, SCI, and Infiniband all beat it in price/performance. Quadrics is faster, and scales to more nodes than the others.
According to Quadrics latest price list, the cards are $1200 each, $913 per port for a 64 node switch, and $185-$265 for a cable. That's $2300/node.
Myrinet cards are $595, the switch is $400 per port for 64 nodes, and the cables are ~$50. That's $1050/node.
Quadric's price for a 1024 node interconnect is $4,176,094. That's hardly chump change. The bandwith is about 10x higher than gigabit ethernet, and the latency about 100x lower.
I just coded some IA-64 assembly and from what I've seen, this comment is dead-on. They've got a lot of interesting features:
If you just have a simple sequence of operations, each dependant on the one before, you can't really take advantage of these capabilities. (My code was like this. Even though performance wasn't my reason for writing assembly, it was a little disappointing that I couldn't play with the new toys.) If you're expecting these features to make Word start faster, you'll probably be disappointed.
But if you're doing intensive computations in a tight loop, you can do amazing things. If you can get all the execution units working simultaneously, it will fly. And the features like rotating registers are designed to make that possible. You need a very good compiler or a very smart person to hand-tune it. You may need to recompile to tune if your memory latency changes (affecting how many iterations to run at once) or they come out with a new chip with more sets of execution units. But in a situation like this, none of that is a problem. They'll have applications designed to run as fast as possible on this machine. They may never be run anywhere else.