With Linux Clusters, Seeing Is Believing
Roland Piquepaille writes "As the recent release of the last Top500 list reminded us last month, the most powerful computers now are reaching speeds of dozens of teraflops. When these machines run a nuclear simulation or a global climate model for days or weeks, they produce datasets of tens of terabytes. How to visualize, analyze and understand such massive amounts of data? The answer is now obvious: using Linux clusters. In this very long article, "From Seeing to Understanding," Science & Technology Review looks at the technologies used at Lawrence Livermore National Laboratory (LLNL), which will host the IBM's BlueGene/L next year. Visualization will be handled by a 128- or 256-node Linux cluster. Each node contains two processors sharing one graphic card. Meanwhile, the EVEREST built by Oak Ridge National Laboratory (ORNL), has a 35 million pixels screen piloted by a 14-node dual Opteron cluster sending images to 27 projectors. Now that Linux superclusters have almost swallowed the high-end scientific computing market, they're building momentum in the high-end visualization one. The article linked above is 9-page long when printed and contains tons of information. This overview is more focusing on the hardware deployed at these two labs."
This is how we nerds measure our penises. ;)
Virginia Tech's "System X" cluster cost a total of $6M for the asset alone (i.e., not including buildings, infrastructure, etc.), for performance of 12.25 Tflops.
By contrast, NCSA's surprise entry in November 2003's list, Tungsten, achieved 9.82 Tflops for $12M asset cost.
Double the cost, for a Top 100 supercomputer's-worth lower performance.
And it wasn't because Virginia Tech had "free student labor": it doesn't take $6M in labor to assemble a cluster. Even if we give it an extremely, horrendously liberal $1M for systems integration and installation, System X is still ridiculously cheaper.
I know there will be a dozen predictable responses to this, deriding System X, Virginia Tech, Apple, Mac OS X, linpack, Top 500, and coming up with one excuse after another. But won't anyone consider the possibility that these Mac OS X clusters are worth something?
Damn! What kind of paper stock are you printing on?
--- Ban humanity.
Roland Piquepaille and Slashdot: Is there a connection?
I think most of you are aware of the controversy surrounding regular Slashdot article submitter Roland Piquepaille. For those of you who don't know, please allow me to bring forth all the facts. Roland Piquepaille has an online journal (I refuse to use the word "blog") located at www.primidi.com. It is titled "Roland Piquepaille's Technology Trends". It consists almost entirely of content, both text and pictures, taken from reputable news websites and online technical journals. He does give credit to the other websites, but it wasn't always so. Only after many complaints were raised by the Slashdot readership did he start giving credit where credit was due. However, this is not what the controversy is about.
Roland Piquepaille's Technology Trends serves online advertisements through a service called Blogads, located at www.blogads.com. Blogads is not your traditional online advertiser; rather than base payments on click-throughs, Blogads pays a flat fee based on the level of traffic your online journal generates. This way Blogads can guarantee that an advertisement on a particular online journal will reach a particular number of users. So advertisements on high traffic online journals are appropriately more expensive to buy, but the advertisement is guaranteed to be seen by a large amount of people. This, in turn, encourages people like Roland Piquepaille to try their best to increase traffic to their journals in order to increase the going rates for advertisements on their web pages. But advertisers do have some flexibility. Blogads serves two classes of advertisements. The premium ad space that is seen at the top of the web page by all viewers is reserved for "Special Advertisers"; it holds only one advertisement. The secondary ad space is located near the bottom half of the page, so that the user must scroll down the window to see it. This space can contain up to four advertisements and is reserved for regular advertisers, or just "Advertisers". Visit Roland Piquepaille's Technology Trends (www.primidi.com) to see it for yourself.
Before we talk about money, let's talk about the service that Roland Piquepaille provides in his journal. He goes out and looks for interesting articles about new and emerging technologies. He provides a very brief overview of the articles, then copies a few choice paragraphs and the occasional picture from each article and puts them up on his web page. Finally, he adds a minimal amount of original content between the copied-and-pasted text in an effort to make the journal entry coherent and appear to add value to the original articles. Nothing more, nothing less.
Now let's talk about money. Visit http://www.blogads.com/order_html?adstrip_category =tech&politics= to check the following facts for yourself. As of today, December XX 2004, the going rate for the premium advertisement space on Roland Piquepaille's Technology Trends is $375 for one month. One of the four standard advertisements costs $150 for one month. So, the maximum advertising space brings in $375 x 1 + $150 x 4 = $975 for one month. Obviously not all $975 will go directly to Roland Piquepaille, as Blogads gets a portion of that as a service fee, but he will receive the majority of it. According to the FAQ, Blogads takes 20%. So Roland Piquepaille gets 80% of $975, a maximum of $780 each month. www.primidi.com is hosted by clara.net (look it up at http://www.networksolutions.com/en_US/whois/index. jhtml). Browsing clara.net's hosting solutions, the most expensive hosting service is their Clarahost Advanced (http://ww
Supercomputers have become so advanced we need more supercomputers just to understand them.
With Linux Clusters, Seeing Is Believing
Does this mean that we don't have to just imagine a Beowulf cluster anymore?
It is easier to build strong children than to repair broken men. -Frederick Douglass
A machine that can compile a Stage1 Gentoo install in a reasonable amount of time.
"Joy is not in things; it is in us." Richard Wagner
So, if I've got this straight, Slashdot drives the banner ad traffic, real journalists write the content, and all Roland has to do is rip off a few articles, then sit in the middle and collect the checks. How do I get a sweet gig like that?
To reaffirm what the article said building linux clusters is very simple. In fact certain distributions such as bccd and cluster knoppix specifically for that. Although configuring clustering softwares such as pvm mpi lam mosix etc wouldn't be a problem, I prefer something which has almost everything build into one package thats why I like the above distros. In fact I built a cluster (using BCCD) at home and used it to render images built from povray. I used pvmpov for the rendering on a cluster part. Although there were only four machines the speed difference was evident. And above all making clusters is extremely cool and shows the paradigm shift towards parallel computing.
So now Monsieur Piquepaille has been shamed by scornful posters into including a link to the actual article (instead of harvesting page views), but he'd still really, really like you to click through to his page....
grammar-lesson free since 1999. (rescinded - 2005)
Now that Linux superclusters have almost swallowed the high-end scientific computing market...
While some simulations parallelize very well to cluster environments, there are still plenty tasks that don't split up like that.
The reason clusters make up a lot of the Top 500 list is that they are relatively cheap and you can make them faster by adding more nodes - whereas traditional supercomputers need to be deisgned from the ground up.
Look here.
The speed you quoted is the theoretical peak, not the actual maximum achieved in a real world calculation (like the Top 500 organization's use of Linpack).
System X's equivalent theoretical peak is 20.24 TFlops.
I'm also not indicting Linux clusters in the least; they've clearly shown they can outperform traditionally architected and constructed supercomputers for many tasks, with the benefit of using commodity parts - at commodity pricing. All I'm saying is that there's a new player here, and it's a real contender, and has done a lot for very little money...which was the whole goal of Linux clusters in this realm in the first place.
(Also, as I said, the volunteer labor model is irrelevant - let's just pretend it was professionally installed for an additional $1M, or even $2M if that would satisfy you. It's still several million dollars cheaper, and 3Tflops greater performance. These are BOTH rackmount clusters with similar amounts of nodes and processors, running a commodity OS with fast interconnects. There are differences, yes, and perhaps even differences in goals. But looking past that, price/performance for something like this is still an important metric.)
30 accepted stories since August 29th, 2004! Wtf is going on here?
E pluribus unum
...except get untold amounts of recognition, publicity, free advertising, news articles, and the capability to catapult themselves to the forefront of the supercomputing community overnight for a paltry sum of money, thus attracting millions of dollars of additional funding and grants to build clusters that WILL be doing real work, such as the one we're talking about now (which is more than capable now that it has ECC memory), and the several additional clusters they plan to build in the future, not to mention the benefit of proving that a new architecture, interconnect, and OS will perform well as a supercomputer, allowing more choice, competition, and innovation to enter the scene, which ultimately results in more and better choices for everyone.
Clusters are proven to be cost effective, but they do require more labor to optimize code to get it to work in that environment. Its easier to have the system and the complier do the work for you in a single image system. This article address those issues and concerns. single image shared vs distributed memory in large Linux systems