Ask the Man Behind the NOAA's New Beowulf Cluster
Greg Lindahl sent in this story last September about a massive Alpha Linux cluster that's being built by HPTi for the NOAA's Forecast Systems Laboratories. What Greg forgot to mention when he submitted the original story is that he's the project's chief designer. What with all the Beowulf (and Alpha) interest around here, we figured he'd make a great interview guest, especially now that the project is well under way. Please post your questions below. Answers to 10 - 15 of the highest-moderated ones should appear within the next week.
I have recently become gainfully employed in a capacity which will require me to administer a Beowulf cluster. My question, Mr. Lindahl, is how you feel about the various competing technologies for distribution of computation. In particular, do you feel there is much to be gained from the work of the MOSIX project at The Hebrew University of Jerusalem? Traditionally tasks for Beowulf style supercomputers have required specific programming in MPI or PVM calls. MOSIX endeavors to provide adaptive load-balancing with process migration. Essentially this allows the programmer to forgo the hassle of parallelizing his code. Rather, he can now simply fork() or create SMP threads and the OS will automatically handle distribution of those processes over the cluster. Do you feel that this is a worthwhile avenue to pursue for scientific computation or are there issues which make MPI or PVM still a substantially better choice? Thank you for your time.
--
Brandon D. Valentine
This brings to mind a more fundamental and philosophical question - Does your computer (or any one that's possible to build) have enough horsepower to out-calculate that analog computer called reality that we all know and love so very much?
make world, not war
The raw performance of the hardware being used for scientific and parallel programming has improved by leaps and bounds in the past 10-20 years. However, most folks still program these supercomputers much the same way they did in the 80's: Unix, Fortran, explicit message passing, etc.
You have worked in research with Legion and in industry at HPTi. Do you think there is hope for some radical new programming technology that makes clusters easier for scientists to use? If so, what do you think the cluster programming environment of tomorrow might look like?
One of the weaknesses for beowulfs seems to me to be a lack of decent (job) management software. How do you split the clusters resources? Do you run one large simulation on all the CPUs, or do you run 2 or 3 jobs on 1/2 or 1/3 of the available CPUs?
Is there provision for shifting jobs onto different nodes if one of them dies during a run?
Most of the IS/IT trade publications and media usually do not fully comprehend the differences between massively multiprocessor systems with shared memory and those clusters of systems and processors with their own local memory, or supercomputing clusters. This is quite evident in a recent article regarding the TPC-D performance between clusterd Compaq Wintel/MSSQL systems and a single, shared memory Sun/Oracle system where the Compaq cluster outperformed the Sun solution in 2 of the 10 standard benchmarks. Basic laws of statistics negate those results because the design of the two systems were not of the same class -- e.g., to be fair, Microsoft-Compaq should have compared performance to an equivalent cluster of lower-costing Sun systems (let alone a Lintel cluster!).
As you and I already know (and I hope everyone reading this now knows), there are several applications where lower costing clusters cannot always do the job of more costly shared memory systems as efficiently (e.g., low-latency, real-time applications such as real-time simluations, come to mind). That is why the Compaq Wintel cluster scored drastically far below the shared Sun system in many of the other 8 benchmarks in the aforementioned study.
As such, I am interested in the considerations the NOAA has had to make in evaluating shared memory versus clustered systems. Specifically:
- What are some of the NOAA/NWS programs and software that will not be applicable for execution on this new cluster?
- What [estimated] percentage do these programs make up of the total applications the NOAA uses, both quantity and in time of execution?
- What [assuming] shared memory systems and solutions does the NOAA use for these applications?
Of course, the lower the number in the first two questions, the more advantageous the existence of a supercomputing cluster is to an organization. For example, in the aerospace industry, the quantity of cluster-efficient applications may be small, but the total execution time of a "run" of these select applications can greatly outweigh all others. Again, speaking from my aerospace background, such applications like Monte Carlo, CFD, 6DOF (six degrees of freedom) runs and simulations are extremely time consuming. Monte Carlo is an ideal application for clustering since each "run" result is complete independent from another (almost linear performance improvement when distributed in a cluster). CFD is very close to linear (~90% efficient) and 6DOF, I would guess, could be as high as 60 or 70%, if it is written to take advantage of distributed computing systems.The main reason why these engineering applications are so efficient on clusters is the nature of how they use data. They need little to start crunching, and return little. But during the run, they create and use massive ammount of data, which is all "temporary." This is in stark constrast to databases (such as those targetted by the aforementioned TPC-D benchmarks), where data, not computational results, is the focus of the application. By using supercomputing clusters for computational-driven engineering apps, we can save both money on systems and the time of our engineers waiting on results.
As such, I am interested in the overall increase in efficiency you are seeing after the introduction of supercomputing clusters. Specifically:
[ I now work in the semiconductory design industry, and we are looking at acquiring some Linux supercomputing clusters speed up the runs of EDA (electronic design automation) tools like those for IC layout and the like. ]
I appreciate your time and wish your organization and yourself the best wishing in our Linux and OSS endeavors.
-- Bryan "TheBS" Smith
-- Bryan "TheBS" Smith
Independent Author, Consultant and Trainer
first post?
How do you think the new wave of Beowulf clusters will effect all of supercomputing, not just forcasting?
There are four boxes used in defense of liberty: soap, ballot, jury, ammo. Use in that order.
How did you come to be the project's chief designer? I'm curious to know the background of anyone who gets to work on such an interesting project.
Got Rhinos?
Are other government agencies going to duplicate your work? Have they already? If so, for what purposes?
www.alarmist.org
can you give us some information about what exactly is in this cluster? what alphas, etc?
I built a Beowulf-style cluster this past semester in college for independent study. One of the biggest hurdles we had was picking out a message passing interface such as MPI or PVM. Configurining across multiple platforms was then even worse (we had a mixture of old Intels, SunSparcs and IBM RS/6000's). What do you see in the future for these interfaces in terms of setup and usage and will cross-platform clusters become easier to install and configure in the future?
Some people take their .sig way too seriously
Ok, a two parter:
As I understood it weather models are a fairly hard thing to paralleliz (how the hell do you spell that?) because of the interdependence of pieces of the model. This would seem to me to make a Beowulf cluster a tough choice as it's inter-CPU bandwidth is pretty low right? And that's why I thought most weather prediction places chose high end super-computers because of their custom and expensive inter-CPU I/O?
Second part: Is weather prediction getting any better? Everything I've read about dynamic systems says that prediction past a certain level of detail or timeframe is impossible. Is that true?
Disclaimer: I might be dumb.
Hotnutz.com - Funny
I am curious as to whether (no pun intended...:)) or not you have ever done any testing to see if a distributed.net type enviornment would be useful for your type of work?
It seems to me that there are more than a few people who are willing to donate spare cpu cycles for various projects. At a minimum. you could concentrate on the client side binaries and not worry as mouch about hardware issues.
In the immortal words of Socrates, who said; 'I drank what?'
Before deciding on a beowulf clusters, what different options did you explore (Cray? IBM?), and what motivated you to choose the Beowulf System?
Additionally, to what would you compare the system that you are planning to build, as far as computing power is concerned?
Thanks,
VVulfe
Having built a few small ones, I got to know quite a bit about Linux clusters, and about programming for them. Therefore, this question has nothing to with clusters.
What was the biggest 'WTF was I thinking' on this project? I'd imagine there was a fair amount of lateral space allowed to the designers, and freedom to design also means freedom to screw up.
.sig: Now legally binding!
Seriously, what was the most challenging of maintainence tasks you had to undertake? Do you anticipate that a trade off point where the number of machines makes maintanence impossible? Do you have any pearls of wisdom for those of us just involved in the initial design of such clusters, so that maintaining it in the future is less painful?
First off, from what I have gathered, it was not clear if you background was weather or not, so, I am hoping it is. Here are a couple of questions:
1) Having just graduated with a BS in Atmospheric Sciences, I have had a chance to take numerical weather prediction courses over the last five years. With this new influx of processing power, where do you see numerical models going in the future?
2) Somewhat related to 1), with mesoscale models becoming more popular (MM5 quickly springs to mind), where do you see the balance of processor time going to these models. The ability to get a model out faster, or to compute more variables to provide a more accurate forecast at the smaller scale?
3) Not knowing too much about the origins of these models, I was interested to find that a person could get the source to the MM5 and modify it as they see fit. Will models developed in the future follow this same trend? With powerful computers becoming affordable, it would not be that difficult for a university to build one and run a particular model for their area (I believe that Ohio State is doing it, again, with the MM5)?
Thanks!
Bryan R.
Bryan R.
The price of freedom is eternal vigilance, or $12.50 as seen on eBay.....
Besides that, best of luck, and I can't wait to see the final product. ;^)
-legolas
i've looked at love from both sides now. from win and lose, and still somehow...
Why did you choose Alpha processors for the individual nodes? Why not something cheaper with more nodes, or something more expensive with fewer nodes? What other configurations did you consider, and why weren't they as good?