High-Performance Linux Clustering
An anonymous reader writes "High Performance Computing (HPC) has become easier, and two reasons are the adoption of open source software concepts and the introduction and refinement of clustering technology. This first of two articles discusses the types of clusters available, uses for those clusters, reasons clusters have become popular for HPC, some fundamentals of HPC, and the role of Linux in HPC."
You can't even run Java on them.
What do you mean? I thought 1.4.2 and up had support for Itanium. Check this white paper (search for Itanium). Are their claims false, or are you running and older version of the JRE?
I'm wanting to build one of these, but I really don't need it. Time may change that.
DMCA, Hollings, Palladium. What might have sounded like paranoia is now common sense.
For some very good information on F/OSS based clustering, check out aggregate.org. They have really neat ideas, that are reasonably well doccumented and freely implementable/usable. I built a little cluster (AFAPI on a WAPERS switch) with them for my highschool senior project, and it was a great experence.
Though mpp's are kind of like clusters, and the boundary between the two is vague, I think there's definately a distinction. In many MPPs, nodes share access to memory, just at a performance penalty. Often the scientific binary is written using a message-passing tool like MPI, but the OS is often run with direct memory access. Definately from a systems-administration point of view, an mpp is different from a cluster. In an MPP you don't have 4000 root hard drives and 4000 power supplies to replace when they break. An mpp may be like a (fast) cluster from the programmer's point of view, but they are a lot simpler to deploy and manage. (Blue Gene, xt3, altix)
I also contest some of the distinctions drawn about vector processor systems. The two vector systems currently on the market, the cray X1 and the NEC SX-8 are clusters. Each node just happens to be a vector-smp. The earth simulator is a 640 node cluster of 8-way SMP boxes, where each of the processors in the smp is a vector cpu. However, the predominant programming method even on these boxes is with explicite message passing like MPI. Co-array fortran and Unified Parallel C are faster, but slow to catch on.
Good summary of the common case though.
Rocks has a great system for making high-performance clusters from similar machines. A Rocks cluster consists of a front-end ("master") node and a bunch of compute nodes (and I think special-purpose nodes).
The master gets a full Linux (RedHat-based) install. It's a NFS/DHCP/Kickstart server for the compute nodes, and runs whatever other services you want the compute nodes to use. The master has two network cards and acts as a firewall (NAT optional).
The compute nodes boot via DHCP and Kickstart, downloading their kernel and whatever other OS files you want to their local disk. You decide how much NFS or local disk to use.
Job queueing is handled by, e.g., Sun Grid Engine (an Open Source queueing package) or some other queueing software.
Here's the neat thing: to make a change to a compute node setup, you change the Kickstart config and reboot all the compute nodes (as they finish whatever queued work they're doing, or immediately if you want). That makes the sysadmin's life easy, while still maintaining the speed of having the OS on the local disk.
Raise your children as if you were teaching them to raise your grandchildren, because you are.
Do you watch DVDs? Do you dream of squeezing all your DVDs onto a harddrive and streaming them to a media PC attached to your TV?
;)
You could copy the DVDs at ~8GB each to some large harddrives or you could transcode them to much smaller formats with all the garbage removed and go from ~8GB/movie to less than 4GB/movie. But to do this you need lots of processing power. A cluster works very good for this and the software is already there for you:
http://www.exit1.org/dvdrip/doc/cluster.cipp
For the cost of some overpriced Dell crap video editing PC you could build a decent diskless cluster. Who needs harddrives, monitors, video cards, keyboards, mice, etc. At least more than one set.
burnin
That's a bugger, but still, web-browser applets != applications. If they offer the J2SE for Itanium, you should be good to go with anything other than browser applets. Java applications should run just fine, and even stand-alone applets should be runnable with the Java appletviewer.
Forget thrust, drag, lift and weight. Airplanes fly because of money.
MOSIX is really more of a halfway-point between a traditional cluster and a "single system image" sort of cluster. Unfortunately, some aspects of clustered computing are still extremely difficult to abstract away into an ssi type of implementation. I had hoped over the years that the MOSIX work would get folded in with mainstream Linux's NUMA scheduling and memory allocation, essentially treating non-local cpu and memory resources (other nodes) like a second layer of NUMA with even less connectivity than local NUMA nodes have. Throw in a truly distributed redundant filesystem (ala the google filesystem that we don't know all the details about), and we could really begin to approach the concept of turning large-scale local and even distributed clusters into truer single system images. But the kinds of fundamental work that needs to be done to get these things rolling hasn't even started, so I don't expect it to happen anytime soon. Aside from all that clustering and kernel work, there would have to be some evolutionary changes in how we write code, and in the languages we write it in, in order to smoothly take advantage of the dynamic availability and locality of resources easily.
For now, your best bet is to construct your HPC linux cluster as a high-speed network of interconnected but independant Linux machines using a network topology that suits the class of problems you face and how easily the problem can be broken into loosely coupled peices, and then code the "clustering" aspect into your application code itself, usually working off of libraries like MPI.
11*43+456^2
I would wager you've never used Java. I say that not as an insult, but because you simply have yet to realize that a change made back in 1996 made the average piece of Java code run as fast as the average piece of C/C++ code.
... suggesting that java performance is catching up to or even pulling ahead of gcc at least.
---
Five composite benchmarks listed below show that modern Java has acceptable performance, being nearly equal to (and in many cases faster than) C/C++ across a number of benchmarks.
1. Numerical Kernels
Benchmarking Java against C and Fortran for Scientific Applications
Mark Bull, Lorna Smith, Lindsay Pottage, Robin Freeman,
EPCC, University of Edinburgh (2001).
The authors test some real numerical codes (FFT, Matrix factorization, SOR, fluid solver, N-body) on several architectures and compilers. On Intel they found that the Java performance was very reasonable compared to C (e.g, 20% slower), and that Java was faster than at least one C compiler (KAI compiler on Linux).
The authors conclude, "On Intel Pentium hardware, especially with Linux, the performance gap is small enough to be of little or no concern to programmers."
2.
More numerical methods: SciMark2 scores
R. F. Boisvert, J. Moriera, M. Phillipsen, R. Pozo,
Java and Numerical Computing,
Computing in Science & Engineering, 3(2):18-24, Mar.-Apr., 2001.
SciMark includes a number of numerical codes. On a PIII/500, MFlops (higher is better):
ibm jdk 1.3.0 84.5
linux2.2 gcc (2.9x) -O6 87.1
3.
Still more numerical methods
From the book Object-Oriented Implementations of Numerical Methods by Didier Besset (MorganKaufmann, 2001):
Operation Units C Smalltalk Java
Polynomial 10th degree msec. 1.1 27.7 9.0
Neville Interpolation (20 points) msec. 0.9 11.0 0.8
LUP matrix inversion (100 x 100) sec. 3.9 22.9 1.0
4. Microbenchmarks (cache effects considered)
Several years ago these benchmarks showed java performance at the time to be somewhere in the middle of C compiler performance - faster than the worst C compilers, slower than the best. These are "microbenchmarks", but they do have the advantage that they were run across a number of different problem sizes and thus the results are not reflecting a lucky cache interaction (see more details on this issue in the next section).
These benchmarks were updated with a more recent java(1.4) and gcc(3.2), using full optimization (gcc -O3 -mcpu=pentiumpro -fexpensive-optimizations -fschedule-insns2...). This time java is faster than C the majority of the tests, by a factor of more than 2 in some cases...
These test were mostly integer (except for an FFT).
5.
Microbenchmarks (cache effects not considered)
In January 2004 OSNews.com posted an article, Nine Language Performance Round-up: Benchmarking Math & File I-O. These
If you "get" pointers add me as a friend (116)!
[Donald's] work in making the "piles of PCs" approach to high performance computing a reality with Beowulf has been responsible for vastly expanding the construction and use of massively parallel systems. Now, viturally any high school - never mind college - can afford to construct a system on which students can learn and apply advanced numerical methods.
In retrospect, however, it would seem that the obvious cost benefits of Beowulf very nearly killed the development and use of large SMP and vector processing systems in the US. My understanding of the situation is this:
* Before Beowulf, academics had a very hard time getting time on hideously expensive HPC systems.
* When Beowulf started to prove itself, particularly with embarrassingly parallel problems using MPI, those academics who happened to sit on DARPA review panels pushed hard to choke off funding for other HPC architectures, promising that they could make distributed memory parallel systems all singing, all dancing, and cheap(er).
* They couldn't really deliver, but in the meantime, Federal dollars for large shared memory and vector processing systems vanished, and the product lines and/or vendors with it.... at least in the US.
* Eight years later, only Fujitsu and NEC make truly advanced vector systems [top500.org], and Cray is only now crawling back out of the muck to deliver a new product. Evidently someone near the Beltway needs a better vector machine, and Congress ain't paying for anything made across the pond.
Cutting to the chase, did [Donald Becker] advance a "political" stand among [his] peers within the public-funded HPC community, or [was he] just trying to get some work done with the budget available at NASA?
Luke, help me take this mask off
Some guy on an HP forum asks how to get Java code to run in his Web browser on Linux Itanium. Shock and awe follows as he's told you can't run an applet on his multi-thousand dollar 64-bit workstation.
This is because that JDK / JRE for Linux Itanium does not have browser plugin yet. But it should run any Java apps seamlessly. This same thing happens on AMD64. That's not a big deal. For AMD64-ers, we look at Blackdown JDK. But for IA64, you're SOL.
By components I mean software, since hardware is basically just a bunch of servers (or desktops), with optionally faster than commodity network and some stuff like that. The optional parts depend on what kind of applications that you run.
The most important cluster components are a base operating system and a batch scheduler like torque or slurm. There are also communications libraries (MPI and friends) and optimised math routines (matrix calculations, FFTs, etc) for some application types.
Then we have the administrative side, where it isn't that specific to HPC clusters, but a general matter for anyone that is handling a large number of similar machines. You want to have an automatic installation method, not answer 25 questions on the console every time you need to reinstall or add a node. You want to have a convenient way of synchronising configuration and settings. You want a distributed shell to run one command on all/many/several nodes without lots of arrow up and command line editing.
This should be familiar to both cluster admins and admins of server farms or large deployments of desktops too. Automate repetitive tasks, choose tools that reduces the maintenance burden, etc.