Choosing the Right Cluster System
ckotso asks: "So I've read here and there about linux clusters, and I am ready to set on creating one with some help of the educational institute I am working for. So far I've found out about Beowulf, SCI and MOSIX. I really wish I can get some help on this, since NT is making its way into the University gradually and I hate to see this. I want to give a cheap and robust alternative to this place, I simply have to change their minds! " Interested? There's more information inside.
"My questions are:
- Have I missed any other serious competitor in the cluster field?
- What are the pros and cons of these systems?
- Has anyone tried them all and written any report as to how they compete?
--
Here are some information you may consider before starting your own cluster:
So, some positive factors, some negative ones. If you want to convince your University, always remind them that they can always count on the support of other universities and research centres the world over that are using this technology right now.
Good luck!
The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
Click here to go directly to the project abstract (more details, less graphics.)
--
NetInfo connection failed for server 127.0.0.1/local
Second: SCI is orthogonal to the other two technologies - it is a special hardware network technology (Scalable Coherent Interface), originally made to support distributed shared memory. You may be thinking of the software Dolphin Interconnect Solutions provide with their SCI solutions, but as far as I know, that doesn't directly enter into the same space, either. Their web pages does certainly not indicate that it does, and my discussions with (one of?) their Linux developer(s) implied that it contained somewhat more (lock managers etc), but not in the same space. A technology that compete with SCI, though proprietary, is Myrinet. This has a longer history than SCI, and has been less plagued with problems than SCI (though SCI is supposedly quite stable now).
Third: There are a bunch of other technologies (some cross-platform, some single-platform) that compete in making it easy to build clusters. MOSIX and Beowulf are just two of them. If you give more details of what you want to achieve, I'll dig out references from my collection (made to support the development of FreeBSD-specific clustering improvements, so some types of references may be lacking, but I'll probably be able to come with at least some points to start for any wanted cluster workload.)
Eivind.
Doubting the existence of evolution is like doubting the existence of China: It just shows that you're uninformed.
I would whole heartedly recommend that anybody interested in clustering should read Greg Pfisters "In Search of Clusters" published by Prentice Hall - ISBN 0-13-899709-8. It is the seminal work in this area.
. html x .html
Other good resources include:
- IEEE Task Force on Cluster Computing
http://www.dgs.monash.edu.au/~rajkumar/tfcc/index
- Linux-HA http://linux-ha.org/
- some general links
http://www.tu-chemnitz.de/informatik/RA/cchp/inde
There are more clustering products out than you can shake a stick at, and everybody seems have a different take what they mean by a cluster.
Does anyone have any information on what the Linux Cluster Cabal are up to ?
Probably the best thought out cluster solutions are OpenVMS Clusters and UnixWare NonStop Clusters.
Zootlewurdle
Share and Enjoy
If you're looking to cluster for high performance, you need to decide which HPC paradigm you're going to go with and choose your clustering based on that.
Distributed shared memory (DSM) is great on the programmer side. You've got a big steaming chunk of memory shared among processors, and a bunch of parallel threads/processes (depending on OS) acting on that chunk. DSM makes a lot of sense for database servers, and is the prevalent HPC solution among the big server companies (Sun, SGI, etc.) since multithreaded code runs dandy without any modifications. MOSIX implements DSM.
The downside: vast memory bandwidth required for sharing and high overhead. In an educational environment (and IMHO), DSM is a Very Bad Thing, since programming DSM teaches you nothing about actually using parallelism--it's just like working in any other multithreaded environment.
Students are better served by learning on a message-passing system, which is what Beowulf clusters are. You have a bunch of computers and a way to make them talk to one another (PVM or MPI)--"now implement some algorithms!" MP machines are [given equal-quality implementations, a big given] generally faster and more scalable than DSM machines, as well as being more "pure". Optimizing DSM programs is much easier if you have MP experience.
Downside: MP is a pain to program and even more of a pain to debug. But students could use more suffering, right? Language support is a little iffier for MP, too, with Fortran and C being prevalent.