Building Your First Cluster?

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Wednesday July 26, 2006 @12:50PM from the it's-been-a-while dept.

An anonymous reader asks: "I'm interested in building a DIY cluster using Linux and will be using conventional Linux software. However, the number of possible ways to do this is huge. Aside from Beowulf, there's Mosix, OpenMosix, Kerrighed, Score, OpenSSI and countless others. Therein lies the problem. There are so many ways of clustering, development seems to be in fits and starts, most won't work on recent Linux kernels and there's no obvious way to mix-and-match. What have other people used? How good are the solutions out there?"

15 of 71 comments (clear)

Min score:

Reason:

Sort:

Rocks by corvair2k1 · 2006-07-26 13:00 · Score: 5, Informative

Rocks is a great tool to build a cluster. It includes lots of monitoring tools and such, so you can see the status of nodes, etc. However, I'm not quite sure how large you're planning on going... May be overkill for a 4-noder. =)
1. Re:Rocks by kst · 2006-07-26 22:14 · Score: 4, Informative
  
  No, Rocks isn't overkill for a 4-node cluster; I'm running three 4-node Rocks clusters myself.
  
  I'm not familar with other solutions, but Rocks is remarkably easy to install.
Find the problem before trying to solve it by linuxkrn · 2006-07-26 13:07 · Score: 4, Informative

Sounds like you're trying to solve a problem that you don't have.

Cluster need special software to take advantage of the disturbed computing. They are built with a specific task in mind. Or do you already have a need and just failed to tell us?

For me, I run my network with distcc (http://distcc.samba.org/) So all of my Gentoo boxes can compile using shared computing power. It cut a typical 33Min app down to less then 2 mins doing this. And works wonders for my slower laptop.

With distcc, all you need to do is have the same tool chains. (glibc, gcc, coreutils, etc) You can even specify how many threads per box you want running to fine tune your network.

On the other hand, if you just want to learn, then you should try them all. The all suit different needs.
1. Re:Find the problem before trying to solve it by Sithgunner · 2006-07-26 15:15 · Score: 2, Informative
  
  He's apparently not asking for compile farm...
  
  Author did fail to say what the purpose was, but here are some good starts.
  
  Apache cluster
  MySQL cluster (should also refer to mysql.com resources)
  Ultra monkey, heartbeat and the like can make cluster as well.
LinuxClusters.com book is a good reference by Anonymous Coward · 2006-07-26 13:33 · Score: 2, Informative

You may want to look at this online book (free):

http://linuxclusters.com/compute_clusters.html

At least get to know various approaches at a high level before proceeding...
My Experence by PAPPP · 2006-07-26 13:46 · Score: 3, Informative

I have a stack of five origional Pentium boxes with 32mb of RAM and 2gb harddrives (except for one, with a larger drive for a software repository). Origionally built it to experiment with AFAPI based clustering, but since AFAPI is a reasonably non-invasive setup, it works well for trying other techniques too, everthing from simply running distcc on the nodes to speed up i586 software builds to briefly fiddling about with some of the other clustering options mentioned. Fiddling around with options on a real cluster (running cluster software on a single node really isn't a good impression) that could be reinstalled from scratch in a few hours, and the machines aren't worth enough to matter if it is physically damanged is a great way to learn.
Re:Whatever happened to... by sophanes · 2006-07-26 14:07 · Score: 5, Informative

While a few early parallel computers used hypercube based interconnects (eg, CM-1), there hasn't been a lot of interest in hypercubes since then. The advantage of hypercubes is that their diameter only increases logarithmically with the number of nodes in the network. Their disadvantage is that the node degree increases with the dimension, meaning that each node must be configured with sufficient ports to support the maximum dimensionality of the network -- making hypercube-based networks either expensive or non growable. (This can be solved to an extent by using cube-connected cycles.)
Re:Whatever happened to... by SirLoadALot · 2006-07-26 14:09 · Score: 3, Informative

No, cluster network links are not nearly as elaborate as that, for the most part. The advent of high-speed crossbar links means that only the largest clusters need to even consider the network structure. You can get a 288 port Infiniband switch that provides full crossbar connectivity across all data paths. Only if you outgrow the largest available switch do you need to consider how to best link the multiple switches together. Generally this would involve minimizing the amount of IO that would need to be performed between the switches, which isn't always easy, or even possible.
Rocks Rocks! by Frumious+Wombat · 2006-07-26 14:11 · Score: 4, Informative

No, seriously, if you're setting up a cluster where your work can be batch-queued, or intend to run MPI, then Rocks http://www.rocksclusters.org/ is the way to go. It also comes with tools such as SGE (Sun Grid Engine) or OpenPBS pre-configured, Intel compilers and libraries ready for you to drop a license onto (but of course the entire GNU suite is there as well, including Ada), has more monitoring tools (plus some nice web-based interfaces) than you can shake a stick at, and runs on IA-32/AMD-64/IA-64 (Itanium). It also has a Roll to help build a tiled display wall, which would be a really cool use of a small cluster.

They're also really great guys.

On the other hand, Oscar is supposed to be good, and if you're not into the whole batch-mode thing, you can get OpenMosix up and running using http://clusterknoppix.sw.be// ClusterKnoppix, and just fire jobs off into space and let them find their own unburdened node.

But still, Rocks is really an elegant and clean way to go, plus it will scale up in case you're going to deploy a huge one of these for real after you get your feet wet.

--
the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
java by Anonymous Coward · 2006-07-26 14:32 · Score: 1, Informative

Do a Google search on "Java heterogenous cluster". I have been playing around with some of them.. Nice things - you can use any machine. No lengthy build out process. Downsides - slower than dedicated, not as flexible.
Depends on what you're doing by mrsbrisby · 2006-07-26 14:59 · Score: 3, Informative

Clustering means too many different things these days. I operate several clusters- but they're all so different that I can't say that any of them are the same.

I run ClamAV and Spamassassin- both very slow programs- with cexec which simply lets me farm regular unix tools across multiple (lots) of CPU servers. This lets me replace the clamscan and spamc programs with "wrappers" that use my farm. I like cexec because it doesn't make me create lists of clients and servers, but automatically load balances and fails out very nicely.

For my frontend web servers, I use fake/heartbeat and some custom proxy software for routing frontend requests to backend farms.

I haven't found a real reliable replicated directory- with one, I could use cexec as a filesystem... Maybe some day...
Diskless OpenMosix by rwa2 · 2006-07-26 15:02 · Score: 3, Informative

I worked for a linux supercomputing startup way back when. The easiest time I had was by separating the components : one big machine for storage, and lots of little diskless machines for computation.

So I'm a Debian fan, so that involves just creating one large computer (or two with redundancy using linux-ha) with a good RAID as a shared home directory. Then just install the "diskless" package. This will allow you to spawn off as many hosts that mount root off of NFS as you like. All you have to do is get the compute nodes to boot a kernel that supports nfs root (I used a single floppy, but you can do bootcds or net-boot if you're more sophisticated).

I used a Mosix kernel at the time, but I suppose OpenMosix is a better bet today. Mosix pretty much makes the entire system look like a massive SMP, so you just launch a whole lot of batch scripts on one computer, and it automatically distributes the load out to idle machines, and ships the results back to the one you started on. You just choose a node to become the master diskless-image, and then use the diskless scripts to clone it as many times as you like.

The compute nodes could have a local drive, but I just used them for swap and maybe local /tmp. It's damn convenient to be able to configure all the nodes from one place whether they were online or not.

The other nice thing about OpenMosix is that it's architecture agnostic, so you could conceivable join and remove nodes that were all different speeds, AMDs or Intels or maybe even 64-bit platforms in any combination. The faster processors would get the heaviest loads first, etc.

After you have this system up and running, you might start playing with more sophisticated stuff, like parallel distributed global filesystems and the like. But before that you could certainly make your NFS root server scale up by splitting it up across multiple machines (for /home , /usr, etc.). You'd be amazed at the performance you can get with a well tuned NFS share... since one server can cache most of the disk access, it can even dish out files from one big high performance RAID faster than if you had bothered to give all the nodes their own drive or two.

Anyway, it's the systems management that will get you... so I recommend using Debian, getting real cozy with aptitude, and searching the apt repository for all of the little command and monitoring thingies that will help you, like clusterssh and cfengine and nagios2 and stuff.

Burning a bunch of ClusterKnoppix CDs will pretty much get you on track with most of this, I'd imagine. Also check out the "KNOPPIX Remastering" howto so you can customize your own livecds, should you choose that path insteads of diskless nfsroot.

So that's a software approach, the hard part is really selecting, testing, and optimizing all of the hardware. The slowest component is always going to be storage (invest in lots of separate SATA cards using the Linux software RAID5 or RAID10 - reconfigure and test lots with hdparm -t and bonnie++ and format reiserfs), followed by network (gigabit NICs are cheap - you could afford separate ones for the NFS and the "real" network, though gigabit switches are still up there - Linksys and D-Link make some good 16-port ones for ~$300).

Um, if you're looking for parallel applications, povray is fun. And for a time we could sort of measure how many nodes I had up and running by monitoring my stats at distributed-net . But with OpenMosix, just about anything with lots of CPU-intensive parallel batch processing is fair game and works out of the box.

Have fun!
OSCAR by Odocoileus · 2006-07-26 15:18 · Score: 3, Informative

Last year I built a cluster using OSCAR http://oscar.openclustergroup.org/
I haven't tried any others, but OSCAR installs pretty easy. Just run the installer on the head node, and when it is done it feeds an image to each of the other computers that are a part of the cluster. It includes the ganglia monitoring tools and the apache server so you can view it.

--
...
Re:Whatever happened to... by BigFootApe · 2006-07-26 16:17 · Score: 2, Informative

Some high-speed interconnects like SCI and Dolphin are designed to be deployed in ring based structures (hypercube is based on a bus). The multidimensional analog to the ring is the hypertorus, and many clusters based on SCI and Dolphin use a hypertorus topology.

For instance:
http://krone.physik.unizh.ch/~stadel/zbox/start
warewulf-cluster by Imp- · 2006-07-26 18:08 · Score: 2, Informative

I have used http://www.warewulf-cluster.org/ for my Opteron cluster. Works with new kernels and with many different distros. Seems to be under good devopelment.