What's The Best Linux Distribution For Clustering?
syn1 asks: "There has been a proliferation of Linux distros over the last couple years. Many are specialized for specific tasks or needs. In terms of Beowulf Clusters, there are a growing number of distros specialized for these clusters. Although the old favorite among specialized Beowulf distros is Extreme Linux, other distros such as Syclid Linux and Scali Linux are catching up in terms of user share. Additionally, more people are using conventional distros (Red Hat, Debian, Mandrake, SuSE, etc..) and adding Beowulf support. I am just wondering what fellow Slashdotters think about these various distros when it comes to Beowulf Clusters and which ones they think are best."
What about Mosix?
Everything is but a number spoken by itself.
Debian it is.
O this learning! What a thing it is - William Shakespeare
SUWAIN: Slashdot User Without An Interesting Name
SUWAIN: Slashdot User Without An Interesting Name
You should have read the all-encompassing Linux-HOWTO!
Or better yet, the more specific, completely non-generic Beowulf-HOWTO!
Everyone knows that.
-------
CAIMLAS
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
one of the more clever tricks to lure the slashdot editors into posting yet another "my distro is beeger than yours" holy war.
(note: looking at how obvious the submitter tries to start a distro-war, it doesn't take to much cleverness to lure the editors)
Wanna bet there will not be any useful discussion in this thread?
I really wish Slashdot would start moderation on articles too, then this would be dismissed as Flamebait fast enough.
<grub> Reading
It's a second generation Beowulf, with some
very interesting features (see below). You can
download it for free or purchase it for cheap
(see link at http://www.scyld.com/ )
http://www.scyld.com/clustering_overview.html
[...]
Scyld Beowulf installation is easy. It's like loading Linux onto a single PC.
The Scyld Beowulf software provides the capability to start, observe, and control
processes on cluster nodes from the cluster's front-end computer.
Scyld Beowulf's cluster process control, BProc, decreases time to start processes
remotely. With process migration times of ten milliseconds, BProc provides an
order of magnitude improvement over other job spawning methods. Additionally
BProc provides insight into job and cluster performance.
Scyld Beowulf features Large File Summit (LFS) support via Scyld's Linux kernel
updates and GNU C library which support 64 bit file access on the ext2 filesystem.
Scyld Beowulf also includes utilities modified to take advantage this. (Basic text
utilities, scp, ftp client and server).
Scyld Beowulf includes GUI-based cluster node configuration, control and status
tools.
Scyld Beowulf ships with a customized version of the popular MPICH message
passing library. This version is modified to take advantage of the unique process
creation and management facilities provided by BProc which makes running MPI
applications easier than before.
Scyld Beowulf includes MPI-enabled linear algebra libraries and Beowulf
application examples.
-- Eugen* Leitl leitl ICBM: 48.07100, 11.36820 http://molecu
There are twqo basic types of clustering:
/is/ a valid answer, but it simplifies more than it educates.
1) Process clustering - This beowulf, it is designed to rip every last shred of CPU time out of boxen. It is a VERY custom, machine dependant thing. A good B-cluster will be so hand tweaked as to be almost unrecognizable as what ever distro.
2) Server clustering - this is failover stuff, and distros can do this much better. Most people call it something like High Availability. But you are still likely to teak it up.
This is not a very good question, because clusters tend to be so custom. Its like asking: "Whats the best frame to base a kit car on?" There
-- Crutcher --
#include <disclaimer.h>
-- Crutcher --
#include <disclaimer.h>
I've done this myself, and without starting a flame war, I've found that the easiest setup was achieved using RedHat. Their piranha tools make things easier and since the servers came with RedHat, I didn't have to waste too much time, nor did I have to drop a couple thousand dollars for their cluster distro, it all comes in the general distribution. During research for this project I read quite a bit about the TurboLinux distribution. The internals aren't much more than lvs, but the price tag scares you away (not that you couldn't do it with a stock TL and LVS, but to use their special distro it costs ... just like RedHat's. You're not really paying for the software, but rather the tech support). Whatever you decide, keep in mind a few things ..
1. Any distro can do it.
2. When you get the cluster up, do what you can to keep the distro/OS in the cluster the same. You'll save yourself a good bit of headaches in administration and make using the weighted algorithims a reality (ex: NT won't respond to the uptime, or ruptime polling requests, so you're stuck with the static weight that you assigned read the HOWTO for more).
3. If you are using lvs, use direct routing. It's fast.
What is all this Beowulf crap? For highly-available systems, clustering usually means server fail-over. It means an active-standby configuration with a shared disk. If the active server dies, the standby mounts the disk, starts up the app, and carries on.
For examples of shrink-wrapped versions, see Sun Cluster, Veritas Cluster Server, and a Linux based one, Turbo Linux Cluster Server.
A lot of services have to be active-standby; only one server can be doing the job at a time. Any database falls into this category, including SQL-based, LDAP, and mail stores. This is where the above products would get used. For services that can be active-active, like web servers, DNS, mail relays, some form of load balancing is better and cheaper.
There are distributed databases on the horizon, but few of them are ready for primetime. These would feel more like a Beowulf cluster.
I'm not trying to tell you that calling Beowulf a cluster is wrong, but limiting clustering to just Beowulf is.
Carpe Diem = Sieze the day
;P
Caveat Emptor = Buyer Beware
It may look like I'm doing nothing, but I'm actively waiting for my problems to go away.
--Scott Adams
What are your goals, how many concurrent jobs will you be running (and with what priorities), and do you know where the bottlenecks reside?
Clustering, high-performance computing in general encompasses a huge number of problems and solutions. There are literally gobs of different routes one could take. Beowulf and Benchmarks, while easy to remeber and look at, are not the solution to everything. Perhaps you need the vector performance of a Cray or maybe the cache-coherent shared-memory system of a Data General AViiON or Silicon Graphics Origin. It all depends on your needs. Do the research before assuming you need one exact solution.
FWIW, you may want to look at SGI's Advanced Clustering Environment for an all-inclusive, free, open-source solution. It's available for both SGI MIPS IRIX and IA-32/Intel Linux and works quite well with SGI's great Performance Copilot analysis software. They also know a thing or two about high performance computing. If you need more power you can build a warehouse of Linux boxes or a buy a 512-processor Origin 3000 (w/ 1TB RAM and 714 GByte/sec bandwidth)... or a cluster of those!
My $0.02
Hvem skal i kloster?
I think the last version of Extreme Linux was (searches for his Extreme Linux CD) is based on RedHat Linux 5.0 - it's a little out of date now - code has moved on considerably.
For you I would like to recommend some reading:
Building Linux Clusters by David HM Spector published by O'Reilly, (hmmm site seems to be down, come back later, or check Google cached version)
This book comes with a CD together with clustering software. It also comes with step-by-step instructions. I believe, however, that there are some errata, which means that some hacking will need to be done to get your cluster online.
It also goes through some aspects of choosing hardware etc...
A more in-depth resource, without step-by-step instructions, but with in-depth discussions on granularity of Beowulf systems and whether they are actually good for the tasks you have in hand is:
How to Build a Beowulf, A guide to the implementation and application of PC Clusters by the MIT Press
Also check the The Beowulf Project Site and the The Beowulf Underground Site
Have fun!
---
I really wish Slashdot would start moderation on articles too
You really wish you were looking at Kuro5hin. All logged-in users are always moderators at all times, and all logged-in users can vote +1 or -1 (remind you of [e2]?) on story submissions in the public queue.
Will I retire or break 10K?
You'll find it hard to find anyone who doesn't reply to that question (best Linux for clustering) with their current flavour of the month home Linux. We're too damn partisan.
I've used 4 distributions in the last 2 years. I only use Linux, at the moment I however rank them by which is least sucky.
My intention is to have my N machines properly clustered, so I read this thread with excitement. However, if the conclusions are XxxXxx or XxXX, then I'll give up right away.
FatPhil
(Xxxxxx is least sucky presently)
Also FatPhil on SoylentNews, id 863
TurboLinux Cluster Server provides High Availability functions that boosts uptime for services such as Web serving, mail hosting, news, and FTP. TurboLinux also has a high-performance clustering product called EnFuzion.
Red Hat provides a package called High Availability Server that includes load balancing, fault tolerance, and improved scalability for IP-based applications.
--Loge
Suse 7.0 will soon be available for sparcs if it is not already. Suse comes with beowulf and pvmake. I cannot comment on how good it will be. At the moment I'd stick with Redhat 6.1 and install the clustering rpms from srpms. See if they build.
--- Justin Dearing http://www.justaprogrammer.net/ We're just programmers.
Actually, I think he was drunk at the time ;)
A deep unwavering belief is a sure sign you're missing something...
Apparently DIPC (Distributed IPC) can run with MOSIX, although DIPC a few months ago did not optimize migrated processes. It could work, but works better when DIPC realizes that processes are able to run on other systems.
The question was about which distros support linux clustering, and which people thought were good or not. It wasn't a call for a distro holy war, it wasn't a question about "How do I make a linux cluster?" or "What software packages are out there to cluster linux boxes?"
Personally, I find that while Red Hat is not my favorite of the linux distros, Red Hat offers Red Hat Professional Services, and this is a very nice thing for management, if the cluster in question is going to be a in a production environment at a company or business somewhere. If it's for your home use, do what you like, but most PHBs tend to take extreme comfort in the fact that if something linux related breaks, they can call Red Hat if the cluster admin on-site can't fix it, and Red Hat will either try to help on the phone, or you can pay for RH Prof. Services to come out to your site and take a look.
Didn't they remove OpenGL support in BeOS 5? I'm sure you can put it back, but still.... Be is very professional, I suppose, and refuses to put something in a distribution before it's completely polished.
I can't really think of any others, unless you're a crazy mac user. But don't listen to me; I'm just a crazy mac user.
--
Lagos
Clustering takes on many forms. I would suggest Debian for distributed processing environments because of it's stability. But for HA clustering, it is really up to you. Figure out which distro *YOU* are most comfortable installing, then check out http://oss.missioncriticallinux.com. Their Kimberlite cluster will run on *ANY* distro.
Well SuSE and SGI are porting Failsafe to linux-ha.org High Availability project.. and SGI supposedly has much experience with that package. I'd look more at RedHat if their installation process didn't suck the big snarfborg.
(as the SuSE liker nevertheless ends up developing for RH..)
But missioncriticallinux.com's Convolvo says "any deestro".
But for actual stability like trying to get the job done? The last two VA Linux boxes I bought had RedHat on them already, and hardware cost is a pretty big factor. Or did you want to start repartitioning that 50GB RAID array? Is there such a big difference between deestros after you shut everything down? How about which HA distros not which Linux distro?
Someone's going to say BSD or die, etc etc. I'd much rather see people with actual experience responding and backing up what they say, and hear people with experience using the HA tools.
Better yet screw the distro idea, someone just post a list of tools they like and ideas about compiling, resource management, and security.
... your choice should also depend on the hardware and the amount of time you want to spend tweaking the config. The Beowulf I help admin has bleeding-edge hardware that requires proprietary (closed-source, commercial) drivers that are usually packaged for RedHat, even tested against RH-specific kernels. Yes, I could probably take the RPM apart and install them on another distro, but then I couldn't really use the OEM's support as they would come back with 'we don't support that'.
So, in *practice* your best choices would be, in my experience, RedHat for Beowulf-type clustering (process distribution) and TurboLinux for high-availability clustering (fail-overs)...
Debian has all the beowulf stuff you need prepackaged, like MPI and I think it has some batch programs. Just makes it easier to maintain if you ask me.
# debian/rules
Blender can't cluster! Check the offical docs!
;)
Fortunately for SkyWriter, MOSIX isnt "clustering" since it turns a cluster of machines essentially into a single very-large SMP machine. All the program needs to know is how to thread or fork itself to use multiple processors. At least thats the theory. Never used it myself
God Fucking Damnit
Is there any clustering technology available for *BSD? I mean, for someone who has assorted old hardware with many different processors and platforms, NetBSD seems ideal. And for a group of Intel boxes, FreeBSD has always been choice. (sorry:it's just more stable & higher performance than most linux distros--please don't take this as flame-bait, it's just my opinion; what i'm really looking for here is BSD alternatives to Beowulf clustering). And of course, anything that runs on OpenBSD is just kick-ass, although if you're running a cluster that's not connected to the Internet, securing everything is not really necessary, performance is more key. but i guess this might be a case of "just because i can."
t slash sites:infantililsm.org
------------------------------------------
bes
I have found a minimalist Slackware good for running the Linux-HA (heartbeat) software, with a bit ot tweaking.
This may be out of your price range, but Sandia Labs has a nice little machine they call the Cplant (Computational Plant). Its a cluster of about 500 linux boxes with supercomputer power. Its ranked 84th in the list of TOP500 Supercomputers in the world as of November 3rd.
..a Beowulf cluster of the Best Linux Distribution for Clustering???
/.ers discussing the best Linux distro for a 420 node cluster to generate the worlds largest and most detailed ascii penis bird for use in sigs. Or viewing fake Natalie Portman porn.
OR a Beowulf cluster of
.sig wanted: Must be concise, funny, and display my cleverness.
Beowulf means parallel processing and distributed queueing mainly. Not at all HA, which is more suited to "new-economy" business types anyway. :)
For a real beo, go for Debian.
Those so-called "beowulf specific" distros just won't cut it.
Thanks,
--exa--
Sure, clustering aint just Beowulf, and even then Beowulf is not the only High Performance solution.
/var to be filled up any time a job runs out of its limits)
But even in High Performance solutions availability and scalability are things that are not to be forgotten. (Unless you don't mind your High Performance cluster to crash every week or so due to harddisk failures and overheating Pentium chips.)
To come back to the submission question, to my opinion a Distribution for a Beowulf cluster should have:
- a means to automate installation completely. If you miss this it would take you a lot of time to install a new machine each time one of the machines in your (500 node) cluster crashes.
- an easy way to update your nodes (for the same reason)
- a clear and understandable filesystem layout, that also protects your nodes from the clustered processes (you wouldn't want your
- hardware support for the devices of your choice (which include hardware raid mirrored disks, gigabit ethernet cards, fast io devices)
- a good 'out of the box' security policy
Now I don't know what distro has these features, but I might have to know somewhere soon in the future as I try to find alternatives to the really expensive O2000 10proc r10000 HPC cluster and the evenly expensive SUN E3500 / A3500 HA cluster we use at my job.
Besides that I think the point noted above are also true for a distribution that just has to provide a platform for a serious bussiness server.
For as far as I'm concerned there are some key functions I mis in scalability for the linux solution (or they just exist and I haven't looked good enough):
- a filesystem that is journaled, life growable, has exellent performance on >1 TerraByte sizes and can be attached to two or more machines enabling failover (like veritas vxfs) (could coda help?)
- an architecture that has the capability of real number crunching (like the O2000/O3000) while maintaining reliability and low prices (maybe alpha's are a solution here, or I just have to cool down and settle for 1000Mhz Pentium IV machines)
sig not found
I'd be interested in finding out from an experienced Website Admin. just how much extra webserver load (if any) would result from letting all logged in experienced users moderate.
All experienced users can moderate on [Everything 2]. Each user who has [at least 50 XP] (like Karma but you also get one for each write-up) is given 10 to 100 or more points per day with which to vote +1 or -1 on a particular write-up and cannot see other users' write-ups' scores until after voting on them.
Will I retire or break 10K?
Has anyone tried using one of the tiny distros?
Whenever I hear anything like this, I think of the word 'clusterfuck'. I don't know why, but I do :)
Thanks to previous posters we now know the Beowulf flavor of clustering (supercomputing) and the load balancing flavor of clustering (mostly web servers) and also the High Availability flavor of clustering (duplicated servers with or without load balancing.) There is one area I have not seen addressed and that is disk clustering where the disk farm (usually a RAID array) has (hardware) access by two or more machines. Microsoft call their version MSCS. It is useful for database servers because you do not need two copies of the database. One DB copy on disk can be read/written by more than one machine directly over a scsi ( or fiber) channel. (Disk redundancy is provided by the RAID-5). Is there such a thing for Linux? (Not RAID-5. It works fine right out of the box. It's the dual porting software I have not seen).