Slashdot Mirror


What's The Best Linux Distribution For Clustering?

syn1 asks: "There has been a proliferation of Linux distros over the last couple years. Many are specialized for specific tasks or needs. In terms of Beowulf Clusters, there are a growing number of distros specialized for these clusters. Although the old favorite among specialized Beowulf distros is Extreme Linux, other distros such as Syclid Linux and Scali Linux are catching up in terms of user share. Additionally, more people are using conventional distros (Red Hat, Debian, Mandrake, SuSE, etc..) and adding Beowulf support. I am just wondering what fellow Slashdotters think about these various distros when it comes to Beowulf Clusters and which ones they think are best."

19 of 57 comments (clear)

  1. Obviously! by CAIMLAS · · Score: 2
    People should do a little more reading before they post things like this to slashdot! *huff*

    You should have read the all-encompassing Linux-HOWTO!

    Or better yet, the more specific, completely non-generic Beowulf-HOWTO!

    Everyone knows that.

    -------
    CAIMLAS

    --
    ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
  2. This is indeed... by Leto2 · · Score: 2

    one of the more clever tricks to lure the slashdot editors into posting yet another "my distro is beeger than yours" holy war.

    (note: looking at how obvious the submitter tries to start a distro-war, it doesn't take to much cleverness to lure the editors)

    Wanna bet there will not be any useful discussion in this thread?

    I really wish Slashdot would start moderation on articles too, then this would be dismissed as Flamebait fast enough.

    --
    <grub> Reading /. at -1 is like driving through Cracktown in a convertible that is stuck in 1st
  3. the best Beowulf distro? Scyld, of course. by eleitl · · Score: 3


    It's a second generation Beowulf, with some
    very interesting features (see below). You can
    download it for free or purchase it for cheap
    (see link at http://www.scyld.com/ )

    http://www.scyld.com/clustering_overview.html

    [...]
    Scyld Beowulf installation is easy. It's like loading Linux onto a single PC.

    The Scyld Beowulf software provides the capability to start, observe, and control
    processes on cluster nodes from the cluster's front-end computer.

    Scyld Beowulf's cluster process control, BProc, decreases time to start processes
    remotely. With process migration times of ten milliseconds, BProc provides an
    order of magnitude improvement over other job spawning methods. Additionally
    BProc provides insight into job and cluster performance.

    Scyld Beowulf features Large File Summit (LFS) support via Scyld's Linux kernel
    updates and GNU C library which support 64 bit file access on the ext2 filesystem.
    Scyld Beowulf also includes utilities modified to take advantage this. (Basic text
    utilities, scp, ftp client and server).

    Scyld Beowulf includes GUI-based cluster node configuration, control and status
    tools.

    Scyld Beowulf ships with a customized version of the popular MPICH message
    passing library. This version is modified to take advantage of the unique process
    creation and management facilities provided by BProc which makes running MPI
    applications easier than before.

    Scyld Beowulf includes MPI-enabled linear algebra libraries and Beowulf
    application examples.

    --
    -- Eugen* Leitl leitl ICBM: 48.07100, 11.36820 http://molecu
    1. Re:the best Beowulf distro? Scyld, of course. by Coolfish · · Score: 3

      Wow, imagine a beofwulf cluster of these...

      oh wait.

  4. Clustering is way custom by Crutcher · · Score: 4

    There are twqo basic types of clustering:

    1) Process clustering - This beowulf, it is designed to rip every last shred of CPU time out of boxen. It is a VERY custom, machine dependant thing. A good B-cluster will be so hand tweaked as to be almost unrecognizable as what ever distro.

    2) Server clustering - this is failover stuff, and distros can do this much better. Most people call it something like High Availability. But you are still likely to teak it up.

    This is not a very good question, because clusters tend to be so custom. Its like asking: "Whats the best frame to base a kit car on?" There /is/ a valid answer, but it simplifies more than it educates.

    -- Crutcher --
    #include <disclaimer.h>

    --

    -- Crutcher --
    #include <disclaimer.h>
    1. Re:Clustering is way custom by Bender+Unit+22 · · Score: 2

      2) Server clustering - this is failover stuff, and distros can do this much better. Most people call it something like High Availability. But you are still likely to teak it up.
      I never really seen number 2 as a cluster solution. But wasn't that the first NT "cluster"? I think it was because MS called their failover system a cluster at that time that people started calling a failover system for a cluster. I might be wrong, but it's just my impression. Oh well never mind.
      --------

  5. High Availability Clustering. by trexl · · Score: 2
    I won't touch on PVM or MPI clustering, but as far as High Availability clustering goes, most of the distributions will use some form of lvs Since it uses nice command line utiities you can write your own scripts, or you could use the gui they offer as well. Slap that software on any distro(make sure that the kernel's patched right) and you're ready to go.

    I've done this myself, and without starting a flame war, I've found that the easiest setup was achieved using RedHat. Their piranha tools make things easier and since the servers came with RedHat, I didn't have to waste too much time, nor did I have to drop a couple thousand dollars for their cluster distro, it all comes in the general distribution. During research for this project I read quite a bit about the TurboLinux distribution. The internals aren't much more than lvs, but the price tag scares you away (not that you couldn't do it with a stock TL and LVS, but to use their special distro it costs ... just like RedHat's. You're not really paying for the software, but rather the tech support). Whatever you decide, keep in mind a few things ..
    1. Any distro can do it.
    2. When you get the cluster up, do what you can to keep the distro/OS in the cluster the same. You'll save yourself a good bit of headaches in administration and make using the weighted algorithims a reality (ex: NT won't respond to the uptime, or ruptime polling requests, so you're stuck with the static weight that you assigned read the HOWTO for more).
    3. If you are using lvs, use direct routing. It's fast.

  6. Clustering ain't just Beowulf by garver · · Score: 2

    What is all this Beowulf crap? For highly-available systems, clustering usually means server fail-over. It means an active-standby configuration with a shared disk. If the active server dies, the standby mounts the disk, starts up the app, and carries on.

    For examples of shrink-wrapped versions, see Sun Cluster, Veritas Cluster Server, and a Linux based one, Turbo Linux Cluster Server.

    A lot of services have to be active-standby; only one server can be doing the job at a time. Any database falls into this category, including SQL-based, LDAP, and mail stores. This is where the above products would get used. For services that can be active-active, like web servers, DNS, mail relays, some form of load balancing is better and cheaper.

    There are distributed databases on the horizon, but few of them are ready for primetime. These would feel more like a Beowulf cluster.

    I'm not trying to tell you that calling Beowulf a cluster is wrong, but limiting clustering to just Beowulf is.

  7. Better Questions? ACE, others by green+pizza · · Score: 2

    What are your goals, how many concurrent jobs will you be running (and with what priorities), and do you know where the bottlenecks reside?

    Clustering, high-performance computing in general encompasses a huge number of problems and solutions. There are literally gobs of different routes one could take. Beowulf and Benchmarks, while easy to remeber and look at, are not the solution to everything. Perhaps you need the vector performance of a Cray or maybe the cache-coherent shared-memory system of a Data General AViiON or Silicon Graphics Origin. It all depends on your needs. Do the research before assuming you need one exact solution.

    FWIW, you may want to look at SGI's Advanced Clustering Environment for an all-inclusive, free, open-source solution. It's available for both SGI MIPS IRIX and IA-32/Intel Linux and works quite well with SGI's great Performance Copilot analysis software. They also know a thing or two about high performance computing. If you need more power you can build a warehouse of Linux boxes or a buy a 512-processor Origin 3000 (w/ 1TB RAM and 714 GByte/sec bandwidth)... or a cluster of those!

    My $0.02

  8. Re:Beowulf Beowulf Beowulf... by psergiu · · Score: 2

    Mosix will be the best one when they will finish the implementation on the almost-not-unlike NUMA mmap wrapper.

    --

    --
    1% APY, No fees, Online Bank https://captl1.co/2uIErYq Don't let your $$$ sit in a no-interest acct.
  9. Extreme Linux is a little out of date by GC · · Score: 4

    I think the last version of Extreme Linux was (searches for his Extreme Linux CD) is based on RedHat Linux 5.0 - it's a little out of date now - code has moved on considerably.

    For you I would like to recommend some reading:

    Building Linux Clusters by David HM Spector published by O'Reilly, (hmmm site seems to be down, come back later, or check Google cached version)

    This book comes with a CD together with clustering software. It also comes with step-by-step instructions. I believe, however, that there are some errata, which means that some hacking will need to be done to get your cluster online.
    It also goes through some aspects of choosing hardware etc...

    A more in-depth resource, without step-by-step instructions, but with in-depth discussions on granularity of Beowulf systems and whether they are actually good for the tasks you have in hand is:
    How to Build a Beowulf, A guide to the implementation and application of PC Clusters by the MIT Press

    Also check the The Beowulf Project Site and the The Beowulf Underground Site

    Have fun!
    ---

    1. Re:Extreme Linux is a little out of date by raulmazda · · Score: 2

      For you I would like to recommend some reading:

      Building Linux Clusters by David HM Spector published by O'Reilly, (hmmm site seems to be down, come back later, or check Google cached version)

      That book is not very good. I wouldn't recommend it to anyone building a Beowulf Cluster. The examples are broken and there is a LOT of errata (a whole new ch2 from what I've heard?).

      Go get the Scyld Beowulf2 Beta... really, it's the way HPLC (High Performance Linux Clusters) are going... it's easy to admin, easy to setup, easy to understand. It's a big step in useability for Linux clusters.

      .laz

      oh, and how much will you pay for /. ID 87 ;)
      --
      My car is orange, my sig is not.

  10. Moderation on story submissions? by yerricde · · Score: 3

    I really wish Slashdot would start moderation on articles too

    You really wish you were looking at Kuro5hin. All logged-in users are always moderators at all times, and all logged-in users can vote +1 or -1 (remind you of [e2]?) on story submissions in the public queue.

    --
    Will I retire or break 10K?
  11. MOSIX DIPC by SEWilco · · Score: 2

    Apparently DIPC (Distributed IPC) can run with MOSIX, although DIPC a few months ago did not optimize migrated processes. It could work, but works better when DIPC realizes that processes are able to run on other systems.

  12. No one seems to be able to answer the question... by s.d. · · Score: 2

    The question was about which distros support linux clustering, and which people thought were good or not. It wasn't a call for a distro holy war, it wasn't a question about "How do I make a linux cluster?" or "What software packages are out there to cluster linux boxes?"

    Personally, I find that while Red Hat is not my favorite of the linux distros, Red Hat offers Red Hat Professional Services, and this is a very nice thing for management, if the cluster in question is going to be a in a production environment at a company or business somewhere. If it's for your home use, do what you like, but most PHBs tend to take extreme comfort in the fact that if something linux related breaks, they can call Red Hat if the cluster admin on-site can't fix it, and Red Hat will either try to help on the phone, or you can pay for RH Prof. Services to come out to your site and take a look.

  13. Any distro can do it, but... by costas · · Score: 2

    ... your choice should also depend on the hardware and the amount of time you want to spend tweaking the config. The Beowulf I help admin has bleeding-edge hardware that requires proprietary (closed-source, commercial) drivers that are usually packaged for RedHat, even tested against RH-specific kernels. Yes, I could probably take the RPM apart and install them on another distro, but then I couldn't really use the OEM's support as they would come back with 'we don't support that'.

    So, in *practice* your best choices would be, in my experience, RedHat for Beowulf-type clustering (process distribution) and TurboLinux for high-availability clustering (fail-overs)...

  14. If it helps... by patreides · · Score: 2

    Debian has all the beowulf stuff you need prepackaged, like MPI and I think it has some batch programs. Just makes it easier to maintain if you ask me.

    --
    # debian/rules
  15. Re:Blender can't cluster! Check the offical docs! by cxreg · · Score: 2

    Blender can't cluster! Check the offical docs!

    Fortunately for SkyWriter, MOSIX isnt "clustering" since it turns a cluster of machines essentially into a single very-large SMP machine. All the program needs to know is how to thread or fork itself to use multiple processors. At least thats the theory. Never used it myself ;)

  16. All experienced users moderate on E2 by yerricde · · Score: 2

    I'd be interested in finding out from an experienced Website Admin. just how much extra webserver load (if any) would result from letting all logged in experienced users moderate.

    All experienced users can moderate on [Everything 2]. Each user who has [at least 50 XP] (like Karma but you also get one for each write-up) is given 10 to 100 or more points per day with which to vote +1 or -1 on a particular write-up and cannot see other users' write-ups' scores until after voting on them.

    --
    Will I retire or break 10K?