Slashdot Mirror


Ask Slashdot: Best Linux Distro For Computational Cluster?

DrKnark writes "I am not an IT professional, even so I am one of the more knowledgeable in such matters at my department. We are now planning to build a new cluster (smallish, ~128 cores). The old cluster (built before my time) used Redhat Fedora, and this is also used in the larger centralized clusters around here. As such, most people here have some experience using that. My question is, are there better choices? Why are they better? What would be recommended if we need it to fairly user friendly? It has to have an X-windows server since we use that remotely from our Windows (yeah, yeah, I know) workstations."

3 of 264 comments (clear)

  1. Distro isn't the biggie, it's the scheduler by javanree · · Score: 5, Interesting

    I've worked with various clusters over the past year.
    The distro doesn't really matter, mostly it's what you feel most comfortable with. I'd slightly favor RedHat Enterprise or a respin of it, since it's easiest in terms of drivers for commercial cluster hardware and commercial software support, but Debian would be just as fine. I would choose a 'stable' distro though, so no Fedora, no Ubuntu (even their LTS isn't exactly enterprise grade compared to RedHat / Suse or even Debian stable) You don't want to have to update every week since this usually requires quite some work (making new images and rebooting all nodes)

    What I found out matters a lot more is the scheduler you will use; Sun Grid Engine, PBS, Torque or slurm to name a few. Every scheduler comes with it strong and weak points, be sure to look at what matters most to you.

    If you are unfamiliar with all of these things, pick a complete bundle like Rocks (it's based on RedHat Enterprise Linux), which makes setting up a cluster quite easy and still allows you to choose which components you want. That'll greatly improve your chance of success. But be warned; it's still a steep learning curve building and specially configuring a cluster. The most time is spent tuning queuing parameters to maximize the performance of your cluster.

  2. Re:NPACI Rocks by daemonc · · Score: 5, Interesting

    Seconded. I used Rocks to build clusters for the university for which I worked, and it made my life much, much easier.

    If you are already familiar with Redhat administration, you'll be happy to know Rocks can use either Redhat or CentOS as its base OS.

    It uses meta-packages called "rolls", which completely automate the installation and configuration of your computing nodes. There are rolls that include most of the commonly used commercial and Open Source HPC software out there, or you can "roll" your own. Basically you just configure your head node, and then adding a compute node is as simple as setting the BIOS to boot over PXE, plug it in, and done.

    Rocks, well, rocks.

    --
    All that we see or seem is but a dream within a dream.
  3. Re:None of them by cashman73 · · Score: 3, Interesting

    Actually, in the realm of biomedical supercomputing, "none of the above" has already been done. Check out the Anton supercomputer designed and built by D.E. Shaw Research. The entire supercomputer, right down to all of the processor cores themselves, were specially designed and built specifically for molecular dynamics research. The system has no operating system and, as such, no overhead. Every processor cycle goes straight into the calculations. It is capable of churning out simulations of 150,000+ atom protein complexes on the order of several microseconds long, using wallclock CPU time of a few days.