Slashdot Mirror


Linux Clustering Hardware?

Kanagawa asks: "The last few years have seen a slew of new Linux clustering and blade-server hardware solutions; they're being offered by the likes of HP, IBM, and smaller companies like Penguin Computing. We've been using the HP gear for awhile with mixed results and have decided to re-evaluate other solutions. We can't help but notice that the Google gear in our co-lo appears to be off-the-shelf motherboards screwed to aluminum shelves. So, it's making us curious. What have Slashdot's famed readers found to be reliable and cost effective for clustering? Do you prefer blade server forms, white-box rack mount units, or high-end multi-CPU servers? And, most importantly, what do you look for when making a choice?"

13 of 201 comments (clear)

  1. Dual Opteron 1U rack units.... by Fallen+Kell · · Score: 5, Interesting

    For the size and performance, they are hard to beat. A dual opteron setup in a 1U rack case is a very powerful setup in and of itself. The bonus of using off the shelf components with no need for proprietary hardware or software also make them very affordable. The added bonus is that you can simply get the parts from regular retailers for replacement.

    --
    We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
    1. Re:Dual Opteron 1U rack units.... by FuturePastNow · · Score: 4, Informative

      It's no dual Opteron server, but this Dan's Data article reviews what are probably the cheapest 1U servers you can buy. Definitely something to consider of you're going for cheap.

      --
      Give a man fire, and you warm him for the night. Set a man on fire, and you warm him for the rest of his life.
  2. Check out Xserve by Twid · · Score: 5, Informative

    At Apple we sell the Xserve Cluster node which has been used for clusters as large as the 1,566 node COLSA cluster. We also sell it in small turn-key configurations.

    Probably the most interesting news lately for OS X for HPC is the inclusion of Xgrid with Tiger. Xgrid is a low-end job manager that comes built-in to Tiger Client. Tiger Server can then control up to 128 nodes in a folding@home job management style. I've seen a lot of interest from customers in using this instead of tools like Sun Grid Engine for small clusters.

    You can find some good technical info on running clustered code on OS X here.

    The advantage of the Xserve is that it is cooler and uses less power than either Itanium or Xeon, and it's usually better than Opteron depending on the system. In my experience almost all C or Fortran code runs fine on OS X straight over from Linux with minimal tweaking. The disadvantage is that you only have one choice: a dual-CPU 1U box - no blades, no 8-CPU boxes, just the one server model. So if your clustered app needs lots of CPU power it might not be a good fit. For most sci-tech apps, though, it works fine.

    If you're against OSX but still like the Xserve, Yellow Dog makes an HPC-specific Linux distro for the Xserve.

    --
    - "When you want something with all your heart, the entire universe conspires to give it to you" -Paulo Coelho
  3. Dual Core Opteron Blades by municio · · Score: 5, Insightful

    At the current time I would choose blades based on dual core Opterons form many reasons. Some of the main ones are:

    - Price
    - Software availability
    - Power consumption
    - Density

    Brand depends on what your company is confortable with. Some companies would want to have the backing of IBM, SUN or HP. Others will be quite satisfied with in house built blades. This days it's quite easy to build your own blade, some mother boards builders take care of almost all components and complexity (for example Tyan). But again, maybe the PHBs at your gig will run for the hills if you mention the word motherboard alone.

  4. Read the Google paper ! by devitto · · Score: 5, Insightful

    In the paper, it goes into tedious detail on the architecture and low-level operation of the application. Why do you think it does this? Because it is the application that *totally* depicts the solution, they chose lots of systems because of reliability, they made those systems "desktop class" because they didn't get much extra from using super-MP/MC systems.

    It's a great article, I strongely suggest you read properly, and do what they said they did - evaluate need against what's available.

  5. well... by croddy · · Score: 5, Informative
    at the moment we have a rack with Dell PowerEdge 1750's. They're very nice for our OpenSSI cluster, with the exception of the disk controller. Despite assurances by Dell that the MegaRAID unit is "linux supported", we're now stuck with what's got to be the worst SCSI RAID controller in the history of computing.

    we're hoping that upgrading to OpenSSI 1.9 (which uses a 2.6 kernel instead of the 2.4 kernel in the current stable release) will show better disk performance... but... yeah.

  6. Total Overkill by Anonymous Coward · · Score: 5, Funny
    We can't help but notice that the Google gear in our co-lo appears to be off-the-shelf motherboards screwed to aluminum shelves.

    That would be typical of a prima donna company like Google that's floating in cash from their IPO.

    Around here, we don't waste money on fancy designer metals like aluminum. Salvaged wooden shipping palettes work just fine for us; they're free. And screws!? No need to waste resources on high-end fasteners when you can pick up surplus baling wire for less than a penny per foot. A couple of loops of wire and a few twists are all you need to assemble a working server.

    The dotcom days are over. There's no reason to throw money around like there's no tomorrow.

  7. No one size fits all answer but here is mine :) by Anonymous Coward · · Score: 5, Informative

    My .02 cents worth ...

    I build Linux and Apple clusters for biotech, pharma and academic clients. I needed to announce this because clusters designed for lifesci work tend to have different architecture priorities than say clusters used for CFD or weather prediction :) Suffice it to say that bioclusters are rate limited by file I/O issues and are tuned for compute farm style batch computing rather than full on beowulf style parallel processing.

    I've used *many* different platforms to address different requirements, scale out plans and physical/environmental constraints.

    The best whitebox vendor that I have used is Rackable Systems (http://www.rackable.com/ . They truly understand cooling and airflow issues, have great 1U half-depth chassis that let you get near blade density with inexpensive mass market server mainboards and they have great DC power distribution kit for larger deployments.

    For general purpose 1U "pizza box" style rackmounts I tend to use the Sun V20z's when Opterons are called for but IBM and HP both have great dual-Xeon and dual-AMD 1U platforms. For me the Sun Opterons have tended to have the best price/performance numbers from a "big name" vendor.

    Two years ago I was building tons of clusters out of Dell hardware. Now nobody I know is even considering Dell. For me they are no longer on my radar -- their endless pretend games with "considering" AMD based solutions is getting tired and until they start shipping some Opteron based products they not going to be a player of any significant merit.

    The best blade systems I have seen are no longer made -- they were the systems from RLX.

    What you need to understand about blade servers is that the biggest real savings you get with the added price comes from the reduction in administrative burden and ease of operation. The physical form factor and environmental savings are nice but often not as important as the operational/admin/IT savings.

    Because of this, people evaluating blade systems should place a huge priority on the quality of the management, monitoring and provisioning software provided by the blade vendor. This is why RLX blades were better than any other vendor even big players like HP, IBM and Dell.

    That said though, the quality of whitebox blade systems is usually pretty bad -- especially concerning how they handle cooling and airflow. I've seen one bad deployment where the blade rack needed 12 inch ducting brought into the base just to force enough cool air into the rack to keep the mainboards from tripping their emergency temp shutdown probes. If forced to choose a blade solution I'd first grade on the quality of the management software and then on the quality of the vendor. I am very comfortable purchasing 1U rackmounts from whitebox vendors but I'd probably not purchase a blade system from one. Interestingly enough I just got a Penguin blade chasssis installed and will be playing with it next week to see how it does.

    If you don't have a datacenter, special air conditioning or a dedicated IT staff then I highly recommend checking out OrionMultisystems. They sell 12-node desktop and 96-node deskside clusters that ship from the factory fully integrated and best of all they run off a single 110v electrical. They may not win on pure performance when going head to head against dedicated 1U servers but Orion by far wins the prize for "most amount of compute power you can squeeze out of a single electrical outlet..."

    I've written a lot about clustering for bioinformatics and life science. All of my work can be seen online here: http://bioteam.net/dag/ -- apologies for the plug but I figure this is pretty darn on-topic.

    -chris

  8. Re:XServe by jschottm · · Score: 4, Interesting

    To those who say Apple isn't targeting the enterprise, look no further.

    Let me know when they stop trying to force their iPod updater (you know, the one that breaks Real's compatability DRM software) onto my servers. No matter how many times you put that update in the "Never update this" category, it shows back up the next time you run Software Update. Until they stop trying to play childish games on my production servers, I'll not consider them ready for the enterprise.

  9. Obviously by iamdrscience · · Score: 4, Funny

    Isn't it obvious that the best technology is blade servers? I mean, c'mon fucking BLADE servers! It's far and away got the coolest name of any of them. The only way you could beat them would be if some company came out with something cooler like ninja star servers, now that would be awesome.

  10. Mobos on Ikea shelves by astrojetsonjr · · Score: 4, Interesting

    Currently 65 (1 master, 64 nodes) of AMD Mobos on Ikea shelves. Cheap, easy to swap out, good air flow around the hardware. The shelves are wood, so everything just sits on them. It would be nice to find power supplies with extra connections to power more than one system.

  11. SunFire Servers by PhunkySchtuff · · Score: 4, Informative

    SunFire v20z or v40z Servers.
    http://www.sun.com/servers/entry/v20z/index.jsp
    http://www.sun.com/servers/entry/v40z/index.jsp
    They're the entry-level servers from Sun, so they have great support. They're on the WHQL List, so Windows XP, 2003 Server and the forthcoming 64-bit versions all run fine.
    They also run Linux quite well, and as if that wasn't enough, they all scream along with Solaris installed.
    The v20z is a 1 or 2 way Opteron box, in a 1RU case. the v40z is a two or for CPU box that is available with single or dual core Opterons.
    Plus, they're one of the cheapest, if not the cheapest, Tier 1 Opteron servers on the market.

  12. Great story re w/ build your own IBM cluster by MilesParker · · Score: 4, Interesting

    We wanted to set up a small 4-8 node cluster mostly for testing and as a compute resource. For various political reasons we were looking at an IBM solution. At my uirging we went for dual Opterons in the 1U format. And the price seemed right. Here's where it gets wierd *after* the OBM sales people step in. Going thourgh it peice by piece I thought I could put a decent system together - with our substantial IBM discount -- for $14k. By the time we got the quote with all of the crap they thought we needed it was 34k! Just to give the flavor, the rack and assorted pieces was 4k. But thats not the funny part. We were like, "well for this much money, we assume you are putting it together for us." "Um no...didn't you see the services quote that went along with this?" We hadn't -- with the services/support quote came in at $60k! So at this point we asked, can't we just buy the individual pieces we need and put it together ourselves. "Well, yes, but then it won't be an IBM e1350 cluster 'solution'..." "Yea, well, we don't really care what its called, it'll be just as fast and 75% cheaper..." At that time they were getting rid of their 325 servers for way cheap and we actually put that system together for as cheap as a whitebox and probably as cheap as if we'd tried to put it together ourselves. The moral I guess is that if you have to deal with the big vendors, have a very sharp pencil handy!