Slashdot Mirror


Choosing the Right Cluster System

ckotso asks: "So I've read here and there about linux clusters, and I am ready to set on creating one with some help of the educational institute I am working for. So far I've found out about Beowulf, SCI and MOSIX. I really wish I can get some help on this, since NT is making its way into the University gradually and I hate to see this. I want to give a cheap and robust alternative to this place, I simply have to change their minds! " Interested? There's more information inside.

"My questions are:

  1. Have I missed any other serious competitor in the cluster field?
  2. What are the pros and cons of these systems?
  3. Has anyone tried them all and written any report as to how they compete?
Thanks!"

2 of 106 comments (clear)

  1. A few ideas... by Noryungi · · Score: 5
    OK, here is my take on your question. Watch out, though, as I am not a Beowulf expert.

    Here are some information you may consider before starting your own cluster:
    • Beowulf clusters have to be useful for the kind of scientific projects your university undertakes. Large science (physics, astronomy) projects, usually coded in Fortran and involving lots of calculations that can be computed in parallel, are ideal applications for them. Other applications may be a lot less interesting. A Beowulf cluster, depsite its power, is not always the perfect solution.
    • If your University is short on cash, you may want to investigate the "Stone Soup" cluster -- recycled old Pentiums and 486s can find a second lease on life in a Beowulf cluster. Pros: cheap. Cons: require a lot of labor and patience and is less powerful than Beowulf cluster using up-to-date CPUs and network connections.
    • To be truly effective, Beowulf clusters require at least a couple of very powerful servers and very advanced network hardware -- be sure to compute this into the total cost.
    • Beowulf clusters are not for the faint of heart. They require quite a lot of skills, as far as the network configuration, machine configuration and traffic optimization are concerned. It's not surprising the first Beowulf were born at NASA -- It did require rocket scientists to make them work! =) Once they are up and running, though, their performances are close or better than dedicated supercomputers -- for a small fraction of the price.
    • Another good side of Beowulf is the fail-safe possibilities and evolution capacities of such a machine. If a "node" goes down, the machine does not crash, and the node share of the task(s) can be assigned by the main server to another machine. If you need a more powerful machine, simply add a dozen new PCs to your mix and watch those MIPS/Gigaflops go up!
    • Finally, never forget the one argument that wins them all: price, price, price, price! Linux is free, Intel PCs are dirt cheap, all you need is a lot of space and a dedicated team to make it work. Oh, and lots of network cards & cables... =)

    So, some positive factors, some negative ones. If you want to convince your University, always remind them that they can always count on the support of other universities and research centres the world over that are using this technology right now.

    Good luck!
    --
    The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
  2. Depends on your needs. by SEWilco · · Score: 5
    That's quite an assortment. What you want depends on your needs and on the characteristics of the choices. As for NT, the availability of source for many of these things will be nice for research activities.
    • Beowulf is one of a family of parallel programming API tools. Programs must use the API to accomplish parallel programming.
    • SCI is fast hardware with support for distributed shared memory, messaging, and data transfers. Again, if you don't use the API then no gain.
    • DIPC is distributed System V IPC. Programs which use the IPC API can be converted to DIPC easily, such as just by adding the DIPC flag to the IPC call.
    • MOSIX is the most general-purpose. Processes are scattered across a cluster automatically without having to modify the programs. No API needed other than usual Unix-level process use. Allows parallel execution of any program, although full use requires a parallel program design.