Slashdot Mirror


Linux Clusters Explained

tramm writes: "As someone who works on massively parallel Linux clusters everyday, I get tired of explaining why it is not 'Just another Beowulf'. Linux World has a good article on the four major types of Linux clusters. Our work is in supporting scientific codes that have a high degree of communication. This requires a very different system from the standard Beowulf-class machines that excel at the 'embarrasingly parallel' codes that do not require as much communication. The cost of the network interconnect for a high performance cluster is vastly more than that of a generic 100base-T system."

2 of 53 comments (clear)

  1. They didn't talk about... by rho · · Score: 5

    Nerd Clusters, which are more widespread than Beowulf, and more scalable.

    For example, a Nerd Cluster, using ChineseTakeOut messaging, are often used in last-minute, panic-striken Intranet roll-outs, yet each node of a Nerd Cluster can answer simple management questions such as, "Hey, my PC at home crashes all the time. How can I fix it?"

    Nerd clusters are, however, more dangerous to operate. If, for example, you say "Let's migrate our core applications from Solaris to NT", you run the risk of massive memory leakage as individual Nerd-nodes began to prioritize jobs such as "update_resume" over your request queue.

    Nerd clusters need a "master" node as well. These can generally be identified by their bushy beards, or a long string of nodes queueing up to beg for static IPs.

    --
    Potato chips are a by-yourself food.
  2. They forgot Condor... by epaulson · · Score: 5
    Condor, from the University of Wisconsin, should have been listed on page two. Condor is a high-throughput computing system, that runs on UNIX (virtually all flavors) and NT. We support MPI and PVM. We can run regular jobs, or you can relink with our libraries and get transparent checkpointing and remote I/O. You can use sockets in your job.

    You don't need to have a dedicated cluster - Condor started life as a scavenger of idle workstations. We run Condor on every workstation here at CS, and routinely recover several thousand CPU-hours a day that otherwise would have been wasted. You can configure Condor to run with any policy you want on a per-workstation level - only run jobs at night, only run jobs from this group, only run jobs if the wind is blowing from the west - whatever makes sense to the workstation's owner.


    Best of all, we're free-as-in-beer.


    If you have any questions, send us mail at condor-admin@cs.wisc.edu