Slashdot Mirror


Why Does Current Clustering Require Recoding?

AugstWest asks: "I've been doing some research into what the available clustering options are for pooling CPU resources, and it looks like most of the solutions I've found require that programs be re-written to take advantage of the cluster. Since there are virtualization apps like Bochs and VMWare, where the applications just make use of a virtual CPU as if it was a real CPU, why aren't there clustering solutions that do this as well?"

6 of 75 comments (clear)

  1. latency? by Johnny+Mnemonic · · Score: 5, Insightful

    why aren't there clustering solutions that do this as well?

    Because it's a lot faster to address a local CPU than it is to send that info down the wire to a remote CPU? And because of that latency, it's a lot easier to keep 2 or more local CPUs in sync than it is to keep 2 or more remote CPUs in sync?

    You need to recode because you want to work around the latency, which is severe, of working via a network cable--so you design your apps to minimize messaging between CPUs. Some apps can do this well--they don't need results from other CPUs to complete their own information.

    Other applications require CPUs to work in tandem, and for each CPU to have to wait while the results are served out over GigE would suck some serious ass, even if it might be technically possible.

    --

    --
    $tar -xvf .sig.tar
  2. cooking lessons by xutopia · · Score: 3, Insightful
    imagine telling a group of 10 cooks to make a huge roast. You wouldn't cut the roast in 10 pieces and each make them cook it seperatly and then glue the pieces back together. It would be highly difficult to glue back the 10 pieces. Instead it would make more sense to ask all cooks to do a seperate tasks. A few could could cut vegetables while another would make a sauce, another a salad.

    As it stands today, an OS cannot easily share tasks. But there exists some tasks which are more easily shareable than others. I imagine within a century we'll be able to share tasks more easily and I think the CELL chip is meant to ease this transition but I could be wrong.

    1. Re:cooking lessons by swmccracken · · Score: 3, Insightful

      See, is this High Availablity clustering or performance clustering. The asker doesn't state, and it's a rather important distinction.

      If it's HA, you'd get 10 cooks each to make a roast. Sure, you'd end up with cooking extra meat but that doesn't matter - the goal here is to guarentee that a roast will be cooked no matter what. (I can imagine two copies of bochs running on seperate physical machines but linked to run in absoulte lock-step. Performance might be impared, but relability will be there.)

      If it's performance, then you're right, you can't magically glue two computers together and get twice the performance.

  3. Mosix by NitsujTPU · · Score: 4, Insightful

    You might want to try Mosix.

    http://www.mosix.org/

  4. TANSTAAFL by Julian+Morrison · · Score: 4, Insightful

    Clustering exposes complications regarding: shared data, latency, concurrency, transactions, central control, security, failovers, and so forth. It's hard because it's hard.

  5. Holy Grail by owlstead · · Score: 3, Insightful

    This will be a bit difficult to explain fully. The other posts have already lightly touched the problems involved (especially latency). But you are talking about the holy grail of parallel computing here; seeing one system while it is running all over the place. My best advice for you is to get a good book on parallel systems and get educated. This is something like asking a doctor why there are still diseases.