Why Does Current Clustering Require Recoding?
AugstWest asks: "I've been doing some research into what the available clustering options are for pooling CPU resources, and it looks like most of the solutions I've found require that programs be re-written to take advantage of the cluster. Since there are virtualization apps like Bochs and VMWare, where the applications just make use of a virtual CPU as if it was a real CPU, why aren't there clustering solutions that do this as well?"
why aren't there clustering solutions that do this as well?
Because it's a lot faster to address a local CPU than it is to send that info down the wire to a remote CPU? And because of that latency, it's a lot easier to keep 2 or more local CPUs in sync than it is to keep 2 or more remote CPUs in sync?
You need to recode because you want to work around the latency, which is severe, of working via a network cable--so you design your apps to minimize messaging between CPUs. Some apps can do this well--they don't need results from other CPUs to complete their own information.
Other applications require CPUs to work in tandem, and for each CPU to have to wait while the results are served out over GigE would suck some serious ass, even if it might be technically possible.
--
$tar -xvf
As it stands today, an OS cannot easily share tasks. But there exists some tasks which are more easily shareable than others. I imagine within a century we'll be able to share tasks more easily and I think the CELL chip is meant to ease this transition but I could be wrong.
This is in addition to the handling of resources such as database connections and other shared resources across the distributed cluster. I'm not exactly sure what your specific needs are but when you separate threads across different physical memory spaces, it creates significant problems to overcome. If you just want to virtualize the application (so one machine, many virtual machines, one physical memory), then the recoding should be trivial. And I agree, in this isolated case, no recoding should be necessary. But most of the time, clustering entails spaning multiple physical memories, and thus the application needs to be designed to handle these difficulties.
"Those that start by burning books, will end by burning men."
You might want to try Mosix.
http://www.mosix.org/
Clustering exposes complications regarding: shared data, latency, concurrency, transactions, central control, security, failovers, and so forth. It's hard because it's hard.
In my shameless search for a site to cite, I found this http://www-unix.mcs.anl.gov/dbpp/ which covers lots of problems that have to be solved.
I'd love to see a language (or language extension) cleanly define a way to let me define a code block attributes which could affect how and where it gets executed. The runtime library could then distribute that block as the environment best allows.
This is a boring sig
This is a basic systems question:
[Why must] programs be re-written to take advantage of the cluster.
The simple answer is that programs, in general, are written as single threaded applications with shared state (memory). A cluster is the opposite of that - multiple parallel CPUs without shared state (or at least requiring one to be explicit about shared state, as opposed to simply declaring a variable).
Usually a program algorithm has to be completely re-designed in order to take advantage of the cluster, while mitigating the problems. At minimum the program must be parallelized. If you don't change the program to succesfully deal with shared memory latency then the cluster becomes nearly as powerful as a single fast computer running the program.
The reason you are asking this question is that you don't realize that a cluster is fundamentally different than a single (or dual or quad) CPU. The architecture is completely different. You can't expect to treat it like any old computer.
-Adam
This will be a bit difficult to explain fully. The other posts have already lightly touched the problems involved (especially latency). But you are talking about the holy grail of parallel computing here; seeing one system while it is running all over the place. My best advice for you is to get a good book on parallel systems and get educated. This is something like asking a doctor why there are still diseases.
The only way you'll have source code that compiles and runs unmodified on architectures of widely varying parallelism efficiently is for the language itself to know about parallelism, and make it the compiler's (and even runtime-linker and kernel's) job to parallelize your code for you. An inherently parallel language would have ways for you to specify in your source code what can and cannot be executed in parallel, and what code absolutely depends on the serial execution of some previous code. Even then, we're really only talking about the SMP case. When you start involving network latencies and bandwidth restrictions, the decisions on when and how to parallelize become more challenging for the compiler/runtime, possibly requiring either more intelligence on its part and/or more meta-information in your source code.
Until you write code in a language like that, you can never expect to write code in a single-threaded mindset and then have it just magically take advantage of a parallel environment.
11*43+456^2