Why Does Current Clustering Require Recoding?

← Back to Stories (view on slashdot.org)

Why Does Current Clustering Require Recoding?

Posted by Cliff on Tuesday September 13, 2005 @09:07AM from the reinventing-the-wheel dept.

AugstWest asks: "I've been doing some research into what the available clustering options are for pooling CPU resources, and it looks like most of the solutions I've found require that programs be re-written to take advantage of the cluster. Since there are virtualization apps like Bochs and VMWare, where the applications just make use of a virtual CPU as if it was a real CPU, why aren't there clustering solutions that do this as well?"

5 of 75 comments (clear)

Min score:

Reason:

Sort:

Compilers by Marillion · 2005-09-13 09:30 · Score: 3, Interesting

Most compilers/interpreters support languages designed for single thread execution. Fortran, COBOL, C, C++, Ruby, Perl, PHP, Java, ... Sure all these have API calls to make use of multiple threads, but the language itself isn't multi-threaded.
In my shameless search for a site to cite, I found this http://www-unix.mcs.anl.gov/dbpp/ which covers lots of problems that have to be solved.
I'd love to see a language (or language extension) cleanly define a way to let me define a code block attributes which could affect how and where it gets executed. The runtime library could then distribute that block as the environment best allows.

--
This is a boring sig
1. Re:Compilers by Marillion · 2005-09-13 17:11 · Score: 2, Interesting
  
  No. Java is a perfect example of API based threading and in Java it's easy to do. Still a class has to implement Runnable and the programmer has to create a Thread and start it.
  The synchronize keyword was closer to where I was going. Suppose Java had a thread modifier keyword for looping operators. You could then:
  public void renderImage(Image images[]) { thread for (int i = 0; i < images.length; i++) { render(images[i]); } }
  each iteration of the looping block launches as a different task running in parallel and the loop exits once all tasks are complete. Or a easier RPC like
  public double getSalary(EmpID id) { String idString = id.getID(); double salary; remote("hr-system") { salary = HR.getSalary(idString); } return salary; }
  I certianly recognize that there are some very hard problems that would need to be solved in such a senario. Thread synchronization, mutexes, semaphores would need to be looked at. A clean way to integrate directory services and other ways of defining environmental resources is not trivial either and critical for the success of such a language.
  
  --
  This is a boring sig
Because bandwidth is scarce. by roystgnr · 2005-09-13 09:35 · Score: 2, Interesting

If your problem is so parallelizable that bandwidth isn't a limitation, then you don't need any special clustering software, you just need nfs and ssh: I do all my compiling in a flash with a short script and "make -j 16 CXX=sshcxx".

If your problem isn't that parallelizable and yet you need a whole cluster of computers to run it, odds are you need more efficiency than distributed shared memory can give you. You can access memory on your own node with orders of magnitude more bandwidth and less latency than on other nodes, and if your application doesn't take that into consideration it can run orders of magnitude slower.

Of course, that doesn't apply to every problem, and there are people trying to create exactly the cluster-as-computer architecture you'd like to see for ease of application programming. Check out OpenMosix and MigShm for one example - I haven't used the latter DSM patch myself but I know that for non-shared-memory programs, Mosix has had working process migration code for years.
My try by fm6 · 2005-09-13 10:48 · Score: 2, Interesting

Lots of good answers, but none that quite satisfy me. Here's mine:
The virtual machines you mention all run on a single existing system. You want a virtual machine that runs on multiple systems. That goes way beyond what the existing VMs do. They just implement the hardware instructions of a single system in software running on a single system. Taking that implementation and spreading it out among multiple systems means anticipating every clustering problem the code might raise, and solving it in advance.
Nobody knows how to do that. If they did, they'd implement it as the back end of compiler rather than waste the overhead of using a VM.
(They say that there are no stupid questions. Not true. But there are lame stupid questions, and interesting stupid questions. My vocation is answering interesting stupid questions, which is why I'm grateful for this one!)
No, the hard part is ... by hummassa · 2005-09-13 11:56 · Score: 2, Interesting

cache consistency. When I modify a page that is in my processor cache, now I have to put the word out to the whole network -- and I can't really commit that page until I know for sure that other threads in the cluster did not modify the same page (and, in the case someone did, I must decide how do I merge their modifications and mine, notify them of the merging, etc, etc...) What was a quick (important for performance) operation becomes a dog-slow operation, and maybe puts the whole motif for using a cluster in jeopardy...
HTH,

--
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048