Programming Environment When Mixing Beowulf And SMP?
mulcher asks: "In a beowulf cluster I want to store (logically) one giant data structure across many nodes and have individual nodes compute in parallel. Each node has SMP so I want to exploit that as well. I have seen C/C++ and Message Passing Interface MPI, however I am also looking for Java and Haskell interfaces, or special parallel languages interfaces. Any suggestions? Also is there anyone out there who has attempted this and can recommend some URLs?"
I hope you use at least a 100mbit switched backbone otherwise its going to be a severe bottleneck. You might even want to consider gigabit ethernet, however gigabit over copper is not wise.
Only the State obtains its revenue by coercion. - Murray Rothbard
What you need is an MPI implementation that
Does The Right Thing in handling communication
between interhost MPI processes and intrahost
MPI processes.
Such a beast is not that far off into
the future.
For Haskell parallel work check
out the GpH project at http://www.cee.hw.ac.uk/~dsg/gph/
You want speed? Then don't bother with
Java for now. If you are into functional
languages I'd also recommend taking a look
at Objective Caml, an ML variant from INRIA
in France. Nice language, excellent implementation
and there are both MPI and PVM libraries available.
For a cluster of SMPs like that, I've seen
people code in MPI for internode stuff, and use
threads within a node.
Do you have money for software? Check out
the portland groups compilers. Automatically
parallelizing ones as well as explicitly parallel.
Check out the OpenMP stuff in particular.
You don't provide much detail on your code, but
if you are new to parallel work, expect an AWFUL
lot of work to get good performance. Unless
of course the problem is embarassingly parallel.
Of course, a lot depends on the type of computing you are going to do. Since you're thinking of using a Beowulf cluster, I'm assuming the amount is 'fairly' limited and that you are able to partition your problem across several nodes without too much trouble.
The easiest solution is probably using MPI for all communication, like some suggested before. However, you would be completely ignoring the fact that your nodes do have two tightly coupled CPUs. Because of this you may not get the best performance from your system. Since this is a hybrid system (looesely coupled between nodes, tightlly coupled between the two CPUs of a node) you might as well use this to your advantage.
Of course, a lot depends on the type of computing you are going to do. Since you're thinking of using a Beowulf cluster, I'm assuming the amount is 'fairly' limited and that you are able to partition your problem across several nodes without too much trouble.
That should read 'the amount of traffic between nodes is'...
mpi-softtech produces MPI Pro, a commercial implementation of MPI. MPI Pro takes advantage of SMP machines automatically - if two tasks are on the same node, MPI Pro will use shared memory for communication between the two.
Although it is commercial software, it is free to download for unlimited use (for Linux, for NT you are limited to only 8 processes).
I have some very limited experience using it, but it might fill your needs.
...if your program is anywhere near memory bandwidth limited the extra processor (or three!) will buy you nothing. We see performance gains in the single digit percentages when the problems we solve are maxing the memory bandwidth.
The answer to your question is clearly based on what kind of problem you happen to be solving, though. If you're just banging registers together, then use MPI or one of the MPI w/SMP packages available (see other posts).
Of course, if you're interested in using Java or Haskell, your probably not maxing out the memory bus or the processor's resources, so the extra local processors will probably buy you a lot. In the case of Java use threads and RMI or somesuch.
Yes...I am a rocket scientist.