A Fast Start For openMosix
axehind writes "Dr. Moshe Bar recently announced the creation of openMosix, a new OpenSource project. The project has quickly attracted a team of volunteers developers from around the globe and is off to a very fast start. openMosix, is an extension of the Linux kernel. openMosix is a Linux kernel extension for single-system image clustering. openMosix is perfectly scalable and adaptive. Once you have installed openMosix, the nodes in the cluster start talking to one another and the cluster adapts itself to the workload.
"
Under some workloads, I can go along with the assertion that a MOSIX cluster is just like having a big machine with a lot of CPU's. It seems to be great for those workloads and I would love to try it out. Those loads tend to be multiple long running (more than a few seconds) and not multithreaded. For MOSIX to be most efficient, there also needs to be fewer jobs than there are CPUs to run them.
Other workloads, however, will not benefit from MOSIX. These statements are based on reading the docs a couple weeks back, not on actual experience.
Under the MOSIX model, when a process forks, the child may run on the current machine or it may migrate somewhere else. If the job is short lived (ls, echo whatever | sed s/blah/baz, you get the point) MOSIX will perform poorly because it will spend more time trying to figure out where the process should run than would have if it had just run the program on the local host.
If you need more CPU time than one CPU can provide and your program is multi-threaded, a single multiprocessor machine will also work better. This is because MOSIX does not yet support threads running on different machines. A 128-node cluster of 386's is going to run Netscape slower than a single 486 because you will only be using one 386 CPU.
For cases where you just have too many jobs for the resources available (CPU or memory), you may be better off with something like Condor. It is great for submitting batch jobs, migrating those jobs around, and only running the number of jobs that the system can handle.
I tried (vanilla)mosix a while back. It was cool, but had some real world drawbacks. If you start a process on a node and that process opens a socket, opens a file, or uses shared memory, then that process is stuck on that node. So if you start 10 dnet processes on one node, they won't migrate to idle nodes because they have open sockets (to the key server).
I don't know if this is the case any longer, I heard rumor that all these things were going to be implimented, so it'll be an interesting project to watch.
Good Luck Open Mosix!
-The JungleBoy
"You never know when some crazed rodent with cold feet might be running loose in your pants."
-Calvin
The main difference is that Mosix doen't work with threads. You can spawn a separate process on a node and it can migrate to different nodes. But if your application is threaded all the threads will run on one node, or migrating between nodes.