openMosix Is Shutting Down
jd writes "Despite having one of the largest user-bases of any clustering system for Linux, openMosix is to be shut down. Top developers have left and they lack the means or motivation to continue. Their official claim of multicore CPUs making clustering redundant is somewhere between highly improbable and totally absurd, as has been pointed out elsewhere. Why is this shutdown so important? Well, from a technical standpoint, the open-source bproc (the Beowulf process migration module) is ancient, MOSIX is very hard to obtain unless you're a student, and kerrighd is (as yet) immature. From a user standpoint, openMosix is the mainstay of the Open Source clustering world and has by far the best management tools of any. The ability of this project to continue will likely have a major impact on the future of Open Source in the high-end markets — if the best of the best couldn't survive, people will be more careful about anything less."
There's a similar project named 'Open Single System Image'
http://sourceforge.net/projects/ssic-linux
I agree, there's a degree of optimism in my argument but the summary is plain flawed.
Its message and tone is that openMosix = dead, openMosix = OSS, therefore openMosix dying = OSS solutions are bad.
What it completely fails to address is that the situation would be no better, and in fact would be a lot worse, if this was a CSS tool. Indeed, the ray of light for openMosix users comes from the fact that it is OSS.
Bashing OSS solutions because one is dead/dying/in limbo/whichever way you want to look at it is patently ridiculous because it's not the openness of the code that's at fault here, or even the open source development model.
To put it bluntly, CSS projects that lose their core development teams don't exactly fair any better do they?
"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
The pendulum has swung back now. In the days when 10Mbps ethernet came onto the scene and our processors could barely keep up with their floopy drives (which is why a floppy used DMA), we collectively came up with the idea of using several computers to solve a problem by splitting the problem up among them. Since then thanks to Moore's law our processors now spend a lot of time waiting to fetch the next instruction from their on-chip L1 cache - as in when there's a miscalculation in the branch prediction step.
Our networks however have not kept up to this pace. Right now our very best effort for network speed is infiniband which tops out at 96Gbps theoretical limit. The AMD Opteron page lists a limit of 24GBps, that's 192Gbps, bandwidth using three coherent hypertranport processors. See the problem?
I see one of two things happening, either we'll find a magic bullet technology to significantly increase our network speeds; or some limit will finally end Moore's law. Otherwise there's simply no reason to tie together multiple processors. Despite Windows best efforts, our CPU's still spend most of their time waiting for something to do.
Dennis Dumont
Times like this make me realize that the end result of capitalistic software and open source software is really naught if you are on the losing end. If nobody likes your work, you are not going to be funded, and that's really what seems to be happening here.
The premise for shutting down the project is correct. Multiple cores all but eliminate the need for the most extreme clusters. Throw PCI-X graphics cards into the mix, and you have even that much more computing power. That's not to say that there aren't applications that require clustering, but, those who make those applications probably are going to wind up writing their own distributed processing framework anyway that is tailored to their needs.
Sometimes the turtles are just destined to be soup.
This is my sig.
Was a switch to 2.6. I personally was using it in 2.4, however, when 2.6 rolled out, OpenMosix wasn't going to support it for some time. This caused a lot of users to stop using it, because for NOW, there wasn't really a way to justify staying on 2.4, when the responsiveness of 2.6 was so much better.
I see now that they have an alpha version out for 2.6.
Note that 2.6.0 was released in 2003.
The article summary was certainly an eye grabber... but, the truth is, I've deployed quite a few linux HA and load balancing clusters. I have also installed a couple openMosix clusters. While it may be sad that openMosix is closing, the vast majority of clustering cannot be handled by openMosix. It is designed as a parallel processing cluster. I would say 99% of clusters are of the HA/load balancing variety. IE, I've got 3 web servers and I want to distribute the load between them. openMosix cannot do this, it isn't designed for it. Or I have 5 DB servers and I want to distribute load/perform replication. again openMosix does not do this. It is a "processing" cluster. IE I have this huge data set, and an application which will split up that data set and do some processing on it. Think SETI@home except, you don't want to send it to people's homes, you just want to run a single process which will send jobs off to other nodes for computing. The only thing I ever successfully used openMosix for was a compile cluster, and for that it was nice, but even for regularly compiling KDE, it wasn't much worth the effort to get the cluster running for the time it saved in compiling.
At the time I used it it couldn't migrate web server processes or db server processes, so it was useless for what I do most of the time.
MOSIX and OpenMOSIX are separate projects. OpenMOSIX was started because MOSIX is so closed. Try to find an easy download link for MOSIX, go ahead. I'll wait...
I'm saddened by this development too. I've got a small network I've built over years of tinkering with Linux and I would have liked to explore what MOSIX and OpenMOSIX promised. I was hopeful that OpenMOSIX would release a stable branch for Linux 2.6, as that's what I prefer running on my machines. I may have even been able to contribute some after a while, but I'm no kernel hacker (Which is what's required for a project like this), so I can't even bootstrap in now.
... And so it comes to this.
Guys who used to develop BProc now are focusing their efforts on http://xcpu.org/
Geezuus, it's not like OpenMOSIX is unusable as is, or that there aren't alternatives. For that matter, while coding one's own cluster controller isn't trivial, it isn't string theory either. Our shop has released (eg. given away) two schedulers, and we've got another that's stayed in house. When I've strolled the booths at the SuperComputing conference, it seems that every other university is giving away their own cluster controller.
OpenMOSIX is neat, but it ain't the end all be all, and it's been my experience that any shop that's serious about running a cluster manages to find/attract someone with the chops to get it up and running. Can just any elementary school pull one together for "free"? Maybe not. For them, there's Pooch or AppleSeed.
Luke, help me take this mask off
You mean links like these?
y stems/Kernels/MOSIX-7287.shtmly stems/Kernels/MOSIX-Grid-and-Cluster-Management-23 125.shtml3 O SIX.html
http://linux.softpedia.com/get/System/Operating-S
http://linux.softpedia.com/get/System/Operating-S
http://www.mosix.org/txt_cluster.html
http://www.tucows.com/software_detail.html?id=847
http://www.icewalkers.com/Linux/Software/530140/M
BTW, that's just a few. I hope they helped out. BTW, my search term in Google were "MOSIX download" without the quotation marks.
@Mindless Drivel: 100% of Twitter posts ever Tweeted.
Most Beowulf clusters run parallel codes written to use the Message_Passing_Interface (MPI). MPI programs really don't want to be migrated to different nodes while they're running, so load management is achived through schedulers such as Grid Engine, TORQUE, and others. These schedulers avoid the need for process migration by preallocating the resources (compute nodes) in advance, and prevent the load imbalance from happening in the first place. openMosix waits for the imbalance to slow down the computation before it migrates a process to relieve the problem.
If you check the archives of the Beowulf mailing list, you'll see that while the Beowulf community knows about openMosix, very few Beowulfers use it.
Well, let me precise the announce :
...
...
The project will be shut down in March 2008, not before.
actually, it's Moshe only who will stop "leading" the project (as a reminder, he didn't really 'lead' many thing in the 2.6 version)
After march, we will see who will get the 'leader' position, but I don't think that is really an important change (call that politics if you want). The fact is for now, oM 2.6 has 3 core devs (me, risc, and g4saa) and we are quite all busy elsewhere. Anyway, if I can make interesting progress this year on the oM2.6 code, I'll take over the project.
Don't fear, oM project is not yet buried
Anyhow, if any of you guys feel like kernel/user cluster dev, please feel free to contact me (or the list directly, I'll answer it)
WE NEED MORE DEVS !! (as always anyway).
"has had clustering for ~15 years"
I always laugh when people talk about "clusters" in which the members have no independent I/O path to the disk drives. Back in the days when I worked with VMS clusters (1985!), we had independent CI and later SCSI connections from each CPU to the disk farm. Even the disk controllers were redundant with failover. If any machine and any disk controller was up, you still had access to everything. There were even some military-grade installations where the cluster members were a few miles apart and connected by fiber. Nothing short of an ICBM would take down the cluster, and even then it would need multiple warheads to wipe out all the locations. Anything less, and the whole thing would still run as if nothing happened.
Installing software on one system meant it was installed everywhere because all of the machines booted from the same system disk. The remote help desk consisted of a person who would occasionally deal with telecomm issues and the very rare swapout of nearly-indestructible VT-100 terminals.
In today's world, I see too many "imitation" clusters in which a single failure takes down the whole thing. It's hard to believe that Digital lost the war to Intel and Microsoft.
IT was a fabulous career choice back in the 1980's. The high cost of everything forced people to make serious commitments to vendors and [gasp!] employees who ran the data center. Today's world of disposable commodity hardware/software/people may be faster and cheaper, but the bugs and downtime would be unacceptable by 1985 standards. Here in modern times, we expect much more than 1980's IT could deliver. But we certainly don't expect with the same consistency.
The main reason projects like this are floundering is that the ROCKS project is becoming the defacto standard for cluster setups.
Also, companies like IBM and HP love to push their own proprietary setups.
As well, there are some good commercial products that add lots of well supported tools.
For example MOAB
Maurice W. Hilarius Voice: (778) 347-9907