Domain: mosix.org
Stories and comments across the archive that link to mosix.org.
Comments · 73
-
Re:There are other options for DynDNS only routers
Not that I know of. The three links below can give you dynamic failover, and if you install the MOSIX kernel (free for 4 nodes) you've automatic load balancing, but these all work at the node level on a single WAN. I can find nothing similar that would give you extranet failover/load balancing.
http://www.linux-ha.org/wiki/Releases
http://clusterlabs.org/wiki/Main_Page
http://www.corosync.org/doku.php -
Earlier
.. was Mosix http://www.mosix.org/
It allowed mosix-running linux computers to distribute their loads over a connected other mosix-running linux computers.
Processes migrate to other nodes transparently. No programming changes were needed. -
Re:orly?
You mean links like these?
http://linux.softpedia.com/get/System/Operating-Sy stems/Kernels/MOSIX-7287.shtml
http://linux.softpedia.com/get/System/Operating-Sy stems/Kernels/MOSIX-Grid-and-Cluster-Management-23 125.shtml
http://www.mosix.org/txt_cluster.html
http://www.tucows.com/software_detail.html?id=8473
http://www.icewalkers.com/Linux/Software/530140/MO SIX.html
BTW, that's just a few. I hope they helped out. BTW, my search term in Google were "MOSIX download" without the quotation marks. -
Re:orly?
MOSIX and OpenMOSIX are separate projects. OpenMOSIX was started because MOSIX is so closed. Try to find an easy download link for MOSIX, go ahead. I'll wait...
I'm saddened by this development too. I've got a small network I've built over years of tinkering with Linux and I would have liked to explore what MOSIX and OpenMOSIX promised. I was hopeful that OpenMOSIX would release a stable branch for Linux 2.6, as that's what I prefer running on my machines. I may have even been able to contribute some after a while, but I'm no kernel hacker (Which is what's required for a project like this), so I can't even bootstrap in now. -
Re:Linux 3.0.0> * Transparent clustering. Run this process somewhere else with as much or as little user control is a required
Oh boy!!! this is how SMP kernels work when you run them on a multiprocessor systems. But not when you have multiple discrete single CPU systems. Or multiple discrete multiprocessor systems.
Imagine all your machines automatically acted as a single box when they were connected to the LAN. Or other low latency interconnect like Infiniband.
I think the closest thing is Mosix:
http://www.mosix.org/txt_about.html
Then you're on to the network queueing systems like NQS, PBS, Torque, Sun Grid Engine, Condor ... -
Re:Depends on what you mean by real world.Thank you for the compliment. It's equally nice to know that there are active questioners on Slashdot determined to stretch the quality to the limits. In the spirit of providing information, though, I'll add a few links for the perusal and amusement of all. I'm hard on some of the software, but that's not because I could do better. If anything, it's because I have confidence the authors could.
Let's start with a Slashdotting of NASA...
- Scalable Dynamic Chimera Methods for Unsteady Aerodynamics is one of those packages mere mortals like us will have either no use for or will have to just drool over.
- Fully Unstructured Navier-Stokes 3D is a nice Fortran-based CFD, requires some hefty paperwork to obtain, and may need you to use G95 rather than GCC's GFortran, due to compiler bugs.
- OVERFLOW and related CFD software.
- Three Dimensional Multi-block Advanced Grid Generation System is the component that actually lets you do a lot of the necessary grid work for CFDs.
- Viscous Upwind ALgorithm for Complex Flow ANalysis is the hardest of the CFD codes at NASA to obtain, but if you want to work on anything hypersonic, it's the best place to start. Do Not Use hypersonic airflows for CPU cooling.
- Astrophysical Thermonuclear Flash Simulator - well, you never know.
- Geant4, for the subatomic nuclear physicist in your life...
- Open Field Operation and Manipulation is a nice open-source CFD package.
- Parallel Basic Local Alignment Search Tool gives you a parallelized search engine for nucleotides and proteins.
- Stanford Exploration Project provides some nice parallel geophysics applications and tools.
- Tachyon Parallel Raytracer is a nice example of what you can do with parallelism and graphics.
- Kerrighed is an up-and-coming clustering system for Linux. I saw it demonstrated at SC|05 - and was less than impressed. It needed a lot of work at that point. However, it looks like it has improved a lot since then, and it would be unreasonable to not mention it.
- MOSIX is the second-oldest clustering technology to gain a fan following to rival Star Trek. It's very good, though hard to get if you're not in academia. Arguably for entirely fair reasons.
- OpenMOSIX was originally a fork from MOSIX but is now essentially its own clustering technology. Development is nowhere near the speed I'd like, it does need far more eyes, but is well-known and highly regarded. Moshe Bar is also one of the coolest developers I've encountered.
- DAKOTA is a program for profiling parallel applications and should be useful in telling you where you are gaining and losing.
- HPC Toolkit is another toolkit for profiling HPC applications.
- is yet another profiler for parallel software. Between this and the others I've listed, you should have more information than sequential programmers ever get to work with.
- Performance API is a facility used by most of the profiling software to provide an architecture-independent view of performance counters. I have it on good authority that some (now former)
-
Re:our brains aren't wired to think in parallel
Correct me if I am wrong here, most parallel programming needs are
greatly alleviated by this project => http://www.mosix.org/txt_about.html -
terminals + cluster
Install the clients as linux terminals. http://www.ltsp.org/
Connect them to a mosix cluster http://www.mosix.org/
Use rdesktop for those apps which still need windows. http://www.rdesktop.org/ -
MOSIX License
Actually, I'd recommend openMosix.
Agreed.Granted Mosix is the original and is open source now as well,
Not by OSI/DFSG/FSF standards. The license is still very restrictive. I think the kernel patches might be under GPL, but certainly not the user tools.it still seems like openMosix is more actively developed.
This is certainly true. Most talent jumped ship & openMosix does have a higher number of active developers (and is somewhat backed by AMD (though I think AMD can and should give more developers to the project)). -
openMosix
-
Mosix
You might want to try Mosix.
http://www.mosix.org/ -
I wonder if Professor Amnon Barak knows about thisThis sounds like a fork of the original MOSIX written by Professor Amnon Barak (Hebrew University, Isreal).
IANL, but the MOSIX License Agreement is very specific in terms how the code can be used, here are some quotes:- THE CODE is the intellectual property of the Copyright owner (currently, Amnon Barak or Amnon Shiloh or both).
- You may make copies of THE CODE or its derivatives for yourself or for your company or for your organization.
- Any use not specifically permitted by the Copyright owner is hereby excluded from this License Agreement. In particular, this also implies that you are not allowed to redistribute THE CODE or its derivatives outside your premises, your company or your organization.
- Reverse-engineering of parts of THE CODE (if any) that are provided in binary and/or assembly form is not allowed.
-
Re:High Innovation Rate?
According to the history of MOSIX
http://www.mosix.org/faq/output/faq_q0003.html,M OSIX has been around since 1977.
An interesting, but not very strong example... -
Re:High Innovation Rate?What high innovation rate? Software is doing the same shit today that it was doing back in '95, we just have prettier interfaces now. I'd hardly call that innovation.
We all see what we want to see, I suppose. How about:
- Konqueror's KIO abstracted protocol interface
- Extensively reliable plug-in based software (IM, firefox, etc). Do you remember what generic software extensions were like in 1995?
- Dancing tree filesystems
- MPEG4/divx/ogg vorbis, theora
- Hashing-based multipart/swarming P2P clients
- Freenet - New Linux VMs and schedulers
- Mouse gestures
- Bayesian spam filters
- MOSIXNo innovation? Open your eyes.
-
MosixSome years back EdLUG received a visit from Amnon Barak, one of the founders of the Mosix project. He mentioned a couple of things relevant to the current discussion.
The project was started due to US export controls preventing the purchase of an off the shelf solution.
GNU/Linux was chosen over *BSD, at least in part, due to the GPL. The team wanted to share their work, even though it freaked the military out.
-
MOSIX...
-
Re:Hyperthreading
Just use MOSIX to get it working. To quote the about "Just fork and forget..."
-
A Dose of Reality
I've been part of the Sun beta software program, and one of my colleagues just returned from SunOne. We've been looking at this roadmap for some time, as it promises a set of features that we can exploit rather effectively should they come to fruition.
Behind the marketing-speak, N1 is merely Sun's attempt to provide the same types of services as MOSIX. They want to provide a utility computing resource that spans servers. This, combined with a set of management tools for creating/administering rulesets, can be used to automagically reconfigure Solaris domains with additional resources in order to meet spot demand.
The end result would be a cluster of x800 Starfire-class boxen linked with Sun's Wildcat bus interconnect. The domaining featureset will be extended so that individual Solaris domains could span physical servers. Memory and CPU could be held in reserve to be brought online to meet demand, or domains could be dynamically re-sized (a swamped domain could steal processors from one with lots of idle time).
Storage would be provided to the virtualized operating environments via SAN or NAS services. Solaris 9's network QoS and Mobile IP facilities could be leveraged to make the network interfaces transparent.
The phrase "eliminating sysadmins" is loaded with hyperbole, as the cluster would still need to be monitored, backed up, have its hardware maintained, have filesystems cleaned out and reorganized, etc. There's no "A.I." involved. There's no "click here to create a secure ordering and inventory system" button that magically translates a spec into an application, at least as far as I've seen.
-
RAID
Who says that raid can't run accross multiple machines?
-
Try this?
Take one of thease
Add a little, traveling a sales man
and a few neurons
-
I saw somthing on freshmeat
Found it
'MOSIX'
You can use Mosix to cluster those 486's together and get high performance file data transfer.
using
"The experimental MOSIX Parallel I/O (MOPI) package can read over 1,600 MB/S using 60 nodes. "
-
Re:Wow... good thing they chose linux...
1368 wasted CPU's
Are you kidding? With the possible exception of Cray and maybe the Hitachi(can't find info on that one... they seem to be out of the supercomputer business), nobody builds single-unit supercomputers any more. The scalability with clustering and shared memory over high speed networking overcomes the contention problems with massively parallel processors, though the Numa-Q may let us put more cpus in a single box. Check out the AlphaServer SC, the RS/6000 SP, both supercomputers in the top 10 (with the SP dominating... awesome switch for the shared mem), and the Beowulf, or (no shared memory here), the Mosix, clustering. If I were building a linux supercomputer, I think i'd rather go with a pile of dual-cpu units anyway, cutting down on resource contention.
With distributed processing like these systems, adding another unit adds about another unit worth of processing power, whereas adding cpus to a SMP configuration gives diminishing returns. As long as you build the infrastructure along with the growth (add switch capacity), the sky's the limit. -
Re:Wow
As my previous post sounds a troll, I just intended to joke around the practical need for an AtheOS or anything similar.
I was a fan of BeOS mostly because of its clustering abilities, and I become enthusiasted with the idea of running a server w/ 32 clustered boxes. Then I found MOSIX, and I'm just happy with Linux (although I have still only a few boxes).
-
Re:Yeah, right
This isn't Beowulf clustering. You don't have to write or modify applications specifically for the cluster. MOSIX migrates individual processes to the least used or best system available in the cluster. Think of it like SMP processing.
Has anyone in this thread even bothered to check out the MOSIX site? It doesn't seem like it because it sounds like everyone is thinking beowulf.
Do yourselves a favor and see:
The MOSIX Site
What is MOSIX? -
Re:Yeah, right
This isn't Beowulf clustering. You don't have to write or modify applications specifically for the cluster. MOSIX migrates individual processes to the least used or best system available in the cluster. Think of it like SMP processing.
Has anyone in this thread even bothered to check out the MOSIX site? It doesn't seem like it because it sounds like everyone is thinking beowulf.
Do yourselves a favor and see:
The MOSIX Site
What is MOSIX? -
Re:Yeah, right
This isn't Beowulf clustering. You don't have to write or modify applications specifically for the cluster. MOSIX migrates individual processes to the least used or best system available in the cluster. Think of it like SMP processing.
Has anyone in this thread even bothered to check out the MOSIX site? It doesn't seem like it because it sounds like everyone is thinking beowulf.
Do yourselves a favor and see:
The MOSIX Site
What is MOSIX? -
Hey, check this out...
For you Mandrake users, I head a project to include LTSP and Mosix on a Mandrake configured kernel; to package and explain in very easy terms the whole process, and then eventually release a stripped-down Mdk, geared towards education (edu-tech is pretty much my field) ala K12 LTSP. We call it The Mandrake Mosix Terminal Server Project. Check it out and lend a hand if interested. Thanks.
-
Some facts, anyone?As far as I can tell, MOSIX is still freely downloadable from mosix.com. However, the statement on the download page does seem kind of odd (you acknowledge "THAT MOSIX IS THE PROPERTY OF AMNON BARAK").
I really appreciate the work that Barak has done with and on Mosix. But I also find it kind of odd that Mosix could be the "property" of one individual. I would assume that it was developed with public research grants and while the author was employed at a university. Graduate students probably have also contributed, and there probably have been bug fixes as well. So, maybe it isn't bad if there is a GPL'ed distribution of Mosix after all. The GPL regulates issues of ownership rather well.
As for a user-space implementation of Mosix, I think that makes sense, although it has its drawbacks as well. One of the problems with user-space implementations is that they tend to be less than transparent in practice. It also strikes me as somewhat redundant, since Condor has already gone the user space route. A userland Mosix would only make sense if it were free and open source (as opposed to Condor).
Altogether, I hope people won't get too upset at each other over this. Mosix is great stuff and Barak and his university have been generous in making it available freely up to this point.
-
Clusters kick ass!
Does anyone else get a hard-on when they look at stuff like http://www.mosix.org/pics/index.html this?? OMG. I wish I had 40 boxes in my basement to do my evil bidding. Mwhahaha...
-
I never heard of mosix,
and since information is a bit lacking at the link provided, here's a link to the regular mosix FAQ.
-
wow!
imagine a mosix cluster of dual-gigahertz g4s!
-
one word
"Mosix"
You may need to configure one machine separately from the others to be the "master." You'll probably need either some kind of disk space availible or use some sort of ramdrive.
From the Mosix web page:
How to configure MOSIX clusters with a pool of servers and a set of (personal) workstations:
Single-pool = all the servers and workstations are used as a single cluster: install the same "mosix.map" in all the computers, with the IP addresses of all the computers.
Advantage/disadvantage: your workstation is part of the pool.
Server-pool = servers are shared while workstations are not part of the cluster: install the same "mosix.map" in all the servers, with the IP addresses of only the servers.
Advantage/disadvantage: remote processes will not move to your workstation. You need to login to one of the servers to use the cluster.
Adaptive-pool = servers are shared while workstations join or leave the cluster, e.g. from 5PM to 8AM: install the same "mosix.map" in all the computers, with the IP
addresses of all the servers and workstations, then use a simple script, to decide whether MOSIX should be activated or deactivated.
Advantage/disadvantage: remote processes can use your workstation when you are not using it. -
If it was really important not to lose it...
Then why didn't you and some colleages lift the server and the UPS together (without switching it off), put it in a car, and drive to the nearest place with electricity before the UPS runs out.
Or if the server was too heavy/big to move, then run to the hardware store and buy a generator
Other answer:
If it's a linux machine, then build a mosix cluster, then you can migrate the process to a system that still has power (assuming not all systems in the cluster lost power...) -
Re:fdsa
MOSIX is clustering software that has run with Bell UNIX, BSD, and Linux, and originally ran on PDP/11's.
-
I'll be going with MOSIX
A housemate of mine and I decided that we wanted to build a pathetic little Supercomputer out of the various PCs laying around in our little Geek House. We've decided to give MOSIX a run. It sounds like a fairy tale solution...especially when it comes to automatic process migration node to node. Anybody here have any positive experiences or harsh words regarding this?
-
Re:A wierd idea I had
You want to run Mosix.
-
mosix for rendering
have you considered using a mosix cluster
-
Why IBM is so important?
Clustering is such a fascinating area on its own and the article is so shallow that I'm curious why they've published it at all. They could mention the potential benefit of cluster computing as well as examples of some working clusters like Beowulf or Mosix or even the famous fact that there is a cluster among top 500 supercomputers.
-
Re:Machines..
If I understand correctly (IIUC? Maybe I can start a new acronym trend here...) anything parallelizable must be custom-written with the Beowulf libraries in mind.
Nah. Check out MOSIX - it can migrate processes across nodes of a cluster automatically. Like on SMP, parallelism is just a fork() away. -
Below is the article copied from Byte...Byte.com is pretty non-responsive from my part of the world. Its a good read if you have time...
Linux Kernel Pillow Talk
(Linux Kernel Pillow Talk: Page1of1)
By Moshe Bar
October 29, 2001
And you thought the netherworlds of dry kernel engineering were free of politics, egos, and prima-donnas? Guess again. The events of the last four to six weeks and the e-mails flying to and from the Linux kernel mailing list show how Byzantine and complex the dynamics of decision finding, features design, and implementations can be. Go to http://www.tux.org/lkml/ to subscribe to the kernel mailing list, but be careful: This is a very high-traffic list. Subscribe only if you really want to follow every single detail of the Linux kernel, or instead read the weekly digest at Linux Kernel Cousin at http://kt.zork.net/kernel-tra ffic.
Sure, the lively debates have always existed. In the past there have been disputes about the Linux firewalling code, networking code, scheduler, installer, driver model, and many more. One recurrent theme has always been the Virtual Memory (VM) manager. Nothing determines the peculiar behavior, the feel -- even the ultimate success or failure of an operating system -- like its virtual memory design. Sometime during the development cycle leading up to the Linux 2.4.0 kernel, in other words in 2.3.xx times, Rik Van Riel (http://www.surriel.com), a Dutch kernel hacker working for Brazil-based Conectiva (one of the smaller Linux distributions), introduced a radically new VM code. It was based on what seemed to be new and advanced algorithms for efficient finding, allocation, and disposal of virtual memory pages requested by programs. Rik later introduced an interesting new kernel feature called the "OOM killer." OOM stands for Out Of Memory. The OOM killer attempts to locate a killable process when memory runs out in the system. Without such a feature the whole machine can go nuts or enter a vicious cycle of swapping out a few pages, realizing immediately after that those pages are needed, and searching again for swappable page candidates, keeping the kernel busy doing only this instead of letting user processes run.
Rik is a gifted hacker, and among other things he has been trying to improve the efficiency and speed of maintenance of those lists in the kernel responsible for managing all the virtual memory pages in the system. One of the main questions to address in every operating system VM code is: "How do you choose which page to steal next when there is a RAM shortage?"
In the 2.4.0 release, the Linux kernel scans the process page and decides which page to remove. The problem with this approach is that sometimes a lot of process tables have to be scanned to free just one page, or very few pages. Also, this approach does not guarantee that the pages stolen are only those that will not be needed again very soon. Some UNIXes introduced the notion of the working set; that is, the minimum amount required by a process to function efficiently. This solution is, however, limited to per-process pages only and does not consider other kinds of pages, such as filesystem caching. Stealing from these pages might in some cases even prove counter-productive. Very often in VM theory, a solution to one problem can worsen another; that's why kernel programming is difficult.
Rik van Riel and I have variously discussed another approach, called "reverse mapping," which implements a reverse-lookup between the page and process table. Once you have reverse-mapped pages, the VM can simply scan the pages for the ones to be freed. Naturally, some extra fields need to be added to the appropriate control tables to allow this reverse mapping. My own implementation has an overhead of 14 bytes and is therefore certainly a lesser solution than Rik's -- his overhead is just 8 bytes.
Other extremely talented kernel hackers such as Marcelo Tosati and Ben LaHaise have made other important contributions to the Linux VM.
However, even though all these intelligent people tried hard to make the Linux VM fast, efficient, and powerful, user reports since the 2.4.0 release indicated poor Linux kernel performance and erratic and unstable behaviors. Up to kernel 2.4.7, for instance, on machines with small memory footprints (less than 40-MB RAM), sudden swap storms could erupt which would virtually freeze the system while it inexplicably started swapping pages in out and like crazy. In some cases, the aforementioned OOM Killer would choose the wrong process to kill; I have seen the all-important init process killed erroneously. Many fringe kernel projects, like my own Mosix project or others such as Win4Lin, suffered because users accused these projects of unstable operations, assuming that a released kernel like 2.4.0 must be free of such nasty bugs. Even though the kernel had gradually evolved from 2.4.0 to 2.4.9, it was evident that the VM design was more of a liability than an advantage.
Linus himself said in a recent kernel list mailing that he wasn't happy yet with the VM. These problems were enough for many Linux shops to resist the migration to the 2.4 kernels and instead continue using the 2.2.19 kind of kernels. Obviously, compared to 2.4., the 2.2. series has many shortcomings -- like no zero-copy networking, the division of page cache and buffer-cache in filesystem operations, big spinlocks (serializations of kernel execution paths for computers with more than one CPU) for many parts of the kernel, and so on.
A simple C program like the one below shows how kernels up to 2.4.9 had problems dealing with stress workloads on the VM system. If, after running this program, you turned the swap partition off with swapoff, your server or workstation would become totally unresponsive for up to 15 minutes.
/* based on a code originally proposed by Andrew Tanenbaum, later by Derek Glidden and many others since */ #include void main(void) { /* in the next line we allocate 200MB, but since the virtual memory page is not actually allocated by the kernel until we use it, we also have to create an access to. The amount of allocated pages should reflect the total RAM on your computer. This test runs well with machines of, say, 256MB */ void *p = (void *)calloc(50000000, sizeof(int)) ; /* In the next line we let the system calm down a bit after allocating pages*/ sleep(12); /* and now re release it all again */ free(p); }Back in February 2001, I ran an informal and unscientific benchmark comparing FreeBSD 4.1.1 to kernel 2.4.0 (visit http://ww w.byte.com/documents/s=558/byt20010130s0010/) on exactly the same hardware and with exactly the same subsystems versions (MySQL, Sendmail, Apache, and others). The results clearly showed that, indeed, there were major problems with the efficiency and speed of the early 2.4 kernels. A New VM
Then, on September 24, with the kernel standing at version 2.4.9, everything suddenly changed. Andrea Arcangeli, an Italian kernel hacker (read my interview with him two years ago at http://ww w.byte.com/documents/s=287/byt20000229s0008/) and a very prolific contributor, decided that enough was enough. He sat down and in one of those marathon hacking bouts completely rebuilt the VM from scratch. In short succession he sent to Linus Torvalds over 150 patches to the 2.4.9 kernel, to implement a new VM engine. This is an extremely remarkable feat. A VM is a major piece of software and by nature very complex. One needs to satisfy many opposed objectives: Simultaneously efficient handling for server-type loads and interactive-type loads; ease of implementation and at the same time, optimized use of every last and small feature of the CPU. The VM must also be able to run well on Intel CPUs spanning 4 or 5 generations, as well as on AMD chips, Alphas, MIPSes, Sparcs, ARMs, and what have you. Andrea, by the way, does all his development on a Compaq AlphaServer with 2 500-MHz CPUs and 3-GB RAM.
Out of the blue, Linus accepted the new VM and incorporated it into the official Linux kernel tree.
Recently, I spent two days with Andrea giving speeches. During the two days, over many bottles of beer, we had plenty of time to discuss his new VM. I was mainly interested in how the new VM affects Mosix. Because Mosix must migrate virtual memory pages belonging to the program's address spaces between cluster nodes, it is important to correctly understand the VM and interface efficiently to it.
Specifically, Andrea took exception to the following problems in the 2.4 VM:
- kswapd looping forever on DMA or NORMAL class-zones.
- swap+ram will be almost all available address space (modulo when the swap cache serves to avoid swapin of shared anonymous memory after a fork).
- swapout storms.
- benchmarks, when run repeatedly, gradually slow down.
The new VM is much simpler and faster. Let me explain how it works.
The old 2.4 VM had a major design problem that manifested itself mainly when freeing physically dirty pages (remember dirty pages are the frames of 4-KB memory in the RAM whose contents have been modified by one of the virtual memory pages residing in it). The last owner of the page (usually the VM, except in swapoff) has to clear the dirty flag before freeing the page. When being swapped off in swapoff it may be a little more complicated -- we may need to grab the pagecache_lock to ensure nobody starts using the page while we clear it.
So, Andrea went and did the following: All physical pages are now divided into active and inactive pages. These two are further divided into dirty and clean for both active and inactive. When the active dirty pages become about 66 percent of the total number of pages, the VM starts to scan them for the oldest ones to be put into inactive dirty and then, later still, from there to the swap when memory becomes tight. This part is very central to the new VM and its simplicity is...well, simply stunning.
This elegant mechanism totally changes the behavior of the 2.4.10 kernel under heavy load and also makes for much better predictability of the system. Another very important change is that the swap is now additional to the RAM, just like in 2.2 times. All earlier 2.4 kernels (since 2.3.12) needed at least the same amount of RAM in swap and then more to give you additional virtual memory. This meant that on an 8-GB server, you needed to put aside almost a full 9-GB disk just to be able to swap, similar to some versions of Solaris or other UNIXes.
Finally, the page scanner doesn't page scan if there are theoretically no freeable pages, whereas before it did. Oh, and the OOM killer never really worked, so Andrea disabled it, as I did for all my kernels. In 2.4.12 it is enabled again; this time, however, it works much better. Try it with the above program to see it in action.
Arcangeli's VM is stable, acts predictably -- something that the old VM never really achieved -- and it makes the swap space look like it did in 2.2 days. Additionally, the design is much simpler and easier to understand. People will catch up fast with it.
However, many kernel hackers disagree. Upon the release of kernel 2.4.10, a virulent and sometimes aggressive debate flamed up, with many people trying to show why one of the two was a good VM and the other not. Some comments got a bit out of control, and only in the last two weeks or so has some calm been restored.
However, one nasty side effect stays. Alan Cox, the number two man after Linus Torvalds, does not yet like the new VM and in his own kernel tree (called the "ac tree") he still continues to use and patch the old VM. As a consequence, users and system administrators now find themselves facing two very different kernel trees to choose from: the official Linux tree and the Alan Cox tree. Quite often, latest patches to drivers and new features are only in Alan Cox's tree. Those who want to go with the official Linux source code may find themselves unable to apply the patches due to the different VM code all over. It is acceptable for the two trees to be different for a few days on such important subsystems like VM, but it is not acceptable to have them different for months and across many kernel versions.
Nobody has yet dared to speak of a Linux source fork, but this is dangerously close to one.
It became obvious that the VM up to 2.4.10 was a design liability. You can try to fix something that was designed badly, but it will never become a beauty. I think Linus' decision to scrap the old VM and go with the Arcangeli VM was courageous and right. Having a functioning and stable Linux box should not be deferred to 2.5 when we can do it already with 2.4. Kernel Preemption
But apart from the VM issues, there are other lively debates in the kernel community. There was an interesting interview at h ttp://kerneltrap.com/article.php?sid=328&mode=thr
e ad&order=0 with Robert Love, who is leading one of two projects trying to make the Linux kernel fully preemptible. Making the kernel preemptible means making it possible to interrupt whatever the kernel is doing (say, executing a system call) to process some other outstanding task and then return to its original task. Linux, as a multiprocessing OS, obviously always did that for user-land processes. However, many, just like Robert Love, feel that the fact that Linux up to now would not let itself be interrupted contributed to poor latency. Latency describes how quickly you can expect a response from your kernel when you actually need something from it. Note that Linux is not designed as a real-time OS (though there is at least one Linux real-time implementation somewhere), and therefore does not explicitly guarantee latency. User-land programs must be aware of this as, especially with kernel preemption, latencies can be very unpredictable.Theoretically, an OS will answer faster if it can be interrupted. What does suffer from kernel preemption is the global throughput. If you have a task that gets n seconds within the kernel to complete (let's say executing a given system call takes 0.005 seconds), then all the interruptions add some overhead to switch from one kernel task to another. So, finishing the execution of that system call (in our example) will finally require n+op where p is the frequency of switching and o the static overhead for one switching operation. Notice that kernel context switching does not invalidate the CPU cache, and is therefore not as expensive as process switching. However, kernel preemption will surely lead to a higher rate of switching from kernel space to user space, because upon preemption the scheduler might decide to give higher priority to a user process.
In other words, kernel preemption does decrease latency but slows down overall throughput. It's the math: nothing to be done against it.
Furthermore, in his interview, Robert Love heavily criticized Linus Torvalds for adopting Andrea Arcangeli's new VM in 2.4.10 and dropping the old van Riel VM.
Well, I did try the patch with kernel 2.4.12 and with pre13. While accurate measurement (which Robert Love provides with the preemption kernel patches) does indeed report an improvement in latency, for the life of me I have not noticed it on an empirical basis.
I really do appreciate Love's work, but I do not fully agree with some of his comments in the interview. First, as Linus himself said, if latency sucks in the kernel then we should check why it sucks, with or without preemptive scheduling. If the latency is bad in the stock kernel, then it should be fixed anyway.
The preemptive kernel 2.4.12 worked fine on my laptops and on my SGI 550 workstation where I do interactive work. The MP3 player very rarely skipped beats when doing heavy background work such as kernel compiling or opening large files in the editor. But for my servers and clusters, the decrease in performance and the unpredictability of latency is a problem. Also, some important patches will not apply to a Love-patched kernel. Mosix, the clustering kernel extension, does not patch correctly, and neither do some versions of the LIDS intrusion detection system.
It is up to each individual user to decide whether or not to use the patch, but is important to understand the implications of using it. Linux and FreeBSD Revisited
Upon returning home the other week after meeting with Andrea, I went to my lab and searched for the disk images of the server comparison I ran back in January of this year (of FreeBSD 4.1.1 versus Linux 2.4.0). I took the Compaq ML500 server I have been reviewing (2x 1-GHz CPUs, 2-GB RAM) and upgraded both the FreeBSD disk image to 4.4-Stable and the Linux version to 2.4.12. Then, I changed the memory down to 192-MB RAM so as to stress the VM system more. I also upgraded to the latest stable versions of Sendmail (8.12.1) and MySQL (version 3.23.42). Finally, I compiled everything with the latest version of gcc, 3.0.2, and tuned the two instances to the best of my knowledge (softupdates and increased maxusers for FreeBSD, and untouched default values for Linux).
The results were very interesting indeed. Since this benchmark is too much to be handled in this article, Byte.com will post it here soon for you to read.
The story of this article is that the 2.4 kernel has finally grown up with the 2.4.10 release. Not many users outside the relatively small kernel community realize that. Now you know about it, too. Spread the good news and immediately install 2.4.12 on your busy server. The server will thank you for it.
Moshe Bar is a systems administrator and OS researcher who started learning UNIX on a PDP-11 with AT&T UNIX Release 6, back in 1981. Moshe has a M.Sc and a Ph.D. in computer science and writes UNIX-related books.
For more of Moshe's columns, visit the Serving With LinuxIndex Page . Page1of1
-
desktop cluster
this is small beans but we've currently got around 12 linux boxen on the desktop at my company. for fun we turned them into a mosix cluster which is rather nice whenever i need to do any compiling.. and you never know when you might need to do some after hours 3d rendering!
-
This is MOSIX
Distributed computing? Automatically deciding if a program should run locally or on a remote machine? Fault-tolerance? Dynamic load-balancing? Resource controls? Near-infinite scalability?
Sorry Microsoft, but you're the one playing catch-up here. Linux already has, working, today, 98% of your vision.
It's called MOSIX.
Frankly it's the most jaw-dropping bit of Linux development I've ever seen. On a local network, create your own supercomputer out of idle workstations. Across the internet - well,
.NET should go hang its head in shame. As a programmer, all you have to do is write ordinary, threaded applications, and magically benefit from the processing power of tens, hundreds, even thousands of machines. MOSIX does all the hard stuff.Truly an amazing piece of work.
-
Um, this isn't new...
OS research has been pursueing these goals for years. There's nothing there that's really very interesting or new. It sounds like they've just browsed the web for a little while and summarized what the various projects are striving for.
One project that's come pretty far is Mosix (I think they're planning to integrate bits into Linux 2.5, but I'm not sure). Then of course there's Plan 9 and Inferno from the fine folks who brought you Unix. And lets not forget Tanenbaum's Amoeba.
-
Re:What do you run on the darn thing ?
If you just want to tinker, and ball-busting computing power is unnecessary, you really shouldnt be building a Beowulf cluster.
Have a look at the info contained at MOSIX.org they provide kernel patches && apps to build clusters also, but these clusters allow -depending on your configuration (how/who/what of each computer in the cluster) - processes to migrate from 'more idle nodes'.
From what i understand, you could setup your 'head or workstation' machine to let procs dribble off onto other nodes.
have a look - much more usefull to geek exploration than beowulf... even if beowulf gets 'all the /. mentions'
-
What cluster?
-
Re:this brings up something..This link describes how MOSIX can be best applied.
The best solution for any distributed computing problem depends on that problem. How CPU intensive is the job? How much data will need to be distributed to nodes. Do intermediate processing steps require intermediate answers from other nodes? How fast is the CPU? How fast is the network?
Basically, if you have a lot of processing that could be manually distributing to a bunch of hosts, via rsh, or rlogin, then MOSIX can be used to easily manage/monitor that work with no coding. For harder problems that couldn't be manually distributed you might need MPI or PVM with special code in order to do the equivalent of "threaded" distributed computing.
Manual distribution is often easiest when you have network shared filesystems ( NSF, etc.,) and so is MOSIX.
MPI is short for Message Passing Interface. You can use MPI libs to do interprocess/interhost messaging and I/O on non homogenous networks. MPI does not require shared filesystems, though your own project might use them. MPI is certainly easier to manage when you have shared filesystems. One must be careful to conider the I/O time involved in network read/writes. Also note that multiple network nodes will clobber each other's data if they all try to write to the same file over the network.
MPI, or PVM is often used when the problem of breaking the job into pieces, or putting the results back together is non-trivial. For instance, if you were doing very processor intensive image processing on a large file, you may need to break the image into pieces, or tiles, and then distribute the processing of the tiles, with some processors sharing intermediate results, then stitch the results back into one file and finally write that file.
The approach that Unlimited Scale is using only makes sense in limited cases, i.e.., when computers are " getting bogged down in processing interrupt requests from peripherals." In general, multithreaded processing, or even distributed processing, only makes sense when I/O time is dwarfed by CPU time. SMP machines have an advantage in that they have really fast, communication between nodes, compared to Ethernet. Beowulf clusters have relatively slow communication between nodes. Beowulf clusters can only really be effective in the more CPU bound and less I/O intensive jobs. But if you have a job that can be run faster on a Linux cluster, you can save the big bucks on your initial hardware purchase. The more CPU intesive the processing is, the slower the network you can put up with. SETI is a good example of this. If it takes 12 hours to process a packet of data, then I/O over a 56K modem is OK.
-
Re:this brings up something..
Try a MOSIX Cluster This type of cluster spreads processes out to the machine with the least load. A Beowulf can be done, but to take advantage of it, you have to run custom software that is capable of parallel processing.
-
Re:Who does what?
Yes, MOSIX does it and is appropriately calling the creating machine ``home-node''.
-
Re:Who does what?
Oh. So, basically, it's just like Mosix? Doesn't sound too different to me.
-- -
consider Mosix
Load balancing via IP routing tricks is kind of nice. Mosix goes one step further and allows live processes to migrate across a cluster. Experimental add-ons also will do socket migration and havedistributed file system support. I think that's the kind of approach to clustering you are going to see in the long run.