Rik van Riel on Kernels, VMs, and Linux
Andrea Scrimieri writes "
An Interesting
interview with Rik van Riel, the kernel developer, in which he talks
about the Linux's VM, particurarly about his own implementation (which was
recently adopted in Alan Cox's tree). With some controversy towards Linus
Torvalds.
"
>> (which was recently adopted in Alan Cox's tree).
As I understand it, the Rik VM is what we started the 2.4 series with.
The Andrea VM was adopted in 2.4.10 amidst much controvery, and Alan has kept
the Rik VM as a part in the -ac kernels.
25% Funny, 25% Insightful, 25% Informative, 25% Troll
I'm wondering why both VM's can't be included in a distro and allowing the end user to select the one he/she wishes to compile into the kernel? Are the two implementations THAT mutually exclusive?
BTW, this kind of bashing between the high priests of Linux is not good. You can bet your bottom dollar that MS is going to use this conflict to fuel their propaganda machine, saying Linux is a fractious OS run by a bunch of young upstarts who can't agree on anything.
In the end they will lay their freedom at our feet and say to us, Make us your slaves, but feed us. - Fyodor Dostoyevsky
I'm not so sure I agree with him -- if you want to make a dent in the market shares of Solaris and NT/2000/XP you have to keep up with their innovations (Async-I/O, better SMP, etc.). As a user of Linux as our OS of choice for our database and web servers I am feeling a lot of pressure to switch to Solaris because of their better handling of higher-load environments (OLTP databases, web servers, etc.). If Solaris wasn't so damn expensive we'd probably be using SunFire 280's. So I'm pleading to keep up with the big dogs so that I can be reassured that Linux has what it takes (it's handling things fine now but as he said in the article, everyone needs more RAM, CPU, etc.).
Thanks,
--
Matt
I think it was an excellent decision of Linus to remove Rik's VM from the mainline kernel. If not for technical reasons then for political reasons.
Rik's VM obviously needed to be fixed and/or tuned, but apparently lacked the necessary attention from Rik. If Linus had not removed the VM, it would probably have been the situation for a while. Instead we now have TWO VM's which are rather stable and Rik working full speed to make his VM the best.
Competition is good! Which VM will be the best for the future will be determined by Survival Of The Fittest(tm)
It can be argued though, that it was not the right time during 2.4, but Andreas VM seemed to stabilise rather quick with the high level of attention to the problem. Sometimes it takes drastic measures to get results...
I have a lot of respect for Rik van Riel, but I think that Linus made a good decision to "cut bait" on his VM implementation for 2.4.
It was not that Rik's ideas were bad, it was just that their complexity and implementation were going to take too long - they should have been hashed out in 2.3 instead of 2.4.10.
I'm looking forward to having Rik prove his reverse mapping technology implementation in 2.5.
May the best ideas ultimately win, and may the giants of the kernel not take offense at each other. It would be a real shame if something stupid like Linus' lossy source code control system put off Rik so much the Linux community at large lost his wonderful contributions.
Here's to hoping that Linus gets more sensitive in some cases, and that Rik gets less sensitive in some cases.
"Provided by the management for your protection."
I saw a post on the linux kernel news groups (you can serch for it) about 2-3 weeks ago where Linus says something like "that's why I don't consider you a kernel developer" he always seems to be wining about something. But hey what do I know I'm still trying to get xscreensver to do a mozilla -remote openurl (some url) for a kiosk :)
Anyways, an enlightening, no-holes-barred interview. Enjoyable.
Recall that "Linux" is owned by Linus. It's not inconceivable to envision a pissing match of the egos ending in "Cox' "rogue" kernel isn't true Linux. Rename it." one day.
Trolling is a art,
Then again, how much is RAM? I just bought 512MB for less than $40. I NEVER swap (running ~800MB RAM / server) and never want to. I ran my first e-commerce web site w/ SSL and mSQL on a 486SLC-20MHz with 5MB of 110ns DRAM. Yeah, it swapped. But these days, most machines are way overkill for serving web pages, files, and queries.
----- Refactoring is the reason why man does not mistake himself for a god.
Open Source's biggest PR dilema is this sort of argument.
Make no mistake, every company has developers that do this. There's two differences in the Open Source world: 1) you can't just fire an Open Source developer who won't "play ball" with management's edict 2) it's usually public.
These are actually both really good things. The fact that you can't silence someone leads to repeated analysis of a problem. OSS' biggest benefit is that it brings massive peer review to bare not just on the code, but on the process.
The fact that it's public feeds into that, and is equally good.
The problem is PR. The Linux kernel is starting to look like anarchy to non-developers. I suggest that the process works, so we should all take a deep breath and leave it be. However, we all need to take the front lines on PR. Spin is all-important. This is not a "spat" or a "fight", this is "parallel development" and "peer review". The joy of this kind of spin is that, unlike most spin, it's TRUE! This guy is pissed at Linus. Linus has dumped his code. Yet, the two of them keep working hard to meet their customers' demands and producing what they feel is the best possible product.
Please, don't foster the idea that we're a bunch of anarchists producing code that's any less functional than the rest of industry, because quite the opposite is true.
I strongly feel that honesty wins in the end, because people aren't stupid. No one believes that IBM or Microsoft is one happy camp singing "we are the world."
It's great there is a lot of attention on the VM and intense effort to make it better. I have no doubt linux and Rik are professionals and have no problems putting politics aside to get the job done. That is after all part of being a professional. Rik makes some good argument and given enough time and money he'll build the VM of his design. Will it matter 10 years from now? Most likely not. Development will continue and linux will get better. Butting heads is part of the fun, because without conflict people tend to stagnate.
I know that it has nothing to do with the topic...
But while waiting for the page to load, I noticed that the extension was "htm" which lead me to lookup "linux.html.it" using netcraft and discovered it running IIS. Go figure?
Label this a flame if you want but I was absolutely disgusted by the tone and tenor of Rik's responses in that article. Regardless of the technical merits of his code or algorithms, Rik's repeated attacks on Linus will certainly not move the operating system forward.
When the author of the article ended with "Thank you for your kindness and the opportunity to get to know you better", I almost fell out of my chair laughing. The only thing that stopped me was that Rik's behavior really isn't funny. It isn't professional and it has no place in the open source or any other community. It speaks volumes about Rik's emotional maturity or more accurately his lack thereof.
An honest environment -- such as fostered by "free" software -- is both good and bad. On one hand, I (as a programmer) am comforted to read the kernel mailing list and other resources that let me know exactly what is happening with my tools. I don't need to wonder what's happening with "free" software -- and this is more comforting to an engineer like myself than is the closed-door, silence-is-golden, hide-the-bugs policy of a Microsoft.
On the other hand: Show this interview to an MIS manager who need 24/7/365 reliability, and she is going to be very nervous about deploying a Linux-based solution. You can talk until you're blue in the face about reliable distros and the open road to sofwtare quality -- what the MIS/corporate person sees is chaos and feels a lack of COMFORT .
"Out of sight, out of mind" is a philosophy many people adhere to, especially when dealing with complex issues they can not or do not want to grasp. From waste storage in Nevada to the the war in Afghanistan, most people lack the time and initiative to understand what is really happening; they go on appearances and marketing, and ignore complex and disturbing facts.
Technology is no different. The MIS manager doesn't want to hear about VM conflicts or file system bugs or different kernels -- such issues are beyond their capability and desire to understand. Buying Microsoft is (or was, until recently) comforting, because no one ever saw the internal debates and code battles and what-not that any development team expresses. Even recent security disclosures about WinXP are unlikely to shake the faithful -- but those same people will run in fear from the blunt honesty of Linux.
Ignorance may be bliss, but it can also get you killed. I know people whose lives depend on cars, but they have no knowledge of how to check the oil. Most MIS managers simply want to drive software; if it looks good (like a Jeep Liberty), they don't pay attention to whether it is safe (the Liberty performs poorly on crash tests).
I doubt, however, we're going to change human nature -- and I'd rather have spirited debate and even some nasty contention if it means that people are striving to make Linux the "best" it can be.
All about me
Rik is an extremely bright (and likeable) guy, but his adherence to the OOM killer concept is disappointing. I've seen a lot of dumb ideas gain currency in the computing community or some part of it; OOM killer is the dumbest. If your process was allowed to exist in the first place, it should not be killed by the VM system. The worst that should happen is that it gets suspended with all of its pages taken away. If that doesn't free up any memory then neither would killing it (modulo some metadata - read on). If there are other processes waiting for the one that's suspended, then eventually they'll go to sleep, their pages will be released, and the suspended process will wake up - which won't happen if you killed it. There are only two differences between the two approaches:
The usual whine from OOM-killer advocates is that you can still get into a situation where all of that retained metadata clogs up the system and essential system functions can't allocate pages. However, that's preventable too. All you need to do is preallocate a special pool of memory that's only available for use by those essential system processes - either individually or collectively. The size of that pool and the exact details of how it gets allocated (e.g. which processes are considered essential) could be treated as site-specific tuning parameters. The same idea can then be further generalized to allow definition of multiple private pools, creating a semi-hard barrier between different sets of tasks running on the system (if you want one; the default pool is still there otherwise). This actually fits in very nicely with other things like processor affinity and NUMA-friendly VM, which I know because I once worked on a kernel that had all of these features.
In short, there's no need for the OOM killer. Plenty of systems, many of which handle extreme VM load much better than Linux, have been implemented without such a crock. Rik contends that a lot of people make suggestions without actually understanding the problem, and he's right, but I also submit that sometimes he also rejects suggestions from people who do know what they're talking about. This row has been hoed before, and Rik's smart enough that he should know to avoid the NIH syndrome that afflicts so many of the other Linux kernel heavyweights.
Slashdot - News for Herds. Stuff that Splatters.
Actually windows will do an OOM kill.. I've seen it.
As for overcommiting: it's needed; in fact Solaris and BSD both do it. The main problem is that it's hard to fix the OOM killer without getting the VM down to an art.
It started out with Rik's VM in the kernel, since it was a promising new development. However, once it was in Linus's kernel, the fact that Rik's development style was not compatible with Linus's source control style because an issue, because the VM wasn't getting updated in Linus's tree.
So Linus switches to the other VM, which is based more on the original. This means that Rik can do his development without dealing with Linus and the Linus tree can have an up-to-date VM. When Rik's is to the point where he's really happy with it and he doesn't think he'll have to make a lot of patches (and it does all the things he wants), it will probably go back in.
Since then, Rik and Linus have figured out (hopefully) how their interaction failed to work, and what Rik has to say along with his patches to make Linus know they're worth looking at. It turns out that it is possible to automate this process, such that a script will send the patches when appropriate, with the right assurances of freshness (having actually tested them, of course).
Linus wants to be able to ignore any patch that isn't for the part he's thinking about at the time (e.g., non-block-i/o patches around the beginning of 2.5). When it becomes interesting again, however, the original patch may not be right any more. Having not looked at the patch at the time when it was sent, Linus can't determine whether it is still good, since the author may have found bugs, and he doesn't know exactly what the patch was supposed to do. He wants the author to make any updates needed and resend it. It may be, of course, that the patch doesn't need to be changed, and the author doesn't have a new and better patch, but Linus can't tell unless the author sends it again with a note that it's still good.
So Rik's patchbot will test whether the patch still applies and still works, and has not been replaced by a new version, and then will send it again until Linus actually looks at it. This seems to me like a good plan, since it doesn't require Linus to test everyone's old patches and have a complicated mail system. And Linus won't accidentally apply the wrong version of a patch or be unable to find a patch.
...how a lot of kernel developers seem to talk mainly about how well their patches help a server withstand slashdotting? ;)
I mean... look at that bloke who posted a scheduler patch in Kernel Traffic, and now Rik... both mention a certain website. I dunno if this is a good or bad thing
A VM is basicly a small thing: a list of pages, every page has a set of properties and an interface on top of that to get things done with the pages (claim/free/mark dirty etc). I wrote one on an MSX2 in 1986 for having 256KB roms in 128K ram + 128K vidram (and 32K disk :)). Of course, modern OS-es need a VM that can take decisions, is scalable on different hardware, and can handle the requests fast.
A lot of research has been done on virtual memory and the managercode for this type of memory. Also a lot of different types of VM's are implemented in different OS-es, all with pro's and con's in different situations. It's therefor not hard to dig in and get the knowledge you need.
F.e.: the rmap stuff is a nobrainer. If you let the VM handle every request to share/allocate a mempage, that VM can keep a set of pid's per page. IIRC NT's VM (VMM32) does this. That the current VM in Linux doesn't already have this feature is beyond me.
Never underestimate the relief of true separation of Religion and State.
First off, an honest account of a person's feelings is not a personal attack. In fact, it has nothing to do with the other person at all. Nobody can "make" another person feel a particular way. A feeling is simply what a person has. The question is what the person does with that feeling. Rik van Riel seems to be doing what any dedicated, driven, psycho-geek would do - he's making his VM the best he possibly can.
Second, there are MANY possible resolutions to the purported conflict. The idea of having modular VMs is extremely sound, and likely to be implemented at some point. In the same way that networking QoS code supports multiple methods, in parallel, with rule-matching to determine the "probably best" solution, I could easily see a "meta-VM" engine which used a similar system to drive multiple VMs which could "steal" memory off each other, as needed, to run the most programs closest to optimally.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
The problem is that there isn't a decent multi-patch versioning system out there
Uh, yes there are. Perforce, aide-de-camp, bitkeeper, and others all do this just fine. I haven't used squeak much, but I think this is also how the built-in version control in their smalltalk image works as well. Every change management system that uses changesets works pretty much exactly this way.
CVS basically sucks, which is why some people are trying to replace it. It only gets used because it is popular and free, not because it is technically superior. The only thing it is better than is RCS/SCCS. Every other possible solution is no worse, and usually much better, than CVS.
"Will it matter 10 years from now? Most likely not."
And I really hope that the reason it doesn't matter is because our systems will have nearly unlimited memory. You don't really need a VM if you don't have limitations on memory.
Think I am crazy? Think back to what kind of system you were using 10 years ago, and how quickly we have gotten to where we are. What do you think will happen in another 10 years?
Hopefully, we won't even need a VM in 10 years.
My beliefs do not require that you agree with them.
> there is indeed a /proc entry >(/proc/sys/vm/overcommit_memory)
That setting doesn't work properly. Linux will just overcommit slightly less.
It's not so much "mutually exclusive," it's more like "they both rewrite the same chunks of code." Maybe I'm splitting hairs there. AFAICT, the amount of common code between the two isn't enough to make this worth it.
The result is that the kernel hackers aren't concerned about, say, code size, as much as they're worried about readability and maintainability. The number of #ifdef's scattered throughout the VM code would be incredible, the resulting total code would look like Your Favorite Form of Pasta[tm], and fixing bugs would be difficult.
There are other ways to do it besides #ifdef, of course, but they all detract from maintainability. And it all becomes vastly difficult to scale as soon as a third VM implementation comes along...
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Actually windows will do an OOM kill.. I've seen it.
So have I. It's called "Blue Screen of Death".
Fascism starts when the efficiency of the government becomes more important than the rights of the people.
Think back to what kind of system you were using 10 years ago, and how quickly we have gotten to where we are.
Ten years ago I was running a 486/66 with 16M of RAM, and a 340M hard drive that eventually filled up. Right now I'm on an Athlon 950 with half a gig of ram, and ~50 gigs of hard drive that's just about full. The programs you run are always going to drive development of the equipment you run it on. It doesn't matter how much space you have- you'll fill it.
"If he thinks he can hide and run from the United States and our allies, he's sorely mistaken." Bush on bin Laden
Yes you are trolling! Especially since the original poster could not get the quote right.
..."
... who is being immature now?
Linus wrote: "...Which, btw, explains why I don't consider you a kernel maintainer, Rik,
See for yourself!
The reason was that Rik didn't care about everybody else if his bugfixes were not applied. Would YOU like a maintainer that didn't care about the rest of the world?
"I would tell Linus to fuck off"
Am I the only one who's spent more time reading the Linux Kernel Mailing List than slashdot recently *because* of the feuding and flaming that's going on? All the patches, bug reports, insults, ideas, and philosophical asides are like a soap opera (with diffs). Okay, I admit I'm addicted to reading through diffs that I have no idea what they're doing, but it makes me *feel* smarter.
About the only thing I didn't like was Linus' rambling evolution thread. Personally, I'm on Andrea's side in the VM wars, but I think its because he had a clever flame or two a while back. Plus, I've had to build kernels for two friends with 2.4.13 & 15 who were having problems with memory with older 2.4.x's (probly redhat's problems) but since Rik's siding with redhat, that's another strike against him. I don't run a data warehouse, and I hate xinetd, and am still bitter over the RPM incompatibilities between 3 & 4.
There's a lot of back-and-forth discussion, not only on the VM, but on the feature (un)freeze of 2.4/2.5, and on how Linus is a lousy patch control system. But maybe that's not the most important thing here.
Way back when, the purpose of a development kernel was to feed things in to a stable kernel tree. Now part of the problem has to be that Linus started 2.4 way before 2.3.X was ready for it, but it looks like history is repeating itself. 2.4 isn't all that stable, even now, but Linux is happily accepting lots of new goodies to play with in 2.5.
Something is not working right here. Is Linus less demanding of quality now, since he's willing for somebody else to come in and fix up the allegedly stable kernel tree? Or is he accepting too many things to allow a development tree to stabilize?
I suspect it's a combination of too much stuff and too big a kernel. Instead of the heady days of 2-3 kernels per week in the development tree, and the stable tree gets another kernel every week or two, now we have a development kernel every week or two and a stable patch every month or three. And the kernel size is 10x bigger than in the 1.0 days.
Look how long it took the USB stuff to filter through the development into the stable tree.
It seems obvious the Linus Linux development process is not scaling. I'm not sure what the answer will turn out to be, but it may be some combination of the following:
(1) More "boutique" kernels like Alan Cox's ac series, feeding into the "stable development" kernels that Linus has been generating.
(2) More formal check-in methods, a la CVS commit. This may take some developer training in how to use CVS -- does anyone want to offer Linus a course and set up a server for him? I bet he'd take a complementary Geek Cruise!
(3) Some kind of more rigorous control in the stable kernel tree. I suppose you could say Redhat and SuSE are doing this informally now; if they start coordinating their efforts, and get IBM involved, the kernel will be incredibly stable. And even more incredibly slow to update.
(4) More beta testers to crack the newer kernels. This is going to get harder, as more of us need to get work done on our Linux boxes. It used to be a hassle when Linux crashed; now it's not acceptable any more!
(5) Better ways for these users to track down problems and report bugs. This last week I heard myself say, "Try rebooting your Linux box and see if the problem goes away." I just don't have the time, energy, knowledge, and skills to deal with lusers' "I've got a problem" whines any more.
(6) Is the quality of kernel patches too low? Do we need to develop some regression tests for the kernel, which a patch would have to pass before it would be accepted? (And how do you do a regression test program of this magnitude without Microsoft's beta testers, AKA customers?)
Anybody want to contribute more ideas to the list? We can spam Linus with them until he agrees!