Rik van Riel on Kernels, VMs, and Linux
Andrea Scrimieri writes "
An Interesting
interview with Rik van Riel, the kernel developer, in which he talks
about the Linux's VM, particurarly about his own implementation (which was
recently adopted in Alan Cox's tree). With some controversy towards Linus
Torvalds.
"
I'm wondering why both VM's can't be included in a distro and allowing the end user to select the one he/she wishes to compile into the kernel? Are the two implementations THAT mutually exclusive?
BTW, this kind of bashing between the high priests of Linux is not good. You can bet your bottom dollar that MS is going to use this conflict to fuel their propaganda machine, saying Linux is a fractious OS run by a bunch of young upstarts who can't agree on anything.
In the end they will lay their freedom at our feet and say to us, Make us your slaves, but feed us. - Fyodor Dostoyevsky
This talk of "Alan's tree" and "Linus's tree" is kind of foreboding. A de facto fork has already taken place.
What would Alan call his version of kernel? His last name already ends with an "X" so... I dunno where that would leave us.
Yeah, better off to just keep referring to them as "Alan's tree" and "Linus's tree."
it's about time rmap VM was developed and integrated
:(
into the kernel. together with O(1) scheduler and low-latency patches it will be a great advance for 2.5 kernel
But sooner or later OOM due to memory overcommit will have to be solved properly (by not overcommitting). OOM killer is just a hacky solution (even windows doesn't have suck a hack).
CaptnMArk (forgot my password right now
I know that it has nothing to do with the topic...
But while waiting for the page to load, I noticed that the extension was "htm" which lead me to lookup "linux.html.it" using netcraft and discovered it running IIS. Go figure?
I think you are right that Linus could probably have taken more patches from Rik than he did.
On the other hand you could argue, that too many patches are a sign of the VM not being stable enough. Therefore the VM should probably be matured in a seperate tree (as Rik himself suggests) instead of flooding Linus with bugfixes and tweaking. Then when the VM is stable and can be proven to perform better I am sure even Linus can be persuaded.
And yes a one man control system IS lossy but that is not a bug but a feature - because it ensures consistensy. In every project of this scale coordination is essential and the individual developers MUST be more thorough with their work before comitting it!!!
Somebody has to speak up against Linus. Linus is not a god. The man makes mistakes. And over the last view years it becomes increasingly a problem that "Linus doesn't scale".
Linus however continues to develop the kernel pretty much the same way he started doing it ten years ago. And not many people think that's a problem. Rik does (AFAIK). And I tend to agree with Rik: the current system just isn't working very well. It's not very bad, but it certainly isn't optimal, IMHO.
However, remaining silent doesn't solve the problem. Somebody has to speak up.
This is your sig. There are thousands more, but this one is yours.
Or did you mean that "the process's pages will be swapped?" Even if you did mean that, my understanding is that the OOM killer only takes effect when there is no memory space left - including swap. In this scenario, there isn't much to be done should the system need more memory to continue - you either kernel panic or you find some process to kill and kill it. In an extreme circumstance like OOM on a kernel alloc, I see nothing wrong with deciding to kill a process. I really don't see how "suspending" on a process solves memory issues since it still needs its pages somewhere...
My understanding was that the idea behind the OOM killer was to prevent the kernel from panicing and instead leave a working system which needs to have its memory problems worked out. I could be wrong since I haven't really looked into the OOM killer and when it's invoked.
You are in a maze of twisty little relative jumps, all alike.
there is indeed a /proc entry (/proc/sys/vm/overcommit_memory) to disable VM overcommit. In which case, it's impossible to reach the scenario where the OOM is needed (some process gets a null from malloc instead).
However, as it stands, linux by default is willing to overcommit (via copy-on-write). This is a good, and beneficial thing - when one forks, the pages don't need to be allocated and copied until they are changeed (as most never are). This saves memory, saves time, and vastly improves scalability of many tasks. Ditto for many, many other situations. But, it means when everything goes to hell in a handbasket, you have promised memory to processes that you simply do not have, and you've already told them they can have it. So you have to produce something, and that means someone gets tossed.
as far as reservving special memory, the mlock call does just that. It tells the VM that these pages can't be messed with, they need to be ready immediately.
You can't just suspend, because you already did that - OOM doesn't occur until you are also out of swap. OOM is a last-ditch, we have *nowhere* left to put this. If you ever see OOM, you need more swap. Simple as that.
(Now, one thing that would be very nice would be dynamically resizing swapfiles, so that if you had disk space left not currently being used for swap, the swaparea could grow. But even then, there is such a thing as out of disk as well. The only way to completely avoid OOM is to avoid overcommit/copy-on-write and allocate any pages that could potentially be used by a call every time (even when, as in fork/exec they very rarely are). That way you could make the calls fail in this worst of worst-cases and the applications could respond.
The Matrix is going down for reboot now! Stopping reality: OK. The system is halted.
RTI - Read The Interview.
...Rik's repeated attacks on Linus will certainly not move the operating system forward.
... Yes, though I guess I have to add that I have a lot of respect for Linus. He is a very user unfriendly source management system, but he is also very honest about it.
Rik was interviewed in order to get insight into how he thinks/sees things, no? So if he doesn't like the way Linus does things, is he not at liberty to say so? (also, see quote below about still having respect for Linus in spite of their disagreements/conflicts)
Rik's behavior really isn't funny... It speaks volumes about Rik's emotional maturity or more accurately his lack thereof.
Rik Say:
With Linus out of the way, I can make a good VM. I no longer have to worry about what Linus likes or doesn't like.
I don't quite think that qualifies as immature - granted, there is a lot of conflict going on, but they still have respect for each other, even if Rik doesn't like to work with him, and there's not really anything showstopping about it. The VM situation wasn't pretty, but it's being resolved.
You might (or might not) be overcommitting. Up to you. However, even if you are, you're not waiting until the last second and then going postal instead of taking concrete steps sooner to avoid total memory exhaustion. For example you could say that, once you start dipping into the overcommit pool, fork() will start failing but existing processes can continue. You could say that only certain processes that are being allowed to run to completion will be able to allocate new swap space; anyone else will just get suspended the first time they try. Once you have set a high watermark somewhere short of total exhaustion, you can do any number of things, even if you're overcommitting. Some of those measures are pretty drastic, but still better than the OOM killer.
To a certain extent, perhaps, these "softer" approaches just slow down what might be an inevitable march to OOM. In theory, you could still reach the total-exhaustion deadlock that OOM-killer is supposed to deal with, although it really doesn't because it doesn't in any way guarantee that your system will really be any more useable than if the deadlock had occurred. In practice, though, you'd be hard pressed to find a system that (a) allows overcommit, which is only necessary with VM systems that are broken (wrt how much swap they allow) to start with, (b)takes such drastic measures before going OOM, (c) does in fact hit OOM anyway, and (d) would benefit from an OOM killer if it had one. Without such an existence proof, claims that an OOM killer is necessary are pretty bogus.As I've said, these aren't new ideas just off the top of my head. These are approaches that are proven to work. Ask yourself: how is it that so many systems get by just fine without an OOM killer? There are answers out there.
Actually I didn't. I accused other Linux kernel hackers of NIH, and tried to warn Rik about becoming more like them. I know Rik's smarter than that, but sometimes even smart people submit to "common nonsense".
Slashdot - News for Herds. Stuff that Splatters.