Slashdot Mirror


Rik van Riel on Kernels, VMs, and Linux

Andrea Scrimieri writes " An Interesting interview with Rik van Riel, the kernel developer, in which he talks about the Linux's VM, particurarly about his own implementation (which was recently adopted in Alan Cox's tree). With some controversy towards Linus Torvalds. "

17 of 233 comments (clear)

  1. Minor nit... by FauxPasIII · · Score: 3, Informative

    >> (which was recently adopted in Alan Cox's tree).

    As I understand it, the Rik VM is what we started the 2.4 series with.
    The Andrea VM was adopted in 2.4.10 amidst much controvery, and Alan has kept
    the Rik VM as a part in the -ac kernels.

    --
    25% Funny, 25% Insightful, 25% Informative, 25% Troll
    1. Re:Minor nit... by FauxPasIII · · Score: 2, Informative

      Hate to follow up to myself, but I went and reread some old stuff; look like Alan has actually move -from- the Rik VM now, and is using the same Andrea VM that's in the linus kernel.

      --
      25% Funny, 25% Insightful, 25% Informative, 25% Troll
    2. Re:Minor nit... by Rik+van+Riel · · Score: 5, Informative
      Both Alan's and Michael's kernels are including my -rmap VM now.

      This is quite interesting since I haven't begun tuning -rmap for speed yet ;)

  2. Rik a prime devoloper Linus don't think so by mab · · Score: 2, Informative

    I saw a post on the linux kernel news groups (you can serch for it) about 2-3 weeks ago where Linus says something like "that's why I don't consider you a kernel developer" he always seems to be wining about something. But hey what do I know I'm still trying to get xscreensver to do a mozilla -remote openurl (some url) for a kiosk :)

    1. Re:Rik a prime devoloper Linus don't think so by xphase · · Score: 2, Informative

      See this link and scroll down a bit for this quote from linus:

      "Which, btw, explains why I don't consider you a kernel maintainer, Rik, and I don't tend to apply any patches at all from you. It's just not worth my time to worry about people who aren't willing to sustain their patches."

      --xPhase

      --
      The following sentence is TRUE. The previous sentence is FALSE.
  3. Patch bot is the answer? by imrdkl · · Score: 2, Informative
    Rik seems pretty hot on this idea, but I dont see how it could help much. I mean, won't repeated email be ignored nearly as much as the initial submission? I recall in the interview with Marcelo that he did not plan to use either public CVS or a tracking system, but rather planned to keep things close to the cuff as in the past. Perhaps this is the persona of a kernel-master, or perhaps openness and publicity lead to more interruptions, I dunno.

    Anyways, an enlightening, no-holes-barred interview. Enjoyable.

    1. Re:Patch bot is the answer? by Rik+van+Riel · · Score: 5, Informative
      The problem is simple: maintainers of any parts of the kernel get flooded by email, maintainers of the whole kernel (Linus, Alan, Marcelo) get flooded even worse.

      You really cannot expect these people to read all their email all the time, so patches and bugfixes get lost and may need to be resent various times before they get noticed.

      Add to that the fact that many of the people writing these patches are also extremely busy and may not get around to resending the patch all the time (I know I don't).

      The solution here would be to have the patch re-sent automatically as long as it still works ok with the latest kernel version ... this can all be checked automatically.

    2. Re:Patch bot is the answer? by _Quinn · · Score: 3, Informative

      cvs co -r 2.5.2
      # patch mjc-1
      cvs tag -b 2.5.2-mjc
      cvs tag mjc-1
      cvs commit
      # elsewhere/when
      cvs co -r 2.5.2-mjc
      # patch mjc-2
      cvs tag mjc-2
      cvs commit

      cvs co -r 2.4.13
      #patch ac1
      cvs tag -b 2.4.13-ac
      cvs tag ac-1
      cvs commit
      # elsewhere/when
      cvs co -r 2.4.13-ac
      # patch 2.4.13-ac2
      cvs tag ac-2
      cvs commit

      # assuming that Rik's VM patches are independent
      cvs co 2.5.2
      cvs co rvr-VM
      # patch rvr-VM
      # or, maintain Rik's VM patches as their own
      # files:
      # cvs co rvr-VM
      # cvs update # forces merge
      cvs tag -b 2.5.2-rvr-VM
      cvs tag rvr-VM-1
      cvs commit
      # elsewhere
      cvs co 2.5.2-rvr-VM

      Why wouldn't something like this work? You could even wrap everything up in a nice GUI if you wanted to. :)

      -_Quinn

      --
      Reality Maintenance Group, Silver City Construction Co., Ltd.
  4. Re:Multi-proc 'big iron'.. by Rik+van+Riel · · Score: 5, Informative
    Indeed, it is important to optimise the VM to work right on such large machines. I guess what I wanted to say is that the VM isn't just optimised for high-end machines, but also for machines on the low end.

    To be honest though, optimising for machines of different sizes really is a no-brainer compared to having to make the VM work with really diverse workloads ;)

  5. On developer spats and high drama by ajs · · Score: 5, Informative

    Open Source's biggest PR dilema is this sort of argument.

    Make no mistake, every company has developers that do this. There's two differences in the Open Source world: 1) you can't just fire an Open Source developer who won't "play ball" with management's edict 2) it's usually public.

    These are actually both really good things. The fact that you can't silence someone leads to repeated analysis of a problem. OSS' biggest benefit is that it brings massive peer review to bare not just on the code, but on the process.

    The fact that it's public feeds into that, and is equally good.

    The problem is PR. The Linux kernel is starting to look like anarchy to non-developers. I suggest that the process works, so we should all take a deep breath and leave it be. However, we all need to take the front lines on PR. Spin is all-important. This is not a "spat" or a "fight", this is "parallel development" and "peer review". The joy of this kind of spin is that, unlike most spin, it's TRUE! This guy is pissed at Linus. Linus has dumped his code. Yet, the two of them keep working hard to meet their customers' demands and producing what they feel is the best possible product.

    Please, don't foster the idea that we're a bunch of anarchists producing code that's any less functional than the rest of industry, because quite the opposite is true.

  6. OOM Killer must die by Salamander · · Score: 5, Informative

    Rik is an extremely bright (and likeable) guy, but his adherence to the OOM killer concept is disappointing. I've seen a lot of dumb ideas gain currency in the computing community or some part of it; OOM killer is the dumbest. If your process was allowed to exist in the first place, it should not be killed by the VM system. The worst that should happen is that it gets suspended with all of its pages taken away. If that doesn't free up any memory then neither would killing it (modulo some metadata - read on). If there are other processes waiting for the one that's suspended, then eventually they'll go to sleep, their pages will be released, and the suspended process will wake up - which won't happen if you killed it. There are only two differences between the two approaches:

    • Suspension does not take irrevocable action; the suspended process can still be restarted.
    • Suspension bears the cost of retaining the metadata for the suspended process so it can be restarted.

    The usual whine from OOM-killer advocates is that you can still get into a situation where all of that retained metadata clogs up the system and essential system functions can't allocate pages. However, that's preventable too. All you need to do is preallocate a special pool of memory that's only available for use by those essential system processes - either individually or collectively. The size of that pool and the exact details of how it gets allocated (e.g. which processes are considered essential) could be treated as site-specific tuning parameters. The same idea can then be further generalized to allow definition of multiple private pools, creating a semi-hard barrier between different sets of tasks running on the system (if you want one; the default pool is still there otherwise). This actually fits in very nicely with other things like processor affinity and NUMA-friendly VM, which I know because I once worked on a kernel that had all of these features.

    In short, there's no need for the OOM killer. Plenty of systems, many of which handle extreme VM load much better than Linux, have been implemented without such a crock. Rik contends that a lot of people make suggestions without actually understanding the problem, and he's right, but I also submit that sometimes he also rejects suggestions from people who do know what they're talking about. This row has been hoed before, and Rik's smart enough that he should know to avoid the NIH syndrome that afflicts so many of the other Linux kernel heavyweights.

    --
    Slashdot - News for Herds. Stuff that Splatters.
    1. Re:OOM Killer must die by Salamander · · Score: 4, Informative
      So we know before a process gets to execute exactly what its memory usage profile is?

      Please don't construct strawmen. Oh wait, that's not just a strawman, it's also circularity. You're assuming that the OOM killer exists, then using that to "prove" that an alternative approach is impossible to implement. Well yeah, an alternative system that both does and does not incorporate the OOM-killer concept is impossible. Congratulations. Well done.

      What I'm really saying is that the VM system should ensure that it has other means to deal with memory exhaustion. Disallowing overcommit altogether is one approach, and that has proven quite acceptable for many systems, but there are plenty of other approaches as well. I've briefly sketched out only one; look up the others yourself (the information is available in plenty of places including some OS textbooks).

      "taken away"? Where do they go?

      The phrase "suspended with all of their pages taken away" (which is what I said) includes the case where the pages have already been taken away. English 101.

      As for where they go, the obvious answer is not the general swap area, because that's already full. However, that doesn't preclude the existence of a secondary (actually tertiary) swap area that exists only for this purpose. It could also be a percentage of the general swap area, which starts to look very much like the memory-pressure code in the very highly regarded FreeBSD VM, or Solaris, etc. The point is that there's a middle ground between "no overcommit at all" and "if you allow overcommit we might shoot you in the head just because we feel like it".

      Color me skeptical.

      Skepticism is one thing; strawmen and circularity are another. I'm skeptical about the need for an OOM killer.

      --
      Slashdot - News for Herds. Stuff that Splatters.
    2. Re:OOM Killer must die by Elladan · · Score: 3, Informative
      Your scheme won't work. Think about it.

      The OOM killer is triggered when the system is completely out of all memory, including swap, and a process (any process, not just some ram hog) tries to allocate more. That allocation request cannot complete, so the kernel needs to do something else. Note that it can't fail the request, because it already passed it due to overcommit.

      The OOM killer approach is to find a process that looks ripe and get rid of it. Thus, stability is restored.

      Your approach is to freeze the process that wanted a little bit more ram. What do you hope to gain by this? Well, presumably, you think that some other process is going to release some memory and allow the first one to complete. As should be obvious, this may not happen. In fact, it probably won't. What you'll end up with is a dead system with a lot of frozen processes. If you're careful, root might still be able to get in on the console to kill some stuff or link in more swap, but that's about it.

      For all practical purposes, the system is hosed.

      Your scheme has the advantage for a computation server that the administrator might be able to link in some more swap to complete a computation, but for normal uses, it's just a hang. The OOM killer approach is to attempt to blow away the memory hog and keep the system operational, without administrator intervention.

      The other approach, of course, is to get rid of overcommit entirely. People wouldn't like this too much, since they'd need a lot more swap space.

  7. Re:Good decision to remove Rik's VM from mainline. by docwhat · · Score: 3, Informative

    Rik's a really smart guy, but he isn't (or, rather, wasn't) so good at keeping the mainline kernel moving forward. Despite his comment about Linus dropping patches (which is true). What he didn't mention is that he never resubmitted the patches. He tried once and then dropped them.

    I thinks Rik's VM will become really really good as he maintains a branch for himself. When it's 95% of the way done, he can then work on merging it into maintstream (ie, the Linus kernel). Then we'll have a really kick-ass VM.

    But Rik wasn't working well with the established method for dealing with the Linus kernel. Linus then made the choice to go with a VM from someone who *did* know how to work with the Linus Kernel.

    It's not a technical issue, it's a maintence issue.

    Read up on the kernel cousin stuff with Rik and Linus talking about this.

    Ciao!

    --
    The Doctor What (KF6VNC)
  8. Re:CVS isn't decent by jslag · · Score: 2, Informative
    Every other possible solution is no worse, and usually much better, than CVS.


    That's a little strong. True, I haven't used anything but CVS for the last couple years, but last time I tried common alternatives (namely MS VSS and PVCS), they were major PITAes - slow, unreliable, and not helpful when more than one developer was working on the same file. Not to say that CVS doesn't have its problems, of course, but for a number of years it has been the logical choice for anyone who doesn't want to plunk down hundreds of dollars per seat for a closed-source tool.

  9. Re:MEMORY OVERCOMMIT must die by CaptnMArk · · Score: 2, Informative

    > there is indeed a /proc entry >(/proc/sys/vm/overcommit_memory)

    That setting doesn't work properly. Linux will just overcommit slightly less.

  10. IRIX by Pussy+Is+Money · · Score: 1, Informative
    Here's an interesting tidbit by Steve Lord on the linux-xfs mailing list as well:

    ... [malloc] does not fail can also mean does not return for a VERY long time. Plus the memory system on Irix has a mechanism where various subsystems which consume memory can register callouts which the memory system can call to ask them to release memory. Linux does not have the latter except for the explicit calls in page_launder or what ever it is called this week.
    --
    Pushin' 'n dealin', shovin' 'n stealin'