Slashdot Mirror


Debate on Linux Virtual Memory Handling

xturnip sent us a good piece running over at Byte about Linux's VM. Somewhat more technical then the stuff we usually see online, this one talks about different VM systems, and the egos in the kernel. Its worth a read.

25 of 330 comments (clear)

  1. His favorite? by LinuxGeek8 · · Score: 4, Interesting

    He seems to think a lot in favor of the Andrea VM.
    That's ok to me, but he might want to take notice of the fact that linus didn't accept Rik's patches a lot and that 2.4.9 still had actually the VM of 2.4.5. The -ac tree was more up to date.
    So for a good comparison you'll need to compare the linus and the ac tree.

    --
    Well, don't worry about that. We can get you back before you leave. (Dr. Who)
    1. Re:His favorite? by mr3038 · · Score: 5, Insightful
      I'm aware that this doesn't mean they've met in person, but it shows that Moshe has discussed things with Rik before AA's VM was written. So I think he holds nothing agains Rik, he just likes aa's VM better.

      In addition to this it seems that he has implemented VM with reverse mapping also. Therefore it should be clear that he previously thought this was the best method. I've understood that the issue between Rik's and AA's VMs is that Rik's is optimized for normal swapping and AA's for OOM case. Because VM performance really matters only when OOM happens I think AA's should be superior. The real difference depends on benchmark, of course.

      Both systems seem to be somewhat equal. AA's needs less swap but Rik's is claimed to be better performer. If AA's system is simpler then that's what should be used. Select maintainability over questionable performance increase. This is like quicksort - there's a point when you usually get better performance bubble sorting the little pieces quicksort generates during the whole sort. The smart version isn't always the best. Nowadays CPUs can easily do a bunch of dumb operations faster than one smart operation.

      --
      _________________________
      Spelling and grammar mistakes left as an exercise for the reader.
  2. Re:The failure of Open Source by Anonymous Coward · · Score: 4, Insightful

    Oh please. Have you ever worked on a commercial software project? I've seen just as much if not more ego in moronic engineering team meetings at my enterprise software company. Without a single strong technical leader OR a group of smart people who all equally respect each other's opinions, the SAME THING happens on a commercial project. I've watched a Director of Engineering call meetings almost every day for 3 weeks in a row because he didn't know how to solve exactly this sort of problem. In the end he just decided to go with what the person with the most years of experience said and to get the CEO to give him blanket license to make that technical decision, though none of the other engineers agreed with it - they were all too conflict averse to speak up and too afraid about losing there jobs just as the economy was tanking (he made a bad decision indeed and the project suffered greatly for it, getting delayed by 3-4 months and even then never delivering a large portion of the promised features because this architectural decision made them impossible). That company (mine, unfortunately) is most likely going out of business soon. So don't give me this crap that ego only adversely affects Open Source projects.

  3. Re:It should all be configurable. by pwagland · · Score: 4, Interesting

    Sadly, no.

    While it is nice to be ultra configurable that leads to two seperate problems:
    1) Code maintainability
    2) User maintainability

    1) Is a serious problem. If you have to test the impacts on two different VM systems, and fully understand the impact that any change will have is a mammoth task.

    2) Users are not all technically literate anymore. Look at the recent slashdot story on microsoft losing there grip in Asia...

  4. AC kernels are not a fork by rakarnik · · Score: 5, Interesting

    Moshe Bar seems to indicate that Alan Cox is creating some kind of fork of the Linux kernel. Actually, -ac kernels are alwasys different from Linux kernels to some extent, since they include slightly more experimental code (e.g. ext3), or code that Linus has not had a chance to review yet. This way, the experimental code gets more testing before going into official Linus kernels. You can read more about -ac kernels at KernelNewbies.Org.

    As anyone following LKML knows, Alan thinks that drastic VM changes should be reserved for 2.5, and so continues to keep Rik's VM going. This actually helps quite a bit as both VMs get tested and there have been several comparative tests conducted leading to improvements in both VMs. Competition in this case is certainly helping Linux.

    Oh and for all you fork conspirators, here's another fact: Andrea Arcangeli also releases his own kernel releases, called -aa. I don't think any of these are considered forks; everyone understands that this way pacthes get more testing, "crosstalk" between the different flavors is a given.

    Much ado about nothing, IMHO...

    -Rahul

    1. Re:AC kernels are not a fork by bwt · · Score: 5, Interesting

      I don't think any of these are considered forks; everyone understands that this way pacthes get more testing, "crosstalk" between the different flavors is a given.

      Well, I disagree -- they are ALL forks. Any time you create a patch you are forking. The open source development model relies on perpetual fork and merge to accomplish its development. Most projects are forked this way into a development and a stable branch. I call this a "constructive fork". The AC kernels are perpetually different, but importantly they are generally about the same "distance" away, and "crosstalk" as you call it keeps it that way.

      As the "distance" increases, tension increases, and if it isn't resolved it will divide the development camp. If the crosstalk stops, and the idea of eventual merge is abandoned, you have a "true" fork. Developers have to pick sides, and the split can become permanent.

      I think the AC kernels have always been the former kind of constructive fork. If he never adopts the new VM, then his kernel will begins to diverge since developing for two VM's is hard. In this way, a small perterbation can become a full blown deviation that divides developer resources. I really doubt that the VM issue will divide the linux kernel team permanently. As AC's kernel gets farther away from the main line, the tension on everyone will increase. Eventually, I predict, the team will force one solution, but there is no guarantee.

    2. Re:AC kernels are not a fork by Tumbleweed · · Score: 4, Insightful

      Actually, I'd say they're more like 'sporks' than 'forks'. Nobody who makes them intends for them to take over from the main Linus kernel tree.

  5. Re:Make it a build option by iamsure · · Score: 5, Interesting

    This was actually answered on the list, and summarized in a Kernel Traffic. As Alan Cox put it "It would be horribly difficult".

    While it sounds simple enough, as they said in the KT, the "replacement" of the VM was no small feat. It took 170 patches, which touched a very large percentage of the kernel.

    Imagine doing so TWICE (or more) and trying to code 'around' the issues for each.

    No.

    This way madness lies. While it is a nice idea, the simple truth is that it doesnt belong in 2.4.

    2.5 should have branched the second that the patches were considered. Linus didnt want to deal with bitching about 2.4 not being "good enough" and was impatient.

    So be it. The differences between Linus' and Alan's kernel trees (other than the VM) is growing VERY small this week, and will probably be 'close-enough' for a handoff within the next two weeks.

    The only question is which VM will end up in the 2.4 series. (NOT when Linus hands it over, but when Alan begins his releases of it).

    I would not be shocked to see Alan disagree with Linus, and stay with the 2.4.x (x10) VM, and I also wouldnt be shocked to see him agree with Linus and use the new VM.

    As to the patch on install idea, it is actually also discussed for kbuild in the 2.5 series.

    2.5 will be very excited, if we can only get Linus to get working on it, instead of muddying the stable-series water!!

  6. Re:It should all be configurable. by Flower · · Score: 4, Interesting
    While the idea is interesting I don't think it is practical. From what I've read on KT and in this article changing the VM forces design considerations on userland programs. It's additional complexity that most developers (and especially companies like Oracle) wouldn't appreciate. I also think it would raise support costs. At the very least I'd want some variable in /etc that would clearly state which VM was being used. For me at least, the issue is simplicity in favor of flexibility

    I think the biggest bone of contention in the community is Linus replaced the VM in the current stable version instead of pushing it into 2.5. Again, not being a kernel hacker and only going from everything I've read, this was a radical change. I'd almost be willing to say the latest kernels should be labeled 2.6 but that's just me.

    Oh, and finally, to paraphrase an old saying, give any tech-savvy user enough rope and they will hang themselves.

    At least that's what I think. :)

    --
    I don't want knowledge. I want certainty. - Law, David Bowie
  7. Alan will be switching VMs soon... by rakarnik · · Score: 5, Informative
  8. Re:To fork, or not to fork by ethereal · · Score: 4, Informative

    Well, drivers eventually do get from the -ac tree into the Linus tree, you know - the whole point is that AC tries them out until they are stable enough for Linus. Not to mention that Mr. Cox does have some responsibility to provide RedHat with the best kernel he can, no matter what Linus thinks of it. The only weird thing here is that as far as the VM goes, Linus has picked up the more experimental code first. So people who always recompile the Linus kernel when they install a new distro may find that their kernel operates very differently after that.

    My naive thought is that the best way to do it would be to somehow modularize the two VMs so that it can be a compile-time or boot-time option, and let users try both on the same box to see which is better. However, I imagine this would be a ton of work to set up.

    --

    Your right to not believe: Americans United for Separation of Church and

  9. Re:It should all be configurable. by jacobito · · Score: 4, Informative
    That's not going to happen in the 2.4 series. The kernel hackers think that making the VM policy configurable would be a nightmare:
    Michael T. Babcock asked how ugly it would be to make Rik van Riel's and Andrea Arcangeli's Virtual Memory subsystem code into a compile-time option, so folks could try each one out as they pleased. Alan Cox replied simply, "Too ugly for words." Mike Fedyk suggested that it might be feasible in 2.5, and asked if there were a way to make it non-ugly. Marcelo Tosatti replied, "Even if its non-ugly, its non-easy. Way too much overhead. For 2.5 we'll probably be able to get people working together."
    This is from Kernel Traffic #139.
  10. ok, here's the thing by Velex · · Score: 5, Interesting

    I don't care if you want to swear by the Linus kernel, but it gets killed by IO. I mean, come on, I'm using 2.4.12, and I can't rip a CD an play an MP3. Under the AC series, I can rip CDs, play MP3s, watch divx movies, surf the web, untar a file, and have a compile job going at the same time. Even for more usual setups, like viewing a video without doing anything else, the Linus kernel drops frames left and right, whereas the AC series laughs at it. Don't tell me I need to use mplayer with SDL, because I do.

    Because I treat my Linux box as though it were a Windows box (one of the reason I switched over to Linux for everything is that the widgets in GTK are prettier than the widgets in Windows -- it's nice to have people ask me how to get their desktops to look like mine and tell them they have to install linux) and I expect it run at least as well as a Windows machine, I must use the AC series. While I'm sure that the Linus kernel has it's applications, it is simply unacceptable for replacing the Windows kernel.

    Mod me flamebait or troll if you want, but I speak the truth. I have a Thunderbird-750 with 224 MB of ram, and I find it simply unacceptable when I can't run Quake or view movies under linux because of the Linus kernel. When mp3s skip because I'm moving some data around, it tells me that something is wrong with the Linus kernel. I'm glad that I had a friend who introduced me to the AC series, or I would have given up on linux. Plain and simple, politics aside, the end user doesn't care that he's being loyal to Linus the Great, he just cares that he can view that movie. If Windows outperforms linux in multimedia, he'll use Windows.

    --
    Join the Slashcott! Stay away entirely Feb 10 thru Feb 17! Close all tabs to prevent autorefresh!
    1. Re:ok, here's the thing by choward · · Score: 5, Interesting

      I use the Linus stock kernel on a _very_ similar setup (Duron 700, 384MB ram) and I don't have the problem you mention. One thing I've noticed is that with the Linus kernel, DMA is _never_ turned on by default, you must use hdparm explicitly at startup. Once you do that, skipping mp3s are a thing of the past.

      Running the hdparm tests,
      w/out DMA: 4.01MB/sec
      with DMA: 34.96MB/sec

      Quite a change.

      Craig Howard

      --
      -- Craig Howard
  11. Re:To fork, or not to fork by battjt · · Score: 4, Insightful

    Look at the success of EGCS and GCC. That was a successful split and merge. It led to a better GCC in the end while supporting both stable and advance versions of gcc in the interim.

    Joe

    --
    Joe Batt Solid Design
  12. Re:OSS Power by Xzzy · · Score: 5, Interesting

    > so Linus used Arcangeli's new VM code. Problem solved. Stable as ever.

    This is actually a wad of baloney. In normal applications (ie, running xmms, reading slashdot and maybe running gimp, with your glitzy desktop of choice), sure the VM works fine.

    In any SERIOUS situation though, 2.4 simply falls apart crying because the kernal handles memory so badly. One would like to think that in a low memory situation the kernel would start hacking off whatever was causing the problem so that it could survive. Well, it doesn't. It just freezes. This has been a situation I've been forced to deal with over the past month.. so while I'm not a guru on the subject, I have pieced together some bits of the story.

    Basically at my job we have a programming group that has mountains and mountains of source that they have to compile. Lazy as programmers tend to be, they also try to compile it over nfs on the machines with the biggest specs. To give a sense of scope, the resulting executable clocks in at around 500 megs. So basically, their build really stresses out the machine they're compiling on.

    The machine freezes EVERY time because of memory shortages. The kernel can't allocate pages for incoming network traffic, causing a backlog, causing processes to hang, causing further backlog.. then powie an unresponsive machine. The obvious solution would be to slim down the build but if anyone's ever worked with a developer suggesting that would be as useful as suggesting Hitler was a saint.

    From what I've gathered of the story, the 2.4 kernel was supposed to have this new grand VM that made dorking with the freepages file obsolete.. to the point where you can't even tweak the kernel with the freepages file anymore. The kernel was supposed to have this feature that would let it detect what processes were stealing all the memory and kill them off.

    NEWS FLASH they took this feature out because it was buggy.

    So what happens? The kernel just paints itself into a corner until the machine freezes. Only way to recover is to power cycle. This is why damn near every patch in the 2.4 line has the line "VM tweaks" in the changelog. Quite frankly the 2.4 VM is garbage, and only functions suitably well in non-intensive applications.

    It's been getting better with each dot release but it's still nothing you'd want to bet money on.

  13. Compound errors by Salamander · · Score: 5, Interesting

    IMO both Rik's code (RVM) and Andrea's (AVM) were accepted prematurely, and Linus's ADD is the root of the problem here. Everyone thought the 2.2 VM was broken, so he jumped on RVM when it really hadn't received adequate testing with various workloads. Then, when that didn't work out, he did something even worse by jumping on AVM in the middle of a "stable" kernel series when it was totally undocumented and even less thoroughly tested than RVM. That's just bad software engineering, regardless of the quality of Rik's or Andrea's work.

    Ideally, an "old-fashioned" alternative to RVM would have been maintained throughout the 2.3 process, as a fallback in case RVM turned out not to be ready for 2.4 - which was in fact the case. But this wasn't done, there was no alternative, and so RVM became the basis for 2.4. Once that decision was made it should not have been unmade by replacing RVM with AVM. Andrea's work should have been in the 2.5 tree, which should have been opened a long time ago to deal with precisely this sort of situation. 2.4 is not the last Linux kernel that will ever exist. We don't need to make it perfect. It would be far better to admit its imperfections, band-aid them as best we can, and try to get a head start on creating something better for 2.6. What we have instead is error on top of error, "not ready" replaced with "even less ready".

    To clarify, I have nothing but the highest regard for both Rik's and Andrea's work. Obviously they have different ideas and attitudes. Rik has drawn on many sources in his design, resulting in a system that is both very advanced and very complicated. The process of reining in the complexity is still incomplete, but I still have hope that some day Rik will be able to come up with something that's really awesome, and he has always documented his ideas thoroughly. Andrea, by contrast, is much more pragmatic; he wants something that works now even if it's somewhat more limited in scope (e.g. by being almost impossible to reconcile with NUMA). The dark side of that "pragmatism" is that Andrea has skimped on non-code activities such as documenting or explaining the basic ideas on which his system is based. Nonetheless, both have done great work and should continue to do great work...in the 2.5 tree.

    --
    Slashdot - News for Herds. Stuff that Splatters.
    1. Re:Compound errors by puetzk · · Score: 4, Informative

      FWIW, I think that Andrea's setup is modeled after the 2.2 VM (which he did a fair amount of work on tuning). So this is really more of a pragmatic revive-the-old-approach than it might initially seem.

      We all know this simplistic setup had scalability problems (like much of 2.2) but at least it worked right. Hopefully given some more time, Rik can really get his to go, since it seems more sophisticated/scalable long-term.

      --
      The Matrix is going down for reboot now! Stopping reality: OK. The system is halted.
  14. Bring yourself up-to-date by marm · · Score: 5, Interesting

    The machine freezes EVERY time because of memory shortages. The kernel can't allocate pages for incoming network traffic, causing a backlog, causing processes to hang, causing further backlog.. then powie an unresponsive machine.

    This was a common problem with kernels from about 2.4.1 up to 2.4.9 - the machine would gradually eat into swap further and further, failing to release no-longer-used swapspace, until it would go Out Of Memory (OOM) and attempt to kill the process that was eating all the memory. Frequently it would pick the wrong process to kill (sometimes even killing init) or would end up deadlocking.

    I agree with you - that is no way for a virtual memory system to behave.

    However, the Linux development process moves quickly once people get annoyed enough to actually do something about it, and that's precisely what has happened. Starting with 2.4.10, a new, simpler VM system has been used in the official Linus kernels, and I can say with some confidence that it has solved all the major problems with the 2.4 VM system, and continues to get significantly faster with every release.

    If you haven't actually tried a new kernel yet (and from your problems it seems that you haven't), I suggest that you do - it's made the world of difference for me.

    At the same time, the old 2.4 VM has lived on in the -ac series of kernels, and has become a great deal better there - some competition has made a big difference. Almost all of the major areas where it behaved badly have been fixed. However, my own impression is that it is still somewhat slower than the new VM.

    The choice is yours which you want to run - my own recommendation would be for the new VM in the official Linus kernels, but others may disagree.

    [OOM Killer]
    NEWS FLASH they took this feature out because it was buggy.

    Umm, no they didn't - it continues to exist in both the new VM in 2.4.13 and the old VM in the most recent 2.4.13-ac kernels. It does, however, now work correctly in both VMs. There are some philosphical arguments over whether killing processes is the best way of handling an Out Of Memory situation, but it is surely better than deadlocking the box, which is what most VM systems (including the famed FreeBSD's) do when OOM occurs.

    It's been getting better with each dot release but it's still nothing you'd want to bet money on.

    All I can say is that the new VM works great for me and lots of other people, even under extreme load. I can certainly understand your pain if you're using an older 2.4 kernel, but please try a recent one - the difference is astounding.

    If you're still having problems with recent kernels, then I'm sure linux-kernel@vger.kernel.org would love to hear from you - and would certainly be a lot more useful to you than ranting on Slashdot. Getting the VM right is now priority number 1 for the kernel hackers.

  15. Re:Make it a build option by Rik+van+Riel · · Score: 5, Informative
    I wonder what Rik has to say about the new "blessed" VM? If he thinks it's a better all-around VM, then the debate can stop pretty quickly I would think.

    Well, since you wanted to know ;)

    First let me explain that most of the time in the beginning of 2.4 was spent making the VM stable, stopping it from chrashing on highmem machines, etc... Speed improvements were a secondary thing, to do later on. Secondly, Linus is a very busy man and didn't seem to have the time even to apply critical bugfixes at times, so his kernel has had a big disadvantage over Alan's kernel.

    Around the time where the VM in Alan's kernel got stable, I was finally getting the time to work on speed improvements and Linus still lagged a few patches, suddenly Andrea surprised us all by posting the first version of his new VM online. An even bigger surprise was that Linus integrated this into the kernel within 24 hours, without even asking Andrea!

    As to why Andrea's VM is faster for desktop use ... it was optimised for speed on low to medium loads in exactly the same way the 2.2 kernel was. Note that this also means the server falls over quicker under high load and it is basically impossible to tune the system to run decently under all loads ... just like 2.2.

    My VM was slower for desktop loads, but since the thing stabilised I put in some time to make things faster and I seem to have mostly caught up with Andrea on the speed front now. The benchmark results posted on the linux-kernel mailing list seem to indicate that Andrea's VM is faster for some things, while my VM is faster for some other things.

    Personally, I think it is easier to make a solid VM fast than it would be to make a fast VM solid. This opinion was formed because of the living hell of the Linux 2.2 VM, which was undocumented and horribly subtle.

    In the future, I know I'll always be optimising for (1) maintainability, (2) correctness/stability and (3) performance, in that order...

  16. Why VM is bad by Animats · · Score: 4, Interesting
    Virtual memory is way overrated, and probably should be phased out, both on servers and desktops.

    In Peter Denning's classic paper, The Working Set Model of Program Behavior, Denning concluded that paged virtual memory was, at best, good for an effective 2X increase in memory size. When he wrote that paper in 1968, memory cost about a million dollars a megabyte, so a 2X increase was worth the headaches of a VM system. Today, with memory at a few hundred dollars a gigabyte, it looks less attractive. It's not that expensive to double the size of RAM today. It can be cheaper than adding a fast disk drive just for paging. Uses less power, too.

    Disk as backing store gets worse as RAM gets faster. When Denning wrote that paper, the fastest backing devices (drums) rotated at around 10,000 RPM, for a 6,000 microsecond access time, and core memory cycle times were around 4us. So main memory was 1,500 times faster than backing store. Today, RAM cycle times have dropped to around 0.020us, but disks still top out around 10,000 RPM, making main memory 300,000 times faster than backing store. Thus, the relative cost of a page fault has increased by a factor of 200. This makes VM far less attractive today than it used to be. It's not getting any better, either.

    The price of having virtual memory is terrible performance once paging between active processes starts. That's called "thrashing". On a server which is processing short transactions, you're much better off throttling at the transaction launch point (as, for example, where CGI programs launch) than going into thrashing. This requires some coordination between applications and memory allocation, but where most of the memory is used by Apache and its child processes, that's a viable option.

    The main value of VM today is getting rid of dead code at run-time. A basic problem with shared libraries is that you load in the whole library, needed or not, when you need any function from it. This wastes memory, but after a while, the VM system will notice the unused pages and quietly release them. On a larger scale, the same problem is seen with dormant applications, a problem which has gotten totally out of hand in the Windows world, where far too much unwanted stuff launches at startup. VM ejects them from memory. That's what VM is really used for today.

    So if you're actually page-faulting, VM is hurting, not helping.

    I'd argue that it's time to go back to a swapping model - all of an app has to be in before it runs. That's where UNIX started; virtual memory didn't come in until 4.1BSD. But in support of this, apps need more information about the current memory situation. And they should be able to designate parts of their space as pageable, at least at the shared object/DLL level. Only a few apps (web servers, window managers) need much memory awareness, so that's feasible. Throttling needs to occur at a smart place, just before allocating substantial resources, such as CGI process launch or connection opening. By the time the VM system becomes involved, it's too late; resources are already overcommitted.

    The big win from this is repeatable latency at the memory level. With all the interest in reducing kernel latency at the CPU level, it's time to address it at the memory level too.

    QNX, the real-time OS, is worth looking at in this regard.

    1. Re:Why VM is bad by DaveWood · · Score: 4, Informative

      Alright, I'll bite. What you say is interesting, and I believe your comments regarding the changing relative costs of traditional VM paging algorithms make sense. The problem is that I suppose I don't understand the alternative you are proposing. I am certain this is due to my own ignorance; please give tolerance to my questions, and don't let my inquisitiveness be mistaken for criticism.

      You say, "The price of having virtual memory is terrible performance once paging between active processes starts." Assuming the VM algorithm is working correctly (big assumption lately), this means basically that you are trying to run more than your memory can handle, and have reached a load-shearing point with respect to RAM. From this I surmise that we might be talking about a "smarter" VM system that would shear better, perhaps by identifying the condition, and perhaps by better communication with higher levels - in other words, a different/better application-level interface to the VM system.

      And, indeed you say, "On a server which is processing short transactions, you're much better off throttling at the transaction launch point [than thrashing]... This requires some coordination between applications and memory allocation." So I think I understand so far.

      Then you say: "A basic problem with shared libraries is that you load in the whole library, needed or not, when you need any function from it." This is where I perhaps display my ignorance of the kernel, but that's not what I have understood was going on. My impression of things was that an application was loaded into memory by mapping its data on the disk into "virtual" memory, and that the VM subsystem arbitrated between real and virtual memory by retrieving from the disk only what blocks were "necessary" (i.e. being referenced by the executing code), and that this process naturally extended to libraries, and especially shared libraries (which need only exist in "real" memory in one location, despite being mapped into multiple "virtual" memory environments). Then again, perhaps it is a minor point - if the whole SO image is loaded and then unused pieces are unloaded or vice versa, it seems less important than the contention problem already on my mind...

      You say "VM ejects [unused bits of libraries and applications] from memory. That's what VM is really used for today." Absolutely! But regardless of the relative differences, isn't this process of migrating data between different "tiers" of data storage in the computer (each with a different latency, throughput, and cost/availability) always going to be necessary? While I can certainly see a major advantage in creating/improving ways for the application to communicate with the memory management system, is there really some fundamental alternative to the block-based VM "guesswork" that takes place in absence of directives set at compile time?

      You say: "So if you're actually page-faulting, VM is hurting, not helping." I am wondering if the VM is either hurting or helping per se, since the real problem is that you don't have enough RAM even for the "active" blocks you want to run. Of course, the quality of your VM will determine how close you can get to "perfect" utilization of your RAM.

      Then you say, "I'd argue that it's time to go back to a swapping model - all of an app has to be in before it runs." This is where you lose me, I suspect because I do not understand what you are really proposing. You go on to say "in support of this, apps need more information about the current memory situation. And they should be able to designate parts of their space as pageable, at least at the shared object/DLL level. Only a few apps (web servers, window managers) need much memory awareness, so that's feasible.Throttling needs to occur at a smart place, just before allocating substantial resources, such as CGI process launch or connection opening. By the time the VM system becomes involved, it's too late; resources are already overcommitted."

      At first it sounds as though you are saying that you want to eliminate swap altogether. I do not doubt that for some situations this is preferable - you want to have consistent performance and a sharp failure rather than the long thrash in the case where you use up your resources (and you mention QNX). However for general-purpose computing, I'm not so sure this is a good idea, even with RAM as cheap as it is. Depending on what you're trying to do, the slight loss in predictability and overall performance is vastly preferable to sharp failures for many, I would even say, "most" applications, even on the server.

      But moving on, it seems you are saying that what you dislike about the VM is that data is broken into arbitrary blocks - and so we should rely on application programmers to designate what it would be a good idea to swap out in case of memory contention ("designat[ing] parts of their space as pageable"). The problem I see with this is that you are relying on the programmer to do something that, if they do not do it, their program will appear to run anyway.

      This is therefore automatically classified a frivolous expense by commercial software developers, and even OS people working for the love of the game may be tempted into the same pitfall. This is superficially similar to the argument between malloc/free proponents and garbage collector advocates. Giving the programmer another "lower-level" thing to worry about gives them an opportunity to optimize it, but in practice we often find that on the balance we get more mistakes and the quality of the user experience suffers.

      The compiler probably could be coaxed to do it for you. But the various tradeoffs between compile time "pre-blocking" and runtime blocking might leave compile-time computations, whether in the compiler or even in the developer's head, looking inferior to what a good VM system can do while observing actual behavior in real-time.

      Your point about throttling occuring "at a smart place" is not lost - obviously many applications could benefit from more transparency by the memory management system in managing their affairs - apache users really don't want to have to guess how many processes/concurrent users should be allowed, they want apache to determine it for them based on what the system can handle. But most application programmers are not going to do this extra work or do it right, and a VM seems like what you need as a "default behavior," even if its benefits (and its audience - those who have enough RAM that they never need fear swap) are lessening over time.

    2. Re:Why VM is bad by RelliK · · Score: 5, Insightful
      huh? what? This is the most uninformed garbage I have ever read. I don't have time to refute all of the nonsense, so I'll just take on the biggies.

      The price of having virtual memory is terrible performance once paging between active processes starts.

      When that happens, you are running a lot more processes that can fit into memory. Without VM you would not be able to do that at all.

      A basic problem with shared libraries is that you load in the whole library, needed or not, when you need any function from it.

      False. Any decent VM does demand paging. Only the pages that are needed are loaded from the executable. The parts of the program that are never executed are never loaded from disk, notwithstanding read-ahead optimization. A shared library is just an extention of the executable so the same rules apply. Further, a shared library can be used by multiple processes and only *one* copy of it is loaded into memory.

      I'd argue that it's time to go back to a swapping model - all of an app has to be in before it runs.

      That would be absolutely stupid. It would slow down the system tremedously. Se above about demand paging.

      Without VM, you would need to increase the memory requirements by a factor of N, where N is the number of processes running concurrently. Further, the startup time of each process would always be slower since all of the code would have to be read in memory. With VM part of it is already there (shared libraries), and the code is loaded on demand.

      In short, this is the biggest pile of uninformed garbage. You *really* need to take an OS course before you can talk about OS design.

      --
      ___
      If you think big enough, you'll never have to do it.
  17. unstable kernel, sigh.. by TheGratefulNet · · Score: 4, Insightful
    as a 5+yr linux vet, I'm horrified at this turn of events. I've always counted on linux to be rock stable, yet the last few months have been anything BUT stable.

    I really hate to say this, but I'm wondering if jumping ship to freebsd (etc) makes sense. I've been a major linux supporter for quite a long time, but I know that the *bsd guys have had their act together (good smp, good networking under load, etc) for a long time.

    would it be all that crazy to adopt the VM system from the 'establishment' (bsd)? frequently the linux codebase DOES borrow from bsd. why is the VM system all that different?

    --

    --
    "It is now safe to switch off your computer."
  18. What about the AIX VM? by Sara+Chan · · Score: 5, Interesting
    The discussion so far has focussed mainly on Rik's and Andrea's VMs. For the 2.4.x series, that's fair. For 2.5, though, what about considering the AIX VM?


    IBM has said that they will open source any part of AIX that we would like. The AIX VM works well under high stress. Obviously it could not just be put as-is into Linux, but there must be a lot of good ideas/algorithms in it that could--arguably should--be moved to Linux. Why isn't anyone looking at doing this?