Tuning Linux VM swapping
Lank writes "Kernel developers started discussing the pros and cons of swapping to disk on the Linux Kernel mailing list. KernelTrap has coverage of the story on their homepage. Andrew Morton comments, 'My point is that decreasing the tendency of the kernel to swap stuff out is wrong. You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful.' Personally, I just try to keep my memory usage below the physical memory in my machine, but I guess that's not always possible..."
"You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful."
I absolutely despise the way that XP swaps out applications in order to make the disk cache larger. I have 1GB of RAM on my machine precisely so I don't have to wait two minutes for it to swap my web browser back in after it's swapped out... yet if I copy a 2GB file from one drive to another, the stupid operating system will swap out all the applications it can just to make the cache larger.
Please, please, don't take Linux down the same braindead route as Microsoft has done for XP. It's utterly insane to swap out my browser so that a 2GB file can be copied two seconds faster when I then have to wait two minutes for the browser to swap back in. Or at least provide some kind of '#define STOP_VM_SWAPPING_STUPIDITY' so that I can disable it.
She had just procured a new Sun machine with 2 GB of RAM. Mind you, disk space hadn't grown all that significantly and you could still get machines with 9 GB drives.
The original practice was to make swap 2xRAM. So when the student she had putting the machine came to her and said, "What do I make swap?" she responded "Twice the RAM."
He said, "Are you sure? That's like almost half the boot drive."
She thought about it for a second and said, "Oh, yeah. I guess just make it the same as the RAM."
So this begs the questions: What do you make your swap now? When does your rule of thumb change? And remember when you could run a "fast" linux box on a P100 with 64MB of RAM and 128MB of swap?
I talk about stuff.
Personally, I just try to keep my memory usage below the physical memory in my machine, but I guess that's not always possible..."
No it isn't possible. With today's RAM prices I almost always have more physical RAM than the system requires. But, due to aggressive VM swapping there are still hundreds of megs swapped out to disk when there is no need at all. This means that those applications, when their time does finally come, are slow because they must be retrieved from disk first. It's really annoying sometimes. Yet, even with excess RAM turning off swap is disasterous.
I think developers could do more at a library level. For example.....dare I suggest using common sub libraries within libraries, that is people like KDE and GTK get thier heads together and say "are thier functions we include in our libraies that could just as well be linked to an underlying library?"
And if you thought that was boring you obviously havn't read my Journal ;-)
Another reason to gradually and pro-actively swap things out, is that when another program later needs a lot of memory, your system doesn't come to a grinding halt because suddenly a lot of stuff has to be swapped out at once (followed by zeroing all that memory, since you don't want to have one program leaking data to another).
At least, that's the rationale I've read behind OS X's strategy of swapping things out long before all physical memory is used (and of keeping a pool of zeroed memory pages ready to fulfill most requests). Note that this does not require superfluous swap-ins if your reuse strategy is balanced properly, as the fact that something is swapped out doesn't mean that the memory which contained that data will be cleared/reused immediately (i.e., if it's needed again shortly afterwards, that page can be reactivated without having to go to disk).
Under most desktop OS'es, programs can even give some hints to the system regarding their usage of a memory region using e.g. the madvise() system call.
Donate free food here
In modern Unices (including Linux) last I heard, the sticky bit is ignored since everything is simply demand paged.
Could not sticky bit be revived with some similar meaning? As in, "don't be too keen on paging these out?"
I dont mind the kernel swapping out "old" stuff to grow a huge disk cache. Really, thats OK, it makes things faster for disk hungry processes allright.
However, what I mind is the fact that the pages that are swapped out STAY there!
Why not aging the disk cache the same way the RAM pages are aged ? On an idle machine, the disk cache would gradualy decay and be replaced by the pages back from the swap, and the machine would be all responsive again.
It means that if the user leaves for lunch and a cron wants to eat all the disk, with some luck, when the user gets back, his machine is as responsive as it was when he left.
I have a laptop with 192Mb of ram, I always hate when 2/3 of the ram is "free" while it takes 10 seconds for the kmail window to move to the front. Even if the machine has been idle for hours.
I even regularly do a "swapoff -a;swapon" to claim back the cache!
AIX uses LRU today, so when you do a backup, the system tries to keep all filesystems in cache (well that what was read last !!), and will happily swap your apps out to disk in order to do so (with default tuning parameter).
I fondly remember the days when I was running Linux with no swap, none whatsoever...
Unfortunately the current crop of best guess VM managers end up denying the end user the experience of their computer's peak performance. Coupled with the horrible state of application bloat, modern 'state of the art' hardware and software combine to give us less and less in terms of overall performance. Software developers throw more code at the cpu to add functionality with little or no concern for performance. And hardware manufacturers add more and more 'special instructions' and 'pipelining' which the majority of software is completely unable to access. If anything it's more like a bunch of dysfunctional co-dependents than an industry that is cogent as to what really needs to be going on. If the folks dealing with processors and the application software could take a page from the gamers (look at the high levels of integration between game engines and video cards) for example, and more effort put into consolidating functionality in dlls and shared libraries; we would be amazed at how truly fast these machines could perform.
"Can there be a Klein bottle that is an efficient and effective beer pitcher?"
Actually, I haven't been very impressed by the whole swapping thing under Linux lately. I'm running 2.4.22 with a 400MB swapfile.
Some apps _can_ make the system unresponsive enough to ignore keystrokes, which is *very* annoying. At other times, xmms will stop playing while the disk goes crazy... Switching from emacs to Firefox after 10 minutes usually takes an extra 5 seconds to redraw the window and load all the stuff again.
Running GNOME2 on this laptop is also quite noisy on the disk. It swaps all the time...
This point is useful, but only if free RAM is at a premium. For the most part, on servers, there will be sufficient RAM to support the on board applications, and the amount of free RAM remaining will be able to handle the variable load of a standard workday, if the server has been sized properly ahead of time.
On desktops, however, where the number and type of apps can vary much more widely, the need for free RAM is much higher. However, pushing all apps off to swap space as a default to keep as much RAM as possible free isn't necessarily an effective solution, since the OS will spend much more time performing swap operations that it necessarily should.
Perhaps a better solution might be an application where certian portions could be swapped off earlier, as they are less used, or not even loaded at all, while maintaining the core of the application in RAM, with "hooks" to the swapped out areas, to let the OS know that swap procedures are required. If the app could tell the os "I probably won't need functions c.d & e, so go ahead and put them in the swap file", it might lead to a good balance of swapping and performance.
Your server apparently believed that it was accessing that cache and buffer more often than that half gig of random pages. Do you have real reason to believe that it was wrong, or does that just "seem" bad?
In other words, do you have actual numbers to demonstrate that your kernel was making poor decisions, or are you only fairly sure that it was?
Dewey, what part of this looks like authorities should be involved?
I know what you mean, but in this case, it seems like your machine is making a reasonable guess: you haven't used Kmail in hours, so the odds of you wanting to resume using it at any particular instant is pretty low. On the other hand, reading from a drive is quite a bit faster than writing, so the penalty for incorrectly swapping out old pages when the system is idle is significantly less than incorrectly not swapping out old pages before users launch giant processes that want to allocate a lot of RAM very quickly.
Dewey, what part of this looks like authorities should be involved?
> Without a swap file, the kernel has no place to
> stick memory segments that are rarely used.
Anyone who runs Mozilla on Windows 2000 knows that if you minimize Mozilla for a half day, despite you having 756 MB RAM and not using more than 3-400 MB of it at any given time, bringing Mozilla back to the foreground takes anywhere from 2-6 seconds (depending on the speed of your disk), which is just idiotic on a 2 GHz home machine with that much RAM.
There is no reason what-so-ever that the OS should be swapping out a userland application when there's tons of RAM. Sure as hell not to make room to disk cache all these freaking multimedia files I'm moving around, I need them in the disk cache like I need a hole in my head. I can put up with a tenth of a second delay in starting to play my mp3, and using disk cache on a 100MB simpsons episode is just dumb. But I refuse to wait for a 30MB process to swap back in.
I've heard before that "you can NOT turn swap off on Windows 2000", but to hell with it, I think I'll try it when I get home tonight. I've got 756 MB of RAM, if the system crashes when I "hit the wall" so what, I don't think I will hit the wall. Any comments? Will Win2000 let me turn them all to zero min/max size? Anyone tried that before and know what the real actual implications are?
Hmmm, here's another idea, does anyone have a little FireFox extension whose job is to excercise the browser itself to prevent it from being swapped out?
I have only 256MB of memory in my Linux box, and I use no swap at all.
I run Mozilla, Gimp, Quake3Arena, etc, without a problem. I use Gkrellm to monitor the system, and I see the memory meter rarely goes above the middle.
But I don't run KDE nor Gnome, just Fluxbox.
The old rule of swap size = 2 times memory size is stupid. You just have to consider how much memory the sistem is likely to require at max load, and also how much swap it makes sense to have. For example, it's insane to have 1GB swap on a 128MB machine, cause if it's to use all that swap, sure it will be trashing and slow as hell. Or slower.
I've heard on a windows box you want to set your max page file size to 1.5x RAM. So if you're running 512MB of RAM you want your page file to be 768. From what i've noticed with linux or what most distros seem to use for defaults is 1x RAM. I notice my linux box hitting the swap file more than my windows system. However the windows box is always using the page file, even under idle situations when its been doing nothing all night. All be it very low. its stil using it. When i switch over to my linux system...no swap is being used until i open up an ap. Is this consistant with what everyone is saying?
See Sig! See Sig Zig! Zig Sig Zig!!!!!
One time, I had a disk corruption in the swap partition. When I booted the machine, everything went well, until I started opening applications. The machine swapped out more and more data, until it reached the first bad sector in the swap. It crashed quite spectacular.
Once I figured out what happened, I replaced the disk.
That was in the days of the 2.0 kernel. My machine had 16 MB RAM IIRC.
WWTTD?
Best performance improvement I ever got with the 2.4 series kernels was shutting off swap. My machine immediately became more responsive. From that point forward, I wouldn't come back to the machine after an hour away and encounter a jerky X mouse cursor because the instant I turned off the screensaver the kernel had to page all 128MB of my applications back into the 512MB RAM because it decided buffer cache was more important than code.
The 2.4 VM changes causing this behavior were awful, and it's too bad that I have to sacrifice a large (disk-based) physical address space, but I'm not going to put up with my applications being paged out when I have 4x as much RAM as code I'm running. Just allowing the system admin to put a limit on the size of the buffer cache would probably solve most of my problems, but instead I have to turn off swap. Too bad.
[ home ]
As I write this, my process stack (ps) lists 181 processes and KDE's window list contains 50 entries, many of which are Konqueror and Konsole each with multiple tabs open. Among the "bloatier" apps currently running are OpenOffice, Acroread, VMWare Workstation (though, a VM is not currently running), Tomcat 5/JWSDP 1.3, Umbrello (KDE's UML modeller), jEdit, and 2 database systems (Firebird and IBM U2). This, incidentally, represents a typical "snapshot" of my two workstations (both AMD XP 2200+, w/1GB), at most times. Despite all of this, my current memory usage is 990MB w/ 180MB in buffers/cache, and only 320MB in swap (which is mostly holding my WinXPPro dormant VM!).
:-p
The bottom line: "swapping" got you down? By some more RAM! Prices may be on the rise, but it's still likely cheaper than your time.
"LinuX - Dropping the c u r t a i n on Windoze." -- Vee Schade, vschade at mindless dot com
The universal IT answer of "It depends" applies here as well. Yes, having Mr. Bloaty App glob onto scads of memory that are then not referenced for long periods of time can have a negative impact on other apps if the system becomes memory constrained. And, Yes, if the memory manager swaps a bunch of unreferenced memory out to disk and Mr. User has to wait a long time for Mr. Bloaty App to become responsive because it was his memory that got swapped out, that's a problem, too. The ideal is to be able to address this (haha, bad pun) at the application level and not simply at a global level. This has been the standard on the mainframe (MVS, OS/390, z/OS) operating systems for a long time, where there is a very sophisticated virtual memory manager. If there are, say, a 100 apps and 2 of them are very sensitive to response time, most of them aren't, and 10 are just dead dogs you couldn't care less about how nice is it to be able to actually tell the system that? The 2 "loved ones" then receive preferential storage treatment at the expense of the other, "less loved ones" and the dead dogs are always first on the pecking order of who to steal storage from. The memory manager then is acting to maintain the responsiveness of the applications (the reasons we run OS's in the first place) to meet the needs and expectations of the user(s) (the reasons we run the Apps). Without that ability, arguing over "more swappy" vs. "less swappy" when it's only applied at a global, default, level is not especially productive except within the context of attempting to establish, perhaps, where the best general-use default happy setting is - for the general-use default system we all use (is that you? I know it's not me).
"The bigger the lie, the more they believe." - Det. Bunk
And I'd like to point out that valgrind has patent-related legal problems which has (frusteratingly) kept Red Hat from including it in their distribution.
May we never see th
Can someone please describe any adaptive algorithms that could be used. Specifically, I'm thinking of:
- dirty marking unreferenced pages when swaped. if these mem pages are not used after the swap out, no need to swap them in again. i'm prety sure this already occurs
- for process using high swap demands, increase their weighted priority for pages, with a window-averaged for swaps. so then, my database process could hog under load while my less-used apps may swap because they're used less often. could be taylored differently for code versus data segs.
- page-impage comparisons to avoid holding duplicate code segment pages in memory. this plays with the concept of shared libs a bit, but could avoid duplicate pages, especially if this information is saved in a precalc'd hash table that is stored.
just ideas.
Is a difficult dilemma, but that's because an overly complicated scheme is used.
There is a simpler and more powerful scheme that unifies swapping and disk caches, while allowing applications to persist between reboots, all with better performance than current systems!
EROS implements such a system. Generally it is referred to as "Orthogonal persistence", and functionally it behaves as though the computer is "always on", and returns to the exact state it was in after a reboot. The thing is, with orthogonal persistence, the structure on the disk is not a file system, but just the application data.
Since applications no longer work with the disk explicitly (open/read/write) but only with one type of memory (persistent memory), the OS manages all of the disk I/O, and it allows it to eliminate almost completely the largest delay in disk-work - the seek time in all writes. Since all application memory is just mapped to disk transparently, all RAM is just considered a "disk cache", and the kernel does not have to make nasty tradeoffs between disk caches (of explicit open/read/write calls) and virtual memory.
Of course there is still a problem if large work-areas of unimportant applications "swap out" smaller areas of important applications. I suggest solving that by prioritizing pages to the memory manager. In a system like *nix it is not a problem. In more secure systems however (EROS, for instance), it may create additional covert channels between applications so it was avoided.
My approach has been to start all the needed services and then run this small perl script (which I named memhog.pl) to create a process that hogs quite a bit of memory:
;)
#!/usr/bin/perl -w
use strict;
my $a = "xxxxxxxxxx" x (131 *1024*1024);
This is just a quick hack, you may want to adjust the size to suit your memory size. The server from where this script was copied has 2GB of memory. Essentially I want to page out all the stuff that doesn't get used after starting the server and the related server processes. Of course, given enough time the server would swap out those pages anyway, but this method just does it quicker. After the script has been run, the server will gradually swap in those pages it really needs. OK, doing this may be pointless but I don't care
Follow your Euro bills at EBT
Swapping out any semi-active process, especially an interactive one, is utter foolishness which should be subject to termination grounds (unless of course you work for a big iron computer company that needs to boost revenue by purposefully slowing systems down). Any time you decide to clear memory and page a task out ... you are creating two I/O's ... one out, one back in ... and doing it with shit poor locality since the swap area is probably at the other end of the disk from any other active I/O .... just discarding entries in the file cache only requires a single I/O to recover it. .... so tell me ... who is the idiot that believes two I/O's to kill the performance of an interactive process is better?
Here's a solution to the whole debate -- make the sticky bit have meaning under Linux like it does on other UNIXen -- if the sticky bit is set on the execuatble, do not swap it. If it is not set, the executable is free to be swapped. This solves the entire debate (for instance, if you don't want the 'interactive' mozilla process swapped, set the sticky bit on the executable).
We either need some way of expressing this in code in a way that's exposed to the OS so it can avoid paging it out (and don't say 'well, just lock the pages containing it', think about how GUIs code & data are arranged--i.e. OO), or, perhaps instead of LRU, paging should pay attention to when a page was last used relative to an interaction--if you use some menu to bring up a dialog that triggers a long, memory-intensive process, that menu may have been used longer-ago than stuff happening as a result, but it's going to be used again first--LRU is the wrong model. (Also consider the interactive parts of other apps that you have open windows for.) Maybe it should still get swapped out, but swapped back in when some memory is freed up--'this thing was used very soon after an interaction, so it should be in memory if possible'. I don't know if any OSes attempt to page-in from swap before something is requested. Aren't some processors these days trying to prefetch memory requests based on patterns of memory access (not just prefetching code memory)? Same sort of idea.
Tuning for interaction isn't new to OSes; the VMS operating system's process scheduler treated 'interactive' apps differently from 'batch' apps (where, if I remember correctly, an app was interactive if it paused for I/O before using up its timeslice). I dunno if Unixen or Windows do things like that.