Tuning Linux VM swapping
Lank writes "Kernel developers started discussing the pros and cons of swapping to disk on the Linux Kernel mailing list. KernelTrap has coverage of the story on their homepage. Andrew Morton comments, 'My point is that decreasing the tendency of the kernel to swap stuff out is wrong. You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful.' Personally, I just try to keep my memory usage below the physical memory in my machine, but I guess that's not always possible..."
First swap!
"You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful."
I absolutely despise the way that XP swaps out applications in order to make the disk cache larger. I have 1GB of RAM on my machine precisely so I don't have to wait two minutes for it to swap my web browser back in after it's swapped out... yet if I copy a 2GB file from one drive to another, the stupid operating system will swap out all the applications it can just to make the cache larger.
Please, please, don't take Linux down the same braindead route as Microsoft has done for XP. It's utterly insane to swap out my browser so that a 2GB file can be copied two seconds faster when I then have to wait two minutes for the browser to swap back in. Or at least provide some kind of '#define STOP_VM_SWAPPING_STUPIDITY' so that I can disable it.
Memory access vs. disk access I mean?
Back when P90s were the norm, was RAM access about as fast as disk access is today?
-- jaf
She had just procured a new Sun machine with 2 GB of RAM. Mind you, disk space hadn't grown all that significantly and you could still get machines with 9 GB drives.
The original practice was to make swap 2xRAM. So when the student she had putting the machine came to her and said, "What do I make swap?" she responded "Twice the RAM."
He said, "Are you sure? That's like almost half the boot drive."
She thought about it for a second and said, "Oh, yeah. I guess just make it the same as the RAM."
So this begs the questions: What do you make your swap now? When does your rule of thumb change? And remember when you could run a "fast" linux box on a P100 with 64MB of RAM and 128MB of swap?
I talk about stuff.
Personally, I just try to keep my memory usage below the physical memory in my machine, but I guess that's not always possible..."
No it isn't possible. With today's RAM prices I almost always have more physical RAM than the system requires. But, due to aggressive VM swapping there are still hundreds of megs swapped out to disk when there is no need at all. This means that those applications, when their time does finally come, are slow because they must be retrieved from disk first. It's really annoying sometimes. Yet, even with excess RAM turning off swap is disasterous.
Personally, I just try to keep my memory usage below the physical memory in my machine, but I guess that's not always possible..."
No, it isn't really. Unless you don't use your computer.
In some cases it makes sense to use your physical memory as disk cache rather than for unused applications.
Swap out that sshd, and give the database server more memory. Swap out that screensaver and email client, give quake more.
So, what... I want my apps paged out to disk so that I can wait for them to be loaded back in when I switch over from Mozilla to Open Office?
Government of the people, by corporate executives, for corporate profits.
I think developers could do more at a library level. For example.....dare I suggest using common sub libraries within libraries, that is people like KDE and GTK get thier heads together and say "are thier functions we include in our libraies that could just as well be linked to an underlying library?"
And if you thought that was boring you obviously havn't read my Journal ;-)
Personally, I just try to keep my memory usage below the physical memory in my machine, but I guess that's not always possible..."
I keep my memory usage much below the total ram on the servers, but in real life, the machine still swaps. This is because even tho the machine NEVER needs more ram than is available at any given time, over a period of days, it will use more than the available ram. It caches out the old data that was used 12 hours ago.
Unless you reboot every day (as in a client machine) you will use swap on just about any machine. Using swap is not bad. Using swap for a currently running application is not so good. This isn't a bug, its a feature. Reading data from swap after it has been accessed is still faster than reading new data from the drives, especially if its a network drive.
Tequila: It's not just for breakfast anymore!
You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine...
/. so you're not supposed to!)
Why not? BloatyApp, if it's that bloaty is probably an object oriented program with template instantiation (or is by Micro$oft); these programs are notoriously huge, but also have notoriously poor locality of reference. The user will get better perceived response if you can keep more of BloatyApp resident.
If there's space in memory, I don't see the point of pre-emptively ejecting as many LRU pages of BloatyApp. (Of course, I haven't RTFA, but this is
Ah yes. It's all the fault of bloaty apps. Apps like database daemons and high-traffic httpd daemons. We've turned swapping off on our servers because we were sick of seeing almost a GB of cache/buffer memory, while it was swapping 500MB of shit to disk. Want a bloaty app? How about the linux Kernel? I love the thing, but Jesus Tapdancing Christ it would rather swap our starting DB process to disk, than free up the fucking buffers and cache. Is there something wrong with wanting it to give precedence to not swapping?
Read: Rabbit Rue - Free serial nove
With read-only & demand code-page loading and copy-on-write even bloatware really doesn't eat memory. And bloatware has to be frequently restarted to recover the memory it leaks.
Sure, there are some jobs that needs swap -- lots of seldom used memory pages.
But not mine. I prefer to save myself the complexity and performance headaches.
Yeah there is nothing I love more than coming to an idle X console session on box I haven't touched in a while and watching it grind itself into oblivion because everything has been paged out.
At what point does VM stop meaning Virtual Machine and start meaning Virtual Memory.
Or is it just the Virtual "M"?
Unless I'm using something that is very memory intensive (ie, VMware), I just turn my swap partition off. It allows me the freedom to run Evolution/XMMS/etc. quite quickly when I'm not also using VMware, and still run VMware at a good speed just by turning the swap partition back on when I need it.
Find out about the Lexus Rx400h Hybrid!
what about disk corruption? you swap your memory
to disk, its gets mis-written, and you read it back in and now your memory is corrupted!
About 2 years ago I discussed this issue with an OS guru. He was of the mindset that you should always have Swap space = 10xmemory.
I find that Linux just isn't that good at paging. I never use a significant portion of my 2GB swap partition, and memory contention is still high sometimes. Hmm... Maybe I do need to adjust the swapability number.
Where law ends, tyranny begins -- William Pitt
Another reason to gradually and pro-actively swap things out, is that when another program later needs a lot of memory, your system doesn't come to a grinding halt because suddenly a lot of stuff has to be swapped out at once (followed by zeroing all that memory, since you don't want to have one program leaking data to another).
At least, that's the rationale I've read behind OS X's strategy of swapping things out long before all physical memory is used (and of keeping a pool of zeroed memory pages ready to fulfill most requests). Note that this does not require superfluous swap-ins if your reuse strategy is balanced properly, as the fact that something is swapped out doesn't mean that the memory which contained that data will be cleared/reused immediately (i.e., if it's needed again shortly afterwards, that page can be reactivated without having to go to disk).
Under most desktop OS'es, programs can even give some hints to the system regarding their usage of a memory region using e.g. the madvise() system call.
Donate free food here
Gotta love this guy. In articles people usually try to impress you with their brilliance. This man is a kick in the pants! I love Andrew Morton's style.
Harpo Tunnel Syndrome--my wrist feels funny.
Under the performance tab you can use the slider to tune the machine for 'foreground' or 'background' apps.
WURD!!
echo 0 > /proc/sys/vm/swappiness
In modern Unices (including Linux) last I heard, the sticky bit is ignored since everything is simply demand paged.
Could not sticky bit be revived with some similar meaning? As in, "don't be too keen on paging these out?"
So, for standard desktop usage (i.e. XMMS + Mozilla + Vim in KDE), will most users with normal systems (say, 256mb RAM) experience better performance with a swap drive, or with swap deactivated, as some posters here advocate?
You could always just tune the cache down to bugger all. It's one of the kernel parameters.
Government of the people, by corporate executives, for corporate profits.
Which kernel and what parameter? We've tried sysctl settings up the wazzu, but nothing seemed to actually change how it handled swapping.
Read: Rabbit Rue - Free serial nove
is to do something like AIX does, where I can use "vmtune" to customize the percentages of memory I devote (hard or soft limit) to filesystem pages or computational pages. This way I can tune for my Bloatware, tune for file copying a la XP, or tune for my DBMS, whatever suits me.... The developers could take it one step further and provide a simple, understandable (as opposed to AIX's) interface for configuration......
We do understand that paging is different than swapping, and that Solaris has changed the memory allocators and algorithms multiple times across releases right?
..
That said, you might want to look into a recent Solaris Internals book or course, and also look into the history of things like priority_paging and page coloring
/proc/sys/vm/swappiness
Unlike say latest versions of AIX where the OS gives higher priority to the cache than to apps, and you have to go deep in the tuning manual to read that the default probably doesn't suit many situation, please adjust maxperm etc...
I dont mind the kernel swapping out "old" stuff to grow a huge disk cache. Really, thats OK, it makes things faster for disk hungry processes allright.
However, what I mind is the fact that the pages that are swapped out STAY there!
Why not aging the disk cache the same way the RAM pages are aged ? On an idle machine, the disk cache would gradualy decay and be replaced by the pages back from the swap, and the machine would be all responsive again.
It means that if the user leaves for lunch and a cron wants to eat all the disk, with some luck, when the user gets back, his machine is as responsive as it was when he left.
I have a laptop with 192Mb of ram, I always hate when 2/3 of the ram is "free" while it takes 10 seconds for the kmail window to move to the front. Even if the machine has been idle for hours.
I even regularly do a "swapoff -a;swapon" to claim back the cache!
Without a swap file, the kernel has no place to stick memory segments that are rarely used. They stay in resident memory la-la land until the process is terminated. Those segments add up over time and erode the memory available to the page cache.
Page caches are wonderful. When you load an application (like Firefox), you're not just getting the web browser. You're firing up a large chain of shared objects/DLLs that support the widgets, I/O, and components of the application. All of these components must be read into memory anyhow for program operation, so the kernel tends to just leave it in there for future use (the page cache).
When you shutdown Firefox, you're also releasing the necessity of those libraries (provided nothing else is using them). Those libraries also remove themselves from memory. If you load another application (like Thunderbird) that uses the same type of libraries, the kernel will not have to go to disk in order to fetch those libraries. It will instead opt for the page cache contents.
Turning off the swap file in the historic era of VM infancy was the best way to remove the hard drive bottleneck from system. The operating systems of yester-year did not have good page cache schemes that took advantage of all that unused memory. It is a little different now.
Applications are so modularized that they are broken up into a billions of smaller libraries so that code can be shared. This increases memory efficiency by keeping a shared library resident for multiple processes. These libraries are frequently accessed, more often than many people realized. Getting THOSE into memory is better than making sure my 500+ Linux applications stay resident.Notice that on a web server with 1GB of RAM the Linux kernel is still putting things out to swap. These processes that stay asleep for long periods of time do not need to waste the memory that page cache is currently using (892309504 bytes or 753.7MB). What would be stored in that 753.7MB of memory? The database that drives the website (instead of having to seek the disk). The entire web page hierarchy used to display pages on the web site. All the scripts that are used to display dynamic content on the web site (etc. etc.)
Now, if we subtracted from the page cache the amount of memory that was stored in the swap file, we would have over 200MB less that we could keep cached in memory. That could be an entire database that the kernel would then waste needless CPU cycles to fetch from disk.
The only advantage to turning off a swap file on these modern machines would be for a machine that runs only a select few applications, and not having a lot of processes in the background doing things.
Ayup
AIX uses LRU today, so when you do a backup, the system tries to keep all filesystems in cache (well that what was read last !!), and will happily swap your apps out to disk in order to do so (with default tuning parameter).
I fondly remember the days when I was running Linux with no swap, none whatsoever...
Unfortunately the current crop of best guess VM managers end up denying the end user the experience of their computer's peak performance. Coupled with the horrible state of application bloat, modern 'state of the art' hardware and software combine to give us less and less in terms of overall performance. Software developers throw more code at the cpu to add functionality with little or no concern for performance. And hardware manufacturers add more and more 'special instructions' and 'pipelining' which the majority of software is completely unable to access. If anything it's more like a bunch of dysfunctional co-dependents than an industry that is cogent as to what really needs to be going on. If the folks dealing with processors and the application software could take a page from the gamers (look at the high levels of integration between game engines and video cards) for example, and more effort put into consolidating functionality in dlls and shared libraries; we would be amazed at how truly fast these machines could perform.
"Can there be a Klein bottle that is an efficient and effective beer pitcher?"
I don't know if Linux does this at all, but it seems that one useful VM strategy would be to copy to disk rather than swap to disk. That way you can continue to run bloaty app without swapping it back in, but interactivity is still good since you can resuse memory immediately without needing to swap stuff out at the point the demand occurs. Of course it'd need to be tunable and/or smart (no point copying highly volatile areas of memory for a start).
Actually, I haven't been very impressed by the whole swapping thing under Linux lately. I'm running 2.4.22 with a 400MB swapfile.
Some apps _can_ make the system unresponsive enough to ignore keystrokes, which is *very* annoying. At other times, xmms will stop playing while the disk goes crazy... Switching from emacs to Firefox after 10 minutes usually takes an extra 5 seconds to redraw the window and load all the stuff again.
Running GNOME2 on this laptop is also quite noisy on the disk. It swaps all the time...
You probably want to try Andrew Morton's kernel patches. I've got some servers running his kernel patches and it's nearly impossible to make the damn things swap. Machines with uptimes of 4 months and more, and top shows swap file usage at 0 kilobytes. His kernel patches generally give quite a performence gain. I don't know how they would react under an environment of older machines though.
This point is useful, but only if free RAM is at a premium. For the most part, on servers, there will be sufficient RAM to support the on board applications, and the amount of free RAM remaining will be able to handle the variable load of a standard workday, if the server has been sized properly ahead of time.
On desktops, however, where the number and type of apps can vary much more widely, the need for free RAM is much higher. However, pushing all apps off to swap space as a default to keep as much RAM as possible free isn't necessarily an effective solution, since the OS will spend much more time performing swap operations that it necessarily should.
Perhaps a better solution might be an application where certian portions could be swapped off earlier, as they are less used, or not even loaded at all, while maintaining the core of the application in RAM, with "hooks" to the swapped out areas, to let the OS know that swap procedures are required. If the app could tell the os "I probably won't need functions c.d & e, so go ahead and put them in the swap file", it might lead to a good balance of swapping and performance.
Am I way off in lala-land if I humbly suggest that perhaps an application-dependent swap strategy should be implemented?
Perhaps Im describing a 'per application/library swappiness setting'
So people can specify things like:
-I want the webbrowsers main components to live in main memory forever, or at least only swapped out if a starting app really really needs it
-I want 'tar' and any of its data to have less (not 0, less) priority.
-This app I only run occasionally still has personal priority for me
-This game (on linux? eh..) has number one priority and is allowed to choose its own swapping desires..
It seems that the original thread was not about swapping in or out, but about the amount of cache that is used by the kernel.
I just have the same problem. I have 2G RAM, and I run my KDE desktop, some standard server programs and some UML instances.
When I create UML instances (eg. 8G image) then my memory gets full and is not easily reclaimed.
I agree with the philosophy of the buffers and the cache, to speed up IO operations for recently accessed files, but I do not agree with the time that they are in memory.
I do not know if the VM subsystem knows about the three kinds of memory in use, but I think it should, and I also think that it should first of all free up buffers and cache memory before starting to swap out applications.
For me this just means that if you look at memory usage with any application, that after some time you should see the amount of memory used as filesystem cache gradually decrease over time (ie. when not written to, the system should just mark those pages as free).
Your server apparently believed that it was accessing that cache and buffer more often than that half gig of random pages. Do you have real reason to believe that it was wrong, or does that just "seem" bad?
In other words, do you have actual numbers to demonstrate that your kernel was making poor decisions, or are you only fairly sure that it was?
Dewey, what part of this looks like authorities should be involved?
You can mitigate it with the ulimits to some degree, but it would also be useful to have more fine-grained control of swap usage. "Swapiness" indeed.
How about max-swap-bytes-per-process or max-cache-demand-swap-bytes?
I've seen several OS X applications run away into swap in my time. The whole machine is rendered pretty much unusable until it either exhausts the disk or its address space. For interactivities sake, I think it could be suggested that they are not taking the right approach or there is more work to be done and the status quo severely hurts the user experience.
A separate issue is that OS X does (or at least did in historical versions) bad bad things when it ran out of disk. And dynamic_pager's behavior tended to have the result of eating the whole disk if anything went wrong, leading to even larger problems. Serious bad mojo time.
No, no, a TLA is a Three Letter Acronym!
Whatever swapping scheme is used in Windows, I do not know, and I don't care what it's called either.
What I can't despice, is the fact that I got >300MB free physical memory, and 20MB of the kernel is still swapped. Result? Do this, do that (any minor thing) and you have to wait for it to swap in.
In the end, I have never ever seen a Windows-system without a partially swapped kernel, even with tons of free RAM available.
This is just plain stupid, or is there some sort of "smart" explanation for this?
I, for once, would hate having to turn off virtual memory, just to have the system kernel loaded at all times.... And GOD BE DAMNED if Linux takes the stame stupid design-decision.
Not Buzzword 2.0 compliant. Please speak english.
I know what you mean, but in this case, it seems like your machine is making a reasonable guess: you haven't used Kmail in hours, so the odds of you wanting to resume using it at any particular instant is pretty low. On the other hand, reading from a drive is quite a bit faster than writing, so the penalty for incorrectly swapping out old pages when the system is idle is significantly less than incorrectly not swapping out old pages before users launch giant processes that want to allocate a lot of RAM very quickly.
Dewey, what part of this looks like authorities should be involved?
So way you want to do is:
So if the guy goes to leaving a big make running, it gradually pushed the big apps out while it runs. But if the big make completes, the apps start crawling slowly back in. If it hasn't finished when he comes back from lunch, he probably wants it to carry on running the make: since the CPU is at 100% load, he is probably not surprised it is sluggish.
Consciousness is an illusion caused by an excess of self consciousness.
I have only 256MB of memory in my Linux box, and I use no swap at all.
I run Mozilla, Gimp, Quake3Arena, etc, without a problem. I use Gkrellm to monitor the system, and I see the memory meter rarely goes above the middle.
But I don't run KDE nor Gnome, just Fluxbox.
The old rule of swap size = 2 times memory size is stupid. You just have to consider how much memory the sistem is likely to require at max load, and also how much swap it makes sense to have. For example, it's insane to have 1GB swap on a 128MB machine, cause if it's to use all that swap, sure it will be trashing and slow as hell. Or slower.
http://00f.net/item/14/
describe why swapping is _good_.
{{.sig}}
Remember boys and girls, that stuff like virutal memory, swap files and disk caches are workarounds and hacks for limited hardware resources whether it be size contraints or limits on data throughput.
When these limits no longer apply then it's time to rethink our needs for these hacks and THEY are hacks.
The fact that lots of people do something doesn't make it right.
I've heard on a windows box you want to set your max page file size to 1.5x RAM. So if you're running 512MB of RAM you want your page file to be 768. From what i've noticed with linux or what most distros seem to use for defaults is 1x RAM. I notice my linux box hitting the swap file more than my windows system. However the windows box is always using the page file, even under idle situations when its been doing nothing all night. All be it very low. its stil using it. When i switch over to my linux system...no swap is being used until i open up an ap. Is this consistant with what everyone is saying?
See Sig! See Sig Zig! Zig Sig Zig!!!!!
Yeah there is nothing I love more than coming to an idle X console session on box I haven't touched in a while and watching it grind itself into oblivion because everything has been paged out.
What's wrong with that? The machine shouldn't sit and be ready for every possibility. If you've not logged in in X hours, it shouldn't expect you to suddenly log in. It grinds for a minute or so, and then it's fast again cos it's got what you want to use in RAM. Which is the whole point of swap, stuff you're not using gets paged out.
Why would it leave it in swap after it exited? It's just as fast to read the program from its usual place on disk as to read it from swap.
I think I'm right in saying that programs don't get swapped out as such anyway (probably for that reason) and all the gets swapped out is the memory allocated by it, and the sticky bit meant "don't swap out this programs data"
Best performance improvement I ever got with the 2.4 series kernels was shutting off swap. My machine immediately became more responsive. From that point forward, I wouldn't come back to the machine after an hour away and encounter a jerky X mouse cursor because the instant I turned off the screensaver the kernel had to page all 128MB of my applications back into the 512MB RAM because it decided buffer cache was more important than code.
The 2.4 VM changes causing this behavior were awful, and it's too bad that I have to sacrifice a large (disk-based) physical address space, but I'm not going to put up with my applications being paged out when I have 4x as much RAM as code I'm running. Just allowing the system admin to put a limit on the size of the buffer cache would probably solve most of my problems, but instead I have to turn off swap. Too bad.
[ home ]
As I write this, my process stack (ps) lists 181 processes and KDE's window list contains 50 entries, many of which are Konqueror and Konsole each with multiple tabs open. Among the "bloatier" apps currently running are OpenOffice, Acroread, VMWare Workstation (though, a VM is not currently running), Tomcat 5/JWSDP 1.3, Umbrello (KDE's UML modeller), jEdit, and 2 database systems (Firebird and IBM U2). This, incidentally, represents a typical "snapshot" of my two workstations (both AMD XP 2200+, w/1GB), at most times. Despite all of this, my current memory usage is 990MB w/ 180MB in buffers/cache, and only 320MB in swap (which is mostly holding my WinXPPro dormant VM!).
:-p
The bottom line: "swapping" got you down? By some more RAM! Prices may be on the rise, but it's still likely cheaper than your time.
"LinuX - Dropping the c u r t a i n on Windoze." -- Vee Schade, vschade at mindless dot com
- The process needs to read the page - no problem, one copy is in RAM just read it, and keep both copies.
- The process needs to write the page - no problem, you can modify the copy in RAM and discard the copy on disk. Notice that discarding the copy on disk doesn't require any disk access, as the list of swap allocations will typically be in RAM (it is much smaller than the swapspace).
- You actually need memory - no problem, discard a not recently used RAM page, you still have a copy on disk.
The only problem is, that you need to make the page readonly, so you can trap the write and discard the on disk copy. In other words don't do this for pages that are frequently changed. But usually you don't have many pages that are frequently changed, and you certainly don't want to swap out those you have. And should you occationally happen to swap out one, it is not really a major problem. It will cost you a pagefault, but no disk I/O. And a pagefault is compared to a disk I/O. A system that behaves like I have described here would use a lot more space than Linux typically does, but still it should be faster. I wonder why this isn't done more often, it is not like the idea hasn't been known for years.Another problem that many have noticed, and that isn't easy to deal with, is heavy diskaccess causing the cache to grow and stuff getting swapped out. Yes even some Linux versions suffer from this problem. A Red Hat 9 system I had running for months was really slow in the morning, because all the programs had been swapped out while cron jobs where running during the night. But you never know when it is a good idea to swap the stuff out and when it is not. When the disk access is going on, the process page might not have been used for hours. But still you might want it to be kept in RAM. File pages that have been accessed just once shouldn't be kept in cache for long time. But of course you shouldn't remove them unless the memory was needed for something else. Removing the pages too early is also bad, because you wouldn't notice, that this was really a page that was going to be accessed frequently. Some people are fanatic, and don't want process pages to ever get swapped out to make room for cache. That isn't a good idea either. You can really have process pages that may not be needed even once, do you want such a page to be kept in ram for months just in case? And notice how disabling swap is not going to solve the problem. You still have to think about memory mapped files, that in many ways must be treated like anonymous mappings.
Do you care about the security of your wireless mouse?
In fact, this is already done that way since the 2.0 days IIRC (it may be 2.2).
And also the other way round - if a page is paged in (into ram), it is not immediately deleted from swap, so if that page is NOT MODIFIED it doesn't need paged out again if memory is required.
The universal IT answer of "It depends" applies here as well. Yes, having Mr. Bloaty App glob onto scads of memory that are then not referenced for long periods of time can have a negative impact on other apps if the system becomes memory constrained. And, Yes, if the memory manager swaps a bunch of unreferenced memory out to disk and Mr. User has to wait a long time for Mr. Bloaty App to become responsive because it was his memory that got swapped out, that's a problem, too. The ideal is to be able to address this (haha, bad pun) at the application level and not simply at a global level. This has been the standard on the mainframe (MVS, OS/390, z/OS) operating systems for a long time, where there is a very sophisticated virtual memory manager. If there are, say, a 100 apps and 2 of them are very sensitive to response time, most of them aren't, and 10 are just dead dogs you couldn't care less about how nice is it to be able to actually tell the system that? The 2 "loved ones" then receive preferential storage treatment at the expense of the other, "less loved ones" and the dead dogs are always first on the pecking order of who to steal storage from. The memory manager then is acting to maintain the responsiveness of the applications (the reasons we run OS's in the first place) to meet the needs and expectations of the user(s) (the reasons we run the Apps). Without that ability, arguing over "more swappy" vs. "less swappy" when it's only applied at a global, default, level is not especially productive except within the context of attempting to establish, perhaps, where the best general-use default happy setting is - for the general-use default system we all use (is that you? I know it's not me).
"The bigger the lie, the more they believe." - Det. Bunk
I've seen a number of posts echoing this point, overlooking one of the key reasons for swapping. It's not just because you're out of memeory for applications, it's because sometimes there are better things to be doing with your memory. Mainstream operating systems use otherwise unused memory to cache disk access, dramatically speading things up. If you've got an process that hasn't been run for a a while it may actually be more efficient to swap it to disk. This frees up memory to cache data that may be being hit quite frequently. inetd hasn't been needed for a while? Swap it out so that your disk cache is larger, benefitting your heavily used web server.
To be fair, when to make that trade off is very tricky and will never work perfectly 100% of the time. Inevitably you'll occasionally be burned by a bad decision. But there are real benefits. The real question is not how to turn it off, the question is how to improve it and perhaps how to allow users to tune it for their needs.
Search 2010 Gen Con events
"DisablePagingExecutive"=dword:00000001
"LargeSystemCache"=dword:00000001
The combo will keep your kernel in RAM (DisablePagingExecutive)and enforce a minimum reserve (4MB) amount of memory for it (LargeSystemCache), although that amount can grow dynamically. The LargeSystemCache may cause "Delayed Write Failed" errors, if it does, reboot in SafeMode and undo the damage.
More goodies here, many adjustments that can affect how your sytem divides between swap, cache and currently-running-apps:
HKLM/System/CurrentControlSet/Control/Session Manager/Memory Management
"ClearPageFileAtShutdown" "DisablePagingExecutive" "IoPageLockLimit" "LargeSystemCache" "NonPagedPoolQuota" "NonPagedPoolSize" "PagedPoolQuota" "PagedPoolSize" "PagingFiles" "SecondLevelDataCache" "SystemPages" "PhysicalAddressExtension" "WriteWatch"
HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\Se ssion Manager\Memory Management\PrefetchParameters
"VideoInitTime" "AppLaunchMaxNumPages" "AppLaunchMaxNumSections" "AppLaunchTimerPeriod" "BootMaxNumPages" "BootMaxNumSections" "BootTimerPeriod" "MaxNumActiveTraces" "MaxNumSavedTraces" "RootDirPath" "HostingAppList" "EnablePrefetcher"
..that machine should have at least 2GB swap (same as RAM) for a very simple reason: /var/crash/`hostname`/*.
/var/crash in a format useful for debugging.
When your kernel panics, it [basically] copies memory to your swap partition; on boot, it stuffs it back in
Do daemons dream of electric sleep()?
I've heard before that "you can NOT turn swap off on Windows 2000", but to hell with it, I think I'll try it when I get home tonight. I've got 756 MB of RAM, if the system crashes when I "hit the wall" so what, I don't think I will hit the wall. Any comments? Will Win2000 let me turn them all to zero min/max size? Anyone tried that before and know what the real actual implications are?
I've tried this on NT 4. The system gets very pissy about booting, and a lot of stuff doesn't start.
It *is* an interesting case that many programs are unlikely to be useful in predicting future disk accesses, but that the OS cannot use such data.
Take updatedb on my system. When updatedb runs, masses of cached directory data chew up memory on my system. This happens once a day. Now, I'm very unlikely to actually use that data -- in general, updatedb is a poor predictor. However, Linux can't figure out that updatedb shouldn't be trusted for prediction.
It might be an interesting project for an enterprising CS student with an interest in Linux to try producing a "learning" VM system -- logging which program is responsible for each page getting cached in memory, and then determining whether that page is actually used before being replaced. Store such a profile on disk ("/var/vmlog/bin/ls" or something), and you have a bright adaptive VM system.
Such a system doesn't even have to log all the data, only a randomly selected small percentage, so that it slowly gets smarter.
Such data would also provide valuable tuning data for folks who might want to tweak the VM subsystem (or, in the case of the end user, determine whether to buy more memory...)
If you *really* want to get elaborate, you could even learn to use different eviction algorithms with different programs...
May we never see th
Would you rather have a user process wait for all pages to be brought into memory, or would you rather
:) -- primarily because of page caching. All of these nice read aheads are abslutely useless for DB system. But, do not despair! 2.6 kernel has a much improved read ahead algorithm.
have your kernel wait until a page is brought into the cache from the disk?
Kernel preemtiveness can't solve everything. If your system call has to read a page from disk, it might have to wait until the page is brought in from disk. During this time the kernel could be locked.
Your application, on the other hand, will be put onto wait queue until data is available, allowing other applications to run.
Of course, it's a known fact that DB people generaly do not like kernel people
That seems to be EXACTLY the stupid behavior that you see with Redhat 9 and Redhat Enterprise version 3. The disk buffers start taking up a huge chunk of memory and there's no memory left for other things which ought to get it instead (databases, email, etc). It's absolutely infuriating to be trying to tune a redhat9/enterprise3 system's memory usage and have no control over how much is consumed by the disk buffers.
I don't care how smart you think the OS is about deciding when disk buffers are more important that application memory - there are always going to be times (many of them) when the OS can't make a good decision.
Reasonable OSes (HP-UX) at least have a way to limit the maximum amount of memory used for disk buffers. It's been in their performance tuning guides for many years now too.
Redhat 7.2 did NOT have this problem - the same hardware running rh7.2 will never swap whereas it thrashes under rh9 because of the memory given to disk buffers.
It's amazing it's taking so long for redhat to realize that, and that the problem has made it into redhat enterprise version 3 is really disappointing.
Once, i turned of the swap space in my computer. (Actually, I forgot to turn it on after some experiments). It did work great. Except in some rare circumstances, when it needed memory so badly. It would then reap the biggest application running, which was always the X server. Morale: just set swapiness to 0, but make sure you have plenty of ram+swapspace.
I thought---Damn... This is true
/proc/sys/vm/swappiness
My machine would be SO much more responsive if it didn't swap.
cat
60
hmm...lets bump that down to 0
echo 0 > swappiness
lets check the swap we are using....
top
Swap: 0k total, 0k unused, 0k free
DoH! My machine botched a software suspend, and my swap is screwed.
Oh well....Most 'Swapping' is perceptual. Pick a moderate swappiness value, and your system WILL actually be faster. The 1/4 second or so you loose upon bring OpenOffice back from the disk is more than made up for by all the cache you regain.
IMHO, the only disk 'thrashing' that is annoying has little to do with swap, and much more to do with webbrowser cache to disks, etc. . .
WhiteWolf666 an exBush supporter. All you new-school,compassionate,save the children Republicans can rot in hell
Can someone please describe any adaptive algorithms that could be used. Specifically, I'm thinking of:
- dirty marking unreferenced pages when swaped. if these mem pages are not used after the swap out, no need to swap them in again. i'm prety sure this already occurs
- for process using high swap demands, increase their weighted priority for pages, with a window-averaged for swaps. so then, my database process could hog under load while my less-used apps may swap because they're used less often. could be taylored differently for code versus data segs.
- page-impage comparisons to avoid holding duplicate code segment pages in memory. this plays with the concept of shared libs a bit, but could avoid duplicate pages, especially if this information is saved in a precalc'd hash table that is stored.
just ideas.
Is a difficult dilemma, but that's because an overly complicated scheme is used.
There is a simpler and more powerful scheme that unifies swapping and disk caches, while allowing applications to persist between reboots, all with better performance than current systems!
EROS implements such a system. Generally it is referred to as "Orthogonal persistence", and functionally it behaves as though the computer is "always on", and returns to the exact state it was in after a reboot. The thing is, with orthogonal persistence, the structure on the disk is not a file system, but just the application data.
Since applications no longer work with the disk explicitly (open/read/write) but only with one type of memory (persistent memory), the OS manages all of the disk I/O, and it allows it to eliminate almost completely the largest delay in disk-work - the seek time in all writes. Since all application memory is just mapped to disk transparently, all RAM is just considered a "disk cache", and the kernel does not have to make nasty tradeoffs between disk caches (of explicit open/read/write calls) and virtual memory.
Of course there is still a problem if large work-areas of unimportant applications "swap out" smaller areas of important applications. I suggest solving that by prioritizing pages to the memory manager. In a system like *nix it is not a problem. In more secure systems however (EROS, for instance), it may create additional covert channels between applications so it was avoided.
~: ls /proc/sys/vm/swappiness /proc/sys/vm/swappiness: No such file or directory
ls:
What kernels support this?
I don't know the specifics either -- I ran into this when trying to track down why valgrind suddenly wasn't packaged by any of the big third party Fedora RPM packagers.
If this *does* get resolved so that valgrind can go into Fedora Core 2, I have to say that that would be *awesom*.
May we never see th
Also, someone mentioned an autoswappiness patch. Does that "solve" the issue?
My approach has been to start all the needed services and then run this small perl script (which I named memhog.pl) to create a process that hogs quite a bit of memory:
;)
#!/usr/bin/perl -w
use strict;
my $a = "xxxxxxxxxx" x (131 *1024*1024);
This is just a quick hack, you may want to adjust the size to suit your memory size. The server from where this script was copied has 2GB of memory. Essentially I want to page out all the stuff that doesn't get used after starting the server and the related server processes. Of course, given enough time the server would swap out those pages anyway, but this method just does it quicker. After the script has been run, the server will gradually swap in those pages it really needs. OK, doing this may be pointless but I don't care
Follow your Euro bills at EBT
I got xp and 1.5 GB of ram and xp works great soon as I turned off the swap file. I was told that "some" apps needed it, bullshit. The only one I can think of is adobe photoshop and it only gives u a slight warning when you run it, then just hit continue and it continues loading. Therefore it doesn't "need" it. Although my ram usage is usually at 700+ megs its fine by me. Now my swap is no longer a bottleneck since I don't got one :) Shows you how much ram windows actually uses. It's cut in half because half of it goes into the swap usually if u have it turned on.
My Gawd WTF...
There's a registry key that purports to reduce the swapping of the 'executive' components of the kernel. I think it's in: HKLM/system/control/ccurcset/sessionmanager/memory
(abbreviated for sanity)
There will always be SOME of the kernel paged out, because I guess some stuff loads into the kernel at boot and can't be 'jettissoned' I think my w2k box dropped from 32MB/kernel paged to about 20MB after I disabled PagingExecutive.
It's a good idea for laptops or high memory systems.
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
Well after three years of compulsively staring at top output while I do evil shit to my boxes i've decided to move to swapfiles instead of partitions.
/swap.img
The MOST swap I've ever tapped was 16M, I run all my boxes with 512MB-or-more RAM, so there's really no need to have a sizeable swap.
Right now I have three boxes running with 768MB RAM, and each has a 64MB swapfile, and with over a week uptime on all of them I'm using...
3MB on the file/print server
0MB on the workstation
0MB on my friend's machine
The best part about a swapfile is that the size isn't set in stone, if I want to move up or down I just:
# swapoff -a
# dd bs=1m if=/dev/zero of=/swap.img count=
# mkswap
# swapon -a
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
I realize the start of this thread was about server machines, but I'll still talk about my desktops :)
:)
I use a laptop at home. It has 500MB of ram, runs linux (2.4), kde, opera, eclipse, openoffice. xmms, gaim plus whatever. This box have no swap space. I've been using it for about 6 months and I've never had any problems with too little mem. Of course, after playing OGGs for a couple of hours, whatever memory isn't used by apps are used by cache. I don't see that as a problem. I would have really hated if my apps were swapped out in this situation.
At work I have a kickass machine with 2GB physical and 4GB swap. It runs mostly the same apps as my home machine pluss some really memory hungry server software (out own product). I can't remember ever having noticed any swap activity. I'm running the ksysguard applet all the time so I have a pretty good idea of its memory usage. My previuos machine was a windows box with with 500MB. It was swapping *all the f***ing time*. Leave eclipse alone for a a couple of minutes, and you had to wait for ages while it copied all 100MB of it back into memory...
My point is that you don't really need any swap partition at all on a desktop box. "But what if you use it all?", you may ask. "So what?" is my answer. With a swap partition you also have a hard limit on how mush mem you can use. It just get a lot more painful to use it all, with all that swaping. If you have been adviced to have a 1:1 size between your physical mem and swap partition, you might aswell just buy twice the amount of ram, and be a twice as happy geek
I have a headless file/print/shell/kerberos/distcc server and it boots all services in under 40MB RAM, it's got 768MB under the hood and uses 40, that leaves a LOT for buffers and cache. I've got no use for swap on it, so I made a token 64MB swapfile (not a partition) and it's almost never been touched.
Servers often don't need much RAM, adding or disabling swap on my server wouldn't make ANY difference at all.
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
Give a pseudofile for what gets RAM priority (root's apps, disk cache, user apps, whatever else), one for how aggressively to free RAM, how long to wait before a page counts as old, and let people customise.
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
It was added somewhere in 2.5.x
/. posts, grain of salt and all)
(this from other
The funny thing about the whole thread is this: it eventually turned out that the guy who originally complained about swapping didn't have any actual swapping going on at all. What is usually happening is that pages from an *executable* are being dropped, and people incorrectly refer to this as "swapping". In Linux, pages from an executable aren't "swapped out" but are simply dropped and read back from the original executable as needed. The trouble is that once you then access some of those pages again -- say, when you're exercising the repaint code from some bloatware app for the first time in a while, because it's been in the background for a while -- they have to be faulted in one by one, and that takes a lot of time per page.
What seems like it would be a *better* method would be to do both. Keep the running apps in core, but as spare I/O cycles permit, write them out to disk. When you then load a large app, the other app is already swapped out. So long as the swap was very low priority when the system *had* enough RAM, about the only downside would be writing to disk all the time. In fact, you could even prefetch the app back as chunks of memory became available.
Because the soldiers are sold by Bush for the oil companies.
The soldiers did want $$$ for theirs jobs.
As far as my understanding goes, the reason Java has this problem is becuase it uses a mark-and-sweep garbage collector. What this means is that every time the garbage collector runs, it has to "touch" every single object in the system-- that is, the way it works is, it basically looks at every single object accessable in the system and marks it as "YEP, STILL ALIVE!". And essentially, any objects that don't have an "alive" marking afterward can be considered inaccessable and therefore reclaimable space. That's just how it works, and this necessarily requires it to access the memory of every object in the Java system, hence entirely reordering whatever your virtual memory/processor cache thought it was doing just previously. There isn't a lot that can be done about this.
The generational garbage collector used in newer Java virtual machines will probably help with this problem greatly. Generational garbage collectors assume that anything that survives for a couple of collections is probably going to be around for a very long time, so if something survives a collection or two, further collections (unless space REALLY starts getting low) just ignore it and assume it to be still-live without actually checking to see if it is. This can significantly reduce the amount of memory that has to be touched, and so it would probably decrease the number of memory pages the virtual machine needs swapped in on each GC run.
Irritable, left-wing and possibly humorous bumper stickers and t-shirts
no text
Sorry but this is not a complex equation and I think these guys are getting wrapped up in too many details and missing the big picture.
The harddrive is really, really, really, really fscking slow. In comparison Ram is really really really fast. As a result, you want to interact with the hard drive as little as you possibly can, and interact with ram instead as much as you possibly can (the only thing which beats that is interacting with only the cpu registers and avoiding ram and harddrive altogether).
As is, linux doesn't even begin touching the disk until there is only enough ram left to turn on VM. Now this has a negative impact when that limit is reached because there is overhead turning it on.. this impact is negligable and tweakable since you can wait and see if you hitting the limit, add more memory, see again and reevaluate until you simply aren't swapping. This is a good thing.
One of the worst things windows does is swap constantly. In fact beyond a certain point (read enough ram to run an XP desktop) the system swaps MORE if you have more ram. You boot the system with all uneeded services turned off and no startup processes and all the eyecandy turned off. And you've got 4gb of ram in the system, guess what, it's already using VM.
Maybe VM management itself could be tweaked more, but it certainly shouldn't be used unless it absolutely has to (and if you don't have enough ram and it has to all the time then it's not like you suffer that performance hit more than once).
The only exception to this I've found is a linux desktop running kde or gnome with about 256mb of ram, at that point the numbers seem to work out just about right(or wrong I should say) and the system is constantly turning VM on and off, encountering the performance hit again and again and again, with pretty much every operation you perform.
Swapping out any semi-active process, especially an interactive one, is utter foolishness which should be subject to termination grounds (unless of course you work for a big iron computer company that needs to boost revenue by purposefully slowing systems down). Any time you decide to clear memory and page a task out ... you are creating two I/O's ... one out, one back in ... and doing it with shit poor locality since the swap area is probably at the other end of the disk from any other active I/O .... just discarding entries in the file cache only requires a single I/O to recover it. .... so tell me ... who is the idiot that believes two I/O's to kill the performance of an interactive process is better?
Not a problem actually, that permanantly swapped kernel "Nonpaged" is support for 16 bit APIs, OS/2, POSIX and other stuff that you will never ever use.
And it isn't really swapped either, as it isn't data. Swapped pages are data only, code pages are overwritten and just paged in again when needed.
Here's a solution to the whole debate -- make the sticky bit have meaning under Linux like it does on other UNIXen -- if the sticky bit is set on the execuatble, do not swap it. If it is not set, the executable is free to be swapped. This solves the entire debate (for instance, if you don't want the 'interactive' mozilla process swapped, set the sticky bit on the executable).
We either need some way of expressing this in code in a way that's exposed to the OS so it can avoid paging it out (and don't say 'well, just lock the pages containing it', think about how GUIs code & data are arranged--i.e. OO), or, perhaps instead of LRU, paging should pay attention to when a page was last used relative to an interaction--if you use some menu to bring up a dialog that triggers a long, memory-intensive process, that menu may have been used longer-ago than stuff happening as a result, but it's going to be used again first--LRU is the wrong model. (Also consider the interactive parts of other apps that you have open windows for.) Maybe it should still get swapped out, but swapped back in when some memory is freed up--'this thing was used very soon after an interaction, so it should be in memory if possible'. I don't know if any OSes attempt to page-in from swap before something is requested. Aren't some processors these days trying to prefetch memory requests based on patterns of memory access (not just prefetching code memory)? Same sort of idea.
Tuning for interaction isn't new to OSes; the VMS operating system's process scheduler treated 'interactive' apps differently from 'batch' apps (where, if I remember correctly, an app was interactive if it paused for I/O before using up its timeslice). I dunno if Unixen or Windows do things like that.
The logical way to allocate swap is to figure out how much virtual memory you need (1GB, 512M, 2G, whatever) then allocate enough swap that SWAP+RAM is at least that big.
Think of RAM as a cache for swap and you won't go too far wrong. More RAM is better, but if your system is stable with 160M RAM and 320M SWAP there's no point in allocating another 1G of swap when you add a 512M DIMM unless you at the same time increase the workload on the machine.
I totally agree that users should have more control over how and when things get swapped. For instance a mozilla process tree, I'd be searching stuff on the net, then go back to coding, which can go on for hours, then I quickly need to check something in the mozilla window, and it would be damn slow to reload everything from swap. This is related to swapping stuff that's been idle for x time. The sticky bit idea, or something similar would be nice...