Swap Performance in Linux

← Back to Stories (view on slashdot.org)

Posted by michael on Tuesday March 12, 2002 @08:27AM from the RAM-is-cheap dept.

GizmoDuck writes "I'm working in a computational chemistry lab, and we find ourselves using memory and CPU hogs like Amber and Gaussian. The CPU hogging isn't a problem, thanks to Condor, but when submitting one of the jobs that request (and pretty much require) all the physical RAM in the machines, Linux promptly starts swapping so hard that the mouse pointer in X stops moving, NFS and NIS halt, and things don't get back to normal for five minutes. I've tried toying a bit with the settings in /proc/sys/vm/kswapd to no avail. I've done some poking around on the 'net looking for answers. Faster disks and swap partitions at the beginning of the drive aren't really an option at this point. I haven't found a good solution yet. I was wondering if the /. community has any input on how to keep the system from locking during periods of necessarily high swap activity?"

28 of 62 comments (clear)

Min score:

Reason:

Sort:

The answer can be summed up in a math equation by Anonymous Coward · 2002-03-12 08:42 · Score: 2, Informative

2.4.x

I still have no idea why Linus used 2.4 as a development tree. Go back to 2.2.x, no swapping problems going on there.

By the way, does anyone know the command to flush the swap partition?
preempt ? by raulmazda · 2002-03-12 08:45 · Score: 4, Informative

Maybe try out the preemptible kernel patch?
My personal experience is that it has helped my workstation's interactive performance noticeably for big ass c++ compiles and periods of lots of disk activity (big apt-get dist-upgrades). Thankfully, I'm no longer doing the big ass c++ compiles, so it's not as big of an issue as it used to be :)
Try preemtable kernel patch... by PaulBu · 2002-03-12 08:46 · Score: 5, Interesting

It should improve interactive performance (i.e., your mouse will start moving again :) ) when load is high. Also, running your background process nice'ed will be helpful.

You might also consider a crazy idea of having swap file on NFS -- you'll get (if your network is decent) almost the same bandwidth as you get when accessing (older) disk, but much higher latency (this will put your background process in disadvantage compared to your interactive processes).

Hope this helps.

Paul B.
1. Re:Try preemtable kernel patch... by PaulBu · 2002-03-12 10:55 · Score: 2, Interesting
  
  A neat idea, but wouldn't that just migrate the problem to the NFS host? I'm too lazy to try it myself.
  
  Sure it would, but:
  
  interactive performace on NFS server might not be that important
  
  it might have faster disks
  
  and, finally, the swap hog program will slow down due to network latency, creating less load on NFS server than it would on the workstation.
  
  Paul B.
Pushing the limits of RAM by Skapare · 2002-03-12 08:48 · Score: 4, Interesting

If your program(s) push Linux to the point where it actually runs out of available RAM faster than it can free it up, then "all hell breaks loose". It has to swap something out, and just about every program is eligible to be swapped out. That includes GPM (if you are on a virtual console) or X (if you are in X Windows). You need to account for all of these things to determine your RAM needs. Add up the memory usage of all your active programs, plus the buffer demands they have doing disk I/O, plus the kernel, and you need that much RAM. If the program is doing a LOT if disk/file writes, you can expect the buffer demands to be the majority of this, too (because the kernel believes what you just wrote you might soon want to read back, so it tries to keep lots of it in RAM even if that means swapping out GPM and X).

--
now we need to go OSS in diesel cars
FreeBSD by paul.dunne · 2002-03-12 08:54 · Score: 3, Interesting

Is switching to FreeBSD an option? The virtual memory management there is much better than in Linux under stress.
1. Re:FreeBSD by Aaaaaargh! · 2002-03-12 09:59 · Score: 4, Interesting
  
  Is switching to FreeBSD an option? The virtual memory management there is much better than in Linux under stress.
  
  I'd have to agree. The author should look into using FreeBSD. A GIS project I'm currently working on allocates 3GB of RAM at startup. Until we get the rest of the funding for our SunFire solution, we're using what we have available, which is (was, actually: we've replaced the OS with FreeBSD) a P4 Linux box with 2GB of RAM, a 9GB SCSI drive for swap partition and a 36GB SCSI drive for everything else.
  
  I'm not a Linux expert, but the techs in the department are. After a few weeks of their tinkering, it did pretty much the same thing as you're experiencing. I have a small development system at home (P3, 1GB RAM, 4GB SCSI swap, 40GB IDE for all else) running FreeBSD. Installed the software, and it runs like a charm. X works beautifully, Apache still serves up pages (of course, it doesn't get much traffic at home) and the program never chokes the system. Granted, with only a gig of real memory, it spends a fair amount of time accessing the disk (about 30 seconds every 2 minutes), and it steals almost all the cycles from dnetc!
  
  --
  Give them an inch and they'll take a foot. Much more than that, you won't have a leg to stand on.
2. Re:FreeBSD by Strog · 2002-03-12 10:23 · Score: 2, Insightful
  
  How many CPU's?
  FreeBSD on a dual processor box will match a dual linux box any day. I didn't say beat but they can go back and forth depending on exact applications. If you go more cpus then performance will start dropping off in a hurry. It shouldn't do as well on paper but real world applications show that it performs very well with 2 cpus. It's kind of like micro-kernel *should* be better than monolithic but in the real world it isn't.
  I'd go with a dual FreeBSD box any day especially if it is going to be under high loads. I have more linux boxes than anything else right now but their performance under load has been an issue. If you would rather stick with linux then look at some of the alternative VMs out there. I would stick with linux if you have more than 2 cpus unless you really want to go with a commercial Unix (Solaris x86 maybe??).
Preempt + Ingo Scheduler by haplo21112 · 2002-03-12 08:58 · Score: 4, Interesting

The best way to handle this(or at least the best way I handled a similar situation) is to combine Robert Love's Preempt patch and Ingo's Scheduler.
They will significant increase high load user performance, keeps the system from running away with itself. If your feeling really, adventuresome you could also throw in Rik's Rmap VM...I have done very little testing with it, but I hear alot of reports that it helps.
there are all available in the authors respective directories on Kernel.org riel,rml,mingo

--
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
1. Re:Preempt + Ingo Scheduler by ChadN · 2002-03-12 09:12 · Score: 2
  
  In particular you might try patch-2.4.18-pre9-mjc2.bz2 , which include O(1) scheduler, preempt, and Rik's Rmap vm (among other things), and has been working solid for a number of people. At least, it is worth testing out to see if it helps any.
  To build it, get the linux-2.4.17.tar.gz kernel, patch it to linux-2.4.18-pre9, then patch again with patch-2.4.18-pre9-mjc2. Then build and use the kernel. Check recent (ie. 2002 ) kernel archives to read discussion of this and other related patches, if desired.
  
  --
  "It's overkill, of course. But you can never have too much overkill." - Anonymous Slashdot Coward
You're out of luck by afay · 2002-03-12 09:07 · Score: 5, Interesting

Unfortunately, you're out of luck. The current linux VM (in later 2.4 series) is fine for low to medium load systems but falls apart on high load systems. The previous VM (early 2.4 series) is a good design but isn't really ready for production.

I would suggest buying more RAM (it's cheap) if you aren't already maxed at 4 gigs (x86). Alternatively switch to FreeBSD which has a very stable efficient VM. Any source should recompile without too much trouble and it can run linux binaries at almost full speed!

--
Best slashdot comment
1. Re:You're out of luck by Skapare · 2002-03-13 20:06 · Score: 2
  
  The VM in the early 2.4 kernels would grossly lock up when it was out of memory. I was told this was due to the fact that the design assumed you had at least as much swap space as RAM. It could not handle the case of (memory need > swap) even though (memory need < swap + ram). I have several systems which have lots of ram and no swap at all, and they would die quickly. And it wasn't because I was overusing memory with the processes. This would happen even if the ram got used up when writing data to a file larger than ram space. The later 2.4 VM fixed that. Hopefully when Rik's VM is cleaned up, it should solve the problem with lack of (or small) swap.
  
  --
  now we need to go OSS in diesel cars
A couple of suggestions by ThatComputerGuy · 2002-03-12 09:10 · Score: 2

First of all, I would recommend trying the preemptible kernel patch and even the low-latency patch. It seems like an obvious enough suggestion, but some will tell you that these patches should not be used in servers where throughput is important, and that is correct... in some cases. It has been shown, however, that in most cases the preemptible patch increases performance and throughput. I have not heard of any such testing on the low-latency patch, as I am new to it.

In my testing, these two patches have been a big help, especially on my P166 system with 48MB RAM.

Also, you say "faster drives" and repartitioning are not feasible ATM, but how about multiple small drives? As shown in this howto, the linux kernel has support for striping data to swap disks, just by specifying multiple swap entries in fstab.

Then again, if you're not on SCSI, trying to stripe to the swap drives won't be much help anyway, as RAID over IDE for _speed_ usually is just crap.

That last suggestion may not be for you, but definitely try the two patches. It should also be noted that preempt is a compile-time option, and there is also a compile-time option to control the low-latency patch through /proc/sys/kernel/lowlatency. An additional patch may be required to allow these two to work together, but I am unable to locate it currently.

--
XML is like violence. If it doesn't solve the problem, use more.
can you "nice" the applications? by BroadbandBradley · 2002-03-12 09:11 · Score: 2

just a shot in the dark here, but can you just give a lower priority to those applications in order to keep the workstation usable while doing this work?

I can't recall the command line option off the top of my head but I know using Gtop, you right click the app, and pick renice, then set it to 1 instead of 0.

--
"The Most Fun Possible on 4 wheels" is at SunBuggy in Las Vegas
1. Re:can you "nice" the applications? by CMiYC · 2002-03-13 05:02 · Score: 2
  
  That probably isn't going to help matters. Each process needs a certain amount of memory space. No matter how nice it is, it will still require so much memory. The only thing nice will do is maybe cause a small increment in how after the programs get data swapped in and out. It would be very small since the data is going to have to be swapped in/out eventually anyway.
  
  BTW, the command is "nice" or "renice" if its already running. Pretty tough to figure out.
Linux 2.4.x VM by Trevelyan · 2002-03-12 09:20 · Score: 3, Insightful

Did you miss all the 2.4 Linux VM Stories?

I suggest build/installing the latest kernel with the aa VM (the default VM, since 2.4.10). If you still have VM (Swap) problems then go get the latest rmap VM patch and try that.

The kernel VM (Virtual Machine) is what manages memory and sawp, btw.

And if u did miss all the VM stories, a summery:
at the start of 2.4 a new fancy mv was put in to action, using something known as reverse mapping. this was very clever but it wasn't quite ready and there were teathing troubles then suddenly (2.4.10) Linus switched VM to one similar to that of 2.3 (with some updates and a few features from the previous 2.4 VM) This started a big fight, which caused concerns (such that it may split the linux comunity)

which is better i dont know some swer by one other swer the other. but unless ur using RH 2.4.9 kernel i would not recommend a pre 2.4.10 kernel.

however you may need to experiment which is best the VM now in 2.4 (to stay) or rmap, u should try both and see

steps
Install 2.4.[17,18,19]
try it
if it fails u try the rmap patch
Shut down X? by swillden · 2002-03-12 09:41 · Score: 5, Interesting

Do you really have to be using the machine while it's calculating? If not, what about shutting down X and any other memory-hogging system components? Unlike on Windows you do have the option of turning off that expensive GUI.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
My system never seems to give swap BACK by tzanger · 2002-03-12 09:55 · Score: 2, Interesting

What I'd like to see is something along the lines of some kind of LRU which gently starts swapping data back into memory from swap when memory becomes free. There's nothing like having VMWare sitting in swap since you stopped using it an hour ago to do some other work and then jumping back and having to wait the 5-10 seconds of heavy disk activity to resume work there.

As for those saying "don't use swap at all" -- that's crazy talk. I'd rather have an app or two go to swap instead of being outright killed by the VMM when it needs an extra meg or so. If I'm not mistaken Linux tends to pick the big memory eaters to dump to swap over the little guys so if you start a compile... there goes VMWare... or your IM client... or Konqueror... lots of fun. :-)
Re:Without Swap by fonebone · 2002-03-12 10:35 · Score: 2

if you have enough RAM, it won't even use the swap. so if it is using the swap, and you take away the swap, you'll likely just run out of memory.

--
when the rain comes, they run and hide their heads. they might as well be dead.
Computational Chemistry by leastsquares · 2002-03-12 10:37 · Score: 3, Informative

Why don't you submit yhis query to the computational chemistry mailing list (see CCL)

Those people may be able to give you some sensible suggestions, especially with respect to those particular peices of software.

I believe that you can restrict the amount of memory that Gaussian uses via its keywords. When it requires more, it will handle the dumping of data to disk itself. Read the manual - I haven't used gaussian since g94 was the current version so can't remember..

How big is your AMBER simulation? I think I would run a smaller system... or even better... buy some more RAM given that it is dirt cheap nowadays.

AMBER's memory use is a bit heavy - you may have better luck with another MD package. Maybe NAMD? (Although I'd still vote for the "buy more RAM" option)
Not mentioned yet -- go lean... by Spoing · 2002-03-12 11:07 · Score: 4, Insightful
I'll second the other comments already made. In addition, sometimes the simplest ideas are the most valuable, though I'll assume you can't just drop in more RAM.
With that as a given, if your app needs all available memory, run top and lsmod to see what's using your memory and remove everything you don't need (usually by deleting the links to those processes in the /etc/???/rc5.d directory).
If you can't remove it, scale it down. For example /etc/inittab lists off the different virtual terminals that appear when you press ctrl-alt and a function key. If you never use this feature, try reducing this down to 1 or 2 terminals. Leave some behind just in case you need them later. To do this, just comment the higher numbered lines that look like this;
1. 6:2345:respawn:/sbin/mingetty tty6
(NOTE: Removing these lines might not make any difference -- it all depends on the distribution.)
As for X (assuming you need it and are using XFree), try removing any Load lines in the modules section that you don't need and scaling down the display size, background images, and color depth. Another big area of savings is changing the window manager. FVWM usually is installed, and while it is ugly it is also fairly light weight when compaired to KDE, Gnome, and other popular full-featured WMS.
While these steps alone won't eliminate the speed problems -- the other comments might solve that -- the time you spend waiting might be cut way down.
--
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
Just get more ram by I_redwolf · 2002-03-12 11:56 · Score: 2

If you already are maxed out on ram then "nice" is your friend. You can try to squeeze out as much performance as you want, but if you don't have enough ram or ram is not an option then you just have to deal. Also make your swap partition bigger if it's getting full to quick, if 8 people are running gaussian, amber. Then obviously you need more than the "recommended" swap even with large amounts of memory. 4 gigs of memory seems like alot until you start doing heavy shit.. thats where you get a nice cheap 40-60-100 gig drive and make the whole thing primarily for swap. Problem solved. You might also want to write yourself a daemon that nice's based on order.. from -20 to 20. First in gets the highest priority.. subsequent processes get a lower priority and get reniced to a higher priority as the first process finishes.. This way if it's only a dual system even though linux is pretty good with multi-processor support you get even more efficient scaling. Worst comes to worst lobby for a couple of blades or netras or something and stack em.
unmask interrupts by Wills · 2002-03-12 12:44 · Score: 3, Interesting

You could try hdparm -u 1 which unmasks interrupts when the disk interrupt service routine is active. This often allows your mouse to continue moving even if the disk is busy dealing with swap. It's not perfect but it helps a lot. As others have suggested, also try the preemptible kernel patch but keep backups!

--
Scroogle
Some details for the curious by GizmoDuck · 2002-03-12 17:19 · Score: 2, Interesting

1) I can't seem to get on the CCL list. I couldn't find automated instructions and when I sent an e-mail to chemistry-request, nothing
happened.

2) We're already nice-ing things up the yin yang and using the 2.4.18 kernel with pre-empt patch with no noticeable results.

3) The machines must stay useable as they are also analysis and server machines in addition to computational boxes.

4) Machines are dual P3 1400s. Unfortunately, disks are EIDE and RAM is 256MB in the process of being upped to a gig. However, this doesn't change the fact that we'll be running some calculations that will use all of that.

4) We're not so anxious to buy 4GB of RAM for each machine until we're sure what kind of Beowulf cluster we're constructing and hence how much of our money goes to it.
FreeBSD by DiSKiLLeR · 2002-03-12 17:36 · Score: 2, Interesting

The memory manager in Linux has lots of problems (as previous posters have pointed out).

Have you tried FreeBSD? Apart from being a better OS all round, the 4.x series has a brand new revamped VM subsystem that handles high memory loads very efficiently. I never have a problem with swapping on any of my machines (which range from 32mb, 64mb, to 512mb ram machines).

This isn't a troll. Sometimes a certain OS isn't the best solution for a job, and a different OS should be used. I use Linux for GUI/X type things, FreeBSD for heavily loaded servers (since it handles much better), and even Windows 2000/XP for other things. If those programs you use are linux binaries, FreeBSD can easily run them. If you have source, all the better. Recompile with all the specific optimizations for your hardware. (-O3, -mcpu=pentiumpro, -march=pentiumpro, etc)

D.

--
You can tell how powerful someone is by the magnitude of the crime they can commit and be able to get away with.
Maximum swappage by Peter+H.S. · 2002-03-13 17:25 · Score: 2

IMHO, the your box is underspecced (ram, ide harddisks) for
the job you are doing.
Of course, you can try read:
/usr/src/linux[name]/Documentation/sysctl/v m.txt
for some tunable /proc parameters (eg. /proc/sys/vm/ overcommit_memory should be 0 (zero))

Since you are using ide disks, 'man hdparm' is your friend.
Check your kernel config for dma support of your mobo chipset.

Daniel Robbins (from gentoo linux) has written an interesting
article "Maximum swappage" http://www-106.ibm.com/developerworks/library/swap tip2.html

Linux allow you to parallelize swap, just like a RAID 0 stripe

/etc/fstab:
/dev/hda2 none swap sw,pri=1 0 0
/dev/hdb2 none swap sw,pri=3 0 0
/dev/hdc2 none swap sw,pri=3 0 0

Eg.: spread your swapfile on two disks, with equal priority.
That way, you should in theory, double RW access speed for the
swap. Also, some gains could be gained, if the swap partitions
were moved from disks, that the OS and apps writes to.
But read the article.
Re:4K here, 2K there, it's in the 3rd decimal plac by Spoing · 2002-03-14 08:12 · Score: 2

Good point, if it were true!
Any change in available memory can have a drastic effect. The sum total of the changes should add up to a minimum of 10M on an untuned system (One example: Bonobo on Gnome uses ~3.5MB by itself, while a few Gnome terms with a large history buffer chew up an additonal 10MB -- not all of it shared. Just switching from a heavy weight WM to a light weight one and smaller helper apps would recover the bulk of this space. Other changes would only add to the savings).
That minimum of 10MB might be just enough to cut disk swapping down -- by how much it really depends on the application. If it's a single block of data, and no calculations are being done, no speed improvement will be noticed. If it's an in-memory array, the savings could be substantial.
Without giving it a try, or knowing the application's demands, nobody can say for certian.

--
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
Re:4K here, 2K there, it's in the 3rd decimal plac by Spoing · 2002-03-14 08:26 · Score: 2

These comments might be relevant on a 16MB RAM system, but I doubt you'd notice any difference on a 2GB RAM system.
A new post points out that the systems had 256MB, so recovery of 10MB should make a substantial difference.

--
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.