The State of Linux IO Scheduling For the Desktop?
pinkeen writes "I've used Linux as my work & play OS for 5+ years. The one thing that constantly drives me mad is its IO scheduling. When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%. The process which does the actual copying is highly prioritized in terms of I/O. This is completely unacceptable for a desktop OS. I've heard about the efforts of Con Kolivas and his Brainfuck Scheduler, but it's unsupported now and probably incompatible with latest kernels. Is there any way to fix this? How do you deal with this? I have a feeling that if this issue was to be fixed, the whole desktop would become way more snappier, even if you're not doing any heavy IO in the background."
Update: 10/23 22:06 GMT by T : As reader ehntoo points out in the discussion below, contrary to the submitter's impression, "Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use." Thanks to ehntoo, and hat tip to Bill Huey.
This issue got so bad for me I switched to FreeBSD.
On IO intensive server: this is also a real issue. 20-30% of processors and cores stuck with a 99% iowait for hours, while the rest tries to cope. Total CPU load does not go above 20%. No solution yet after months of study and experimenting. Linux is indeed really bad at IO scheduling in general, it seems.
Notw think of that situation and a heavy database system. A no-no solution.
I've wondered on occasion if this problem is really only due to scheduling. After all, most of us still write our file access code more or less as follows: x=fopen('somefilename'); while ( !eof(x)) { print readln(x,1024); /* ---- */
}
fclose(x);
Point being, there's nothing that tells the marked line that the process should gracefully go to sleep while the drive is doing its thing, and there's no callback vector defined either- nothing that indicates we're dealing with non-blocking I/O. I'd like to think that our compilers have silently been improved to hide those implementation details from us, but I have no proof that this is the case. Unless the system functions use some dirty stack manipulation voodoo to extract the return address of the function and use that as callback vector?
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
Theres a bug in chrome that causes it to usually be unable to paste into slashdot's comment box once you've placed an < character in the box. (Slashdot, specfically. It does fine on all sorts of other sites with even fancier ajaxy textareas like the stackoverflow sites)
If I have been able to see further than others, it is because I bought a pair of binoculars.
How does this happen? Every year it seems I read about how this problem has been fixed in the latest kernel, and then it's like those fixes mysterious vanish?
If you need web hosting, you could do worse than here
This problem is highly visible in VMs. When you have one VM doing write-heavy disk IO, the other VMs suffer.
I don't think it's a Linux problem as much as a general problem of the compromises that must be made by any scheduling algorithm.
What about you Linux mainframe guys? You have unbeatable IO subsystems. Do you see the same problems?
-fb Everything not expressly forbidden is now mandatory.
I ran into the same problems and ended up switching to the "deadline" scheduler. Haven't had a single problem since. I changed it via the "elevator=deadline" on the kernel boot prompt, but you can change it on the fly for individual devices. See Configuring and Optimizing Your I/O Scheduler to see how.
I remember using OS/2 (IBM's desktop OS) and i was always amazed that you could format a floppy and do other tasks like nothing else was going on. I never did understand why that never seemed to make it into the mainstream.
This is not a case of Linux IO schedulers being unsuitable for the desktop, but more a case of desktop applications being written in a horrendous way in terms of data access. The general pattern being to open up a file object, load in a few hundred kilobytes, processing this then asking the operating system for more. This is a small inefficiency when the resource is doing nothing, but if the disk is actually busy, then it will probably be doing something else by the time you ask for it to read a little bit more. Not to mention the habit of reading through a few hundred resource files one at a time in seemingly random order, and blocking every time it reads, because the application programmer is too lazy to think about what resources the app is using.
Linux has such a nice implementation of mmap, which works by letting Linux actually know ahead of time what files you are interested in and managing them itself, without the application programmer worrying his pretty little head over it. Other options are running multiple non-blocking reads at the same time and loading the right amount of data and the right files to begin with.
The best thing about a simple CSCAN algorithm is that it gives applications what they asked for and if the application doesn't know what it wants, well, that's hardly a system issue.
When Argumentum ad Hominem falls short, try Argumentum ad Matrem
It's been a big issue for me. Go to a directory with a couple of large files (say a dvd rip) and do a "cat * > newfile". Watch your system come to a crawl.
You can disable swapping in Windows if you have sufficient RAM. The poster raises a very good point, but it's actually more important in servers than clients (isn't Linux anyway dead on the desktop...?).
This is actually one of the very reasons (the other being multithreaded performance) why many of us use Windows Server 2003/2008 sometimes in preference to Linux.
I had been wondering about this myself, for some reason I was under the impression that the BFS was no longer being maintained.
It turns out there is an up-to-date package for Ubuntu (I'm running 10.10) as well: http://launchpad.net/~chogydan/+archive/ppa
I thought I'd try it out as the installation was much more straightforward than I'd expected.
'uname -r' now reveals "2.6.35-22ck-generic" and, while this is just my subjective assessment, a few of the quirks I had noticed before on my own system where things would get sluggish when switching between apps / opening closing apps while running things that read/write to the disk, seem to have been ironed out.
I would love to test this in a more empirical manner, as I can now boot into either kernel to do comparisons, but I don't know of any software that would allow me to benchmark performance in a way that is sensitive to the optimizations the BFS allegedly implements.
Sorry dude, it looks like it's a hardware specific problem. I did that on nearly 700G of large files and then fired up the flight sim while it was still going. The only slow down was on file related activity, which is totally what you'd expect. I had it running full screen across two monitors without any drop in frame rate. AND I'm using economy hardware.
Funnyhacks - Wierd, unusual, and fun hacks
I've encountered situations where I'm trying to do something online and a task starts up due to a cron job that builds some kind of index. The index building should be in the background but somehow takes priority over what I'm doing on the desktop. Those kinds of cron jobs should be default scheduled in the background, not take priority over what is happening on the desktop.
That's great that you post your experiences with server scheduling in a topic about desktop scheduling. It's so relevant. No wait, it's not.
The boundary between the desktop space and the server space is rather fluid, and many of the problems visible on servers are also visible on desktops - and vice versa.
For example 'copying a large amount of data' on a server is similar to 'copying a big ISO on the desktop'. If the kernel sucks doing one then it will likely suck when doing the other as well.
So both cases should be handled by the kernel in an excellent fashion - with an optimization/tuning focus on desktop workloads, because they are almost always the more diverse ones, and hence are generally the technically more challenging cases as well.
Thanks,
Ingo
I often note that multiple simultaneous low-priority file copies implemented as:
run faster than multiple simultaneous high-priority copies implemented as:
If the copies are run one at a time, the higher priority rsync runs faster. For multiple copies, often the lower priority rsyncs run faster. Also, desktop usability is much improved with the lower priority rsyncs.
I suspect a priority inversion occurs inside the file systems write back cache. At regular priority levels, data is not written back to disk in a timely manner. The ionice -c 3 gives the disk caches a higher priority than the rsync I/O commands, preventing the I/O commands from filling the cache and creating a priority inversion.
The Gnome GUI in Ubuntu is particularly vulnerable to this priority inversion, as by default it does multiple copies simultaneously inside a separate window. Ubuntu usually performs better than Windows however. Between the A-V software in Windows, and the tendency to swap applications out of memory to maximize disk cache, Windows usually performs the same copy operations more slowly than Ubuntu and with less system responsiveness.