The State of Linux IO Scheduling For the Desktop?

← Back to Stories (view on slashdot.org)

The State of Linux IO Scheduling For the Desktop?

Posted by timothy on Saturday October 23, 2010 @06:38AM from the in-and-out-and-in-and-out dept.

pinkeen writes "I've used Linux as my work & play OS for 5+ years. The one thing that constantly drives me mad is its IO scheduling. When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%. The process which does the actual copying is highly prioritized in terms of I/O. This is completely unacceptable for a desktop OS. I've heard about the efforts of Con Kolivas and his Brainfuck Scheduler, but it's unsupported now and probably incompatible with latest kernels. Is there any way to fix this? How do you deal with this? I have a feeling that if this issue was to be fixed, the whole desktop would become way more snappier, even if you're not doing any heavy IO in the background." Update: 10/23 22:06 GMT by T : As reader ehntoo points out in the discussion below, contrary to the submitter's impression, "Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use." Thanks to ehntoo, and hat tip to Bill Huey.

9 of 472 comments (clear)

Min score:

Reason:

Sort:

have you tried ionice? by larry+bagina · 2010-10-23 06:43 · Score: 5, Informative

have you tried ionice?

--
Do you even lift?
These aren't the 'roids you're looking for.
1. Re:have you tried ionice? by atrimtab · 2010-10-23 07:01 · Score: 5, Informative
  
  ionice works great in a terminal window, but isn't integrated into any of the Desktop GUIs.
  I suppose you could prefix the various file transfer commands used by the GUI with an added "ionice -c 3", but I haven't bothered to look.
  Using ionice to lower the i/o priority of various portions of MythTV like mythcommflag, mythtranscode, etc. can make it quite snappy.
  
  --
  Facebook is billions of individual "Skinner Boxes." And if you use it you are the pigeon!
BFS Isn't Unsupported by ehntoo · 2010-10-23 06:44 · Score: 5, Informative

Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. http://ck.kolivas.org/patches/bfs/ He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use. http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/
Re:what about servers? by Anonymous Coward · 2010-10-23 06:55 · Score: 5, Informative

There are some interactive-response fixes queued up for 2.6.37 that may help (a lot!) with this stuff.
Start reading here: http://www.phoronix.com/scan.php?page=news_item&px=ODU0OQ
Re:Is it really only a matter of scheduling? by Anonymous Coward · 2010-10-23 07:06 · Score: 5, Informative

The kernel will preempt the process calling "readln", in other words putting it to sleep.
The kernel will make sure the I/O happens, allowing other processes to work at the same time.
You only need non-blocking code if your own process needs to other things at the same time.
Probably not the IO scheduler by crlf · 2010-10-23 07:13 · Score: 5, Informative

This is almost certainly not the IO scheduler's problem. IO scheduling priorities are orthogonal to CPU scheduling priorities.
What you are likely running into is the dirty_ratio limits. In Linux, there is a memory threshold for "dirty memory" (memory that is destined to be written out to disk), that once crossed, will cause symptoms like you've described. The dirty_ratio values can be tuned via /proc, but beware that the kernel will internally add its own heuristics to the values you've plugged in.
When the threshold is crossed, in an attempt to "slow down the dirtiers", the Linux kernel will penalized (in rate-limited fashion) any and every task on the system that tries to allocate a page. This allocation may be in response to userland needing a new page, but it can also occur if the kernel is allocating memory for internal data structures in response to a system call the process did. When this happens, the kernel will force that allocating thread (again, rate-limited) to take part in the flushing process, under the (misguided) assumption that whoever is allocating a lot of memory is the same thread that is dirtying a lot of memory.
There are a couple ways to work around this problem (which is very typical when copying large amounts of data). For one, the copying process can be fixed to rate limit itself, and to synchronously flush data at some reasonable interval. Another way that a system administrator can manage this sort of task (if automated of course) is to use Linux's support for memory controllers which essentially isolates the memory subsystem performance between tasks. Unfortunately, it's support is still incomplete and I don't know of any popular distributions that automate this cgroup subsystem's use.
Either way, it is very unlikely to be the IO scheduler.
Re:what about servers? by joaosantos · 2010-10-23 07:34 · Score: 5, Informative

I just did it and didn't notice any slowdown.
Re:Is it really only a matter of scheduling? by Ingo+Molnar · 2010-10-23 07:55 · Score: 5, Informative

Yes. Here there is another problem at play: cp reads in the whole (big) file and then writes it out. This brings the whole file into the Linux pagecache (file cache).
That, if the VM is not fully detecting that linear copy correctly, can blow a lot of useful app data (all cached) out of the pagecache. That in turn has to be read back once you click within Firefox, etc. - which generates IO and is a few orders of magnitude slower than reading the cached copy. That such data tends to be fragmented (all around on the disk in various small files) and that there is a large copy going on does not help either.
Catastrophic slowdowns on the desktop are typically such combined 'perfect storms' between multiple kernel subsystems. (for that reason they also tend to be the hardest ones to fix.)
It would be useful if /bin/cp explicitly dropped use-once data that it reads into the pagecache - there are syscalls for that.
And yes, we'd very much like to fix such slowdowns via heuristics as well (detecting large sequential IO and not letting it poison the existing cache), so good bugreports and reproducing testcases sent to linux-kernel@vger.kernel.org and people willing to try out experimental kernel patches would definitely be welcome.
Thanks,
Ingo
Re:what about servers? by Ingo+Molnar · 2010-10-23 08:33 · Score: 5, Informative

I think the Phoronix article you linked to is confusing the IO scheduler and the VM (both of which can cause many seconds of unwanted delays during GUI operations) with the CPU scheduler.
The CPU scheduler patch referenced in the Phoronix article deals with delays experienced during high CPU loads - a dozen or more tasks running at once and all burning CPU time actively. Delays of up to 45 milliseconds were reported and they were fixed to be as low as 29 milliseconds.
Also, that scheduler fix is not a v2.6.37 item: i have merged a slightly different version and sent it to Linus, so it's included in v2.6.36 already: you can see the commit here.
If you are seeing human-perceptible delays - especially in the 'several seconds' time scale, then they are quite likely not related to the CPU scheduler (unless you are running some extreme workload) but more likely to the CFQ IO scheduler or to the VM cache management policies.
In the CPU scheduler we usually deal with milliseconds-level delays and unfairnesses - which rarely raise up to the level of human perception.
Sometimes, if you are really sensitive to smooth scheduling, can see those kinds of effects visually via 'game smoothness' or perhaps 'Firefox scrolling smoothness' - but anything on the 'several seconds' timescale on a typical Linux desktop has to have some connection with IO.
Thanks,
Ingo