The State of Linux IO Scheduling For the Desktop?
pinkeen writes "I've used Linux as my work & play OS for 5+ years. The one thing that constantly drives me mad is its IO scheduling. When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%. The process which does the actual copying is highly prioritized in terms of I/O. This is completely unacceptable for a desktop OS. I've heard about the efforts of Con Kolivas and his Brainfuck Scheduler, but it's unsupported now and probably incompatible with latest kernels. Is there any way to fix this? How do you deal with this? I have a feeling that if this issue was to be fixed, the whole desktop would become way more snappier, even if you're not doing any heavy IO in the background."
Update: 10/23 22:06 GMT by T : As reader ehntoo points out in the discussion below, contrary to the submitter's impression, "Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use." Thanks to ehntoo, and hat tip to Bill Huey.
Isn't this also relevant when using Linux on a server? I mean, if one process or thread is copying a large file, you don't want your server to come to a crawl.
It doesn't sound like just a "desktop" issue to me.
If Pandora's box is destined to be opened, *I* want to be the one to open it.
..download and compile the 2.6.36 kernel. A feature of the changes can be found at http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-36-1103009.html
A very very easy to follow guide can be found at http://kernel.net/articles/how-to-compile-linux-kernel.html
Sidenote - What is up with not being able to paste links? That's annoying.
Perhaps if Con Kolivas named his scheduler ...named his scheduler something else, it might gain more traction ...
x
If the CPU utilization is that low, it's an I/O scheduling problem. See Linux I/O scheduling.
The CFQ scheduler is supposed to be a fair queuing system across processes, so you shouldn't have a starvation problem. Are you thrashing the virtual memory system? How much I/O is going into swapping. (Really, today you shouldn't have any swapping; RAM is too cheap and disk is too slow.)
We switched our dedicated web servers from Linux to FreeBSD and OpenSolaris. When we upload videos (usually 10GB or larger) over our 100Mbps internet connection to the server, or a client was downloading the videos, those who were accessing the server using the web server complained it took seconds serve each web page. The videos were on a magnetic hard drives, the OS and web server was on SSDs (which was mirrored in RAM). Server logs were fine, CPU utilisation was low, the servers have 1Gbps connection. We put it down to I/O scheduling. Switching the OS solved the problems.
Then there are programs like Firefox, which continually write to sqlite databases, which causes multiple fsync() calls, which will flush the disk cache each time if you're running on an ext3 filesystem. All because NTFS used to eat your bookmarks file if Windows crashed.
FYI, the IO scheduler and the CPU scheduler are two completely different beasts.
The IO scheduler lives in block/cfq-iosched.c and is maintained by Jens Axboe, while the CPU scheduler lives in kernel/sched*.c and is maintained by Peter Zijlstra and myself.
The CPU scheduler decides the order of how application code is executed on CPUs (and because a CPU can run only one app at a time the scheduler switches between apps back and forth quickly, giving the grand illusion of all apps running at once) - while the IO scheduler decides how IO requests (issued by apps) reading from (or writing to) disks are ordered.
The two schedulers are very different in nature, but both can indeed cause similar looking bad symptoms on the desktop though - which is one of the reasons why people keep mixing them up.
If you see problems while copying big files then there's a fair chance that it's an IO scheduler problem (ionice might help you there, or block cgroups).
I'd like to note for the sake of completeness that the two kinds of symptoms are not always totally separate: sometimes problems during IO workloads were caused by the CPU scheduler. It's relatively rare though.
Analysing (and fixing ;-) such problems is generally a difficult task. You should mail your bug description to linux-kernel@vger.kernel.org and you will probably be asked there to perform a trace so that we can see where the delays are coming from.
On a related note i think one could make a fairly strong argument that there should be more coupling between the IO scheduler and the CPU scheduler, to help common desktop usecases.
Incidentally there is a fairly recent feature submission by Mike Galbraith that extends the (CPU) scheduler with a new feature which adds the ability to group tasks more intelligently: see Mike's auto-group scheduler patch
This feature uses cgroups for block IO requests as well.
You might want to give it a try, it might improve your large-copy workload latencies significantly. Please mail bug (or success) reports to Mike, Peter or me.
You need to apply the above patch on top of Linus's very latest tree, or on top of the scheduler development tree (which includes Linus's latest), which can be found in the -tip tree
(Continuing this discussion over email is probably more efficient.)
Thanks,
Ingo
I would definitely ditch an OS that fucked up a file copy because I used the computer for something else while I was waiting.
The malloc/new that fails causing a process to crash might not be the process that is consuming huge amounts of memory in the first place.
will force everything back into memory
Exactly, and it is a very good solution when things get out of hand. At least once or twice a month I recycle swap in order to prevent an imminent freeze. Maybe every 4 or 5 months I lost the system state for I/O scheduling problems on my desktop. On the other hand, my servers usually reboot only during power maintenances and/or fried hardware cases...
That's great that you post your experiences with server scheduling in a topic about desktop scheduling. It's so relevant. No wait, it's not.
-- Linux user #369862
The mind boggles: you're suggesting a manual tweak for each large copy operation. It's not the user's job to make the computer more efficient.
You are joking right?(OP) I'm using debian, and I routinely copy TB(s) of data from hard drive to hard drive via SATA and/or USB 2.0, and though the usb tramsfer speed is fairly slow, my system doesn't slow appreciably at all.
Weird, I've been using Linux for 10 years now, and one thing linux does really well is move large amounts of data around without killing the system (useability).
Lotsa ram is your freind, also make sure your / filesystem isn't the hdd that you're moving the data from/to or vice-versa. That does slow access down a bit.
jaz
Life is what happens to you while you are busy making other plans. No-one sees motorcycles
Renaming is O(filename), usually a single table entry. Deleting is probably more along the lines of O(filesize).
You have to keep track of the free blocks, too.
"Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
Cite? What exactly is the difference between "public" and "kernel"?
If all processes see the same 1G the distinction isn't meaningful, especially in this context.
You're insane. If your computer ever silently drops "a few bits" while copying stuff, there's something seriously wrong with your OS or your hardware, and things will break whether or not you're using the computer while copying. You might as well sacrifice a chicken to make sure the data transfer works, it'll have about the same effect.
Switch back to Slashdot's D1 system.
Generally Windows runs badly without a swap. Don't listen to people who tell you to disable it. You should have a swap file on Windows no matter how much memory you have.
Tweakers who don't really understand anything about Windows paging often conclude turning off the swap is a good idea, because they only run trivial applications and don't experience certain memory backed I/O operation failing with it off. They do see an initial speed boost though. The reason is NT is very pessimistic about memory. Windows assumes you will need to page out to disk. It therefore flush the set of static pages to disk almost right away. This is why there is so much more disk thrashing on Windows than say Linux when you start an application and plenty of memory is free. It will do its best to keep the working set out of the page file of course. This does give Windows a performance advantage under memory pressure however. When there is not enough memory to start a new application Windows can just drop the pages from memory of the application being paged out without the need to flush them to disk because they are already there; Linux will need to write those pages.
Given that Windows boxes (desktops anyway) tend to have large numbers proccess running in the background so they usually are under that memory pressure.
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
No, massive unfairness is just as bad on the server as it is on the desktop - in all but a few select batch processing situations.
Replace 'desktop' with 'database', 'Apache', 'Samba' or 'number crunching job' and you get the same kind of badness.
There's not much difference really. If it sucks on the desktop then it sucks on the server too: why would it be a good server if it slows down a DB/Apache/Samba/number-crunching-job while prioritizing some large copy operation?
I have 8GB of RAM, why do I need swap?
So you can use that 8 gigs for something else, or not buy 8 gigs in the first place. In particular, so when a program is using large amounts of memory for no good reason, it can be swapped out, maybe even just for disk cache.
Also, hibernate.
Don't thank God, thank a doctor!
Now put ten million files on it and try again. This is about actually used filesystems, not your stash of 500 ISO DVD rips, your bookmarks and your resume.
I was promised a flying car. Where is my flying car?
Yeah, and I hope it continues but I don't know any of our critical vendors who develop for it. Forgive me for marginalizing the open community on this particular item but a lot of the steam behind Solaris (by reflection OpenSolaris) was the industrial use of the o/s. Not just web servers but all kinds of specialized applications tied to command and control of the internet. Companies like Siemens, Alcatel-Lucent, Cisco etc all have or had Solaris based platforms. Our company has not installed a Solaris application system in two years. Everything is Linux, Windows, AIX and any vendor who comes to us with Solaris might as well save their breath. As well, we have not bought any Sun h/w in two years. Our h/w criteria is basically: how cheaply and reliably can we run VMWare?
I fear that Solaris/OpenSolaris is becoming at best a niche operating system. Sun h/w and Solaris are the walking dead and I don't see Oracle being able to do anything about it.
Too bad IBM didn't buy Sun. Solaris really would have had a chance to grow if IBM wanted to push it.