The State of Linux IO Scheduling For the Desktop?
pinkeen writes "I've used Linux as my work & play OS for 5+ years. The one thing that constantly drives me mad is its IO scheduling. When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%. The process which does the actual copying is highly prioritized in terms of I/O. This is completely unacceptable for a desktop OS. I've heard about the efforts of Con Kolivas and his Brainfuck Scheduler, but it's unsupported now and probably incompatible with latest kernels. Is there any way to fix this? How do you deal with this? I have a feeling that if this issue was to be fixed, the whole desktop would become way more snappier, even if you're not doing any heavy IO in the background."
Update: 10/23 22:06 GMT by T : As reader ehntoo points out in the discussion below, contrary to the submitter's impression, "Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use." Thanks to ehntoo, and hat tip to Bill Huey.
This issue got so bad for me I switched to FreeBSD.
Isn't this also relevant when using Linux on a server? I mean, if one process or thread is copying a large file, you don't want your server to come to a crawl.
It doesn't sound like just a "desktop" issue to me.
If Pandora's box is destined to be opened, *I* want to be the one to open it.
have you tried ionice?
Do you even lift?
These aren't the 'roids you're looking for.
..download and compile the 2.6.36 kernel. A feature of the changes can be found at http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-36-1103009.html
A very very easy to follow guide can be found at http://kernel.net/articles/how-to-compile-linux-kernel.html
Sidenote - What is up with not being able to paste links? That's annoying.
Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. http://ck.kolivas.org/patches/bfs/ He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use. http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/
Perhaps if Con Kolivas named his scheduler ...named his scheduler something else, it might gain more traction ...
x
If the CPU utilization is that low, it's an I/O scheduling problem. See Linux I/O scheduling.
The CFQ scheduler is supposed to be a fair queuing system across processes, so you shouldn't have a starvation problem. Are you thrashing the virtual memory system? How much I/O is going into swapping. (Really, today you shouldn't have any swapping; RAM is too cheap and disk is too slow.)
Using Mandriva 2010.0 (or on any earlier builds for that matter). Not sure if their stock kernel is using scheduling patches or not but the only time I've ever seen slowdowns on my wimpy P4 machine is with really serious oversubscribing to memory, which obvious will turn it into a dog. IO seems to have little to no effect however.
So maybe you just need a better desktop distribution? A newer one perhaps? Don't expect that if you slap just any old distro on a machine and call it a workstation that you get something beyond garbage. I'd expect Suse and/or Fedora to work equally well. Ubuntu is probably doing OK but I wouldn't know. Most of the smaller/less mainstream distros however are quite random, and running something like CentOS on a desktop is just asking for a crappy desktop.
"Malo periculosam, libertatem quam quietam servitutem." -- Jefferson
..download and compile the 2.6.36 kernel. A feature of the changes can be found at http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-36-1103009.html A very very easy to follow guide can be found at http://kernel.net/articles/how-to-compile-linux-kernel.html Sidenote - What is up with this comment not showing up when I wasn't registered. That's stupid and annoying.
The 2.6.36 kernel supposedly has a fix for this issue. I haven't been able to test it yet myself, but it sounds like they finally tracked it down. See here for more information.
I've wondered on occasion if this problem is really only due to scheduling. After all, most of us still write our file access code more or less as follows: x=fopen('somefilename'); while ( !eof(x)) { print readln(x,1024); /* ---- */
}
fclose(x);
Point being, there's nothing that tells the marked line that the process should gracefully go to sleep while the drive is doing its thing, and there's no callback vector defined either- nothing that indicates we're dealing with non-blocking I/O. I'd like to think that our compilers have silently been improved to hide those implementation details from us, but I have no proof that this is the case. Unless the system functions use some dirty stack manipulation voodoo to extract the return address of the function and use that as callback vector?
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
"I've heard about the efforts of Con Kolivas and his Brainfuck Scheduler, but it's unsupported now and probably incompatible with latest kernels."
I don't know what you're talking about: http://users.on.net/~ckolivas/kernel/
It's updated for the latest kernel which came out just yesterday.
It is been worked on: http://kernelnewbies.org/Linux_2_6_36#head-738bffb3415051b478ecdfd2eabb0294e35146a9 and http://lkml.org/lkml/2010/10/19/123
Did Con unleash some of his trolls on Slashdot?
Yeah, I think he just did ...
Supposedly the 2.6.36 kernel addresses this issue. I don't know if the problem has been completely fixed, or mostly fixed, or what, since I haven't tried that kernel yet (too bad there isn't an easy way to install kernels in a cross-distro fashion!).
Read the bullet points here, particularly the ones in the middle, as there has been multiple things done to this kernel to improve performance:
http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-36-1103009.html?page=6
Promote true freedom - support standards and interoperability.
I ask this question with utmost sincerity. Folks Over here believe it is indeed dead. I am afraid I agree with them. I hear so little about desktop Linux these days. It's all about iOS, Android and RIM. The future does not appear to be on track to change anytime soon. Now tell me I am wrong and why.
i just run it and let it own the computer for whatever time it takes = anywhere from 10 to 30 minutes, and just walk off, maybe go get a fresh cup of coffee or cold beer depending on where i am and what time of day it is. one thing i dont want is a borked copy because i was too impatient to let it do its job.
Politics is Treachery, Religion is Brainwashing
You mean an OS like Windows which will swap out the web browser you're using when you copy a big file from one disk to another even though it's far too large for the entire file to fit in the disk cache?
I can remember that even as far back as 1999 I saw this issue with Linux. This is not bad only for the desktop, but also for the server. I have also experience with Solaris workstations and servers, and it usually doesn't behave this way.
I ran into the same problems and ended up switching to the "deadline" scheduler. Haven't had a single problem since. I changed it via the "elevator=deadline" on the kernel boot prompt, but you can change it on the fly for individual devices. See Configuring and Optimizing Your I/O Scheduler to see how.
This is not a thread scheduling issue, it's a disk scheduling issue. If CPU utilization is only 1-2% and things aren't snappy then the issue is because the foreground process's I/Os aren't given higher (high enough?) priority. Easy enough to believe too, a whole lot of writes get cached and then queued up. With an elevator algorithm they'll likely all get performed before any reads required by the foreground process.
"noop scheduler: just service next request in the queue without any algorithm to prefer this or that request."
Yes, because everybody knows that kernel scheduling algorithms are far more tunable on Windows than on Linux.
-fb Everything not expressly forbidden is now mandatory.
I remember using OS/2 (IBM's desktop OS) and i was always amazed that you could format a floppy and do other tasks like nothing else was going on. I never did understand why that never seemed to make it into the mainstream.
This is not a case of Linux IO schedulers being unsuitable for the desktop, but more a case of desktop applications being written in a horrendous way in terms of data access. The general pattern being to open up a file object, load in a few hundred kilobytes, processing this then asking the operating system for more. This is a small inefficiency when the resource is doing nothing, but if the disk is actually busy, then it will probably be doing something else by the time you ask for it to read a little bit more. Not to mention the habit of reading through a few hundred resource files one at a time in seemingly random order, and blocking every time it reads, because the application programmer is too lazy to think about what resources the app is using.
Linux has such a nice implementation of mmap, which works by letting Linux actually know ahead of time what files you are interested in and managing them itself, without the application programmer worrying his pretty little head over it. Other options are running multiple non-blocking reads at the same time and loading the right amount of data and the right files to begin with.
The best thing about a simple CSCAN algorithm is that it gives applications what they asked for and if the application doesn't know what it wants, well, that's hardly a system issue.
When Argumentum ad Hominem falls short, try Argumentum ad Matrem
This is almost certainly not the IO scheduler's problem. IO scheduling priorities are orthogonal to CPU scheduling priorities.
What you are likely running into is the dirty_ratio limits. In Linux, there is a memory threshold for "dirty memory" (memory that is destined to be written out to disk), that once crossed, will cause symptoms like you've described. The dirty_ratio values can be tuned via /proc, but beware that the kernel will internally add its own heuristics to the values you've plugged in.
When the threshold is crossed, in an attempt to "slow down the dirtiers", the Linux kernel will penalized (in rate-limited fashion) any and every task on the system that tries to allocate a page. This allocation may be in response to userland needing a new page, but it can also occur if the kernel is allocating memory for internal data structures in response to a system call the process did. When this happens, the kernel will force that allocating thread (again, rate-limited) to take part in the flushing process, under the (misguided) assumption that whoever is allocating a lot of memory is the same thread that is dirtying a lot of memory.
There are a couple ways to work around this problem (which is very typical when copying large amounts of data). For one, the copying process can be fixed to rate limit itself, and to synchronously flush data at some reasonable interval. Another way that a system administrator can manage this sort of task (if automated of course) is to use Linux's support for memory controllers which essentially isolates the memory subsystem performance between tasks. Unfortunately, it's support is still incomplete and I don't know of any popular distributions that automate this cgroup subsystem's use.
Either way, it is very unlikely to be the IO scheduler.
It seems to me you don't really know what you're talking about. As much as one wants one's file copying to be finished sooner, it should never EVER impair responsiveness of a workstation. That's what multitasking operating systems with GUIs are all about. It has nothing to do with Linux or its kernel knowing where it's priorities should be.
Have you ever heard the BitTorrent client Transmission? Whenever the thing has anything to do (downloading / uploading) I have an almost useless desktop. Now, I don't really care how fast it writes to disk as long as it keeps a decent pace, but I do care about being able to be productive while it's doing its job, otherwise I could resort to MS-DOS or somthing.
You are obviously confusing concurrency with latency. Latency has been proved time and again to be THE decisive factor for desktop users. It's all on Internet, just do some googling and you'll find case studies which show what is more important to users - whether their 700Mb DivX file copy finishes in 3 minutes instead of 5 or whether their computer keeps being as responsive as they are used to, during those minutes. Users always prefer latency, which is why Con Kolivas' work is appreciated, regardless of what you may have heard (in particular from Linus.) His whole argument with the opponents of a plugin-scheduler (and scheduler plugin system) revolved around the fact that what works for servers doesn't always work for desktops.
Gee, most of us *nix people - what did that guy call us, something about smoking roosters over small pieces of wood - know that when you need to copy a few gigabytes in background, you use "nice" and crank the priority way down. This has been around since something like 1975 or so.
Don't take life too seriously; it isn't permanent.
Hey, I'm all for grabbing a beer any time of the day, but surely you don't think watching a YouTube video, sending emails, playing chess, or shopping online on your machine as it is copying a file in the background will "bork" the copy. I would toss any O/S that would do such thing.
Is that a roll of dimes in your pocket or are you happy to see me?
There's another massive problem with I/O scheduling on Linux: all of the schedulers are designed for physical disks. With solid state drives as opposed to physical spinning platters, a ladder algorithm is useless and only serves to reduce performance. With solid state drives, the best scheduler is currently noop, which doesn't implement priorities. I prototyped a lottery based scheduler for a class that would allow ionice to be used in a sensible way on solid state drives, but never got it into a state where it didn't crash the kernel. The whole system does seem a little massively out of date.
OP didn't state that he wants to have a fine tunable kernel schedulin algorithm. He stated a problem and is looking for a solution. So, if the problem has already been fixed in another system (So you don't need to tune the kernel scheduling algorithm there... It just works), it is irrelevant whether you actually could tune the kernel scheduling algorithm if you wanted to.
Not saying, that GP wasn't a troll or a flamebait - he obviously was. But just noting that your answer didn't really refute his post in any way.
You can disable swapping in Windows if you have sufficient RAM. The poster raises a very good point, but it's actually more important in servers than clients (isn't Linux anyway dead on the desktop...?).
This is actually one of the very reasons (the other being multithreaded performance) why many of us use Windows Server 2003/2008 sometimes in preference to Linux.
On my current openSUSE 11.3 install I've only observed severe slowdown whenever I read/write large amounts of data from/to NTFS partitions. Similar operations that only involve ext4 largely remain unnoticed. My best guess would be, that the NTFS-3G driver was written around a spec that was for one thing closed and, perhaps more importantly, not designed with the Linux kernel in mind.
If you are doing something non-interactive that uses a lot of I/O, use IOnice. experiment, but I find
ionice -p [pid] -c 2 -n 7
to produce reasonable results.
You can disable swapping in Windows if you have sufficient RAM.
I tried that once on XP and several programs barfed. For example I seem to remember that Premiere simply wouldn't run if you didn't have a swap file, because it did some wacky things with virtual memory allocations; perhaps the newer versions aren't so braindead.
My absolutely puny hardware (all 5+ years old, or netbooks) does not experience this problem at all running different releases of Ubuntu. I did notice that Transmission sometimes chewed up too much processor when I had 10+ torrents going, but my bulk drive was NTFS. After I formatted it to ext4, even that went away. I routinely copy multiple GB files intra-drive, inter-drive, and intranetwork while browsing, youtubing, etc.
Maybe you're using an NTFS filesystem that isn't as efficient?
Again, my hardware is majorly obsolete. My only "multicore" setup is on a hyperthreading Atom.
--why?
FYI, the IO scheduler and the CPU scheduler are two completely different beasts.
The IO scheduler lives in block/cfq-iosched.c and is maintained by Jens Axboe, while the CPU scheduler lives in kernel/sched*.c and is maintained by Peter Zijlstra and myself.
The CPU scheduler decides the order of how application code is executed on CPUs (and because a CPU can run only one app at a time the scheduler switches between apps back and forth quickly, giving the grand illusion of all apps running at once) - while the IO scheduler decides how IO requests (issued by apps) reading from (or writing to) disks are ordered.
The two schedulers are very different in nature, but both can indeed cause similar looking bad symptoms on the desktop though - which is one of the reasons why people keep mixing them up.
If you see problems while copying big files then there's a fair chance that it's an IO scheduler problem (ionice might help you there, or block cgroups).
I'd like to note for the sake of completeness that the two kinds of symptoms are not always totally separate: sometimes problems during IO workloads were caused by the CPU scheduler. It's relatively rare though.
Analysing (and fixing ;-) such problems is generally a difficult task. You should mail your bug description to linux-kernel@vger.kernel.org and you will probably be asked there to perform a trace so that we can see where the delays are coming from.
On a related note i think one could make a fairly strong argument that there should be more coupling between the IO scheduler and the CPU scheduler, to help common desktop usecases.
Incidentally there is a fairly recent feature submission by Mike Galbraith that extends the (CPU) scheduler with a new feature which adds the ability to group tasks more intelligently: see Mike's auto-group scheduler patch
This feature uses cgroups for block IO requests as well.
You might want to give it a try, it might improve your large-copy workload latencies significantly. Please mail bug (or success) reports to Mike, Peter or me.
You need to apply the above patch on top of Linus's very latest tree, or on top of the scheduler development tree (which includes Linus's latest), which can be found in the -tip tree
(Continuing this discussion over email is probably more efficient.)
Thanks,
Ingo
If it works there's no real need to tune it.
Con Kolivas released a patch set against the 2.6.36 kernel just a few days ago. Check lkml.org.
C|N>K
Sorry, I can only comment on server solutions that we wrote. I'm sure that some flaky desktop programs have problems.
i have never seen such a problem... i play music off of a mounted drive, read/write to another mounted partition all the time... move files from here to there... absolutely no noticeable slowdowns coz of that...
but hands down, windows 7 beats the pants off any linux distro (or even mac for that matter)... love it completely! very professional work...
Sometimes I see my system get bogged down doing copies and even lock up for a few seconds. And when I see this I always become very nervous because usually it means I have a hard disk failure of some sort (sometimes it can be just a bad connector, but still a hard failure).
I just copied about 5GB from my 5 disk raid5 to my main partition WHILE I copied about the same amount of data back TO that same raid WHILE watching a video from that raid and didn't see much of an issue. I saw one little "stutter" for about 100mS and that was it.
I am using LVM. And the braindead way ubuntu configures LVM to put swap in the same LVM partition as your home and system directories DOES cause all sorts of nastiness. This used to drive me nuts before I fixed it simply by disabling swap. I have 8GB of RAM, why do I need swap?
Edit: this "new and improved" page formatting SUCKS. Now the buttons that were right there are hiddden behind DUMBASS popup menus. Fucking "engineers" thinking they know how to improve shit. How do I turn this bullshit off?
it doesn't do much for I/O. ionice, a much newer program does something similar to what 'nice' does for CPU for I/O intensive tasks. It's pretty good, not as good as nice is for cpu-bound tasks, but eh.
It seem to me you don't really know what you're talking about either. Linux is the kernel. It has everything to do with Linux knowing where its priorities should be, which for impatient people like you, had better be making sure everything always seems fast. You might also like to set CONFIG_HZ to 20000, everything will feel much more responsive after that. But thank you for informing me about a load of studies which tell me what I want.
You could get a better bittorrent client, or alternatively, have everyone else fix everything else for you. Are you using Transmission to share Linux ISOs over gigabit ethernet or something equally believable?
Your base Ubuntu distro is about as much a server OS as is WinXP Tablet Edition. Get over it. If you're running a non-server version of Linux and using whatever box you have like a server you're either kidding yourself just to look cool or you're really not using it in a real production environment. Can you now stop being such a little fanboi and start to recognize the difference between server and desktop OSs instead of coming off like a dumb ass who thinks that one OS can really do it all?
I would definitely ditch an OS that fucked up a file copy because I used the computer for something else while I was waiting.
In Arch Linux I installed the 2.6.35 kernel with BFS enabled, and I found that it gave too much priority to user input. When playing a standard definition xvid file, it would literally pause for a second while opening Firefox. The default scheduler might have problems, but it will keep playing the same type of file just fine, while also opening something like a web browser.
Here's my experience with this issue:
I develop a camera surveillance system. So, I see machines with constant I/O. 25fps (PAL) on several cameras. So, an average system configured in full-recording will be saving 200fps to disk. All the cameras are shown on the screen through SDL on X11. When WD rolled out it's new 4k sector sized disks, I had to figure out how to make them work properly on GNU/Linux, since they came with a special format utility for XP, but no docs on what was required on GNU/Linux. They report to the machine a 512b sector size, but I know they are 4k. So, I managed to align partitions with cylinders properly, and the disk performance spiked. If you didn't do this, partitions and cylinders would end up unaligned, and disk performance would suck. The whole machine would slow down badly, even the SDL windows showing video. That video comes straight from V4L, and disk-output happens later in another thread (i.e, no the same doing the SDL visuals). SDL windows would only show 4-5 fps for several seconds (after every disk flush), then go back to normal.
There are serious I/O issues. Actually, the IO model is ok for servers, but for anything realtime, it fucking sucks.
WTF am I doing replying to an AC at 5 A.M on a Friday night?
If by 'sufficient RAM' you mean 'enough memory to allocate every single running process their 4GB of addressable memory (assuming 32bit arch)', then yes, you can turn off swapping. Otherwise, don't be surprised when the kernel starts to randomly kill processes when it runs out of memory.
There is nothing interesting going on at my blog
By 'sufficient RAM', I mean 16GB and above depending on your requirements (we routinely run both Linux and Windows with 32GB). I am talking about servers here because that's what we run and we have observed the the problem in TFA.
I see your mistake. Linux is not a "desktop OS", it is an advanced OS for production systems or people who like to tinker. Pretending it is otherwise will only cause grief in the long term.
what you need is http://code.google.com/p/pagecache-mangagement/ This tool allows the user to limit the amount of pagecache used by applications under Linux. This is similar to nice, ionice etc. in that it usually doesn't make an application go faster, but does reduce the impact of the application on other applications performance. This is especially useful for applications that walk sequentially through data sets larger than memory, as discarding their pagecache does not reduce their performance (although this tool does add overhead of about 2%). See http://lwn.net/Articles/224653/ Although it is little more than a proof-of-concept it seems to be fairly useful. When running pagecache-management.sh dd if=100-mb-file of=foo or pagecache-management.sh cp -a /usr/src/linux-2.6.20 /usr/src/foo
That's great that you post your experiences with server scheduling in a topic about desktop scheduling. It's so relevant. No wait, it's not.
-- Linux user #369862
Cooooonnnnnn!
My point was only that a system which is totally bogged down by disk IO is somehow not exactly optimum and that a lot of the less mainstream distros don't exactly seem to know a heck of a lot about tuning or what patches to use, which CAN be an issue. You want a desktop specific distribution from a reputable mainstream source.
People expect that some guy in his garage somewhere knows how to put together a well built OS, but that is largely a vain hope. The Linux kernel is a wonderful piece of software, but that doesn't mean you can't horribly misuse it and get bad results.
No doubt many older releases work fine too. I can't recall having this sort of problem in any version of Mandriva for instance, at least not in years. I don't think my particular system is special in that respect, I just use a reasonably high quality distro and things work!
Of course the OP could be dealing with some horrible bad specific device driver or hardware too. That can make a significant difference. I think the flaw in his thinking is generalizing this as a Linux desktop issue and not a hardware or OS specific issue that HE has.
"Malo periculosam, libertatem quam quietam servitutem." -- Jefferson
If you just change your I/O scheduler to anticipatory this should go away. I think the simplest in Ubuntu is to add "elevator=anticipatory" to your kernel command line arguments. This is done differently in GRUB and GRUB2, so fgi.
Error 404 - Sig Not Found
A process only gets 2 GB of addressable memory on Windows; the other 2 GB is allocated to the kernel. (Sure, you can enable a switch to make this 3/1 GB instead, but that's fairly uncommon.) You only need to have enough RAM to cover memory that's actually used, though, and things like memory-mapped files and zero pages still work as normal.
It's not necessarily a good idea, but you can get away with running quite a few processes even with no pagefiles.
It's extremely relevant since Linux has far more importance as a server platform than a desktop one and the OS used in either is essentially the same.
The problem is this: Let's say you do an action that reads 5 blocks on the disk. While the system is idle it has nothing else to do so your 5 blocks are read immediately, super fast.
While the system is doing some other I/O intensive job, it might be doing 500 block reads at the same time. Everything goes in the same queue, so your task is only %1 of the requests that have to be done in a set time. Result: Your task takes 100 times longer.
This is the problem that all the scheduler are trying to solve: trying to be fair so that every task gets a reasonable share of priority, while keeping performance at an optimum level.
For example, some O/S researchers have tried to implemented multiple-tiered system where every I/O is tagged with flags that indicate if the call came from an interactive user action, or was generated by non-interactive jobs (daemons, lower-level layers, etc...) and then give higher priority to the user requests. Two problems with that approach is it can be very hard to differentiate the two and that any heavy user task may prevent system tasks to work in a timely fashion and the user tasks may depend on the system tasks to complete their jobs in order to proceed; vicious circles and race conditions.
I'm glad I'm not trying to code a kernel scheduler, they're very hard problems and figuring one out that can be fair for all types of uses is nigh impossible.
The great thing about the open source O/Ss is that everything's done in the open, there's intense discussions going on about in the field, and there's multiple solutions being worked on and tested.
To me, Linux has always felt like it gave much higher priority to I/O than the "user experience". It's something I've come to expect. If I copy gigabytes from a disk set to another I gladly accept that my web browser's going to be sluggish for a time, all the while feeling content that at least it's going to be done so efficiently that it's going to last for the shortest amount of time possible.
Other O/Ss that I won't name may "feel" better, but have nowhere near the same I/O throughput that Linux has.
Funny, I do that constantly and have never seen any corruption even possibly attributable to it on any system besides Windows.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
I'll chime in too about not having this problem. I've used Linux since late 2004. I've used Fedora 3, 4, 6 and 12, Ubuntu (since Edgy) and Gentoo (my main Linux distro), with Ubuntu in VMs sometimes -- all these across six or seven machines. I've never had the desktop grind to a halt with heavy I/O. In fact, I was relieved that unlike Windows, the harddrive can be grinding away and I can still actually use my computer and start programs in a reasonable amount of time. So when I first started hearing about this issue recently, I was quite surprised and thought maybe it was just a localized config problem or something specific to certain people's hardware. Now that everyone on Slashdot seems to be having this problem, I can't help but wonder how I managed to be so lucky all these years. WTF is going on?
Obligatory
http://xkcd.com/612/
The Cloud - because you don't care if your apps and data are up in the air.
You are joking right?(OP) I'm using debian, and I routinely copy TB(s) of data from hard drive to hard drive via SATA and/or USB 2.0, and though the usb tramsfer speed is fairly slow, my system doesn't slow appreciably at all.
Weird, I've been using Linux for 10 years now, and one thing linux does really well is move large amounts of data around without killing the system (useability).
Lotsa ram is your freind, also make sure your / filesystem isn't the hdd that you're moving the data from/to or vice-versa. That does slow access down a bit.
jaz
Life is what happens to you while you are busy making other plans. No-one sees motorcycles
My experience is that when IO affects "feel", it's mostly because of increased competition for memory. Someone else mentioned tuning the vm.dirty_* sysctls, which certainly helps, but what surprises me is that we don't use O_DIRECT and splice/vmsplice more. "cp" is still a loop of 32k read/writes, with only the obvious O_ modes on either descriptor (and needless to say, no fadvise either.)
In short, the kernel offers fine mechanisms to resolve these problems - it's user-space that isn't taking advantage of them.
Wrong.
2G private address space, 1G public address space, and 1G kernel address space.
Lose a few bits? Are you being serious? Despite what you may believe, during the file copy which you walked away from, several hundred processes continue to run in the background and do what they're supposed to, doing things way more complex than your file copy. You think X stops doing what it's doing just so the file can be copied? Or the kernel drops everything it's doing except process the file copy? The very simple movement of a mouse fires hundreds of IRQs each second which are services by a service routine. Watching a YouTube video is no different.
Is that a roll of dimes in your pocket or are you happy to see me?
You're looking at a bug that was fixed some time ago but those patches didn't make it into stable kernel yet. It should be fixed for everyone in 2.6.37 (about 3 months from now as 2.6.36 came out 3 days ago). In the mean time, you can grab those patches and compile your own kernel.
when the kernel accesses the slow disk, it is aggressive in trying to cache the read. if there's free memory this is obviously the correct thing to do, since if the memory is needed the cache can be dropped. but if memory is full, the kernel needs to decide whether to drop some file cache, or swap out a process. the default settings tend to favor disk cache, meaning every time you try to access anything on the desktop, the application has been swapped out and it has to wait for disk access to swap back in (often several seconds on my machine)
setting /proc/sys/vm/swappiness to a low value, eg 0, tells the kernel to favor processes at the expense of caching disk reads. this helps a lot in keeping the desktop snappy. kernel trap has a good summary of the issue and the developers motivations
swappiness doesn't help with applications that want to access a file repeatedly, but rely on the disk cache instead of an internal cache. eg, an IDE might have 10 source files in tabs, but instead of keeping the files in memory, it could just reread them each time a tab is switched. as long as the file remains in cache, this works fine. but when you copy a huge file, the source file gets dropped from cache, and the tab takes forever to refresh
not sure if there's an easy way for the kernel to know the difference between an application just copying a file, and actually reading it. but if there is, it would make sense to favor reads
My blog
Oh, I understand how "awesome" lunix is. That's why I use FreeBSD. But don't stop believing that anyone who thinks linux is shit on technical merits must be a stupid Windows or Mac fanboy!
nah, most processes don't try to allocate the maximums, nor does Linux swap out the whole process (true swapping is very, very rare) but just pages.
Most of the time my swapfile never gets used on my 4GB desktop. and yes I made my swap partition 4GB just because I never know what I might be doing in the future. but it's been a waste of space thus far, even running web server, database and middleware for projects with a nice 768MB RAM virtual windows machine in vmware on the side.
i just run it and let it own the computer for whatever time it takes = anywhere from 10 to 30 minutes, and just walk off, maybe go get a fresh cup of coffee or cold beer depending on where i am and what time of day it is. one thing i dont want is a borked copy because i was too impatient to let it do its job.
I see you're a long-time Windows 95 user.
- Tell me about multitasking, daddy!
- Hold on, boy, the floppy didn't finish formatting yet.
Cite? What exactly is the difference between "public" and "kernel"?
If all processes see the same 1G the distinction isn't meaningful, especially in this context.
I've encountered situations where I'm trying to do something online and a task starts up due to a cron job that builds some kind of index. The index building should be in the background but somehow takes priority over what I'm doing on the desktop. Those kinds of cron jobs should be default scheduled in the background, not take priority over what is happening on the desktop.
I ditched Windows for this reason, among many others.
I'd go mad.
Mad i say. Mad!!
Better to die, than live under slavery: "Yes, Massa Balmer, no suh, please don' throw that chair suh."
--Head like a hole, black as your soul, i'd rather die than give you control-- NIN
soylentnews.org Go there to enjoy the people!
You're insane. If your computer ever silently drops "a few bits" while copying stuff, there's something seriously wrong with your OS or your hardware, and things will break whether or not you're using the computer while copying. You might as well sacrifice a chicken to make sure the data transfer works, it'll have about the same effect.
Switch back to Slashdot's D1 system.
i bet you're not really scottish either.
Eh. I've had Transmission's UI freeze when it's doing heavy IO-work like verifying local data, but it has never affected the whole system. They seem to have fixed that issue, too, since I just did a verify and the UI remained totally responsive.
Switch back to Slashdot's D1 system.
That's great that you post your experiences with server scheduling in a topic about desktop scheduling. It's so relevant. No wait, it's not.
The boundary between the desktop space and the server space is rather fluid, and many of the problems visible on servers are also visible on desktops - and vice versa.
For example 'copying a large amount of data' on a server is similar to 'copying a big ISO on the desktop'. If the kernel sucks doing one then it will likely suck when doing the other as well.
So both cases should be handled by the kernel in an excellent fashion - with an optimization/tuning focus on desktop workloads, because they are almost always the more diverse ones, and hence are generally the technically more challenging cases as well.
Thanks,
Ingo
The only way NTFS would "eat" a file is if you were writing a new file, and the system crashed before it was completed. In that case, to make the FS consistent, the file will not be there as having it there would be inconsistent. However in the case up updating an existing file that isn't what happens. If the system crashes during updating a file, the file will be rolled back to the state before the write.
As you say, this is how a journaled system works. Only once a write is complete, once things are consistent, is it applied in a permanent fashion. If a crash happens and the disk would be in an inconsistent state, the journal is used to roll things back so that everything is consistent.
My guess is he's confusing ext4 with NTFS in a case of wishful anti-MS thinking. Ext4 had a case of the "nom noms" with regards to files because it could delay writes for so long. Because of the way some programs choose to update things like bookmarks and config files, they could vanish in a crash. This has been fixed, of course, but it was a problem initially you can search Slashdot for it. That is perhaps what he was thinking of.
A modern OS should be able to deal with being asked to do more than one thing with its disk. There will, of course, be a slowdown as disks are not good at random access, but it should not bring the system to a halt. At work all the time I copy large data files around. 10-100GB videos and VMs. I copy them between local drives, and to servers on the net and so on. System works fine when this is going on. Webbrowsing is fast and responsive, e-mail has no problems, everything works as normal. Only time you notice a slowdown is if you try and do something else disk intensive. Copy a VM on a drive and then boot another VM from that drive and both the boot and the copy slow down as the drive jumps back and forth. However it still works just fine, and the system is still responsive.
This is not too much to ask, this is how it should work.
But it fucked it up with highest possible bandwidth, and in a way that's O(1) scalable up to 1024 processors in a NUMA cluster. Don't you care about server fuckup performance?
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
I've been seeing something similar for sometime too but thought it might be an isolated case. Now I've just searched I notice there is a Slashdot comment paste issues in Chrome which describes what I see (very slow pasting doesn't necessarily succeed in pasting).
If you are using the CFQ I/O scheduler on Linux a process' nice value also impacts its default I/O priority. From the ionice man page:
For kernels after 2.6.26 with CFQ io scheduler a process that has not asked for an io priority inherits CPU scheduling class. The io priority is derived from the cpu nice level of the process
Generally Windows runs badly without a swap. Don't listen to people who tell you to disable it. You should have a swap file on Windows no matter how much memory you have.
Tweakers who don't really understand anything about Windows paging often conclude turning off the swap is a good idea, because they only run trivial applications and don't experience certain memory backed I/O operation failing with it off. They do see an initial speed boost though. The reason is NT is very pessimistic about memory. Windows assumes you will need to page out to disk. It therefore flush the set of static pages to disk almost right away. This is why there is so much more disk thrashing on Windows than say Linux when you start an application and plenty of memory is free. It will do its best to keep the working set out of the page file of course. This does give Windows a performance advantage under memory pressure however. When there is not enough memory to start a new application Windows can just drop the pages from memory of the application being paged out without the need to flush them to disk because they are already there; Linux will need to write those pages.
Given that Windows boxes (desktops anyway) tend to have large numbers proccess running in the background so they usually are under that memory pressure.
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
My poor rhetorics regardless, the OP is wrong - there is neither a need for Linux to know that a file copy need be done in background (no need does not mean a user can't hint the system about his preference), nor should the users be dependent on 'ionice' to have responsive systems - Windows gets it right ENOUGH without user intervention.
Same goes for CONFIG_HZ - I think we all know deep down inside that nobody is going to bother resetting variables to fix the symptoms of a problem lying somewhere else entirely. Also, Transmission was just an example - I am sure its program is fairly straightforward, certainly straightforward enough to not make it a culprit - a 'dd if=/dev/sda of=/dev/sdb' will starve the system in much the same manner.
I did not make up the studies, in case you are ironic about them telling you what you want:
http://portal.acm.org/citation.cfm?id=339420
http://www.sapdesignguild.org/community/design/perc_perf.asp
and of course the closely related term http://en.wikipedia.org/wiki/Perceived_performance
I'll admit it's not like googling Angelina Jolie but the information is out there
I would definitely not let a monkey like you get near my computers if some intense file copy was going on and they wanted to start doing other things while that was going on, sure you can do it but that does not make it a prudent thing to do, and the file may copy over just fine, and it may lose a few bits without even reporting any errors and that can happen on any OS, BSD, Linux, Winders & etc...etc...etc...
You sir, are a perfect specimen of a BOFH. You only have a dim notion of what actually goes on inside those mysterious boxes that are unfortunately left under your care. And yet, by some curious accident of nature, you've been entrusted with root passwords for said boxes. You use phrases like "intense file copy" like they mean anything. You place every idiotic restriction that you can think of on the users of said boxes (who, incidentally, are almost always smarter and more qualified than you in whatever field of work they're in) by using words like "prudent" and "safety"... or god forbid... "security". You actually think that because I run a second program along with your "intense" copy, it can result in loss of "a few bits without even reporting any errors" due to what ? The magical fairies that dance inside those little chips getting angry ? Tired ? Can you do everybody a favor and reduce the amount of utter nonsense emanating out of that tiny, befuddled brain ?
No intention from my side to badmouth the application. I like it a lot because of its simplicity and function. But I haven't fired it for months exactly because I am still afraid it will eat my CPU as the torrents are churning :-) You may be exactly right about the issue having been fixed, what do I know. My version is 1.92 (10621) that was bundled with Ubuntu 10.04. Mind you it's not Transmissions own GUI that suffered, it was any other process that I wanted to work with through the GUI.
I meant version 1.93, not 1.92.
Assuming your SSD is detected correctly, the Linux block layer maintainer is proposing changes to improve SSD performance. The idea of waiting for requests (so as to be able to reorder them in a ladder fashion) is not used on SSD devices since 2.6.28 though.
And of course (poor rhetorics again / trigger happy on the keyboard) the correct description of the behavior would not be Transmission eating my CPU. What I was observing however is that with enough torrents/seeds to apparently saturate the disk bandwidth (no big deal if you got a 10mbps Internet line both ways like I do) the rest of system GUI really was stopping to be responsive enough. Having 'ionice'd Transmission all the way to the background would make a considerable improvement to the point where things would be just acceptable. But I would always know when I had left it running in the background :-)
I would guess the Slashdot article about painful fsync behaviour on ext3 was "Kernel Hackers On Ext3/4 After 2.6.29 Release".
(And wow - a developer who still reads and posts to Slashdot! I've got to ask, which tech news site did you all migrate to in the end?)
That's certainly not what any of the Windows internals books say. Do you have a reference for that?
I tried using a USB hard drive to back up about 30 gigs of data from my CentOS 4.8 server. The IO wait on the system shot up to 80% and I had to kill the process since it brought the machine to a crawl and the other processes were taking forever to complete. Something as simple as copying a file to a USB drive should not cause the system to slow to a crawl and become no long functional.
AFAIK there are only two I/O schedulers remaining in recent Linux (and if you squint you might say that RHEL 5's kernel could have been related to 2.6.34 at one point right? :) - CFQ and deadline (three if you count noop I guess). The anticipatory scheduler was removed in 2.6.33...
Somewhere, a guy just won a bet, that he could get slashdot to print the word 'brainfuck' on its front page. And not, mind you, on April Fool's Day.
His elaborate scheme has paid off, and it might even slip under the radar, and remain undiscovered. I'm watching you. /* uses 2 finger gesture for 'watching you'
WARNING: Smartphones have side effects--most of them undocumented.
Is that megabyte or megabit/s? Shouldn't matter, though, since even 10 megabyte/s should not saturate the disk. I guess doing random read/writes that way could do it, but that shouldn't be an issue with a torrent app. When I was verifying that torrent for the previous post, iotop reported transmission reading in data at 60 to 70 mbyte/s. The 1.3 mbyte/s down at which my connection usually tops out at have no effect on the system.
If you need to use the app again, I'd recommend adding the transmission PPA repository and getting the most recent version (currently 2.11, Ubuntu 10.10 ships with 2.04: sudo add-apt-repository ppa:transmissionbt/ppa && sudo apt-get update && sudo apt-get upgrade
Switch back to Slashdot's D1 system.
You've got it. Why are you fsync()ing so often for a userland app with trivial data. There are so many better ways to do this.
The right one is... why am I still bothering with crappy desktop Linux in the first place?
A point to consider is that after about 1995 everybody realised they were better off on a server OS on the desktop anyway.
splice/vmsplice only work if the source or target is a pipe. Completely useless for file-to-file copy operations.
I wanted to write a lengthy rebuttal here explaining how computers work, but my computer is busy with a seriously *intense* copy right now so I don't want to chance it.
Did you even read the summary? He specifically points out where desktop I/O has different requirements from server I/O: "When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%." So I think he's talking about things like video playback, web browsing, and general UI responsiveness--things that 100% do not matter on a server.
I've noticed this myself--start a complex task and all of a sudden the UI becomes really jerky. If I'm trying to multitask and some mundane task is making the whole UI slow, that's bad. I it takes me 10 seconds to do something with an unresponsive UI instead of 5 just so a bunch of files can copy in 1:00:00 instead of 1:00:01, that's bad.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
"When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%"
maybe the network card is crap or the wiring is faulty? myabe the HDD is near death?
No, massive unfairness is just as bad on the server as it is on the desktop - in all but a few select batch processing situations.
Replace 'desktop' with 'database', 'Apache', 'Samba' or 'number crunching job' and you get the same kind of badness.
There's not much difference really. If it sucks on the desktop then it sucks on the server too: why would it be a good server if it slows down a DB/Apache/Samba/number-crunching-job while prioritizing some large copy operation?
Nothing specific I can recall but I do remember poking around a bit with SoftICE.
"have you tried ionice?" - by larry bagina (561269) on Saturday October 23, @02:43PM (#33997928)
http://sourceforge.net/projects/ultradefrag/forums/forum/709672/topic/3690136
Find the analog to that in Linux' API's (thread and process level priority control code), & integrate it into your own open sourced code for projects you think needs it (which I imagine MOST of you around here don't operate fluently at sourcecode levels, some do I wager, but most not & even if you do, it takes time to study the flow of any project first).
Still, per your idea? Well - That's how I'd go about it I suppose at this level, and you open source crowd do have that much going for you, which could work out nicely at times.
APK
P.S.=> See, I figure it this way, especially to those of you that code: You've got threads now in Linux (since what, kernel 2.2 or so?? All I know was that around 1998/1999, the Linux kernel wasn't preemptible or re-entrant, & that meant no threads @ the kernel level (usermode "round-robin to kernel" cooperative threads do NOT count, ala Windows 3.x)), vs. forks only, thank goodness...
So, that all said & aside? Well, for "the industrious" & skilled amongst you, yes, this IS doable, because if I can do it in Windows, along with others now as shown above in the UltraDefrag64 project (& even show others also as I have for a 64 bit defragger in Windows) you can in Linux as well is my guess, even across languages (Object Pascal to C/C++ &/or Win32/64 API calls) and you additionally have your "Fresh Meat" sites etc. & Open code to work with... apk
Well, I don't even recall something like that in NT branches of windows.
If I need to move around huige amount of data, I use a commandline. Let it be Windows or Linux, it makes it easier on the system.
Tomorrow is another day...
My experience for Ubuntu on desktop was EXACTLY vice-versa, any I/O activity and even sounds started to jitter. I tested different schedulers, read ahead values etc. and that remained, nothing helped. I/O performance was crappy at best for desktop usage.
Pulsed Media Seedboxes
0-2G is local memory
2-3G is global memory, for stuff like GDI and USER and KERNEL. Every process sees the same window between 2 and 3G
3-4G is kernel memory.
Really, I'm right with you. I've using both sides for a long time. I've transfered huge amounts of data around. The only time the system slows down is if I start maxing out the IO. If he had shitty drivers (ancient distro/kernel, modern hardware), sure the performance is going to suck. I had a whole flock of servers, where when we booted to the install disk (using ISOLinux), it took seemingly forever to do a big file transfer. It was a 5 minute job that would take about 30 minutes. Once we booted into a real running environment, we could (and did) do similar transfers on the same hardware in the expected time (about 5 minutes).
Running terabyte transfers in the background shouldn't slow down the rest of the OS, unless it is reaching the IO capacity of the OS drive. Sure, transferring from sda1 to sda2 (booted to sda1) would be slow. Transferring from sdb1 to sdc1 should be fine.
The same applies to other OS's too, except sometimes there is extra overhead, where the OS does get slow. Those were non-*nix OS's though.
I have no problems with speed on USB 2.0 ports under Linux (current Slackware64). On the same hardware, I switch between Win7 and Slack64, and it's always faster under Slack64. Sometimes it's necessary to go in with Slack64 just to fix problems induced in Windows. The last one was a Cygwin install. There were things added in the directory, and I was housecleaning. Windows couldn't remove some files because of the filenames, and Cygwin couldn't even remove them. A quick boot into Linux, and then a 'rm -rf' did the trick.
Serious? Seriousness is well above my pay grade.
Your post makes no sense, what has lunix being awesome got to do with linux? They're totally different operating systems.
So you're saying that disabling the page file is a bad idea if you use most of the memory you have? AFAIK that's the most obvious trade off with disabling the paging file. Hence why everyone says "if you have enough memory".
OTOH, if your computer has 4 GB of RAM and you rarely, if ever, use over 800 MB of physical memory, then there is never a reason to page anything to disk. Windows will still preemptively page stuff there, in some idiotic fear that those 3.2 GB are going to get used sometime soon, so there's extra disk activity. While generally innocuous, that extra activity has an annoying habit of cutting into battery life (let's periodically spin up the HDD for no useful reason!), or happening when you're already disk IO saturated.
It all comes down to your usage, hence why it's a "tweak" and not the default. I'm sure people make the mistake of not examining their needs (especially if they fail to account for the disk cache), but, yeah, that's what happens if you do cargo cult tweaks. At the other extreme, there still seem to be a lot of people that think "free memory" makes your computer (or phone) run faster. So they install extra RAM, memory defraggers, decrease disk cache size, and increase swap size and swappiness so they live at 2-3 GB of free memory rather than 1 GB. Free memory is wasted memory, and until RAM is filled up there's no logical reason to use the (very slow) disk for short term storage.
What do you use? Genuinely curious.
I'm using ext3 on a 1TB WD Green. I used to experience huge fsck times in Karmic (hours), but ext3 in Lucid seems fine except that it takes 45 seconds to create a directory if I haven't created on in the last 5 min.
Are the "exotic" filesystems good for normal use and low fsck times? Is Reiser dead? Has btrfs reached a fork in the road?
I'm not a lawyer, but I play one on the Internet. Blog
Just to mention a very bad app, I'll say Evolution. If there is any heavy disk I/O it becomes ultra sluggish. And you're asking for coffee break especially when you're changing from mail folder to another or deleting messages. I was practically forced to use ionice with any disk to maintain acceptable performance with Evolution. (2.6.32-25-generic #45-Ubuntu SMP x86_64 GNU/Linux / GNOME evolution 2.28.3)
http://www.redhat.com/magazine/008jun05/features/schedulers/
I'd like to buy homeland for our 10 million people. http://twitter.com/mahadiga
"When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%."
I think what you describe is due to kernel bug 12309 and it looks to be fixed in 2.6.36. See https://bugzilla.kernel.org/show_bug.cgi?id=12309 and git commit http://git.kernel.org/linus/e31f3698cd3499e676f6b0ea12e3528f569c4fa3
Is this also an issue under Freebsd ? Im thinking of switching to one of the BSD desktop OS'es.
Did you even run a search on the person you responded to? If you know just a bit about Linux kernel development you should recognize his name as one of the people that is a Linux kernel veteran that works for RedHat and is involved in kernel schedulers. There's a rather large chance your Linux systems are running on the scheduler he helped develop.
Parent is right. You are wrong.
... i switched to FreeBSD, years go.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
I'm 48, but you sound to me like an old dude. Why would I NOT buy 8GB of ram? I use a big chunk of it for my thumbnail directory and other temp data that would otherwise go to a real hard disk, and it's NOT expensive. Well, maybe it is NOW but when I bought it it was cheap. Funny to think I could actually have MADE money by investing in RAM back then, as the 4GB "kits" I use are now about twice what they were when I bought them.
Anyway, who uses hibernate? Did you not see the comment about my five disk raid? This isn't a notebook. Who uses hibernate when they have torrents, freenet, and other p2p channels to make use of? You never get a good priority when you're constantly going on and off line.
I don't care if you can do oodles of HTML, if you can't paste the bloody link into the bloody white box, then all that HTML isn't worth diddly squat.
One of the 2.6.36 patches explicitly mentions addressing poor responsiveness when doing IO on slow (e.g. USB) devices. The CentOS 4 kernel seems to be a heavily patched 2.6.9 though...
I don't know about how default conditions apply, but with CFQ, you should learn how to use 'ionice'. When an I/O bound process is assigned, "idle", it goes completely 'idle' when any other process competitively wants I/O. So complaining about I/O processes swamping userland, is showing a configuration problem, not a problem with the scheduler.
What more do you want? This is the whole reason different I/O schedulers were added.
If you don't want the I/O to go into cache, then make sure your I/O heavy processes use the posix_fadvice( fd, , , POSIX_FADVIS_DONTNEED);
So what's the problem. You have a way to control priority as well as usage. What more do you want?
A refinement, of possible benefit, would be a param to limit the cache-pages/process in memory at any point. That could be another way to address the issue. Hmmm...
I would definitely ditch an OS that fucked up a file copy because I used the computer for something else while I was waiting.
Really +4 Insightful for what is obviously a Trollish, poor choice of curse words language choice. Okay lets run with your hypothesis, consider the following...
Than you have already ditched Windows, and if not you could not help but note that all new Windows OS development is using Linux development strategies, philosophies, etc, heck they even use the term kernel now too.
Not just Windows either, the Mac is built on Unix/Linux as well, they just call it OS X.
Pretty soon you will not have an operating system to use, based on your own definition, that is unless you step up and start helping to develop open source and Linux specifically. That way all the great new stuff will eventually get ported over whatever non-Linux operating system you are using, be it Windows 7 or OS X. Regardless Linux will have it first!
Of course that would mean you could not just complain, but you would have to have a solution as well...now that would NOT be Troll-like...that would be my suggestion to you. Don't complain unless you are going to provide a solution. No solution, then you must be a troll.
Read up on Antony Flew. It might broaden your education somewhat.
That's Adobe's fault, although still Windows problem. You can create a RAM drive and set swap to that partition as a fix.
Don't know how easy is to do that in Seven. That's one of the reasons I like XP... it is really easy to abuse (i.e., make things work MY way, instead of Ballmer's way)
Ubuntu is an African word meaning 'I can't configure Debian'
How about switching to Solaris 10? It has no scheduling issues: it is lightning fast, even during very heavy I/O.
The trouble is that in server workloads you generally don't see ONE LARGE I/O operation - lots of small ones instead. There are very very few server workloads that involve transferring >100MB data at a time (even when it comes to DB snapshoting). On the desktop this is common (all your AVI files).
Scheduler has to find a balance between allocating large enough slices of CPU time so that system isn't slowed down to a crawl due to context switching/missed cache hits/spinning between actually idle tasks while keeping them short enough so that somebody using the system doesn't notice the latency when his process has a chance to draw pretty terminal texts/pictures.
It's a tricky balance for a large number of technical and psychological (ooh, a benchmark) reasons.
There's lots of server workloads that involve large IO requests:
- backups
- DB startup/shutdown
- DB traffic that generates or reads a lot of new data (say report generation)
- HPC workloads that work with huge data sets
- animation farms that work with huge images/movies
- web servers streaming out big files
- fsck
- virtual desktop servers where the desktops are virtual instances running on the server. There any IO load within that 'desktop' runs on the server.
etc. As there is a fair number of server workloads that are IO heavy but which use small IO requests.
If you have those big files in networked storage or if you are backing them up to some network host then you've already transformed those kinds of IO requests into big IO requests on the server side as well: the big file you read or write on the desktop the network file/backup server will read/write from its own disks, etc.
Really, "interactivity sucks during big IO" kind of bugs can hurt servers just as much as they can hurt desktops. The boundary between desktops and servers is very fluid.
He's already using a real OS. How else do you think he posted this? The C64 BASIC interpreter?
I am not devoid of humor.
you all trust and have much more confidence in computers than I do, I surely don't trust them and have very little confidence in them
Politics is Treachery, Religion is Brainwashing
Sometimes poor I/O scheduling is because the I/O scheduler can't see the requests at all. /sys/block/sda/queue/nr_requests, it is by default some low number like 128.
Look at
Considering a few I/O requests, and their associated read-aheads, it can quickly fill up.
At that point no further requests are even *seen* by the I/O scheduler until it empties some of the queue.
Did you try setting it to some higher number like: /sys/block/sda/queue/nr_requests
echo 4096 >
http://linux.die.net/man/1/ionice
Believe me, that'll just make your system less responsive.
I wish I were kidding.
I am not devoid of humor.
I often note that multiple simultaneous low-priority file copies implemented as:
run faster than multiple simultaneous high-priority copies implemented as:
If the copies are run one at a time, the higher priority rsync runs faster. For multiple copies, often the lower priority rsyncs run faster. Also, desktop usability is much improved with the lower priority rsyncs.
I suspect a priority inversion occurs inside the file systems write back cache. At regular priority levels, data is not written back to disk in a timely manner. The ionice -c 3 gives the disk caches a higher priority than the rsync I/O commands, preventing the I/O commands from filling the cache and creating a priority inversion.
The Gnome GUI in Ubuntu is particularly vulnerable to this priority inversion, as by default it does multiple copies simultaneously inside a separate window. Ubuntu usually performs better than Windows however. Between the A-V software in Windows, and the tendency to swap applications out of memory to maximize disk cache, Windows usually performs the same copy operations more slowly than Ubuntu and with less system responsiveness.
Didn't even knew it was there. Thanks, works perfectly.
I've seen similar issues on certain hardware... I did a lot of tweaking on my older platform (Tyan Tiger MPX with a Trident 4D Wave NX audio card) but never really got audio jitter to go away completely until I simply bit the bullet and upgraded my computer (to some AMD760 chipset with onboard audio).
But if you've got a reasonably modern system, I'd suggest disabling PulseAudio on Ubuntu and running your audio apps on straight ALSA drivers.
And lest I lose my Linux fanboi cred, I might want to add that the Tyan Tiger MPX was a real pain under Windows... while audio worked fine, nVidia 6800 was a mess due to the crappy windows AGP drivers... mistesselated triangles showing up everywhere!
Ah, so the solution is just to not run the most common of all user desktop scenarios (not to mention laptops), the single hard disk PC.
I see where you're getting at, very clever.
You actually think that because I run a second program along with your "intense" copy, it can result in loss of "a few bits without even reporting any errors" due to what ? The magical fairies that dance inside those little chips getting angry ?
Actually, he's not wrong. If you have broken hardware (whether broken-by-design or just ordinary faulty stuff), there are a lot of failure modes which might not show up under light load but will show up under heavy load. Examples of this kind of thing really do show up in the field. However, the solution to it is not to live in fear, but instead to identify hardware which is solid under any load condition and buy that hardware rather than cheap shitty junk which breaks if you look at it crosseyed.
Here's just one example of how hardware might be able to handle light loads but not heavy: power distribution. As you load a chip more, it consumes more power. The supply voltage is ideally held constant, so more power implies greater current (since P = current * voltage). However, the regulator typically can only regulate the voltage at the chip's connections to the circuit board. Inside the chip, there are "wires" distributing power, and they have resistive losses like any other type of wire. The more current that's flowing, the greater the voltage drop. If there's too much drop, the supply voltage for some circuits inside the chip might dip below the point where they can operate reliably.
So in other words, it's quite possible to design a chip which functions very reliably under light load, but under heavy load some of its circuits don't get sufficient voltage and start malfunctioning.
OP is not joking. I'm using Ubuntu on eee900 netbook, and I can go make a coffee if I copy a 700MB movie from SD to internal storage. Lotsa RAM is not an option, and the system is default install of one partition for everything. It's pretty obvious if you have some overkill hardware the performance drop won't be nearly enough to affect user experience. No, "Buy More RAM" is what I'd expect from a Microsoft support to hear... I choose Linux primarily because of abysmal Windows performance on that netbook.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
That doesn't seam to be true.
http://stackoverflow.com/questions/1580923/how-can-i-use-linuxs-splice-function-to-copy-a-file-to-another-file
This is something I was wondering. As I understand it splice/sendpage just marks a page belonging to a block of one filesystem as a page of a block of another filesystem. Thus no memory copying, the same RAM the data was read from one disc is used to write that data to another disc. So I think markhahn has a point, the problem is userspace, not the kernel. The cp code could be faster if it used splice/sendpage zero copy stuff (when dealing with filesystems that it is possible to do so (i.e non-FUSE)).
BusyBox certainly doesn't http://git.busybox.net/busybox/tree/libbb/copyfd.c
Still, a server task slowed down by 50% will take 2x the time to finish, that's all. A GUI slower by 50% disrupts the user's workflow and will make the task 3-4x longer because the user wastes time elsewhere, performs unnecessary tasks waiting for the GUI to respond, overlooks appearing reply and reacts much slower...
Think of a network that runs at half the speed vs one with 50% packet loss.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
The GNU cp doesn't either:
http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/copy.c
Could be an interesting test.
I've had systems where it slows down for large file transfers, and others where it doesn't. The pattern I've noticed: Intel integrated video chipsets. It may be any integrated video, but the only integrated video I've had and used with Linux is intel.
Windows/Linux/FreeBSD: None slow down appreciably with a large file transfer and dedicated video card, though Windows has the most slowdown in this case.
Linux, by far, handles large file transfers the worst with integrated cards (at least G31 and earlier chips), and FreeBSD is the only one of the three where I don't see much performance degradation.
I *suspect* the issue is related to shared memory, although I would think that DMA would fix that.
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
I've found that IDE controllers tend to get bogged with heavy I/O. Some of the better controllers like SATA or SCSI tend to totally release the overhead of the I/O.
I also found some CPU/motherboard issues with I/O. For example, one system had an ATA-66 harddrive, and a 64 bit AMD processor (Turon I belive?). When the ATA was pegged, the AMD processor was totally slapped in the face, total time skews and everything.
It wound up being a weird timing issue in the processor where the fair-que got smacked around because of the timing bugs.
So sometimes, the hardware as well as the exact kernel revision you're using can impact performance.
One of the things I find that helps performance is to disable hyperthreading in your bios. For bogged down I/O, that seemed to improve a lot of performance.
Secondly, make sure you have a swap partition specified. Even if you have enough memory to choke a horse, you still should have some swap specified as the kernel may decide to swap out memory that's not been accessed in a long while to disk.
I hope this helps things.
I don't thrash a lot in Windows. I thrash more in OpenSUSE than I did in XP (I only put Linux on that PC when I try it). It has 2GB RAM which can run Vista and XP with little to no thrashing (with Swap set to Automatic). My other PCs run Windows 7 with 4+ GB RAM and a Dedicated ReadyDisk that is = in size to the RAM. They don't thrash at all :P
I experience these issues on Linux. It seems to happen most when the updaters run. So, whilst a Linux Desktop MAY boot up faster than XP, the Updater Applet immediately checks for updates, and it completely bogs down KDE. The end result is that I have to wait about as long as I do in XP to use the System. Also makes the desktop more jerky than necessary.
I find Linux to thrash more than Windows XP or 7 because it is more aggressive IRT caching. It's more similar to Windows Vista (which was extremely cache-happy).
If you have enough RAM Windows doesn't need a SWAP file. It will just cach things to RAM. Premier can run fine if you have lots of RAM and disable the SWAP file. By lots of RAM... 4GB RAM is not enough to run Premier that way if you are editing Videos of a decent size. I'm thinking more of 8-16GB RAM.
Windows can run out of Virtual Memory as well (just like most other OSes), so the fact that you have a SWAP file will not stop you from running out of Memory - especially with a high class Video Editing application... 4GB RAM is too little, IMO.
It was enough in 2004 or so, though...
try dd if=/dev/zero of=/tmp/delme bs=4K
with my sid kernel it hoses X, but with some patience you ctrl-z out of it or pause the process from a text console.
But usually you don't dd stuff, and usually windows is slower than debian. Where I work, all issues are windows related. Too much time to reinstall (aptosid from a usb stick: complete desktop able to print everywhere scan from everywhere in 4 damn minutes, while unattended XP clean install from SP2 takes hours) which office license was installed where, newer office seems less compatible with openoffice etc. All machines are dual booting.
IIRC kanotix ships with kon kolivas kernel so one can easily try out which scheduler is right for him. Then bother patching if it really makes a difference.
---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
Oh great I tried my own suggestion and it doesnt hose the system. Maybe newer kernels corrected the problem? Im on 2.6.36 and ext3 (on encrypted LVM) and doing the dd right now, as root. Load is 4, X is only a lil jerky at times. It was with earlier kernels and ext4 maybe...
---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
I question your experience of average desktop users.
The most common scenario is a single hd and lots of usb drives with data stored and retrieved from that.
Average users don't even know how to make a copy, they usually move files around if same filesystem and copy it around if different filesystem no matter if they wanted to move or copy stuff in the first place.
Back to topic, transfers on the same filesystem are slow because the heads are switching back and forth from source to destination blocks. If that hoses the system because of scheduling problems i dunno since i can't reproduce the problem anymore :D
---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
Because copying over a big file and doing other things is too taxing for a modern machine? Seriously?
Most shocking i/o behaviour like this I've suffered from has been a result of crap RAID controllers. Big cache, shoddy drivers/firmware/hardware, and linux i/o scheduling doesn't get a look in. Potentially you find your one critical read stuck behind 512Mbytes of poorly performing writes all within the confines of the card.
jh