The ~200 Line Linux Kernel Patch That Does Wonders
An anonymous reader writes "There is a relatively miniscule patch to the Linux kernel scheduler being queued up for Linux 2.6.38 that is proving to have dramatic results for those multi-tasking on the desktop. Phoronix is reporting the ~200 line Linux kernel patch that does wonders with before and after videos demonstrating the much-improved responsiveness and interactivity of the Linux desktop. While compiling the Linux kernel with 64 parallel jobs, 1080p video playback was still smooth, windows could be moved fluidly, and there was not nearly as much of a slowdown compared to when this patch was applied. Linus Torvalds has shared his thoughts on this patch: So I think this is firmly one of those 'real improvement' patches. Good job. Group scheduling goes from 'useful for some specific server loads' to 'that's a killer feature.'"
They aren't compiling the kernel to see how long it will take(which, as you say, is rarely of all that much interest, few people do it and a fast build-box isn't going to break the budget of a serious project), they are using a multithreaded kernel compilation as an easy way to generate lots of non-interactive system load to see how much that degrades the performance, actual and perceived, of the various interactive tasks of interest to the desktop user.
This isn't about improving performance of any one specific program; but about making a reasonably heavily-loaded system much more pleasant to use. Compiling the kernel is just a trivial way to generate a large amount of non-interactive CPU load and a bit of disk thrashing...
Arch Linux: Already in core.
I exaggerate, but it's not far from the truth - the kernel releases are imported into the testing repository as soon as they come out.
Yet Another Tech Blog
(but so much more, including game and movie reviews)
http://yanteb.peasantoid.org
They mention the "Con Kolvias" scheduler in TFA, but they don't seem to want to refer to it by its real name:
http://en.wikipedia.org/wiki/Brain_Fuck_Scheduler
It doesn't scale well past 16 cores, which is why Linus doesn't want to include it in the main kernel. But it's included in custom kernels for mobile devices, such as CyanogenMOD for my Android phone.
Typically when you get this sort of speedup, it's by rewriting a tiny piece of code that gets called a lot. Sometimes you can get this sort of thing from a single variable, or for doing something odd like making a variable static.
Or whenever YOU choose to install the patched kernel. Freedom at work.
---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
This is not the scheduler that the grandparent would be referring to though. BFS has been around for about a year, and has as far as I know never actually been pushed for inclusion.
The previous scheduler that Con wrote was rejected in favor of CFS which is currently in use by the kernel. CFS is at least partly based on ideas from Con, and he was also credited for them.
Yep. For those that haven't tried it without the patch, a multithreaded kernel compile will typically peg a modern multicore CPU at 100% and will even give you drunken mouse syndrome. Just being able to scroll around a browser window while doing a lengthy make -j64 is impressive. Being able to watch 1080p video smoothly is ... astounding. Especially when you consider the minimum CPU requirement for 1080p H.264 playback is a 3 GHz single core or a 2 GHz dual core.
My blog
Or rather, that there are different priorities.
The problem was with finer grained scheduling you wound up sacrificing server performance for desktop responsiveness. Linux does not want to sacrifice performance really.
To be fair I've been using linux on my desktop for over ten years now and even without this patch mostly fail to see an issue.
So no, it doesn't have to be a physical RS232 serial line like in the seventies :-)
Linux has had I/O scheduling for a long time, which ought to address this. In practice, I/O scheduling is actually very hard to get right for nontrivial use cases, so your milage may vary. The real problem is identifying the programs that are 'important' (for some value of important). Windows has (had?) a simple heuristic, where the program with the currently active window got a scheduling boost, but this was problematic when you were running Word in the foreground and Windows Media Player in the background - Word's background spell check would get priority over WMP's video decoder. FreeBSD's ULE scheduler tries to prioritise processes that spend a lot of time blocking for I/O, but again this doesn't help with things like H.264 playback which use lots of CPU, lots of I/O, and really don't want to drop frames.
I am TheRaven on Soylent News
Its explained in the BFS FAQ http://ck.kolivas.org/patches/bfs/bfs-faq.txt
Why "Brain Fuck"?
Because it throws out everything about what we know is good about how to design a modern scheduler in scalability.
Because it's so ridiculously simple.
Because it performs so ridiculously well on what it's good at despite being that simple.
Because it's designed in such a way that mainline would never be interested in adopting it, which is how I like it.
Because it will make people sit up and take notice of where the problems are in the current design.
Because it throws out the philosophy that one scheduler fits all and shows that you can do a -lot- better with a scheduler designed for a particular purpose. I don't want to use a steamroller to crack nuts.
Because it actually means that more CPUs means better latencies.
Because I must be fucked in the head to be working on this again.
I'll think of some more becauses later.
But the fact that Windows and OS X and many other lesser OS have fixed this problem years ago could also be used to show that Open Source is not quite as up on the times as you would like to think.
Seriously? OS X has absolutely terrible scheduling. A process that causes some swapping will cause beachballs everywhere and can freeze the windowserver for several seconds (well, the mouse keeps moving, but nothing else does). Hell, I have a FreeBSD VM on my Mac that responds better under load than stuff outside of it does.
I am TheRaven on Soylent News
BFS makes the classic trade offs which Linus and almost all others absolutely agree is a bad decision for core inclusion. BFS trades performance for latency. Basically you get really good interactive performance in exchange for massively losses in over all throughput and efficiency. Furthermore, BFS sits on a horribly slippery slope in that the level of *inefficiency* scales wonderfully with core count. Basically, the more cores you add, the more time BFS spends BF*cking around.
Basically BFS only makes sense on single or dual core systems which will primarily be used as a desktop. After that, it better be a dedicated desktop system because compared to the stock kernel, performance is going to royally suck, despite latency being fairly good. But then again, BFS was never really designed to address scalability - its always focused on latency and a superior interactive desktop experience. By most measures it works, but at a heavy cost.
Was I the only one thinking about Amigas and Alphastations?
Amigas had so much offloading that you could pull the CPU out and still move the mouse pointer.
Did you mount a military-grade, variable-focus MASER on an unlicensed artificial intelligence?
Not all of us had money to keep upgrading our equipment. I was running on a 2400 baud modem until 1995. Of course, I installed Linux on my home box in those days by downloading Slackware to a ton of 3.5" floppy disks at the computer lab at the local university and bringing them all home. If one of the floppies was corrupted, I had to wait until the next day to go back and re-download and copy it.
I also had to walk 10 miles, in the snow. Uphill both ways.
Actually even with no swap you will jam Linux when you run out of memory. Things like system libraries get thrown out of memory cache, but are soon needed again and read from the disk. This kind of circus can go on for half a hour until the actual OOM killer gets into the game.
It's been fairly well documented but you still seem to ignore the reality of what happened.
http://kerneltrap.org/node/14008
Read all that then tell me that Linus has an ego here. It seems to me that Linus is the only level headed guy and you're just trying to distort what really happened.
- He choose CFS over SD because SD behaviour was inconstant among users' computers
- Con would argue with people sending him problems rather then accept them as problems with his code
- Linus didn't want code in the kernel that would only work well for certain users
- Linus didn't want code maintained by someone that was so hostile to others' critique
- Linus states that he believes the desktop is an important platform