The ~200 Line Linux Kernel Patch That Does Wonders
An anonymous reader writes "There is a relatively miniscule patch to the Linux kernel scheduler being queued up for Linux 2.6.38 that is proving to have dramatic results for those multi-tasking on the desktop. Phoronix is reporting the ~200 line Linux kernel patch that does wonders with before and after videos demonstrating the much-improved responsiveness and interactivity of the Linux desktop. While compiling the Linux kernel with 64 parallel jobs, 1080p video playback was still smooth, windows could be moved fluidly, and there was not nearly as much of a slowdown compared to when this patch was applied. Linus Torvalds has shared his thoughts on this patch: So I think this is firmly one of those 'real improvement' patches. Good job. Group scheduling goes from 'useful for some specific server loads' to 'that's a killer feature.'"
Considering that UI lag was always my big problem with anything Linux-based (hell, it even seems to affect Android phones), this might be one small patch for the kernel, one giant leap for userspace...
I used to. For 3 years. But I wanted my time back.
Of course, how many years from now will that filter into the distros? My guess:
Gentoo: soon
Debian Unstable: 2Q 2011
Ubuntu, Fedora: 1Q 2012
Debian Stable: 2015
RHEL: 2020
Oh, no! You have walked into the slavering fangs of a lurking grue!
That's not a kernel patch... That's a bash script that forcibly installs BeOS!
http://xkcd.com/619/
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
They aren't compiling the kernel to see how long it will take(which, as you say, is rarely of all that much interest, few people do it and a fast build-box isn't going to break the budget of a serious project), they are using a multithreaded kernel compilation as an easy way to generate lots of non-interactive system load to see how much that degrades the performance, actual and perceived, of the various interactive tasks of interest to the desktop user.
This isn't about improving performance of any one specific program; but about making a reasonably heavily-loaded system much more pleasant to use. Compiling the kernel is just a trivial way to generate a large amount of non-interactive CPU load and a bit of disk thrashing...
Compiling the kernel doesn't prove true userspace improvements, but it does show an improvement with scheduling.
I see. It creates "groups" based on the tty and then tries to even out the CPU utilization between groups. This helps if there is a crazy background process eating up CPU and it might even help control flash crushing system performance a bit.
MidnightBSD: The BSD for Everyone
Isn't it awesome when a new version of your OS performs *better* than the last one on the same hardware?
# cat
Damn, my RAM is full of llamas.
They mention the "Con Kolvias" scheduler in TFA, but they don't seem to want to refer to it by its real name:
http://en.wikipedia.org/wiki/Brain_Fuck_Scheduler
It doesn't scale well past 16 cores, which is why Linus doesn't want to include it in the main kernel. But it's included in custom kernels for mobile devices, such as CyanogenMOD for my Android phone.
No matter how many different flavors of Linux I installed, it just never seemed as snappy as Windows. There was always a sluggishness about it, nothing I could really put my finger on, but it was definitely there and it bothered me. I'm very glad to hear that a solution is in sight.
I hope the people at Ubunto get this out as an update as soon as possible.
Typically when you get this sort of speedup, it's by rewriting a tiny piece of code that gets called a lot. Sometimes you can get this sort of thing from a single variable, or for doing something odd like making a variable static.
The patch being talked about is designed to automatically create task groups per TTY in an effort to improve the desktop interactivity under system strain.
I guess this works because the task loading up the machine will have been launched from a konsole, and thus be tied to a specific tty (the make -j64 example given later), so a tty-based scheduler can appropriately downgrade it.
But what if the "loading" task has been launched directly from the GUI (such as dvdstyler)? It won't have a tty attached to it, and will be indistinguishable from all the other tty-less tasks launched from the GUI (such as the firefox where you browse your webmail), or worse, Firefox itself creating the load (such as happens when you've got too many Facebook or Slashdot tabs open at once...)
This has been brought up by others on slashdot before, but Linux tends to be either (A) fine and happy, or (B) pushed into a thrashing state from which it can never recover - like it takes 8 or 10 minutes to move the mouse cursor across the screen. Since there is relatively no warning before this happens, it makes a hard reboot just about the only option.
Will this patch help with that issue? Like the threads below say, once a modern (KDE/Gnome) desktop Linux starts swapping, there is so much new data produced per second by the GUI that it's basically game over. I'd like to see a fix for this: it's the single biggest cause on Linux that makes me do a hard reboot. I just don't have the patients to see if the thing will recover in half an hour or so, even though it might.
http://ask.slashdot.org/comments.pl?sid=1836202&cid=33998198
http://ask.slashdot.org/comments.pl?sid=1836202&cid=33999108
http://www.linuxquestions.org/questions/linux-kernel-70/swap-thrashing-can-nothing-be-done-612945/
This is not the scheduler that the grandparent would be referring to though. BFS has been around for about a year, and has as far as I know never actually been pushed for inclusion.
The previous scheduler that Con wrote was rejected in favor of CFS which is currently in use by the kernel. CFS is at least partly based on ideas from Con, and he was also credited for them.
....would make it 2x faster! LOC is the #1 metric for programming.
With make -j64, this is ok: The make will have been started from a terminal (it's not important whether it's a Linux (console) terminal or within an X window), whereas most other GUI programs won't. So they can be distinguished from each other.
But in case of dvdstyler, it won't have a tty if it has been started directly through the menus, and so the scheduler cannot put it in a different group than all the other GUI apps (which also lack a tty).
Of course, if you start dvdstyler from the command line (from a konsole), this won't apply.
You can easily check which tty (if any) a program has associated by doing a ps auxww and looking at the 7th column (where it will say ? for "no tty" or name the tty (such as pts/9)
The answer is very very very badly.
This is a "NERD Feature" patch which does very little to the improve the way Joe Average Luser uses his desktop. In fact it leads to some seriously goofy allocation artefacts.
What it does (if I read it right) is that it puts all processes with the same controlling TTY into the same group. Well, anything launched in X has no controlling TTY. So it all gets lumped into one group. Now you open a xterm and you launch something from there. Miracle, halleluyah, that actually got into a separate schedule group which can now hog a CPU while the rest of apps will fight for the other one on a two core machine. So what am I supposed to do? Start every one of my X applications from a different Xterm so they have a different controlling TTY (and do not close any of them)?
Screw that for laughs.
Process grouping is too difficult to be done based on such simplistic criteria. It is best to provider an interface through which a user can group all of the processes with his UID and leave the Desktop environment do the grouping. Or put something on the dbus which listens and follows who talks to whom to do the same. This will provide much better results than putting yet another simplistic euristic in the kernel.
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
Wow, I think my K6 pentium clone could compile the kernel faster than that!
Con Kolivas worked on his staircase scheduler and various performance patches for years. They were routinely demonstrated to be a major improvement. Linus kept saying he was concerned the tradeoff of desktop performance would come in other environments, even though that wasn't true.
Since benchmark after benchmark showed staircase was vastly superior to what was in mainline, Linus then went to go after the guy rather than the code. He said Kolivas couldn't be trusted to support his code, and thusly it would never be accepted mainline. Reality was that Kolivas had been responding to criticism and updating his patchset for over 3 years, constantly improving it. In addition to the LKML, he also maintained his own support mailing list.
I'm a Linus fan 95% of the time, but it was a really shitty move, and it drove Kolivas away from contributing. He quit coding for a while. Then after Linus argued for years how this was a bad idea, suddenly the mainline kernel developed the Completely Fair Scheduler overnight, which was very similar to Kolivas' Staircase scheduler. Linus never admitted he had been a dick for years arguing against the thery of the scheduler rewrite. He then pushed the brand new, untested scheduler into mainline.
CFS is better than what we had before, but it still lost in benchmarks to Staircase, and it was new, untested code.
Now, Kolivas came out of retirement with a new scheduler called Brainfuck that is even faster, but he has no intention of ever tryint to get it in the mainline kernel.
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
Yep. For those that haven't tried it without the patch, a multithreaded kernel compile will typically peg a modern multicore CPU at 100% and will even give you drunken mouse syndrome. Just being able to scroll around a browser window while doing a lengthy make -j64 is impressive. Being able to watch 1080p video smoothly is ... astounding. Especially when you consider the minimum CPU requirement for 1080p H.264 playback is a 3 GHz single core or a 2 GHz dual core.
My blog
Obviously you've never compiled a kernel passing -j64 to the make process. If you had, you would know that all your CPUs would be pegged (indeed -j7 pegs all 8 cores on my laptop.) Of course that is not the benchmark part. The point is that with an "all your processors are belong to kernel build" condition, group scheduling allows there to be essentially no perceived slowdown for interactive GUI/Window manager based computing. You probably should have figured out that you were missing something when Linus Torvalds didn't object to the benchmark and the results. Seriously, this is a major point to understand: When it comes to the kernel, in a fight between your knowledge and Linus', Linus wins.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
If it takes less than 3 years, you're doing it wrong.
This post contains benzene, nitrosamines, formaldehyde and hydrogen cyanide.
recompile the Gentoo kernel using --mph=88
Why not just compile it yourself?
1) There are no 1200 kbps dialup modems. The fastest ones do 56 Kbps under specialized conditions.
2) If you meant to say 1200 bps modems, well, by the Linus wrote and released the first versions of the Linux kernel, 1200 bps modems and the 386SX were both well obsolete. The most common systems of that day were 486DXs running at 33 MHz at the sweet spot and 50 MHz at the high end. Most people had modems capable of greater than 9600 BPS (many around 14.4Kbps and 28.8Kbps)
3) Now quit making fun of us real old-timers and get off my lawn .
My blog
Actually Linus lost track of many such things because too self centered or ego driven (which happens to most of us when you such success and things to deal with but anyways)
This very patch is a prime example, if it goes *default* in the kernel.
It's a patch that favors *only* Linus's usage of linux: Browse the web while his kernel compiles. Now imagine you start your video from a tty (mplayer blah_1080p.avi) and it takes 95% cpu to be smooth, then you start your browser.. uho.
In BeOS I could^H^H^H *can* (haiku, too) start 5 videos on and old PC, browse the web and:
- the browser is smooth like butter
- the 5 videos are a bit choppy (no miracles but that the point) but they all run at the same speed
Now _that_ is what I want a desktop scheduler to be like.
With more criticism he could see that some would want this auto grouping option, but the majority wouldn't. Now what does that tell us?
It tell us that it's _either_:
A/ Not the best solution (i.e. our scheduler sux)
B/ Grouping should be smarter or more easily configurable (it's currently configurable in the previous kernel version and can do just what this kernel patch does!)
So no, it doesn't have to be a physical RS232 serial line like in the seventies :-)
Occassionally, yes. But usually when Linus flames someone on the LKML he is entertaining, and is right. This was an instance where Linus just seemed to be a dick for no reason.
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
Linux has had I/O scheduling for a long time, which ought to address this. In practice, I/O scheduling is actually very hard to get right for nontrivial use cases, so your milage may vary. The real problem is identifying the programs that are 'important' (for some value of important). Windows has (had?) a simple heuristic, where the program with the currently active window got a scheduling boost, but this was problematic when you were running Word in the foreground and Windows Media Player in the background - Word's background spell check would get priority over WMP's video decoder. FreeBSD's ULE scheduler tries to prioritise processes that spend a lot of time blocking for I/O, but again this doesn't help with things like H.264 playback which use lots of CPU, lots of I/O, and really don't want to drop frames.
I am TheRaven on Soylent News
Its explained in the BFS FAQ http://ck.kolivas.org/patches/bfs/bfs-faq.txt
Why "Brain Fuck"?
Because it throws out everything about what we know is good about how to design a modern scheduler in scalability.
Because it's so ridiculously simple.
Because it performs so ridiculously well on what it's good at despite being that simple.
Because it's designed in such a way that mainline would never be interested in adopting it, which is how I like it.
Because it will make people sit up and take notice of where the problems are in the current design.
Because it throws out the philosophy that one scheduler fits all and shows that you can do a -lot- better with a scheduler designed for a particular purpose. I don't want to use a steamroller to crack nuts.
Because it actually means that more CPUs means better latencies.
Because I must be fucked in the head to be working on this again.
I'll think of some more becauses later.
BFS makes the classic trade offs which Linus and almost all others absolutely agree is a bad decision for core inclusion. BFS trades performance for latency. Basically you get really good interactive performance in exchange for massively losses in over all throughput and efficiency. Furthermore, BFS sits on a horribly slippery slope in that the level of *inefficiency* scales wonderfully with core count. Basically, the more cores you add, the more time BFS spends BF*cking around.
Basically BFS only makes sense on single or dual core systems which will primarily be used as a desktop. After that, it better be a dedicated desktop system because compared to the stock kernel, performance is going to royally suck, despite latency being fairly good. But then again, BFS was never really designed to address scalability - its always focused on latency and a superior interactive desktop experience. By most measures it works, but at a heavy cost.
Its been fairly well documented as to what happened. Linus is an egotist; as are many smart people, including others on the LKML. It seems Con has a fair ego himself with somewhat gruff interpersonal skills. The combination put him at odds with a lot of people. Generally speaking, people with large egos, don't like others who know more than they do, especially when its their pet project. Furthermore, Linus already had some people he trusted, who also had their ego's hurt by Con's work.
So the line was drawn. Linus and his trusted lieutenants on one side and Con on the other. Linus with his lieutenants, all with hurt egos from a slightly abrasive developer who clearly understood the problem domain better than all of them, simply decided he preferred his pride and undue ego over that of Con's potential contributions.
Not surprisingly, this type stuff happens ALL THE TIME amongst developers. I can't tell you how many times I've seen personal pride and ego take priority over extremely well documented and well understood solutions. Not to mention, frequently, the same documentation which explains why its a good decision, also explains why the decision of pride is an incredibly bad decision; with a long history of being very dumb. Despite that, egos frequently rule the day. So its hardly surprising that Linus isn't immune.
Indeed, I find that Linux has an almost miraculous ability to convert users into developers. I've encouraged several friends over the years to use it, and out of those that have actually done it, most have not just stuck with it but have even surpassed me in some ways. It can take a teenager or young adult who knows barely anything about coding or even much about computers and can turn them into a coder or admin in no time. My theory is that people follow a sort of a path:
1. They use a bloated (but more immediately user-friendly) distribution like Ubuntu.
2. They hear about other users with lighter software and try it out. It requires a bit more configuration to be the way they want it to be. In the process, they learn more about their system and they learn many useful things.
3. The drive for improved performance and usability leads them to learning more and more and doing deeper and deeper configuration until they know a great deal and are very comfortable with their system.
4. The "scratch that itch" (ESR) effect comes into play; they see deficiencies in existing software and go out to make their own.
5. Before long, they are contributing to several projects, have a technical blog, and are advanced users.
It seems to be some sort of natural progression and it's interesting to see it happen over the period of several years. More significantly, Linux seems to turn a higher percentage of basic users (even those of the luser variety) into developers and advanced users. This seems to be why Linux progresses at such a fast pace; its users actively encourage other users to involve themselves on a deeper level.
Yet Another Tech Blog
(but so much more, including game and movie reviews)
http://yanteb.peasantoid.org
Back in the 80's and 90's many DOS users followed the same path. BYTE, PC Magazine, and even PC Computing provided essays on "how it works" and source listings for 8086 ASM and BASIC programs. But somewhere around the time of Windows 95, things seemed to change and it became expected that the "average user" had no interest or time in learning how anything works. I remember that desire to Not Know Or Care as one of the major friction points between the DOS user base and the Windows (and Mac for that matter) user base.
Actually even with no swap you will jam Linux when you run out of memory. Things like system libraries get thrown out of memory cache, but are soon needed again and read from the disk. This kind of circus can go on for half a hour until the actual OOM killer gets into the game.
It's been fairly well documented but you still seem to ignore the reality of what happened.
http://kerneltrap.org/node/14008
Read all that then tell me that Linus has an ego here. It seems to me that Linus is the only level headed guy and you're just trying to distort what really happened.
- He choose CFS over SD because SD behaviour was inconstant among users' computers
- Con would argue with people sending him problems rather then accept them as problems with his code
- Linus didn't want code in the kernel that would only work well for certain users
- Linus didn't want code maintained by someone that was so hostile to others' critique
- Linus states that he believes the desktop is an important platform