Con Kolivas Returns, With a Desktop-Oriented Linux Scheduler
myvirtualid writes "Con Kolivas has done what he swore never to do: returned to the Linux kernel and written a new — and, according to him — waaay better scheduler for the desktop environment. In fact, BFS appears to outperform existing schedulers right up until one hits a 16-CPU machine, at which point he guesses performance would degrade somewhat. According to Kolivas, BFS 'was designed to be forward looking only, make the most of lower spec machines, and not scale to massive hardware. i.e. [sic] it is a desktop orientated scheduler, with extremely low latencies for excellent interactivity by design rather than 'calculated,' with rigid fairness, nice priority distribution and extreme scalability within normal load levels.'"
Why would the summary omit this precious bit of information?
Great news :-) Now, will the kernel people with Mr. Torvalds at their head, restart the whole debate on pluggable schedulers. Since his scheduler, as he says, degrades beyond 16 CPUs, better options already exists for servers where I am guessing CFS is used. So, he may be back, but the road ahead is still as steep?
May I be the first to say "amen"? I've been very dissatisfied with the 2.6 kernel and its schedulers on the desktop, CFS in particular. CFS seems entirely braindead for desktop use compared to the older schedulers in 2.4 and yes, even 2.2.
A desktop machine needs to be, first and foremost, responsive. If it isn't, it's comparable to the cursor freezing and input taking several seconds to appear: on today's hardware, one might start to think "hey, did it freeze on me?" - completely unacceptable.
Maybe it can be chalked up to the non-priority of X and video at the kernel level; I don't know. Whatever it is, it used to be better, on very pathetic (133MHz) hardware, while doing a lot more (and when such hardware was not all that powerful anymore, as well).
My question is: is it in the kernel tree yet? Is this that 2.6.31 scheduler change I heard about earlier yesterday, or is it something Completely Different?
Oh yeah, and which other scheduler's, if any, did this guy write?
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
I smell another LKML flamewar coming....
This comment is fully compliant with RFC 527.
Clearly, Desktop Linux and Server Linux have some things in common, but they also have different needs. I'm not intimately familiar with any kernel programming but I do have some basic understanding of how it all works and even I find it relatively easy to understand that the needs of a good and snappy desktop and those of reliable server are going to have some differences.
I think it is beyond time that some sort of kernel operating mode optimizations are enabled like this scheduler thing for desktop even if the defaults are for server.
Took me a while to figure out what "forward looking" means in this context, since "forward-looking scheduler" doesn't seem to be common terminology, and I assumed he wasn't talking about his grand forward-looking vision for schedulerdom.
Based on some previous arguments he's had, it sounds like he opposes the common heuristic of upping interactive process priority by keeping track of how long processes sleep--- processes that sleep a lot are probably I/O bound, and should get a priority boost so they can run on the (less frequent than for CPU-bound processes) occasions when they're ready. Kolivas wants schedulers to be forward-looking in the sense that they decide how to schedule without looking at process run history, by looking purely at who's ready to run, available timeslices, priorities, etc.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
He means something different by it--- that the scheduler should only look forward, not look back to per-process history in making its scheduling decisions. A common hack/heuristic to improve interactive performance is to boost the priority of processes that sleep a lot, since CPU-bound jobs sleep rarely, while interactive processes sleep a lot. Kolivas think that's a hack that obscures the real problems with interactive performance, and leads to unpredictable performance since it doesn't fix the underlying issues. So wants to design schedulers with good interactive performance that make decisions based purely on the current set of running processes and priorities, and the upcoming timeslices.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
Haven't run Linux as my personal OS since 2003 but I had a lot of time (pun intended) for CK's schedulers. Now a whole new generation of youngsters can finally learn what a _REAL_ LKML flamewar looks like ;-)
Still some grudge towards Torvalds and Molnar? From the FAQ:
Are you looking at getting this into mainline?
LOL.
No really, are you?
LOL.
Really really, are you?
No. They would be crazy to use this scheduler anyway since it won't scale to their 4096 cpu machines. The only way is to rewrite it to work that way, or to have more than one scheduler in the kernel. I don't want to do the former, and mainline doesn't want to do the latter. Besides, apparently I'm a bad maintainer, which makes sense since for some reason I seem to want to have a career, a life, raise a family with kids and have hobbies, all of which have nothing to do with linux.
Reminds me of this XKCD.
I don't have 4096 CPUs, good job Con Kolivas!
I've yet to be impressed by any of them, for any use, with any hardware.
I've yet to be impressed by your comment, which contains no reason for your opinion.
Care to give us some examples of your uses & hardware?
My pics.
CFS can't even cope with a CPU-bound application.
Who here runs Linux on anything with more than 16 cores? Why should everyone else get the shitty end of the stick just because of maybe a dozen institutes with deep pockets?
16 sounds like a ridiculously high number for a desktop but is it?
Already we have 4 core processes which have "soft" additional threads (Intel's HT for instance) and some people already have dual CPU desktop machines meaning they are already at the 16 CPU limit.
Roll on 12-18 months and we'll be seeing 8 core CPUs with 8 soft-cores as coming in on top end desktops. Roll forwards 3 years and you'll be seeing 32 core CPUs with 32 soft-cores which is where the scheduler breaks down.
So the problem here is that this is a brilliant optimisation for today and for pieces like the netbook market but won't be good for the desktop market long term.
With Linux looking to be strong in the netbook market however it does say that having a more efficient scheduler for that market would be a better idea than just optimising everything for the server side.
An Eye for an Eye will make the whole world blind - Gandhi
Musical Schedulers? Let me guess, when the music starts to skip, a random process gets killed.
Almost all runaway processes are due to bugs in the end applications, not some situation created by the kernel.
Compiling with SSD vs. mechanical HD:
http://anandtech.com/storage/showdoc.aspx?i=3631&p=25
Compiling is CPU bound.
Hurd is not unsuccessful because it is a microkernel, it is unsuccessful because it is run by perfectionists. Every time they get something quite good, they realise that a complete rewrite could make it even better and they throw away a lot of good code.
Xen seems to be doing quite well as a microkernel, but until everyone is using multiprocessor machines there is a performance penalty for using a microkernel. When everyone is using multicore, they still have the disadvantage that monolithic kernels have been under active development for the last thirty years (more in a few cases) while microkernels have been largely ignored.
A modern OS kernel, however, often has a lot more in common with microkernel designs even if it's all running in a single address space. Take a look, for example, at the OpenSolaris network stack. Every component runs in a separate thread and communicates with those above and below via message passing. It would be trivial to separate these out into different userspace processes, but there's no real advantage to doing so.
I am TheRaven on Soylent News
During testing (on the Windows platform!) I guess it's safe to assume that everything was handled by filesystem cache.
The comparisation with compiling the kernel on Linux on a machine with not too much RAM doesn't stand.
Also, compilation is not all that I/O bound, it is more CPU bound.
Depends a lot on what you're compiling. A typical program on OS X, for example, begins with #import <Cocoa/Cocoa.h>. This includes a header which brings in around a hundred other headers for a total of about 3MB of preprocessed source. Most of the time you'll be using a precompiled header for this, but you still often get a spike of read activity at the start of a compilation, then a CPU-bound chunk, then a write-bound part as it generates the object code. This is why, when you use -j, you are recommended to use a few more processes than you have cores, so you can overlap the I/O-bound parts in one compile with the CPU-bound parts in the next.
I am TheRaven on Soylent News
Welcome back Con! I wonder how long it is before Ingo "Kudos Con" Molnar rips of the new design? The kernel team has developed a very bad case of "not invented here." http://kerneltrap.org/node/8059
an ill wind that blows no good
If you're interested, the clang team have done a lot of profiling of exactly what takes time when compiling. It's particularly interesting how much of a bottleneck preprocessing is with gcc and, more importantly, distcc (which sends the preprocessed sources over the network for compilation). Most of the results are on the web site, with a few in the mailing list archives.
I am TheRaven on Soylent News