Torvalds Explains Scheduler Decision
Firedog writes "There's been a lot of recent debate over why Linus Torvalds chose the new CFS process scheduler written by Ingo Molnar over the SD process scheduler written by Con Kolivas, ranging from discussing the quality of the code to favoritism and outright conspiracy theories. KernelTrap is now reporting Linus Torvalds' official stance as to why he chose the code that he did. 'People who think SD was "perfect" were simply ignoring reality,' Linus is quoted as saying. He goes on to explain that he selected the Completely Fair Scheduler because it had a maintainer who has proven himself willing and able to address problems as they are discovered. In the end, the relevance to normal Linux users is twofold: one is the question as to whether or not the Linux development model is working, and the other is the question as to whether the recently released 2.6.23 kernel will deliver an improved desktop experience."
Hasn't Linus already said he's very interested in adding it into the Linux kernel (going so far as to say that it could be a reason to consider going GPLv3 in the future if Sun releases Solaris under it), but right now it's tied up in closed source?
What do CFS and SD stand for in this case? The summary and linked articles do not describe this.
Is CDDL compatible with GPLv2? I may have mispoke in my first post, but I think the gist of what I was trying to say is still there. ZFS is behind a license which isn't compatible with the license Linux is under. Or am I completely misunderstanding things? I'll admit I'm far from an expert on these matters...
which kernel scheduler is pretty low on the list of factors affecting what the Linux desktop experience is all about...
frankly, really high quality experiences take organizational planning and leveraging the expereince of huge groups in way that the "bazzaar" model of software developemnt in open source does not do well. Would someone please just build a mutual benefit corporation for open source users and maintainers? Let's start paying for project managers and the other experienced professionals required to make a "desktop experience" and you will see Linux take over.
Having lurked on http://www.lkml.org/ for several years, I find Linus to be rather rude. May be it's because English is not his first language...so words are not well chosen. I must say though, that I excuse him because he produces, [or helps produce] a very useful product on the world today. That is the Linux kernel.
You can be pretty damn right that the desktop experience is improved with Ingo Molnar's scheduler! If you have done any serious audio work on any platform you know that Linux kernel + Ingo Molnar's IO scheduler = the best platform for serious audio work. This combination has the lowest latencies. Linux kernel+Ingo Molnar's IO scheduler+Ardour offers currently lowest latencies and the best of all - it's completely free! It is pretty amazing - really. Every true professional audio engineer will agree with me.
Why can't there be a flag that determines what scheduler is used at runtime, with both schedulers built into the kernel? I thought the whole point of Linux is that it is customizable and modular - I know this doesn't necessarily apply to the kernel, but why not?
;)
I know very little about operating systems, schedulers, and maintaining large projects, so please excuse any ignorance in my post
As long as it's not anything like the CFQ IO elevator, which has turned out to slow down and increase critical latencies on every system I've tested it on, compared with deadline and anticipatory schedulers. This seems to be especially true for hard working (i.e. underpowered) systems, where the need for a good IO scheduler is higher.
I know, IO and job scheduling are two different things, but I still hope the "completely fair" naming part is coincidental, and not a promise of similarity.
Have you improved it to the point that when the system is borderline out of main memory or has a moderately high load average it actually *works* as a desktop system?
E.g. when Firefox is consuming 65-70% of main memory and slower than #%#$ and you know it is waiting on swapped out pages and your swap rate is measured in the dozens to hundreds rather than hundreds to thousands (on vmstat)? (I mean really, how can one take an operating system seriously when only memory is at 100% and not CPU + memory + Disk I/O?)
The real issue, for those who have read comments that Con has made in interviews, seems to be the lack of concern on the part of most of the "in-crowd" Linux developers for performance on the desktop. In part this seems driven by the fact that the people who actually get paid to maintain Linux, benchmark it and "improve" it only care about its performance in server farms and *not* on the individual desktop. I will weigh in on the side of desktop user out there (that wants the Linux sitting beneath their desk to devote its every waking minute to making *them* happy) by saying that if my mplayer "hangs" in the middle of a song (only to continue with a loud burst of noise 10 seconds later) when the CPU is busy with "nice -19" processes, my Firefox browser takes half a minute to scroll a page or open a screen) when memory is tight, and it takes minutes to bring up a tab or minimized program I haven't touched in 3 days and return them to a functional state then the operating system *Has a PROBLEM*.
Con was very clear in his interviews that the problem is the lack of caring about *desktop* performance. Given my comments in the previous paragraph -- some of these areas may be very difficult to benchmark and as a result one is left with nothing but handwaving and loud voices when it comes to discussions about whether the problem exists and how it should be fixed.
I will say this, in the mid-'90s I used X-windows under Unixware on *Pentium 1s* as a desktop machine. I now use X-windows under Linux on a Pentium 4 (with 5-10x more main memory) as a desktop machine. I would argue that my desktop user experience is as problematic now as it was then *despite* the hardware improvements. That IMO is what Con felt was the problem he was trying to address. That is what it would appear the core Linux developers may be failing to address. Con's points raised my awareness level to the extent that I actually went investigating to see whether there were open source distributions of the BeOS and/or Darwin (which is based on Mach) available since they are based on different OS architecture models and might be more end-user friendly [1]. I was hoping to find something I could run in a VM under Linux on my current hardware without major file system surgery. But I have little confidence that such an approach would fix core problems with the Linux scheduling and paging systems. I would *love* to see a real side-by-side comparison of Linux vs. FreeBSD for desktop users with an emphasis on how BSD scheduling, paging and swapping may be different (better?).
(And as a side note, I could care *zero* about the performance of Linux in file server applications!)
1. I did use both Nextstep (on Pentiums) and IRIX (on a R4000) for a while and found both to provide better end-user experience than Unixware (X) or Windows (95-98). I am disappointed that Linux barely manages to match those experiences given the hardware available nearly a decade later.
1. Games run just perfectly on Linux with often better speed than in Windows. Every single game i has played in Linux has worked perfectly and smoothly. What Linux needs for gaming is more normal users who buys games. If there is a market someone will fill it quickly. I will also strongly refute that games are essential to desktops. There are infact people who use their computer to actually do something useful. 2. This decision has nothing to do with egos. This guy just happens to believe that its essential to speed up the kernel when the only slow apps in Linux are non-native apps like Java, OpenOffice or Firefox. Those are not slow because of any sheduler but because they are written with slow toolchains. More work on making the kernel more responsive wont help at all at this stage.
HTTP/1.1 400
[emphasis mine]
I'm gonna have to go with Linus on this one.
The -rt patch will:
- Make latencies deterministic
- Reduce latencies if used correctly
- Add slight over head to overall througput
I think once the -rt patch is merged into the mainline it will work wondersfor games and all sorts of other important things, like industrial automation,
automated stock trading, and other high-speed data acquisition and processing.
So there is a road map to improve scheduling. In fact it's actually a broader and
more appealing plan than just scheduling for the desktop, IMHO. I think this is what
Linus is trying to get at in terms of his why he doesn't want a perfect desktop scheduler.
Money is the root of all evil?
And what's with the massive ego? It's as if suddenly Linus thinks he invented compilers or something. I think he needs to take a vacation and regain some perspective.
This post expresses my opinion, not that of my employer. And yes, IAAL.
The CDDL was designed to be GPL incompatible, Solaris didn't want its crown jewels to be in Linux.
Linus pointed out problems with Con dismissing other people's problems, and nobody says "oh, that didn't happen" - except you. In fact, the general response was "well, that didn't happen very often."
Similarly, even people who are attacking Linus say that Ingo acted the way Linus says he acted in the same situation - in other words the main reason Linus rejected SD is confirmed by the very people who are arguing for SD. then Con will get publicly slammed by people like you who think it's fine to comment on what they don't know about So it's OK for you to do it about Linus, but it's not OK for someone to do it about Con? Somehow the words "Pot. Kettle. Black." come to mind.
Ehmm... I'm not a 'gamer'.. Maybe once every 2 months or so if i don't have anything better to do... But most people i know below 30 are playing games, and without having good support for that in gnu/linux they will never switch away from windows.
The problem the linux distributions have today is that they have a bad reputation that scares people away and this scares people away from it.. (remember that all unknowns are scary for most people)
Now back to the topic... First of all i do want to say that i have not had the chance to try the CFS scheduler yet, but the thing here is that the SD scheduler where great on lots of other tasks too. With the plain O1 scheduler i had lots of problems when playing video-streams during high-cpu load,and the issue was that the scheduler gave background-jobs to large timeslices that cause the player to skip frames and such. With the SD scheduler there where some really nice things you could do then too, like setting the background-jobs to SCHED_ISO that caused the process ONLY to use unused cpu-time and this worked great. Could have 5-6 gcc's loading the cpu without even a frame dropped and did try out a opengl game (enemyterritory) during compilations just to see how it worked, and no problems there either.. Then after switching back to the O1 scheduler i could not even have 2-3 gcc's running without loosing LOTS of frames, and did not even bother with trying out a game..
So you see, games are just a part of it.. All types of lowlatency applications like video, music, games and some daemons like nfsd and such needs a good scheduler to give the best performance/interactivity.
I think i'm gonna give the CFS scheduler a try during the week, but i dont have any high hopes after what people have reported about it yet. And i do think it would be a better idea to implement multiple schedulers that you could choose between instead of having 'one size fits all'.
This sort of crap really pisses me off.
I don't disagree that a story about the scheduler is potentially interesting, and that it's what Slashdot used to be about. But as you can see from the above comments, it's not exactly stuff that "nerds" these days care to Google when they don't understand what it's about. Personally, I started reading this site because I ran (and run) Linux, not the other way around. So I've got nothing against this particular story, even though most of the discussion is very poor, and, sadly, typical for this site.
That said, I do disagree that Slashdot has ever been a science site. There's a lot more to science than what interests the nerd crowds (the proper nerd crowds, not the Apple fanboys and gadget freaks, who could care less about how stuff works). Science, as it concerns Slashdot, is mainly science that concerns technology. Throw in a few dinosaurs and volcanoes for good measure. It's boys' stuff. Science fiction stuff. You won't find anything about the travel patterns of herring or anything that takes the social sciences seriously (plenty of people here would even claim "it's not science", demonstrating their own lack of insight into what the scientific method(s) constitute).
OMG people even have to steal comments these days?
I wrote this comment on kerneltrap.
Christ, that's incredible. Nice one "bconway".
Free Software games list and commentary
My read is different, but it'll take a few paragraphs to explain why, so please bare with me. The O(1) so called "improved interactivity" was just a hack upon a hack upon a hack that never worked. If you ever read the code you know what I'm talking about. It was a complete mess. The mess was the result of trying to obtain a goal that is unobtainable: very roughly speaking, Ingo (and others) believed that it is possible to deduce how "important" a process is, based on the frequency of the process' sleep events (not the duration of the sleep). I can point you to a few research papers that show that this can simply *not* be done. Briefly, it has been shown that for each "important" application you can find an "unimportant" application such that the CPU usage pattern of the two is very similar; Hence, you would never be able to distinguish the important from the unimportant (if you are only basing you decision on CPU usage). All you can do is (1) divide the CPU equally, and (2) provide good response time to applications that sleep for long periods of time (like editors). According to several of his emails, Ingo is not reading scheduling papers, and proud of it.
I suspect Con doesn't read such papers either. But he knew this is the case nevertheless. For years Ingo and his followers attacked Con on this. Note that this is a purely technical issue, and Con was *completely* right, e.g. take a look at CFS, or at Linus's original scheduler that lasted until 2.2, or, Solaris, *BSD (with the exception of ULE), HPUX, etc. They are all much closer to the SD's philosophy than to the O(1)'s. (BTW, the only other OS that tried to do what Ingo wanted is... the Windows family).
The discussion between Con and others regarding this issue was not about minor details. Ingo and his followers view towards Con was plain and simple: "your design is crap, it doesn't do what schedulers are supposed to do". Con resisted. And I agree, after years in the business, he was occasionally emotional about it. But this is really understandable and reasonable. Especially when you compare it to the way Ingo and Linus express themselves from time to time (talk about "drama queens"). For example, notice how Linus talks to Ingo when Ingo doesn't understand something immediately.
The bottom line is that most of them are "drama queens" from time to time. Nobody is perfect. In absolute terms, however, I think Con usually expresses himself in a calm, quiet, and relatively humble manner. Certainly in comparison to Ingo/Linus.
Out of curiosity, were there ever any benchmarks between the two schedulers, or is their comparison completely theoretical? And why does it matter if Con can maintain it or not? Why couldn't Ingo maintain Con's scheduler? He's getting paid right?