Slashdot Mirror


Debating the Linux Process Scheduler

An anonymous reader writes "The Linux 2.6.23 kernel is expected around the end of the month, and will be the first to include Ingo Molnar's much debated rewrite of the process scheduler called the Completely Fair Scheduler. In another Linux kernel mailing list thread one more developer is complaining about Molnar and his new code. However, according to KernelTrap a number of other Linux developers have stood up to defend Molnar and call into question the motives of the complaints. It will be interesting to see how the new processor really performs when the 2.6.23 kernel is released."

7 of 232 comments (clear)

  1. Can someone provide some insight? by Opportunist · · Score: 5, Interesting

    Is someone who does understand the differences able to explain, in non-kernel-developer terms, what the big differences will be for the average user, developer or administrator? I mean, I'd love to discuss it, but first of all I'd want to know what we're discussing.

    --
    We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    1. Re:Can someone provide some insight? by Jeff+DeMaagd · · Score: 5, Funny

      I'd love to discuss it, but first of all I'd want to know what we're discussing.

      That's not the Slashdot way. We're supposed to have an unfounded opinion based on insufficient facts and preexisting prejudices.

    2. Re:Can someone provide some insight? by Zephiris · · Score: 5, Interesting

      Essentially, the difference is how well processor resources are divided up, how evenly, and how big the pieces are each process or task gets. Most anyone who has used Linux has had the dreaded moment where you're trying to multitask a bit, and are compiling a program while listening to music, or waching a video, and then...that's terrific, video frames are dropped, or the audio skips. Even if intermitant, it's quite annoying, at the very least. The 'Completely Fair Scheduler' is an attempt to have more fair, sane, and generally less complex scheduling. This also happens to reduces the worst case latencies, averaging from (at least on the tests with my computer) 120+ms on vanilla 2.6.22 scheduler, to ~2.6ms with CFS.

      It's largely a drastic improvement over the old scheduling mechanisms that Linux has relied on, although other OSes have largely worked through such problems some time ago.

      While it's not exactly THE most scientific, I had a few rounds of testing over which did better on load vs. things still behaving exactly the way they should. I ran all of them with audio playing through KDE artsd, video player, glxgears, etc, loaded, plus inducing a CPU load via 'stress'. Linux, even with CFS, it's still fairly easy to 'upset' it by just producing a fairly large (2-4) amount of load. Solaris did notably better. While it seemed to have a few quirks with scheduling in general, it could sustain a load of around 8-12 without producing video/audio frame drops. FreeBSD, with the experimental SCHED_ULE 2.0 scheduler (as of March 2007) could sustain a load of over 80 with no problems, frame drops, or even glxgears slowing down to a complete crawl (although you wouldn't want to especially use OpenGL at such, it was still getting the speed of software glxgears), and even at 120+ load, the mouse wouldn't respond, while everything else kept going fine. This seems purely useless, but it really comes in handy if trying to do one or more KDE compiles while watching video, on Linux, this tends to be prevented. For the uninitiated, load averages like that are basically a multiplier vs. how much actual work your computer can do in real time. Eg, a 0.5 load would mean you're doing 50% of what you could in realtime. A 2.0 load means you're trying to handle twice what you can do in realtime, it is weighted against how many processors you have (I have one), but other things like disk access can also contribute to the load average, depending on OS.

      So, longer story short, a superior CPU scheduler can make a world of difference in how things behave when your system's something else with the CPU(s) at the same time.

      --

      "A Goddess rarely smiles for she is forced by others to be an island unto herself." - Zephiris
    3. Re:Can someone provide some insight? by ciroknight · · Score: 5, Informative

      Kolivas apparently publicly announced his decision to stop working on the kernel, which would include the current scheduler. That means finding another maintainer for his code, should any problems surface. If you've got 2 pieces of code that test the same in speed (as they do according to some), and 1 has a dev that's willing to keep working on it, and the other doesn't... Which would you pick?

      Wow, not even a full year has past and we're already getting revisionist historians trying to change the situation.

      Kolivas quit because of the scheduler debacle, because nobody would listen to Kolivas but were apt to follow Linus and his cronie Ingo around when they drum up more-or-less the exact same thing. Instead of critically listening to Kolivas' points, Linus and Ingo attacked Kolivas' merits. Under that kind of personal attack, I couldn't say I wouldn't have quit just to shut them up. Not all of us are stubborn mules and jackasses.

      --
      "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
    4. Re:Can someone provide some insight? by SEAL · · Score: 5, Interesting

      Whenever Linux or another kernel developer would bring up a point of failure in Kolvias's scheduler instead of Fixing the problem Kolvias would lash out and say it wasn't broken.

      CFS won not because it was a better scheduler at the time, but because Inglo worked with the developers to make it better, instead of fighting everyone who questioned anything about it.


      I believe Linus was the claiming Ingo worked better with the developers (or at least I saw him write to that effect on the LKML). By contrast, Kolivas had many individuals helping out with his branch who were quite pleased with his progress.

      The problem is that Kolivas was working to help desktop, and particularly 3D game users. He'd say it wasn't broken if it was optimizing for those particular platforms at a cost of a fraction of a percent Oracle performance. Most of the kernel devs don't care 2 cents about 3D or desktop users, and many are employed by large enterprise businesses, so you can see who won.

      Personally I think Kolivas should've been given full access to merge his code. Many of us found his work very useful and it's sad to see him driven away by a few people who are oblivious to what everyday users need. Even if his scheduler was not the default one, it still would've been nice to have as a mainline kernel option.

      - SEAL

    5. Re:Can someone provide some insight? by GooberToo · · Score: 5, Interesting

      Average? Probably nothing.

      I don't think that's really fair. Average these days actually has a large spread. Average web user? Average game player? Average video watcher? Average musician? Averaging what? And that's the point of the CFS. Each of those users have different expectations and each reflect a different type of scheduler workload. The old scheduler has lots and lots of code to handle various performance problems for corner cases for each type of "average user" depicted above. This in turn means the code is hard to read, hard to understand, and even harder to maintain. Worse, some code which help some corner cases actually make things worse for others. This in turn means more special case coding and even more complex testing.

      The CFS is designed to simplify most everything the O(1) schedule does without tons of complex heuristics and special code to address various corner cases. In other words, in stead of trying to guess what you really want, the CFS simply tries to be as fair as possible for everyone, thusly ensuring many categories of corner cases are simply no longer an issue with the CFS.

      This does not mean CFS is always better than O(1), but early results indicate they are traveling down a good road. In fact, the last round of patches I read about, actually establish CFS ~12% faster then the old O(1) schedule. Of course, it's yet to be seen how well it will be received and how quickly it will prove it self with a large user base. The devil is in the details with schedulers and sometimes real world user workloads don't reflect anything which has been tested/validating during development.

      Nonetheless, development on CFS continues and it certainly is providing hope it will be smaller, faster, provide lower latency, be easier to maintain, and address a wider range of workloads, including many corner cases, without requiring complex code to track and analyze heuristics.

      Long story short, yes, if all goes well, even the average user may notice an improvement depending on the particulars of their workload. Wouldn't you notice a more responsive desktop? ;)

  2. Re:Ugh bring back 2.7 please by luciofm · · Score: 5, Informative

    There was the 2.4 schuduler, the old O(1) 2.6 scheduler and now the new 2.6 CFS scheduler...
    This doenst seem to me to be ripped every 6 months, unless the 2.6 tree is just about 6 months older...