Slashdot Mirror


Debating the Linux Process Scheduler

An anonymous reader writes "The Linux 2.6.23 kernel is expected around the end of the month, and will be the first to include Ingo Molnar's much debated rewrite of the process scheduler called the Completely Fair Scheduler. In another Linux kernel mailing list thread one more developer is complaining about Molnar and his new code. However, according to KernelTrap a number of other Linux developers have stood up to defend Molnar and call into question the motives of the complaints. It will be interesting to see how the new processor really performs when the 2.6.23 kernel is released."

19 of 232 comments (clear)

  1. Can someone provide some insight? by Opportunist · · Score: 5, Interesting

    Is someone who does understand the differences able to explain, in non-kernel-developer terms, what the big differences will be for the average user, developer or administrator? I mean, I'd love to discuss it, but first of all I'd want to know what we're discussing.

    --
    We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    1. Re:Can someone provide some insight? by Jeff+DeMaagd · · Score: 5, Funny

      I'd love to discuss it, but first of all I'd want to know what we're discussing.

      That's not the Slashdot way. We're supposed to have an unfounded opinion based on insufficient facts and preexisting prejudices.

    2. Re:Can someone provide some insight? by Aladrin · · Score: 4, Informative

      Average? Probably nothing. But for devs/admins that are worried about certain processes taking more time than others, it -should- be more fair and keep things running smoother.

      It's possible for programs right now to exploit how the current schedule dishes out time. As far as I know, they currently only do so out of ignorance, rather than malice. The new scheduler just corrects the problem.

      It's not something a user can really see unless they know exactly what they are looking for, and unless a dev/admin has a program that's behaving unfairly, it's not really going to matter to them, either.

      There is another invisible effect as well... Kolivas apparently publicly announced his decision to stop working on the kernel, which would include the current scheduler. That means finding another maintainer for his code, should any problems surface. If you've got 2 pieces of code that test the same in speed (as they do according to some), and 1 has a dev that's willing to keep working on it, and the other doesn't... Which would you pick?

      The new code also has the added advantage of being a really really neat idea, which encourages people to work on it as well.

      --
      "If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
    3. Re:Can someone provide some insight? by Zephiris · · Score: 5, Interesting

      Essentially, the difference is how well processor resources are divided up, how evenly, and how big the pieces are each process or task gets. Most anyone who has used Linux has had the dreaded moment where you're trying to multitask a bit, and are compiling a program while listening to music, or waching a video, and then...that's terrific, video frames are dropped, or the audio skips. Even if intermitant, it's quite annoying, at the very least. The 'Completely Fair Scheduler' is an attempt to have more fair, sane, and generally less complex scheduling. This also happens to reduces the worst case latencies, averaging from (at least on the tests with my computer) 120+ms on vanilla 2.6.22 scheduler, to ~2.6ms with CFS.

      It's largely a drastic improvement over the old scheduling mechanisms that Linux has relied on, although other OSes have largely worked through such problems some time ago.

      While it's not exactly THE most scientific, I had a few rounds of testing over which did better on load vs. things still behaving exactly the way they should. I ran all of them with audio playing through KDE artsd, video player, glxgears, etc, loaded, plus inducing a CPU load via 'stress'. Linux, even with CFS, it's still fairly easy to 'upset' it by just producing a fairly large (2-4) amount of load. Solaris did notably better. While it seemed to have a few quirks with scheduling in general, it could sustain a load of around 8-12 without producing video/audio frame drops. FreeBSD, with the experimental SCHED_ULE 2.0 scheduler (as of March 2007) could sustain a load of over 80 with no problems, frame drops, or even glxgears slowing down to a complete crawl (although you wouldn't want to especially use OpenGL at such, it was still getting the speed of software glxgears), and even at 120+ load, the mouse wouldn't respond, while everything else kept going fine. This seems purely useless, but it really comes in handy if trying to do one or more KDE compiles while watching video, on Linux, this tends to be prevented. For the uninitiated, load averages like that are basically a multiplier vs. how much actual work your computer can do in real time. Eg, a 0.5 load would mean you're doing 50% of what you could in realtime. A 2.0 load means you're trying to handle twice what you can do in realtime, it is weighted against how many processors you have (I have one), but other things like disk access can also contribute to the load average, depending on OS.

      So, longer story short, a superior CPU scheduler can make a world of difference in how things behave when your system's something else with the CPU(s) at the same time.

      --

      "A Goddess rarely smiles for she is forced by others to be an island unto herself." - Zephiris
    4. Re:Can someone provide some insight? by Opportunist · · Score: 4, Insightful

      That sounds sensible. With the increase of Linux boxes run by "average" people (who will not directly notice a difference), the threat of malware for Linux is going to be on the rise, too. And those people usually know how to exploit even the smallest flaws in a system.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    5. Re:Can someone provide some insight? by ciroknight · · Score: 5, Informative

      Kolivas apparently publicly announced his decision to stop working on the kernel, which would include the current scheduler. That means finding another maintainer for his code, should any problems surface. If you've got 2 pieces of code that test the same in speed (as they do according to some), and 1 has a dev that's willing to keep working on it, and the other doesn't... Which would you pick?

      Wow, not even a full year has past and we're already getting revisionist historians trying to change the situation.

      Kolivas quit because of the scheduler debacle, because nobody would listen to Kolivas but were apt to follow Linus and his cronie Ingo around when they drum up more-or-less the exact same thing. Instead of critically listening to Kolivas' points, Linus and Ingo attacked Kolivas' merits. Under that kind of personal attack, I couldn't say I wouldn't have quit just to shut them up. Not all of us are stubborn mules and jackasses.

      --
      "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
    6. Re:Can someone provide some insight? by mamer · · Score: 4, Interesting

      Another question: I've been waiting for OpenMP support in gcc, it seems to be coming soon. In the meantime, I tried to parallelize my code by hard coding my threads using pthreads. I tested it by running several matrix-matrix multiplications "in parallel" on a multi-core CPU, and what I've got was all threads running on the same processor. Only after increasing my number of threads to a couple of dozens, I get some of them to run on the second processor. So basically, I am not getting any performance gain. I asked a number of people an they tell me "this is an old problem, basically the Linux kernel scheduler is stupid and nobody has bothered to fix it". Now, is Ingo's new scheduler fixing this? if gcc-openmp relies on the kernel scheduler, should we expect that open-mp will basically work-but-not-really-work on shared memory multi processor machines? I think this is an important issue to address, especially in an era where high-performace computing has become the driving force behind the hardware. BTW, how are otehr commercial compilers overcoming this scheduling problem?

    7. Re:Can someone provide some insight? by TheRaven64 · · Score: 4, Interesting

      Well damn. That's just... embarassing It's more embarrassing than the grandparent states, actually. ULE 2 has been improved a lot in the last few months, and the newer ULE 3 performs a lot better, particularly on SMP systems. From what I've seen, the new Linux scheduler has roughly the same shape (and size) performance curves on 1-8 CPU systems as the old 4BSD scheduler that ULE replaces.

      Why's everyone using linux if it sucks so much? Because Linux sounds cool, while BSD sounds geeky. I was recently reminded of how badly Linux sucks when I went over some old code I'd written to get the CPU name and speed. The FreeBSD and OpenBSD implementations of this code each called a single sysctl for each result. The Linux version had to read /proc/cpuinfo and parse it. Actually, it had to parse it in two different ways, because it turns out the format of cpuinfo is different on x86 and all other platforms. Reading the battery life was even worse. On FreeBSD it's just a matter of reading the hw.acpi.battery.life sysctl (one line of code). For Linux, it involves parsing some messy procfs stuff with a format that has a habit of changing between releases. I don't understand how a developer could prefer Linux to any other UNIX.
      --
      I am TheRaven on Soylent News
    8. Re:Can someone provide some insight? by peragrin · · Score: 4, Informative

      What goes around comes around.

      Revisionist history is working both ways I see. Whenever Linux or another kernel developer would bring up a point of failure in Kolvias's scheduler instead of Fixing the problem Kolvias would lash out and say it wasn't broken.

      CFS won not because it was a better scheduler at the time, but because Inglo worked with the developers to make it better, instead of fighting everyone who questioned anything about it. FOSS projects are about helping everyone, and listening to new Ideas. Something Kolvias was having a hard time doing.

      That is at least how i read the whole debate.

      --
      i thought once I was found, but it was only a dream.
    9. Re:Can someone provide some insight? by SEAL · · Score: 5, Interesting

      Whenever Linux or another kernel developer would bring up a point of failure in Kolvias's scheduler instead of Fixing the problem Kolvias would lash out and say it wasn't broken.

      CFS won not because it was a better scheduler at the time, but because Inglo worked with the developers to make it better, instead of fighting everyone who questioned anything about it.


      I believe Linus was the claiming Ingo worked better with the developers (or at least I saw him write to that effect on the LKML). By contrast, Kolivas had many individuals helping out with his branch who were quite pleased with his progress.

      The problem is that Kolivas was working to help desktop, and particularly 3D game users. He'd say it wasn't broken if it was optimizing for those particular platforms at a cost of a fraction of a percent Oracle performance. Most of the kernel devs don't care 2 cents about 3D or desktop users, and many are employed by large enterprise businesses, so you can see who won.

      Personally I think Kolivas should've been given full access to merge his code. Many of us found his work very useful and it's sad to see him driven away by a few people who are oblivious to what everyday users need. Even if his scheduler was not the default one, it still would've been nice to have as a mainline kernel option.

      - SEAL

    10. Re:Can someone provide some insight? by GooberToo · · Score: 5, Interesting

      Average? Probably nothing.

      I don't think that's really fair. Average these days actually has a large spread. Average web user? Average game player? Average video watcher? Average musician? Averaging what? And that's the point of the CFS. Each of those users have different expectations and each reflect a different type of scheduler workload. The old scheduler has lots and lots of code to handle various performance problems for corner cases for each type of "average user" depicted above. This in turn means the code is hard to read, hard to understand, and even harder to maintain. Worse, some code which help some corner cases actually make things worse for others. This in turn means more special case coding and even more complex testing.

      The CFS is designed to simplify most everything the O(1) schedule does without tons of complex heuristics and special code to address various corner cases. In other words, in stead of trying to guess what you really want, the CFS simply tries to be as fair as possible for everyone, thusly ensuring many categories of corner cases are simply no longer an issue with the CFS.

      This does not mean CFS is always better than O(1), but early results indicate they are traveling down a good road. In fact, the last round of patches I read about, actually establish CFS ~12% faster then the old O(1) schedule. Of course, it's yet to be seen how well it will be received and how quickly it will prove it self with a large user base. The devil is in the details with schedulers and sometimes real world user workloads don't reflect anything which has been tested/validating during development.

      Nonetheless, development on CFS continues and it certainly is providing hope it will be smaller, faster, provide lower latency, be easier to maintain, and address a wider range of workloads, including many corner cases, without requiring complex code to track and analyze heuristics.

      Long story short, yes, if all goes well, even the average user may notice an improvement depending on the particulars of their workload. Wouldn't you notice a more responsive desktop? ;)

    11. Re:Can someone provide some insight? by rbanffy · · Score: 4, Interesting

      "The problem is that Kolivas was working to help desktop, and particularly 3D game users."

      I don't know for sure, but I think it should be trivial to have a kernel switch activated on boot to set the preferred scheduler. This way, 3D gamers would be happy to set the -ks (I fail to remember its correct name) while the rest of us would be happy to leave it alone and get the CFS. Maybe by having a modular scheduler architecture would allow to have kernels with the -openvz or -oraclesbest or whatever other schedulers one may want (and could get support from the kernel developers).

      A slightly more clever approach would be to let one switch schedulers on the fly, but I think it's asking too much.

    12. Re:Can someone provide some insight? by AnyoneEB · · Score: 4, Informative

      You are right. It would be easy. In fact, someone wrote a patch for it called plugsched. It was not accepted into the kernel due to the fact that it would supposedly discourage the idea of simply making a scheduler which worked well for everyone.

      --
      Centralization breaks the internet.
  2. Re:Snooze by Reason58 · · Score: 4, Funny

    Seriously has Slashdot got such a boner for Linux that they'll post absolutely every fucking little boring thing they find? You must be new here.
  3. Re:In a blind taste test.... by MyLongNickName · · Score: 4, Interesting

    As a windows user who has very little experience with Windows: This is one of the strengths of open source. if you have a large enough base of contributors, these "little" details are brought out into the open, and you can really understand how things work. I've read a bit on the subject, and it is interesting to see the different approaches that can be taken to something that most of us do not even think about.

    With Windows, how does this work? I will never know for sure. if MS doesn't choose to make it known, it isn't known. If they choose to make it known, then I just have to trust they are telling the truth (Windows Update anyone).

    With a project like this, you are much more likely to get the best approach to the situation.

    --
    See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
  4. Esoteric Discussions by Archangel+Michael · · Score: 4, Insightful

    More importantly, if there are more than one Scheduler, and if someone could tell the difference, why isn't s/he using the ALTERNATE Scheduler and compiling their own custom, tweaked and totally tuned kernel?

    Seriously, most people aren't going to notice, and those that do notice, ought to be able to compile their own kernel, and ought to do exactly that. This is nothing short of an esoteric discussion and shouldn't extend beyond kernel developers. Most people don't know, and don't care which scheduler is implemented.

    I'm one of those somewhere between caring and not. I only care about the supposed differences in approach to scheduling, and quite frankly, from what little I understand, the various schemes to scheduling have their advantages and disadvantages. I seriously doubt that ONE is better in all circumstances compared to all the others.

    --
    Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.
  5. Ugh bring back 2.7 please by maelstrom · · Score: 4, Insightful

    I know they've changed the model of development for the kernel, but how many new schedulers have we gone through between 2.4 and 2.6 now? Maybe it is just me, but the scheduler seems like a pretty important piece of the kernel.... Ripping it out every 6 months and calling it "stable" seems a bit off to me.

    Oh well. I guess I'm just getting cranky in my "old" age.

    --
    The more you know, the less you understand.
    1. Re:Ugh bring back 2.7 please by luciofm · · Score: 5, Informative

      There was the 2.4 schuduler, the old O(1) 2.6 scheduler and now the new 2.6 CFS scheduler...
      This doenst seem to me to be ripped every 6 months, unless the 2.6 tree is just about 6 months older...

  6. Re:more important things by Hatta · · Score: 4, Insightful

    There isn't much more important work for a kernel than improving performance under load. They could probably do better by focusing on I/O scheduling than CPU scheduling. My CPU spends an inordinate amount of time waiting for IO. But kernel performance is of the utmost importance to kernel developers. What does it matter if it runs on your hardware if it doesn't run well?

    --
    Give me Classic Slashdot or give me death!