Debating the Linux Process Scheduler
An anonymous reader writes "The Linux 2.6.23 kernel is expected around the end of the month, and will be the first to include Ingo Molnar's much debated rewrite of the process scheduler called the Completely Fair Scheduler. In another Linux kernel mailing list thread one more developer is complaining about Molnar and his new code. However, according to KernelTrap a number of other Linux developers have stood up to defend Molnar and call into question the motives of the complaints. It will be interesting to see how the new processor really performs when the 2.6.23 kernel is released."
Average? Probably nothing. But for devs/admins that are worried about certain processes taking more time than others, it -should- be more fair and keep things running smoother.
It's possible for programs right now to exploit how the current schedule dishes out time. As far as I know, they currently only do so out of ignorance, rather than malice. The new scheduler just corrects the problem.
It's not something a user can really see unless they know exactly what they are looking for, and unless a dev/admin has a program that's behaving unfairly, it's not really going to matter to them, either.
There is another invisible effect as well... Kolivas apparently publicly announced his decision to stop working on the kernel, which would include the current scheduler. That means finding another maintainer for his code, should any problems surface. If you've got 2 pieces of code that test the same in speed (as they do according to some), and 1 has a dev that's willing to keep working on it, and the other doesn't... Which would you pick?
The new code also has the added advantage of being a really really neat idea, which encourages people to work on it as well.
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
Have you considered the fact that different people work on different areas of the kernel? The people that work on the scheduler generally have very little to do with hardware drivers.
Each process running on Linux has a "niceness" value which you as the user can set. The value indicates which ones you want to have more access to CPU power. The numbers range from 19, meaning roughly "only use the CPU when noone else needs it", to -20 meaning "all your CPU are belong to it".
The new scheduler will make those values behave more like they're supposed to relative to one another, and hopefully use fewer resources for itself in doing so.
25% Funny, 25% Insightful, 25% Informative, 25% Troll
if there are more than one Scheduler, and if someone could tell the difference, why isn't s/he using the ALTERNATE Scheduler and compiling their own custom, tweaked and totally tuned kernel?
Someone (Con Kolivas?) suggested a "pluggable" scheduler API. I think this was even backed up by patches to provide this functionality. Linus Torvalds rejected the proposal - I think he said that the benefits would be outweighed by the need to maintain multiple schedulers. My opinion is that the kernel could have included a single scheduler in the kernel.org tarballs, but by providing the pluggable API it would lower the bar for those who wish to develop or play with different schedulers.
The standard scheduler, without those patches, is just about completely useless for realtime audio recording and editing, even with nothing more than the necessary apps, JACK, X, a lightweight window manager (openbox), HAL, syslogd, anacron, and 6 gettys running. Even taken anacron out of the situation didn't help.
My blog
There was the 2.4 schuduler, the old O(1) 2.6 scheduler and now the new 2.6 CFS scheduler...
This doenst seem to me to be ripped every 6 months, unless the 2.6 tree is just about 6 months older...
I know its not easy getting info on wireless chips, but time would be better spent working on something like that.
I'll ignore for a moment the fact that you're essentially making the same argument as "Why aren't all scientists (from solid-state physicists to cognitive neuroscientists) working on a cure for cancer instead of [perceived frivolous research in the news]?" You're ignoring the different kinds of expertise that go into a complex field of work like kernel development.
Instead, I'm just going to focus on your assertion that support for a few more wireless chipsets than the abundant choices we have today is more important than fixing problems in the most central and fundamental task of the kernel -- a task that even the most minimalist microkernels consider necessary to put into the microkernel.
This is simply hogwash. Scheduling affects every single part of the system, and it's a major factor in the perceived and real performance of a system. Fixes to the scheduler will affect how a user enjoys their system over the entire life of the system whereas a missing wireless driver affects them once -- at purchase time.
Furthermore, not all Linux systems have wireless networking. Adding more wireless drivers is going to be useless in nearly all server and most embedded uses. You seem to be under the mistaken impression that the purpose of Linux is to provide a good desktop or laptop experience. There are considerably more application domains that Linux operates in.
And frankly...
Just look at all the live CD's out there and how many can connect to wifi? Ubuntu and not much else.
This is not the kernel developers' problem. They've provided the functionality as evidenced by the fact that Ubuntu can do it. This is up to the distro developers to work on. Again, you make the mistake of assuming that all developers are equal and interchangeable and that they all have the same responsibilities in bringing the product to you, the unpaying customer.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
Kolivas apparently publicly announced his decision to stop working on the kernel, which would include the current scheduler. That means finding another maintainer for his code, should any problems surface. If you've got 2 pieces of code that test the same in speed (as they do according to some), and 1 has a dev that's willing to keep working on it, and the other doesn't... Which would you pick?
Wow, not even a full year has past and we're already getting revisionist historians trying to change the situation.
Kolivas quit because of the scheduler debacle, because nobody would listen to Kolivas but were apt to follow Linus and his cronie Ingo around when they drum up more-or-less the exact same thing. Instead of critically listening to Kolivas' points, Linus and Ingo attacked Kolivas' merits. Under that kind of personal attack, I couldn't say I wouldn't have quit just to shut them up. Not all of us are stubborn mules and jackasses.
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
Average user? Multimedia tasks will not skip or stutter while the system is under load. The opposite of Vista's network performance taking a nose dive while playing MP3s, Linux systems with the new scheduler will see little/no impact from background/normal operations on their gaming, music, and video. Your mouse won't skip around while the system is loaded, and responsiveness will remain high except in situations involving super-heavy I/O usage (I/O starvation is more difficult to solve than CPU starvation).
It actually makes a substantial difference, and the system is much more fun to use.
There are some informal test results (LKML) from kernel trap:
here's an update: checking whether Wine could be a factor in your
problem i just tested latest CFS against latest SD with a 3D game
running under Wine: v2.6.22-ck1 versus v2.6.22-cfsv19 (to get the
most comparable kernel), using Quake 3 Arena Demo under Wine (0.9.41).
Here are the results in a pretty graph:
http://people.redhat.com/mingo/misc/cfs-vs-sd-wine-quake.jpg
or, in text Quake3-under-Wine behavior under SD/-ck: framerate breaks down massively
during any kind of load. The game is completely unusable with 1 CPU loop
running already!
Quake3-under-Wine behavior under CFS: framerate goes down gently with
load, gameplay remains smooth. Framerate is still pretty acceptable and
the game is playable even with a 500% CPU overload. The graph looks good
and the framerate reduction goes roughly along the expected 1/n
'fairness curve' - so it all looks pretty healthy. [Note: quake3 keeps
its fully 41 fps even with 1 competing loop running on the CPU due to
"sleeper fairness".]
WhiteWolf666 an exBush supporter. All you new-school,compassionate,save the children Republicans can rot in hell
What goes around comes around.
Revisionist history is working both ways I see. Whenever Linux or another kernel developer would bring up a point of failure in Kolvias's scheduler instead of Fixing the problem Kolvias would lash out and say it wasn't broken.
CFS won not because it was a better scheduler at the time, but because Inglo worked with the developers to make it better, instead of fighting everyone who questioned anything about it. FOSS projects are about helping everyone, and listening to new Ideas. Something Kolvias was having a hard time doing.
That is at least how i read the whole debate.
i thought once I was found, but it was only a dream.
Roman is a long time kernel dev and is the maintainer of AFFS, HFS, M68K and kconfig. He's hardly new to the scene. New to the scheduler code, perhaps.
All of this started with Roman doing a code review of CFS a month or two ago. Roman asked some questions to clarify what certain parts of the code were doing, Ingo asked Roman to provide more info so he could see where CFS was falling short on Roman's test cases. Both sides kept trying to talk passed each other. Eventually, Roman got frustrated and provided a new scheduler/patch with some mathematical proofs behind it to make the scheduling better. Roman and Ingo continued to talk passed each other. Ingo picked up some of Roman's patches. Roman feels slighted like he's being ignored but was just as guilty in ignoring Ingo (and other CFS devs) along the way. Factor in the taste in people's mouth from earlier this year with Con and the lines were already drawn before Roman got involved.
It's a clash of egos all around and nothing atypical for a large open source project.
Don't leave your mind so open that your brain falls out. Don't close it so much that you cut off the blood.
The discussion is not as bad as it sounds (almost normal for LKML!), it's just that Roman wants to talk about the maths and Ingo works with patches... as Willy Tarreau pointed out "I know for sure that the common language here on LKML is patches".
Beyond the heated discussion with Roman Zippel, there are still a few workloads which can trigger regressions, one of which I found running some unit tests.
This is covered in this thread, and although there is now a version of CFS which does not exhibit the problem (see graph of combo3-yield patch) it is not the one that is meant to be merged in 2.6.23 (these patches are 2.6.24 material) so Ingo is getting me to test patches until this regression can be solved.
One slightly annoying thing is that the current fix involves using sysctl to switch back (at least partially) to the old scheduler mechanism!
TODO: 753) write sig.
As I started reading the comments on here I noticed that many were quick to down Ingo for his transgressions and its quite obvious from the comments that no one has bothered to read the exchange on LKML in order to become familiar with what is going on. I have read it, I have 0 bias for either Zippel or Molnar and I can say without any reservation that Zippel is a wank and Molnar is borderline saintly.
A recap of what I have read and understood about the entire situation:
Ultimately I think Zippel is purposefully trying to provoke Molnar throughout all of this. His wild accusations are nothing more than games that he is playing, the guy has a chip on his shoulder and if Linux was my toy, I would have blocked him from the mailing lists.
You are right. It would be easy. In fact, someone wrote a patch for it called plugsched. It was not accepted into the kernel due to the fact that it would supposedly discourage the idea of simply making a scheduler which worked well for everyone.
Centralization breaks the internet.
You shouldn't say "someone did", it was Kolivas himself that first offered the pluggable scheduler patch so that his patch could be used along side any new future schedulers and offer a concrete way to benchmark the changes caused by scheduling. And this was done years ago, circa 2004: http://ck.kolivas.org/patches/plugsched/
Of course, Linus and Ingo rejected those patches as well.
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
Boffoonery - downloadable Comedy Benefit for Bletchley Park