The Completely Fair Scheduler
hichetu writes "Kernel trap has a nice summary of what is going on behind the scenes to change the Linux Scheduler. The O(1) Linux scheduler is going to be changed so that it is fair to interactive tasks. You will be surprised to know that O(1) is really too good not to have any side-effects on fairness to all tasks."
I thought Linux used Cron as a scheduler ?
Obviously O(1) is a bad choice. Nice to see Linux learning something from Windows here.
If scheduling was completely fair, this would have been a frist ps0t.
Free software: because some processes are more equal than others.
Life is not fair and neither are kernels.
History has shown that the truly successful kernels use a form of Darwinian scheduling that assign more processor time to the most productive tasks along with light-weight memory protection to act as a safety net for tasks that fall through the cracks.
You will be surprised to know that O(1) is really too good not to have any side-effects on fairness to all tasks.
No I won't, because I don't know what the hell it means.
Hah! In your face, Taco!
Will the new scheduler be more efficient at scheduling single processes across a multi-core system?
On another not[e], when are we going to be able to build a default kernel capable of low latency I/O for A/V work?
Linux really doesn't need a new process scheduler. What it could really do with is I/O prioritisation. Windows now has it, so there's no excuse. CPU power is fairly abundant these days so managing its usage is less of an issue than it used to be, but I/O bandwidth is often in short supply and I/O-bound applications can choke a system and make interactive processes a pain in the ass to use. I'd like to see some way of reserving and limiting bandwidth to particular devices for particular processes. And an equivalent of "top" for monitoring processes' I/O activity would also be extremely handy... as far as I know, the system calls don't even exist in the kernel to do this yet.
This new scheduler may have 'Ingo Molnar' written all over it but I'll be giving the credit to ck!
It's a very interesting phenomenon.
After enough number of iterations trying to optimize a software program to do everything very well compared to the base "naive" solution, you end up with an OS that does everything poorly.
It's counterintuitive, but we see it it every day around us.
Doing a constant time reevaluation for n processes is still O(n) whether you do it individually or at once. Granted you don't have a pause as all processes are reevaluated but the amount of work the scheduler does is NOT independent of the # of processes.
I would love to know how the algorithm actually works. Is this just a new data structure for organizing the processes, or is there other stuff involved?
Inventions have long since reached their limit, and I see no hope for further development.-- Frontinus, 1st cent. AD
On Multics, tasks that used up their CPU were put in a lower priority run queue and tasks that didn't use up their CPU time were put into a higher priority run queue.
So interactive tasks naturally floated to the top and compute bound tasks naturally sank.
(Anyone remember Multics?)
I want a nice-like utility that works for more than just CPU time. I want to be able to do hddnice -5 firefox and networknice -7 pr0ndownloader. Why can't the principles of nice work for various forms of I/O scheduling as well as CPU scheduling? Does anyone know of a Unix that has these sorts of things. I just hate it when my system is unresponsive because some process, which is niced into all get out, is swapping like crazy, and is allowed to do so because "it's not using CPU time". Bah!
Because it doesn't.
We want small, simple and efficient. Without comparing the respective patch sets I believe Ingo had valid concerns about fragmentation. There's nothing personal in it, sometimes you have to stand back from a proposed solution to accurately assess the problem.
Another developer could work on Tux and rewrite or extend it by having a fresh perspective and identifying an area where Ingos code could be improved. It's just the way it goes and in the end it's not bruised egos but technical merit that matters. Let's see how the code develops.
Would have modded up myself if I had the points to do it.
FTA:
>> [...] the core scheduler got smaller by more than 700 lines:
Someone who I respect tremendously once said
"Perfection is attained when you remove everything that is not essential"
I have to agree
I don't know the meaning of the word 'don't' - J
GP: Can't we just give the processes weapons and let them decide which follows?
P: That is actually the kind of question that my Operations Research professor (who also did a lot of work in CPU simulation and performance estimating) used to throw onto final exams as the "separate the B+ from the A" question. If your answer was interesting enough he would send you over to one of his Masters candidates to see if it could be taken any further. So I wouldn't count your suggestion out from the start!
Behold: The Mother of All Possible Comp Sci Flame Wars: The Darwinistically Selected Genetic Algorithms -vs- the Intelligently Designed Algorithms.
Bumper Stickers $4.95; T-Shirts $19.95:
No, I think I'll wait for the unbelievably fair scheduler, or perhaps the ridiculously fair scheduler.
Somebody into the kernel devs community care to read the user comments below TFA and explain what's going on to us mere mortals?
You are incorrect because the act of scheduling the next process to run requires constant time regardless of how many processes there are. The O(1) refers to this, correctly.
The reason this is important, and the reason they are worried about the act of scheduling the next process rather than time complexity over all N processes, is that if scheduling the next process were not constant time, the percentage of time spent scheduling the next process would grow larger as you added more processes. That's fine if you're on a desktop system and you go from 100 to 200, but as the number starts getting large on (say) big servers, you start running into a situation where all your CPUs are perpetually tied up trying to figure out what process to run next.
O(n) time over N processes is not a problem; you've either got the CPUs or you don't. If you don't, then your performance will suck for reasons that are your own fault. If you do have enough CPUs, then the time spent scheduling will remain in step with the time spent running the processes, and this is fine. However, if the time spent scheduling grew, every time a process was scheduled, across all CPUs, then your $50000 server would be worthless because it wouldn't be able to handle the workloads you intended for it. All those expensive CPUs would sit there figuring out what process to run rather than running it.
So, it is NOT a misnomer. It accurately describes the portion of the problem that the developers are concerned with. It's O(n) over n processes, and that's great because it means you can get to n without breaking down.
I rarely criticize things I don't care about.
Microsoft have been outclassed.
For this reason, I've been using Con Kolivas' patches to replace the scheduler. http://members.optusnet.com.au/ckolivas/kernel/ - very helpful especially if you don't have the fastest computer around. Also seems to help a bit with I/O - if my hard drive is trashing for whatever reason, interactive stuff still remains reasonably responsive. Or at least it doesn't make my mouse cursor skip...
Even so, I'd prefer to have IO better scheduled - ionice doesn't really seem to work at least for me.
There's a huge thread about broken disk IO in kernels > 2.6.17.
http://forums.gentoo.org/viewtopic-t-482731.html
How about fixing that first?
I've noticed all kinds of new schedulers coming out of the woodwork in 2.6. Optional kernel preemption. And also changes to the default timer (250 instead of 100, etc.). These are choices like I've never had in Linux before. Should be good, right?
But at a higher level there's been such brokenness.
K3B CD burning broke in 2.6.8 and wasn't fixed until 2.6.10.
Firewire performance (for me) started fluctuating after ~2.6.7. Although grabbing DV over firewire doesn't take much CPU time, for some reason having video preview enabled during capture may/may not cause dropped frames (on a modern amd64! pathetic!) depending on kernel version.
And the promise of 2.6 being responsive and feeling interactive under heavy load? Also ended after 2.6.7 for me.
Recent kernels, like 2.6.18, seem to take minutes to balance tasks out. Playing video during heavy CPU usage? For me it gives priority to everything else BUT the new video process for about 30 seconds. (CFM or anticipatory scheduler)
So I'm not sure if we should be welcoming all this new "choice" with scheduler-du-jour or try to give us a good default choice that actually works right first.
To be smarter the new Linux scheduler will use artificial neural networks and for speed-up it will be written in Lisp.
There's no development tree which has been a blessing and a curse. With practically everyone using 2.6 now, the devs should work on regressions and stability and then branch 2.7 for development of 2.8.
I'd also like to see ingos RT patches merged into mainline and some of the more esoteric stuff moved out into patchsets.
pity more people use linux than macs then. Also: google trends is a pile of poo.
What's the vertical axis represent?
Good job sending all those /.ers over there, it will be the mother of all process fights on that poor server now. Sysadmins battling their way through hordes of zombies and monster processes, with ammo (ehm.. mem,cpu) running lower and lower until they're out, just as another wave of uglies comes out of nowhere...
http://linux.inet.hr/iostat_utility.html
Klingon multitasking systems do not support "time-sharing". When a Klingon program wants to run, it challenges the scheduler in hand-to-hand combat and owns the machine.
(from here)
pity that pity's the only way linux weenies ever get sex.
> pity more people use linux than macs then.
If by "people" you mean "servers," and "use" you mean "administer," then you may be right.
Great, now people are going to ignore my Completely Just Sceduler.
It's just not fair.
I'm impressed. I did some work on CPU dispatching on a mainframe system in the distant past, and we were never able to beat O(log n) on an ordered dispatch queue. The obvious implementation is O(n); getting to O(log n) requires a tree, and I want to see how they got to O(1).
This stuff really matters now. If we're going to have 80-CPU multiprocessors, all the main operating system algorithms have to be near O(1), or the scale-up is a lose.
Possible explanation: More people know about Linux now, so there's less need to Google to learn about it.
Alternative explanation: People have less problems now using Linux, so they google less for solutions on Linux problems.
Third explanation: Linux documentation got substantially better, so people have less need to use Google as a substitute.
Fourth explanation: The larger density of Linux installations comes with a larger density of Linux experts, so people are more likely to consult their local Linux guru than Google.
Pick your favorite choice or make up yet another explanation.
Yes, those explanations are all completely made up, but so was the explanation you had in mind.
The Tao of math: The numbers you can count are not the real numbers.
Looks like the vertical axis is the relative volume of searches. So, you cannot say that linux is fading. All you can see is that linux as a search term is having less share. the actual volume may be in fact increasing.
O(1) can be much slower than O(log(n)) for large values of 1.
In other words, if it takes a day to schedule each process, then that is still O(1), but if you have to schedule a billion processes for the log(n) algorithm to take a day, then the O(log(n)) wins for the average case.
There is something like that to balance ethernet traffic: http://www.linux.com/article.pl?sid=07/04/06/15162 52
You can do the same with iptables and route2, see the advanced routing howto on tldp.org.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
said with his fine wit: 'A thing should be as simple as possible, but no simpler'.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
Of course there's every possibility that your O(1) scheduling algorithm takes as long to run as the O(log(n)) algorithm, or even the O(n^2) algorithm for the typical number of processes running on your system... Ideally, you'd use the best scheduling algorithm for the job and the number of processes, and you'd change on the fly when the number of processes increased or decreased such that the current algorithm was no longer the most efficient.
We rarely see the charts that compare the O(1) scheduler with the O(whatever) schedulers that get proposed over a range of process counts though... Maybe that's because linux_kernel is an all text forum.
Honestly, if there weren't so many credit whores in the non-commercial linux kernel development world, you'd think there would be hundreds of schedulers in the kernel. There's really no reason not to have many, dynamically loaded schedulers. They're easy to write, easy to test, and every CS student writes one at some point.
You can't buy a Big Linux hamburger.
Tux and I will drive over with large quantities of booze and herrings. You sneak round the back with the schedulers.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
There is OQO, but it is by no means the only such product. Poqet PC, an early PDA, had an 8088 processor. Bandai WonderSwan is a handheld video game system with an 8086 clone. Tablet PCs, including the compact XO laptop/tablet designed by OLPC, have newer CPUs using the x86 architecture. But even if handhelds do tend to prefer ARM CPUs over x86 CPUs, why aren't the same Windows Mobile apps available recompiled for traditional Windows OS?
You can choose the amount of bloat. Take Debian base, and add a lightweight Window Manger (like fluxbox, WindowMaker and many others) and just the applications you need.But can one choose the level of bloat on the Windows OS that came preinstalled on a machine? Or do you have to switch operating systems and run the risk of losing compatibility with paid-for hardware? I guess I know why Dell plans to reintroduce Windows XP Service Pack 2 as an option on newer machines, as it can be slimmed down further than Microsoft's latest offering.
In Soviet Russia processes schedule YOU!
I will never live for sake of another man, nor ask another man to live for mine.
This reminds me of a movie called Tron.
s/obligatory/obvious/g
try to be a little more creative with it next time!
You pesky young folk who think yer flash ram is better than good old reliable ferrite donuts have at with ye! We had disk drives that could pull yer filings out from two feet away on a max seek. Do ya get technology like that nowdays? Do ya? Nuuuuu.....
Mind ye what we had for an OS scheduler was a smiling polish gennelmun who wore too much after shave, and he was a bit much, but we knew which program ran after which we did! Off with ye now, come back when you have a *real* algorithm.
(Walks away muttering at his shoes)
Do not mock my vision of impractical footwear
Bah, it's so easy...
If at first you don't succeed, skydiving is not for you
In addition a lot of Ubuntu users do straight to the Ubuntu wiki or forums and search there first. The same probably happens with other distros that have good communities
What part of that did I understand?
maybe people just search by the name of the distro.
Badass Resumes
Why does this seem so much like the Fair Share Schedular in Solaris 10? I think I know where they are getting there clues from. http://www.princeton.edu/~unix/Solaris/troubleshoo t/schedule.html
Fair scheduling is only relevant on machines that runs at 100% most of the time. For desktop PCs and embedded systems, that spends lots of the time in the idle loop, fair scheduling is academic.
If you want to improve the scheduler, improve the support for time critical tasks.
Third explanation: Linux documentation got substantially better, so people have less need to use Google as a substitute.
Documentation? We don't need no stink'n documentation! Real men don't read documentation!
Have you ever met someone who actually read the documentation before asking questions about a software product? Rare, that is.
Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
It's not a similar implementation, but it's very much an elegant instance of a very similar idea.
--dave
davecb@spamcop.net
Wait, I mean NetBSD did it, Neural Net based process scheduler: http://nnsched.sourceforge.net/
That's what I do with Gentoo... When there's not much drama!
"Chinese Amazons, power armor, laser swords.... things just meant to be." - Shampoo, A Very Scary Bet