Deadline Scheduling Proposed For the Linux Kernel
c1oud writes "At the last Real-Time Linux Workshop, held in September in Dresden, there was a lot of discussion about the possibility of enhancing real-time capabilities of Linux by adding a new scheduling class to the Linux kernel. According to most kernel developers, this new scheduling class should be based on the Earliest Deadline First (EDF) real-time algorithm. The first draft of the scheduling class was called 'SCHED_EDF,' and it was proposed and discussed on the Linux Kernel Mailing List (LKML) just before the workshop. Recently, a second version of the scheduling class (called 'SCHED_DEADLINE,' to meet the request of some kernel developers) was proposed. Moreover, the code has been moved to a public git repository on Gitorius. The implementation is part of a FP7 European project called ACTORS, and financially supported by the European commission. More details are available."
because i'm using Earliest Deadline First scheduling!
The EU-funded projects are somewhat interesting in my experience. They tend to fund both academics and researchers from industry to do stuff and the projects tend to be more focused on practical results than a normal project funded by a research council. They can still generate research papers, etc, but there's more of an emphasis on producing new code that can actually be *used* to do stuff that wasn't available before. Whereas more academic research normally focuses on getting the code sufficiently robust that papers can be published about it, then it's often forgotten.
I think the more practically focused work of this kind is valuable and would like to see more. It is less "valuable", academically and as such I suspect academics are less inclined to attribute prestige to those who have worked on it. It would be nice to see a bit more glory given to folks who work on these projects (disclaimer, I have done a *very* small amount of work on one myself) as a valid direction vs industry or academia. Also, this mode of development does remind me a little of some of RMS's writings about how Free Software development could be funded - here we have effectively a government body giving money to worthy causes, as represented by a team of interested experts, to enhance open source software for everyone involved in reasonably directed ways. Ideally it'd be nice to see "get stuff upstream" be a completion goal for these projects, I'm not sure to what extent that is already true.
Whether or not this scheduler is better depends on what you're trying to do.
The proposed scheduler intends to work better for real-time apps, where the correctness of the algorithm depends on how timely the data gets processed. Low-latency audio is a good example of this, as a dropped or late packet of audio results in a nasty audible pop.
Dangerous, sexy, turing complete: Femme Bots
Remember Con Kolivas.
These algorithms will produce substantially worse overall performance in all workloads. However, they allow absolute deadlines to be set for certain tasks. This is mostly useful for embedded devices -- if you're creating a medical device, or a subsystem for a plane, a 20% performance hit to guarantee you don't delay critical tasks for a couple seconds and get people killed isn't even a decision worth thinking about.
This would make Linux a legitimate real time ("RT") kernel. There are RT Linuxes already, but they suck to work with -- I believe RTLinux (one of the RT variants), for example, requires all RT tasks to be in kernel-space.
The upshot is that this is huge for Linux in certain business areas (and other RT OSes are currently quite pricey), but totally useless for your desktop or home server.
I've not read TFA yet but will try not to spout rubbish ;-)
Deadline-based scheduling is good for realtime processing and not necessarily better for your desktop or server. "Realtime" doesn't mean fast / high throughput and doesn't *necessarily* mean low latency. What it really means is "predictable", with low latency potentially being important. A server or desktop scheduling algorithm often does all kinds of crazy scaling of priorities according to process behaviour in the past, etc - it aims to keep processes running on the CPU as much as possible so that your overall performance is good. The desktop scheduler may also be tweaked to try and make sure interactive tasks respond quickly
Typically a realtime scheduling algorithm is more "stupid" and therefore more predictable, so applications that need regular helpings of CPU can run without the behaviour of the scheduler disrupting their operation. Linux currently supports realtime scheduling through "round robin" and "first-in-first-out" classes, which are extremely "stupid" scheduling algorithms but useful in some cases. Realtime tasks run according to the chosen algorithm, then the normal desktop/server scheduler handles other tasks. It sounds to me like the proposal is to add a slightly more intelligent realtime scheduler allowing administrators and applications to control realtime behaviour differently when necessary. It doesn't sound like they're proposing replacing the main scheduling algorithm, although some OSes have played with using deadline-based scheduling exclusively.
An EDF algorithm assigns each task a deadline and tries to always schedule the task whose deadline will expire soonest. Using a periodic deadline you can effectively specify stuff like "I need to be scheduled every 50ms, if at all possible" and the scheduler will try to make sure this happens. If you are, for instance, doing video streaming or interacting with a hardware buffer or controlling a robot arm you might have these requirements. In these cases, getting the CPU regularly is much more important than getting lots of CPU on average, which is why just renicing isn't sufficient.
Deadline scheduling is well established and has been done many times, in many flavours on other OSes. It's probably even been done on Linux before. But if this one gets upstream with the blessing of the kernel community, it would enhance Linux for everyone rather than just those running particular kernel patches.
Linux seems to be having a lot of realtime-related work (see PREEMPT-RT, a somewhat separate but related area of work) done, which is interesting - I would have said that the conventional wisdom was that large, general-purpose OSes cannot be realtime-ified. It seems like certain parties are determined to prove this wrong - and it's looking to me somewhat like getting to "good enough" realtime behaviour will make large segments of users happy even though it's perhaps unlikely to ever replace ground-up realtime OS designs.
I think the question he is posing is:
Is this type of scheduler perfect? Capable of real time, batch jobs and the mixed fruit bowl of jobs on a typical desktop or server?
The answer is, probably not. There really is no such thing as a perfect scheduler. Different kinds of workloads have different requirements.
A busy real time scheduler will tend to starve low priority jobs. This can become an issue if those low priority jobs manage to grab limited resources they are never given the time to use. As those resources dwindle, real time jobs will be harder to satisfy and the low priority jobs must be terminated or given time to release those resources. I can't really say for certain, but it looks like EDF would have these same kinds of issues.
Interesting story a professor of mine told: An old university mainframe was brought offline after decades of operation. A core dump was performed and investigation revealed that there was a process that had been waiting to run for close to 30 years. Somehow, its priority was set to be lower than the idle process and this particular machine did not have automatic escalation of priority in its scheduler.
True - those are IO schedulers, though, whereas the scheduler described in the article is a CPU scheduler. Not that IIRC the summary or myself went so far as to point that out explicitly, which was a mistake!
For CPU scheduling Linux has a general purpose scheduler (which is CFS, in recent Linux releases) and two realtime scheduling classes (round robin and FIFO), if memory serves.
These options are for the IO scheduler called "Deadline". TFA is about CPU scheduler.
That's probably why he's asking jackass.
"linux is just DOS with a UNIX like syntax" -- Galactic Dominator (944134)
20% overall hit on system performance, as in the peak throughput of your system may take a 20% or greater hit due to scheduler decisions that guarantee deadlines get met, even if that means not allowing work to proceed on tasks that are ready to run.
For example, task A may be ready to run, but if the scheduler picks it, then task B might miss its deadline because A's time quanta is too long. This can be true even if B isn't yet ready to run, but is about to be. Ready-to-run task A may take 40ms with a deadline way in the future. Task B might be ready to run in 20ms, and when it does run, it'll run for 70ms. Now suppose B's deadline is 100ms away. If task A can't be preempted, we can't run task A, because then we'd start task B too late. A real-time scheduling algorithm might therefore choose to idle 20ms until B is ready if it's not possible to preempt task A.
(Note to pedants out there: What I just described is not EDF, but a more generic off-the-top-of-my-head example of how realtime scheduling decisions, perhaps even just made by a human making a static schedule, can lead to idle time.)
These sorts of statements tend to apply though when all tasks use a real-time scheduling algorithm, or only to the subset of tasks that are classified "real-time". Linux's scheduling algorithm will remain hybrid, with some real time tasks and the rest using the CFS scheduler (SCHED_OTHER).
As a result, it seems highly unlikely to me that SCHED_EDF (or SCHED_DEADLINE or however it gets spelled) will make Linux a true hard real-time OS. It will just add a new, more traditional real time scheduler to the existing set of soft-real time schedulers (SCHED_RR, SCHED_FIFO). A system with predominately real-time tasks will probably meet all of its real time guarantees with high probability. I wouldn't run my medical equipment with it just yet though, particularly if any of the SCHED_OTHER tasks are doing disk I/O and/or thrashing the VM.
Re: your sig... (0x2B | ~0x2B) = ~0, regardless of your bit length.
Program Intellivision!
Interesting story a professor of mine told: An old university mainframe was brought offline after decades of operation. A core dump was performed and investigation revealed that there was a process that had been waiting to run for close to 30 years. Somehow, its priority was set to be lower than the idle process and this particular machine did not have automatic escalation of priority in its scheduler.
Actually, I think EDF would fix that, at least to an extent.
Currently, prioritization is done based on, well priority. So if you took a job and set its priority to lower-than-idle, as you stated above, it will obviously never run.
However, with EDF,you assign (if I understand it correctly) a deadline to each task rather than a priority. "Task X really should get done in the next 50 msec, but Task Y can wait up to 12 hours and no one's going to scream". On a normal peak-and-valley loaded machine, it'll work on the first deadline first, and if things queue up the low-"priority" ones will simply have longer deadlines. Ideally, the system has enough capacity to accommodate the longer deadline stuff "early" when it has time.
However, on a heavily-peaked system, eventually those long-deadline jobs are going to get priority simply because their deadlines have been reached. So if you have a 50ms deadline job running every 20ms and those jobs start queuing up, then you submit a 12hour deadline job, in 12 hours that job will get the full attention of the system until it's done, because it has the earliest deadline.
As with all things, this has its advantages and disadvantages. If that job that runs every 20ms is truly critical, then you're not going to like this scheme - eventually the low-"priority" stuff is going to come due and take precedent over your "critical" stuff because it's all based on what you asked for first, not how important each task was. On the other hand, this does prevent the very problem you describe - jobs running at such a low priority that they never get any lovin' at all.
"This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
Oh shit! Processor nerd SMACKDOWN!! This is better than wrestling!