Robert Love, Preemptible Kernel Maintainer Interviewed

Re:Why do we need so many different kernels? by Cyph · 2002-01-18 06:37 · Score: 3, Informative

I do believe that what Love is working on is in fact a patch, and not a fork.

Re:Why do we need so many different kernels? by Nothinman · 2002-01-18 06:38 · Score: 2, Informative

Love doesn't maintain a complete seperate kernel tree, just a patch that applies to the main Linus tree.

Re:Why do we need so many different kernels? by jonbelson · 2002-01-18 06:50 · Score: 3, Informative

>the Linux operating system will soon end up like
>*BSD, with several mutually incompatible,
> infighting factions. We can't let this happen.

Where on Earth did you get this nonsense from? There are three open-source BSDs, each with a different focus. Admittedly a few OpenBSD and NetBSD developers aren't the best of chums, but most of the developers are happy to share code between projects (eg. dirhash, smp, usb etc)

--Jon

Re:At the moment, yes by Nothinman · 2002-01-18 06:52 · Score: 3, Informative

If you read the interview you would have seen: "Linus said at ALS this year he was interested in the preempt-kernel patch. That doesn't mean anything to me until we are in, though, but it is a good sign."

And right now Linus hasn't let this and several other patches into 2.5 officially becaues he's focusing on the bio changes and it's much harder to debug if you have multiple subsystem changes going on at once. And Linus also has shown interest in Ingo's new O(1) scheduler, which the preempt patches have only recently become compatible with.

this one gets my vote by spongman · 2002-01-18 07:01 · Score: 4, Informative

I've been running this patch since the early -ac days on a machine that doubles as a development desktop and a medium-load server and I've had absolutely no problems. I don't have any numbers but X/Gnome seems much more responsive even when other procs are busy. I see no reason why this isn't in the main tree, especially since it's a kbuild-time option.

a quick nice -n -5 /usr/bin/X11/X helps, too.

Re:Kernel Maintainer by cduffy · 2002-01-18 07:28 · Score: 3, Informative

There are only so many developers, what happens if you run out?

If the supply of Linux kernel developers runs out (as unlikely as that seems), then kernel developers get hired away from different kernels.

Seriously... there are folks who, for commercial reasons, need a lot of this work to be done; and a developer with real-time experience from a different UNIX kernel can familiarize with Linux within a reasonable amount of time.

For instance, quite a few of the folks MontaVista has hired to do kernel work come not from a Linux background but from working on other Unices -- one fellow we hired to work on preemptability originally did the same variety of thing for IRIX, IIRC.

Re:MS Windows? by gmack · 2002-01-18 07:42 · Score: 2, Informative

I think some parts of NT are and some parts of not.. windows 9.x/me are definatly not to prove this just watch your whole system slow to a crawl while writing to the floppy drive.

I don't think even with this the kernel will be good enough for some forms of real time process control. Real Time process control needs guarunteed sub millisecond response times and everything else including overall throughout is sacrificed for the cause.

It will however make the UI a tad more responsive and make it harder for things like web browser loading to make the mp3 player skip.

Re:MS Windows? by psavo · 2002-01-18 07:46 · Score: 2, Informative

Is the MS windows kernel preemptible? Does anyone know or is it top-secret?

AFAIK, it is not. I know that in Windows minimum latency is about 10ms, same as in default Linux kernel.

If a kernel is preemptible, can it be used for real time process control?

For what? Do you mean for something like.. robotics control? or shuttle devices control?
For that answer is a bit complicated, you see, even this pre-empt patch doesn't help when there are 'long' locks in kernel. There are situations when kernel disables all interrupts and does its things for long time (like 5 to 20ms). There are not many of these, but they do exist. Now, realtime kernel is one that answers any interrupt 'right away', and can be interrupted 'anywhere'. These are a LOT hairier than pre-empt patch.
If one doesn't have hard time limits, pre-empt patch delivers enough certainty on time-bounds..

--
fucktard is a tenderhearted description

Re:Why do we need so many different kernels? by mattdm · 2002-01-18 07:50 · Score: 3, Informative

...Alan Cox's own fork which is steadily separating from Linus' core...

Well, Alan isn't really in the business of doing this anymore. But even when he did, that's not the way things worked -- the Alan and Linus trees stayed fairly in sync with each other in many ways.

Re:Preemptive Kernel? by Nothinman · 2002-01-18 08:04 · Score: 2, Informative

A preemptive kernel is allowed to be stopped for another higher priority task, in the default kernel only userspace apps can be preempted. What this means is if an app is reading from the disk (the syscall for block I/O is run in kernel mode on the app's behalf) and the disk read is taking a long time (many milliseconds) that kernel path can be paused to allow other processes to run while the disk finishes it's read.

Re:Preemptive kernel looks good by Lozzer · 2002-01-18 09:32 · Score: 4, Informative

I think you'll find that Linux has premptive multitasking too. What it can't do (without the preempt patch) is prempt a task that is currently running kernel code (e.g. through a syscall). I've no idea whether the Amiga's "kernel" (exec?) was preemptible in this sense or not.

--
Special Relativity: The person in the other queue thinks yours is moving faster.

Crash-course in preemtivity by slashdot.org · 2002-01-18 10:23 · Score: 5, Informative

For those that don't really understand the importance of preemptive multitasking (and from reading some comments, there are a few of you out there :-O), let's explain this through an example.

Consider application (a) that wants to read 128MB of data from the disk and application (b) that wants to read 1KB of data.

Let's say that the disk transfers @ 1MB/sec and let's assume application (a) issues the read 1 second before application (b) is started.

The sequence of calls for each app will look something like this:
1) program calls read
2) read is handled by the top level file system and is handed down to the proper file system
3) the file system calls the block device
4) the block device driver breaks up the request into the maximum block size the device can handle per request (for example 256 sectors for IDE)
5) for every request, the block device sends request to physical drive
6) drive transfers data to host
7) drive indicates 'done'
8) goto 5 until done
9) file system returns
10) read returns

It's important to know that there may be limitations on the number of requests any stage can handle simultaneously. For example, an IDE drive can only handle one request at a time. Some Operating Systems however introduce even tighter restrictions, because for example the block device driver was written, assuming only one request at a time would be allowed.

Take for example an OS where the kernel assures that no more that one request is pending between step 3) and 9).

This would have the following effect on our apps: app (a) is allowed to call step 4) because it is the first request. One second later, app (b) arrives at step 3) and is blocked. It is not allowed to enter step 4) until app (a) is done and passes step 9). Effectively this means that app (b) has to wait for 127 seconds before it gets access to step 4).

Now consider an OS where the file system and drivers can handle multiple requests. It still has to assure that the physical drive receives only one request though, so it permits only access of one app between step 5) and 7). App (a) arrives at step 5) first, so it gets to start sending requests to the drive. One second later, app (b) arrives at step 5) and has to wait for app (a) to finish it's current request to the drive. As soon as this request is done though, app (a) gets to step 8). Now we have both app (a) and app (b) wanting to have access to step 5). Depending on the scheduler either one will be granted access. There's a good change that app (b) will get access, and thus it only had to wait the time it took for app (a) to finish it's outstanding request, which takes max 1/8th of a second (256 sectors = 128KB) in this example.

Btw. you may notice though that there's more likely to be an (expensive) seek introduced by allowing app (b) to interrupt the transfer of app (a)

You can see how moving the 'lock' deeper in to the OS improves responsiveness. I'm not going to start a flame war about which OS is better, all I will say is that MY OS locks steps 5) through 7) :-)

Re:Crash-course in preemtivity by nevets · 2002-01-18 11:17 · Score: 5, Informative

This is fine except the way the linux kernel works (as well as most others). When app(a) calls the IDE to get the data, it goes to sleep (calls schedule) and won't wake up until it gets the received data. When schedule is called, it can go back to user land. app(b) will now do its call and it will be queued up and blocked until app(a) gets its data. Then app(b) will get a turn and this goes on and app(b) will not be affected by the big read of app(a). Your situation would happen if the kernel didn't call schedule while waiting for data. Thus that would really hurt the performance of the kernel.

The real problem is if you have a periodic program that needs to make a deadline. What a non preemptive kernel does is place in too many varients. You can't guarentee that the program will make its deadlines when you have other lower priority tasks running.

Say you have a period of 10 ms and your task needs 3 ms to do its job, and it must be done within the first 5 ms. So your app starts and gets priority, and you do what needs to be done in the first 3 ms and easily makes your 5 ms deadline. Now you may run other task for the next 7 ms. Come the next period, you start your 3 ms task and repeat.

But lets say you have a system call that takes 4 ms, and one of your non priority tasks calls it at the 9 ms time. With out being able to preempt it, your priority task will start at the 3 ms time and it won't finish until the 6 ms time and thus you missed your deadline.

The reason you have preemtive kernels is so that priority tasks are not affected by lower prioriy tasks. Not the reason you gave above.

--
Steven Rostedt
-- Nevermind

Re:Pre-Emptable Kernel & MicroKernels by be-fan · 2002-01-18 10:50 · Score: 4, Informative

(Aside from the fact that all of the Linux kernel, drivers, etc. is in 'kernel' mode, and a MicroKernel has only the message-passing and task-scheduling in 'kernel' mode, and everything else (drivers, etc.) run in 'user' mode.)
>>>>>>>>>>.
That's basically everything that distinguishes a microkernel from a monolithic kernel :) The preemptive bit is orthagonal to micro/macro kernel. There are non-preemptible microkernels (like MINIX, I think) and preemptible monolithic kernels (Solaris).

--
A deep unwavering belief is a sure sign you're missing something...

Re:low-latency patch by pthisis · 2002-01-18 21:37 · Score: 3, Informative

Then don't compile it in while you're debugging. Most *users* don't debug their kernels.

If a user gets an oops and submits a bug report to linux-kernel while running preempt, the bug report is a lot harder to decipher.

True, but the SMP support introduced these to begin with. The preemptive patches just bring that danger to UP machines.

Not true. preempt introduces new hangs. Read the threads on linux-kernel, especially "Re: [2.4.17/18pre] VM and swap - it's really unusable".

Some things that are broken by preempt:
* Network drivers which disable IRQs to avoid spinlocking on uniprocs (major performance win)
* Drivers which use per-CPU data to avoid spinlocking at all
* Drivers which disable individual interrupts for long periods of time
* Drivers which depend on consecutive lines of code executing near each other in time, especially serial drivers

That thread has details on all of these.

And there's the priority inversion scenario:
SCHED_OTHER process 1 acquires a semaphore in kernel mode
SCHED_FIFO realtime process needs the semaphore, blocks on it

Now the rt process is stuck pending progress from the SCHED_OTHER process. Without preemption, the SCHED_OTHER process would have done whatever required the semaphore and released it. Now, SCHED_OTHER process 2, 3, ... n may be scheduled, leaving the realtime process twisting in the wind until the system eventually gets around to rescheduling process 1. If process 1 is e.g. SETIatHome or another nice 19 CPU hog, and process 2 is mozilla, you can get huge latencies on the realtime process, way bigger than without preempt. And even without SCHED_OTHER you run into the same problem, it's just not quite as easy to illustrate.

[Note that I said semaphore, not spinlock, so the lock-break code won't help.]

Do you not remember all the problems with priority inversion and the SCHED_IDLE patches? This is exactly the same problem, it's not like it's something new and mysterious that people are making up as FUD to stop preempt from getting in to the kernel. Any introductory OS textbook discusses it, and priority inheritance is the only robust way to eliminate it--with all the problems that come along with priority inheritance.

4. It doesn't improve the worst-case latency.
>>>>>>>>>>>
It's not designed to. Love has already started work on a lock-breaking patch to get rid of long-held locks
It won't help. The fundamental problem is that preempt in interrupts is impossible with this scheme (and AFAIK nobody has proposed a scheme that could even theoretically work), and the worst-case latencies are in interrupts. The secondary problem is that a lot of these issues are highly hardware sensitive; different hardware has different timing requirements, and the only way to find them all is to audit every driver and deal with it appropriately (as, e.g., the LL patches do).

As Alan Cox wrote,

The pre-emption patch doesn't change the average latencies. Go run some real benchmarks. Its lost in the noise after the low latency patch. A single inw from some I/O cards can cost as much as the latency target we hit.

Its not a case of the 90% of the result with 10% of the work, the pre-empt patch is firmly in the all pain no gain camp

Sumner

--
rage, rage against the dying of the light

Slashdot Mirror

Robert Love, Preemptible Kernel Maintainer Interviewed

15 of 183 comments (clear)