Robert Love, Preemptible Kernel Maintainer Interviewed

Re:Why do we need so many different kernels? by Cyph · 2002-01-18 06:37 · Score: 3, Informative

I do believe that what Love is working on is in fact a patch, and not a fork.

True geek by gorillasoft · 2002-01-18 06:40 · Score: 5, Funny

Like all true geeks, Love doesn't forget to include in his comments that, despite being a computer nerd, he does, in fact, have a girlfriend.

RL: Approximately how much time per week do you spend working on your kernel patch for Linux?

Love: My girlfriend would probably say too much. Anywhere from a couple hours a week to many hours a day.

Re:Kernel Maintainer by Nothinman · 2002-01-18 06:46 · Score: 4, Insightful

Obviously, you want someone that knows the kernel really well and can maintain every part of it.

That person doesn't exist, not even Linus knows every part of the kernel inside and out.

You have to trust the maintainers of their parts of the kernel because as good as Linus, Marcelo, Alan, etc are they can't know all the gotchas, etc of all the drivers and different kernel subsystems.

Aren't you taking a good developer (who can maintain every part of the kernel) away from the newer versions of the kernel?

They don't have to do it if they don't want to or don't have the time. But with Alan's recent want to not maintain 2.4.x so he can work on other things seems to say how much time is really required for the maintenance of a kernel tree.

There are only so many developers, what happens if you run out?

I highly doubt there will ever be that many currently maintained kernel versions.

You just can't win by wowbagger · 2002-01-18 06:48 · Score: 5, Insightful

If you maintain different kernels, people say "OHMYGOD we are forking we will all DIE"

If you roll changes into a kernel and make it unstable, people say "OHMYGOD production kernel's not stable we will all DIE"

RTFA and lighten up. The patches are being considered for 2.5. They haven't been ruled out.

--
www.eFax.com are spammers

Re:Why do we need so many different kernels? by jonbelson · 2002-01-18 06:50 · Score: 3, Informative

>the Linux operating system will soon end up like
>*BSD, with several mutually incompatible,
> infighting factions. We can't let this happen.

Where on Earth did you get this nonsense from? There are three open-source BSDs, each with a different focus. Admittedly a few OpenBSD and NetBSD developers aren't the best of chums, but most of the developers are happy to share code between projects (eg. dirhash, smp, usb etc)

--Jon

Re:At the moment, yes by Nothinman · 2002-01-18 06:52 · Score: 3, Informative

If you read the interview you would have seen: "Linus said at ALS this year he was interested in the preempt-kernel patch. That doesn't mean anything to me until we are in, though, but it is a good sign."

And right now Linus hasn't let this and several other patches into 2.5 officially becaues he's focusing on the bio changes and it's much harder to debug if you have multiple subsystem changes going on at once. And Linus also has shown interest in Ingo's new O(1) scheduler, which the preempt patches have only recently become compatible with.

Because Linux is a Macrokernel architechture OS by ithmus · 2002-01-18 06:53 · Score: 3, Insightful

Remember when you had to compile your device drivers into the kernel yourself instead of using a module? The idea here is that the vital OS features are part of the kernel.

The open source movement is about modifying your software and sharing it. Anyone with the ability can modify a vital OS feature and share it. voila! Many, many kernels.

But, the problem here is that real time processing does not belong in a macrokernel architechture. Look at the commercial RTOS (Real Time Operating Systems) like QNX and you will see that a microkernel architechture -- a kernel that a provides minimal feature set is favored. This is because if you are depending on time constraints, all you want in your kernel is message passing and task syncronization.

--
I'm supposed to be working right now.

Re:Because Linux is a Macrokernel architechture OS by renehollan · 2002-01-18 07:11 · Score: 3, Insightful

But, the problem here is that real time processing does not belong in a macrokernel architechture.
I'm inclined to agree, but I see no reason to exclude an attempt at "poor-man's" real-time processing in a macro kernel, by making the kernel preemptable. So long as one does not have serious hard real-time constraints (e.g. manned aircraft auto-pilot) and is concerned with snappier interactive response and media streaming this seams appropriate. This is certainly the case if it can be compiled in or out at will.

--
You could've hired me.
Re:Because Linux is a Macrokernel architechture OS by Cato+the+Elder · 2002-01-18 08:46 · Score: 3, Insightful

It's not that "real time processing does not belong in a macrokernel architechture" it's that "macrokernel architechture does not belong in a hard real time system"

I don't see any problem with making the linux kernel preemptible to be able to make better real time garuntees. Sure, I don't think you'll ever get hard realtime, but that doesn't mean that you won't get benefits from being able to respond to interrupts even when the system is running the kernel.

("hard realtime" -- maximum interrupt latency of xxx nanoseconds. "soft realtime" -- runs fast enough, usually)

this one gets my vote by spongman · 2002-01-18 07:01 · Score: 4, Informative

I've been running this patch since the early -ac days on a machine that doubles as a development desktop and a medium-load server and I've had absolutely no problems. I don't have any numbers but X/Gnome seems much more responsive even when other procs are busy. I see no reason why this isn't in the main tree, especially since it's a kbuild-time option.

a quick nice -n -5 /usr/bin/X11/X helps, too.

Re:low-latency patch by steveha · 2002-01-18 07:11 · Score: 5, Interesting

I have been following these two patches a bit, because I want my desktop to be as snappy and responsive as possible.

The current low-latency patches work by finding "hot spots" in the kernel code where something is taking a long time, and then putting in a hack to make the code yield. The good part is that you can get a really low-latency kernel; the bad part is that you have to touch the kernel code in hundreds of places, and the kernel code gets really ugly. I remember reading that Ingo Molnar, who wrote a giant low-latency patch that worked this way, agreed with Linus that his low-latency patch was just too ugly and huge and should not be included in the main source tree.

The preemption patch is comparatively small and elegant. It leverages the work that has already been done to make SMP work correctly. I'm using it on my Linux desktops, and I like it.

On one of the mailing lists, Linus said that he wants the Linux kernel to gain low latency the cleanest way: find all parts that are slow, and instead of hacking them to yield, re-write them so they are faster (but still clean code that is easy to understand). This is of course the ideal, but when will it be finished? The preemption patch is available now, and works now, and I am using it now.

steveha

--
lf(1): it's like ls(1) but sorts filenames by extension, tersely

Re:Kernel Maintainer by cduffy · 2002-01-18 07:28 · Score: 3, Informative

There are only so many developers, what happens if you run out?

If the supply of Linux kernel developers runs out (as unlikely as that seems), then kernel developers get hired away from different kernels.

Seriously... there are folks who, for commercial reasons, need a lot of this work to be done; and a developer with real-time experience from a different UNIX kernel can familiarize with Linux within a reasonable amount of time.

For instance, quite a few of the folks MontaVista has hired to do kernel work come not from a Linux background but from working on other Unices -- one fellow we hired to work on preemptability originally did the same variety of thing for IRIX, IIRC.

Resumable Pre-emtable OS calls by leighklotz · 2002-01-18 07:47 · Score: 4, Funny

The ITS operating system (the world's second timesharing system, and the system for which RMS and others first developed EMACS) had a concept of state checkpoints in OS calls, called PCLSRing.

Alan Bawden wrote a paper on it, and it's quite a good read. His web site has a compressed .gz version, but I found an HTML version of the HTML PCLSR Paper and I quote from its abstract here:

Under any timesharing operating system there will be occasions when a process must access the state of another process. A process may need to start, stop, debug, load, dump, create, or destroy another process. There is also one occasion when a process must access its own state: an interrupt handler needs access to the state of the running process prior to the arrival of the interrupt, so that the process may continue after the interrupt has been dealt with.

"PCLSRing" is a mechanism the ITS operating system uses to enforce a kind of modularity when a process must access the state of another process. The modularity principle is very simple: no process ever catches another process (including itself) in the act of executing a system call. System calls thus behave as if they were directly implemented in hardware. A process can no more catch another process in the middle of deleting a file than it can catch another process in the middle of a multiply instruction.

There was also a way to put the system into a PCLSR test mode that exercised all these control points within the system calls, to help debug them. See SYSDOC TEST documentation extracted from the now decomissioned AI PDP-10 that originally served it up as ftp://ftp.ai.mit.edu/pub/alan/its/sysdoc.tgz (yes, ITS was on the Arpanet and the Internet and ran TCP/IP as well).

Re:Why do we need so many different kernels? by mattdm · 2002-01-18 07:50 · Score: 3, Informative

...Alan Cox's own fork which is steadily separating from Linus' core...

Well, Alan isn't really in the business of doing this anymore. But even when he did, that's not the way things worked -- the Alan and Linus trees stayed fairly in sync with each other in many ways.

Re:Impressive. by John+Whitley · 2002-01-18 07:51 · Score: 4, Insightful

Don't they have to study at all?

What in the heck do you think this IS? Many of the best students treat developing their coding and CS skills more like a musician or artist practicing and performing for many hours a day. It's a creative act, and can be very involving. Moreover, that level of involvement hones problem solving and practival skills that the "just a job" students can never hope to achieve.

I always wondered about students who didn't have any passion for the field. From what I've seen in both academia and industry, that "just a job" mentality reduces one's skills to "programming fodder", and would seem to be a pretty unenjoyable career.

Re:Impressive. by soulsteal · 2002-01-18 07:55 · Score: 3, Funny

Studying comes after Kernel Hacking and right before Binge Drinking. If we're not lucky, then the Kernel Hacking would come after or during the Binge Drinking

Re:Preemptive Kernel? by paulbd · 2002-01-18 09:02 · Score: 3

thats complete nonsense. the behaviour you're describing is what any multitasking kernel does. it has nothing to do with the preemptive kernel patch, which ensures that when a task becomes runnable (for example, due to a device interrupt that tells the CPU that a condition the task was waiting for is met), it can start running ASAP, rather than waiting till the next regular re-scheduling point. ASAP here means within at most a millisecond or two. There will be always be places where the kernel has to prevent this from happening; the goal is to minimize (and document) those places.

Re:Preemptive kernel looks good by Lozzer · 2002-01-18 09:32 · Score: 4, Informative

I think you'll find that Linux has premptive multitasking too. What it can't do (without the preempt patch) is prempt a task that is currently running kernel code (e.g. through a syscall). I've no idea whether the Amiga's "kernel" (exec?) was preemptible in this sense or not.

--
Special Relativity: The person in the other queue thinks yours is moving faster.

Crash-course in preemtivity by slashdot.org · 2002-01-18 10:23 · Score: 5, Informative

For those that don't really understand the importance of preemptive multitasking (and from reading some comments, there are a few of you out there :-O), let's explain this through an example.

Consider application (a) that wants to read 128MB of data from the disk and application (b) that wants to read 1KB of data.

Let's say that the disk transfers @ 1MB/sec and let's assume application (a) issues the read 1 second before application (b) is started.

The sequence of calls for each app will look something like this:
1) program calls read
2) read is handled by the top level file system and is handed down to the proper file system
3) the file system calls the block device
4) the block device driver breaks up the request into the maximum block size the device can handle per request (for example 256 sectors for IDE)
5) for every request, the block device sends request to physical drive
6) drive transfers data to host
7) drive indicates 'done'
8) goto 5 until done
9) file system returns
10) read returns

It's important to know that there may be limitations on the number of requests any stage can handle simultaneously. For example, an IDE drive can only handle one request at a time. Some Operating Systems however introduce even tighter restrictions, because for example the block device driver was written, assuming only one request at a time would be allowed.

Take for example an OS where the kernel assures that no more that one request is pending between step 3) and 9).

This would have the following effect on our apps: app (a) is allowed to call step 4) because it is the first request. One second later, app (b) arrives at step 3) and is blocked. It is not allowed to enter step 4) until app (a) is done and passes step 9). Effectively this means that app (b) has to wait for 127 seconds before it gets access to step 4).

Now consider an OS where the file system and drivers can handle multiple requests. It still has to assure that the physical drive receives only one request though, so it permits only access of one app between step 5) and 7). App (a) arrives at step 5) first, so it gets to start sending requests to the drive. One second later, app (b) arrives at step 5) and has to wait for app (a) to finish it's current request to the drive. As soon as this request is done though, app (a) gets to step 8). Now we have both app (a) and app (b) wanting to have access to step 5). Depending on the scheduler either one will be granted access. There's a good change that app (b) will get access, and thus it only had to wait the time it took for app (a) to finish it's outstanding request, which takes max 1/8th of a second (256 sectors = 128KB) in this example.

Btw. you may notice though that there's more likely to be an (expensive) seek introduced by allowing app (b) to interrupt the transfer of app (a)

You can see how moving the 'lock' deeper in to the OS improves responsiveness. I'm not going to start a flame war about which OS is better, all I will say is that MY OS locks steps 5) through 7) :-)

Re:Crash-course in preemtivity by nevets · 2002-01-18 11:17 · Score: 5, Informative

This is fine except the way the linux kernel works (as well as most others). When app(a) calls the IDE to get the data, it goes to sleep (calls schedule) and won't wake up until it gets the received data. When schedule is called, it can go back to user land. app(b) will now do its call and it will be queued up and blocked until app(a) gets its data. Then app(b) will get a turn and this goes on and app(b) will not be affected by the big read of app(a). Your situation would happen if the kernel didn't call schedule while waiting for data. Thus that would really hurt the performance of the kernel.

The real problem is if you have a periodic program that needs to make a deadline. What a non preemptive kernel does is place in too many varients. You can't guarentee that the program will make its deadlines when you have other lower priority tasks running.

Say you have a period of 10 ms and your task needs 3 ms to do its job, and it must be done within the first 5 ms. So your app starts and gets priority, and you do what needs to be done in the first 3 ms and easily makes your 5 ms deadline. Now you may run other task for the next 7 ms. Come the next period, you start your 3 ms task and repeat.

But lets say you have a system call that takes 4 ms, and one of your non priority tasks calls it at the 9 ms time. With out being able to preempt it, your priority task will start at the 3 ms time and it won't finish until the 6 ms time and thus you missed your deadline.

The reason you have preemtive kernels is so that priority tasks are not affected by lower prioriy tasks. Not the reason you gave above.

--
Steven Rostedt
-- Nevermind

Re:Pre-Emptable Kernel & MicroKernels by be-fan · 2002-01-18 10:50 · Score: 4, Informative

(Aside from the fact that all of the Linux kernel, drivers, etc. is in 'kernel' mode, and a MicroKernel has only the message-passing and task-scheduling in 'kernel' mode, and everything else (drivers, etc.) run in 'user' mode.)
>>>>>>>>>>.
That's basically everything that distinguishes a microkernel from a monolithic kernel :) The preemptive bit is orthagonal to micro/macro kernel. There are non-preemptible microkernels (like MINIX, I think) and preemptible monolithic kernels (Solaris).

--
A deep unwavering belief is a sure sign you're missing something...

Re:low-latency patch by pthisis · 2002-01-18 21:37 · Score: 3, Informative

Then don't compile it in while you're debugging. Most *users* don't debug their kernels.

If a user gets an oops and submits a bug report to linux-kernel while running preempt, the bug report is a lot harder to decipher.

True, but the SMP support introduced these to begin with. The preemptive patches just bring that danger to UP machines.

Not true. preempt introduces new hangs. Read the threads on linux-kernel, especially "Re: [2.4.17/18pre] VM and swap - it's really unusable".

Some things that are broken by preempt:
* Network drivers which disable IRQs to avoid spinlocking on uniprocs (major performance win)
* Drivers which use per-CPU data to avoid spinlocking at all
* Drivers which disable individual interrupts for long periods of time
* Drivers which depend on consecutive lines of code executing near each other in time, especially serial drivers

That thread has details on all of these.

And there's the priority inversion scenario:
SCHED_OTHER process 1 acquires a semaphore in kernel mode
SCHED_FIFO realtime process needs the semaphore, blocks on it

Now the rt process is stuck pending progress from the SCHED_OTHER process. Without preemption, the SCHED_OTHER process would have done whatever required the semaphore and released it. Now, SCHED_OTHER process 2, 3, ... n may be scheduled, leaving the realtime process twisting in the wind until the system eventually gets around to rescheduling process 1. If process 1 is e.g. SETIatHome or another nice 19 CPU hog, and process 2 is mozilla, you can get huge latencies on the realtime process, way bigger than without preempt. And even without SCHED_OTHER you run into the same problem, it's just not quite as easy to illustrate.

[Note that I said semaphore, not spinlock, so the lock-break code won't help.]

Do you not remember all the problems with priority inversion and the SCHED_IDLE patches? This is exactly the same problem, it's not like it's something new and mysterious that people are making up as FUD to stop preempt from getting in to the kernel. Any introductory OS textbook discusses it, and priority inheritance is the only robust way to eliminate it--with all the problems that come along with priority inheritance.

4. It doesn't improve the worst-case latency.
>>>>>>>>>>>
It's not designed to. Love has already started work on a lock-breaking patch to get rid of long-held locks
It won't help. The fundamental problem is that preempt in interrupts is impossible with this scheme (and AFAIK nobody has proposed a scheme that could even theoretically work), and the worst-case latencies are in interrupts. The secondary problem is that a lot of these issues are highly hardware sensitive; different hardware has different timing requirements, and the only way to find them all is to audit every driver and deal with it appropriately (as, e.g., the LL patches do).

As Alan Cox wrote,

The pre-emption patch doesn't change the average latencies. Go run some real benchmarks. Its lost in the noise after the low latency patch. A single inw from some I/O cards can cost as much as the latency target we hit.

Its not a case of the 90% of the result with 10% of the work, the pre-empt patch is firmly in the all pain no gain camp

Sumner

--
rage, rage against the dying of the light

Slashdot Mirror

Robert Love, Preemptible Kernel Maintainer Interviewed

22 of 183 comments (clear)