Removing the Big Kernel Lock
Corrado writes "There is a big discussion going on over removing a bit of non-preemptable code from the Linux kernel. 'As some of the latency junkies on lkml already know, commit 8e3e076 in v2.6.26-rc2 removed the preemptable BKL feature and made the Big Kernel Lock a spinlock and thus turned it into non-preemptable code again. "This commit returned the BKL code to the 2.6.7 state of affairs in essence," began Ingo Molnar. He noted that this had a very negative effect on the real time kernel efforts, adding that Linux creator Linus Torvalds indicated the only acceptable way forward was to completely remove the BKL.'"
That's fine, but once you reach maturity you should be trying to do the "right thing" (the exact opposite.) And the Linux kernel has reached maturity for quite a while now.
I think Linus is right on this.
If that is true then it sounds like a bad decision.
If the BKL code is rarely used then the general usage performance impact is minimal and the efficiency of a spinlock vs mutex is irrelevant. If this is not true then saying it is rarely used is misleading.
However for real-time use you either do or don't meet a given worst case latency spec - the fact that a glitch only rarely happens is of little comfort.
It seems like it should have been a no-brainer to leave the pre-emptable code in for the time being. If there's a clean way to redesign the lock out altogether then great, but that should be a seperate issue.
Keep these on Front Page...
This is the type of stuff that needs to be kept in the news, as the people who post here often have no understanding of, and the ones that do, have the opportunity to explain this stuff, bringing everyone up to a better level of understanding.
Maybe if we did this, real discussions about the designs and benefits of all technologies could be debated and referenced accruately.. Or even dare say, NT won't have people go ape when someone refers to a good aspect of its kernel design.
Slashdot's not supposed to be interesting to every reader all the time. If you want someone to cater to a least common denominator, you'd be better off somewhere else.
ResidntGeek
That's a terrible excuse. There are many applications where a real-time Linux kernel is highly desired. Besides, it is important to note that real time systems do not focus on speed. This is a subtle difference from "performance" which usually caries speed as a connotation; it doesn't for a real time system. The real time system's focus is on completing tasks by the time the system promised to get them done (meeting scheduling contracts). It's all about deadlines, not speed. So from this point of view, the preemptible BKL, even with the degraded speed, could still be viewed as successful for a real time kernel.
Whatever your large project is, I'm willing to bet it's nowhere near as complex as the kernel. Whenever you get the feeling that they must have missed something that seems obvious, you're probably the one that's wrong. No offense, but they have a lot more experience dealing with unique kernel issues than you do.
You talk about unit testing, but how exactly are you going to unit test multi-threading issues? This is not some simple problem that you can run a test/fail test against. These kinds of things can really only be tested by analysis to prove it can't fail, or extensive fuzz testing to get it "good enough"..
This task is not easy at all. 12 years after Linux has been converted to an SMP OS we still have 1300+ legacy BKL using sites. There are 400+ lock_kernel() critical sections and 800+ ioctls. They are spread out across rather difficult areas of often legacy code that few people understand and few people dare to touch.
This is where microkernels win. When almost everything is in a user process, you don't have this problem.
Within QNX, which really is a microkernel, almost everything is preemptable. All the kernel does is pass messages, manage memory, and dispatch the CPUs. All these operations either have a hard upper bound in how long they can take (a few microseconds), or are preemptable. Real time engineers run tests where interrupts are triggered at some huge rate from an external oscillator, and when the high priority process handling the interrupt gets control, it sends a signal to an output port. The time delay between the events is recorded with a logic analyzer. You can do this with QNX while running a background load, and you won't see unexpected delays. Preemption really works. I've seen complaints because one in a billion interrupts was delayed 12 microseconds, and that problem was quickly fixed.
As the number of CPUs increases, microkernels may win out. Locking contention becomes more of a problem for spinlock-based systems as the number of CPUs increases. You have to work really hard to fix this in monolithic kernels, and any badly coded driver can make overall system latency worse.