Removing the Big Kernel Lock

← Back to Stories (view on slashdot.org)

Posted by CmdrTaco on Saturday May 17, 2008 @04:13AM from the wait-i-thought-locks-made-it-secure dept.

Corrado writes "There is a big discussion going on over removing a bit of non-preemptable code from the Linux kernel. 'As some of the latency junkies on lkml already know, commit 8e3e076 in v2.6.26-rc2 removed the preemptable BKL feature and made the Big Kernel Lock a spinlock and thus turned it into non-preemptable code again. "This commit returned the BKL code to the 2.6.7 state of affairs in essence," began Ingo Molnar. He noted that this had a very negative effect on the real time kernel efforts, adding that Linux creator Linus Torvalds indicated the only acceptable way forward was to completely remove the BKL.'"

11 of 222 comments (clear)

Min score:

Reason:

Sort:

Looks like "Worse is Better" all over by paratiritis · 2008-05-17 04:33 · Score: 5, Insightful

Worse is Better (also here) basically says that fast (and crappy) approaches dominate in fast-moving software, because they may produce crappy results, but they allow you to ship products first.
That's fine, but once you reach maturity you should be trying to do the "right thing" (the exact opposite.) And the Linux kernel has reached maturity for quite a while now.
I think Linus is right on this.
Re:Fascinating. by pla · 2008-05-17 04:56 · Score: 3, Insightful

If this bores you, every lkml thread would cause your head to explode.

Hey, I consider myself a code junky (and yes, even consider the issue of the BKL somewhat interesting), but I realize that this topic has about as much appeal to the average Slashdotter as mowing the lawn.
Re:I don't understand by SpinyNorman · 2008-05-17 05:20 · Score: 4, Insightful

If that is true then it sounds like a bad decision.

If the BKL code is rarely used then the general usage performance impact is minimal and the efficiency of a spinlock vs mutex is irrelevant. If this is not true then saying it is rarely used is misleading.

However for real-time use you either do or don't meet a given worst case latency spec - the fact that a glitch only rarely happens is of little comfort.

It seems like it should have been a no-brainer to leave the pre-emptable code in for the time being. If there's a clean way to redesign the lock out altogether then great, but that should be a seperate issue.
Keep these on Front Page... by TheNetAvenger · 2008-05-17 05:28 · Score: 5, Insightful

Keep these on Front Page...

This is the type of stuff that needs to be kept in the news, as the people who post here often have no understanding of, and the ones that do, have the opportunity to explain this stuff, bringing everyone up to a better level of understanding.

Maybe if we did this, real discussions about the designs and benefits of all technologies could be debated and referenced accruately.. Or even dare say, NT won't have people go ape when someone refers to a good aspect of its kernel design.
Re:Linux? by RiotingPacifist · 2008-05-17 05:47 · Score: 3, Insightful

Its gone from 0% to 100% on my pcs, everything is relative.

--
IranAir Flight 655 never forget!
Re:Fascinating. by ResidntGeek · 2008-05-17 06:07 · Score: 5, Insightful

Slashdot's not supposed to be interesting to every reader all the time. If you want someone to cater to a least common denominator, you'd be better off somewhere else.

--
ResidntGeek
(Performance != Speed) // in an RT system by Arakageeta · 2008-05-17 06:14 · Score: 5, Insightful

That's a terrible excuse. There are many applications where a real-time Linux kernel is highly desired. Besides, it is important to note that real time systems do not focus on speed. This is a subtle difference from "performance" which usually caries speed as a connotation; it doesn't for a real time system. The real time system's focus is on completing tasks by the time the system promised to get them done (meeting scheduling contracts). It's all about deadlines, not speed. So from this point of view, the preemptible BKL, even with the degraded speed, could still be viewed as successful for a real time kernel.
Re:Translation? by tomhudson · 2008-05-17 06:26 · Score: 3, Insightful

Scheduler huh? Then it seems to me that this particular problem will go away when schedulers die.
In around 10 years we will have more processors than processes and threads, so each process will have its own private processor and no scheduling will be necessary (actually it will seldom be used, like HDD swap today with 4GB+ RAM). Think 100 to 1000 processors per machine.

Keep dreaming ...
With all those processors, you'll want to be saving energy, so you'll be aiming to turn off individual processors until needed, and run the remaining processors at full load, so you'll still need a scheduler, locks, etc.
And yes, it's possible even today to use up more than 4 gig of ram and have to hit swap.
Re:Sounds like the Linux kernel needs some tests.. by pherthyl · 2008-05-17 06:28 · Score: 5, Insightful

Whatever your large project is, I'm willing to bet it's nowhere near as complex as the kernel. Whenever you get the feeling that they must have missed something that seems obvious, you're probably the one that's wrong. No offense, but they have a lot more experience dealing with unique kernel issues than you do.

You talk about unit testing, but how exactly are you going to unit test multi-threading issues? This is not some simple problem that you can run a test/fail test against. These kinds of things can really only be tested by analysis to prove it can't fail, or extensive fuzz testing to get it "good enough"..
This is why monolithic kernels do real-time badly by Animats · 2008-05-17 06:30 · Score: 5, Insightful

This task is not easy at all. 12 years after Linux has been converted to an SMP OS we still have 1300+ legacy BKL using sites. There are 400+ lock_kernel() critical sections and 800+ ioctls. They are spread out across rather difficult areas of often legacy code that few people understand and few people dare to touch.
This is where microkernels win. When almost everything is in a user process, you don't have this problem.
Within QNX, which really is a microkernel, almost everything is preemptable. All the kernel does is pass messages, manage memory, and dispatch the CPUs. All these operations either have a hard upper bound in how long they can take (a few microseconds), or are preemptable. Real time engineers run tests where interrupts are triggered at some huge rate from an external oscillator, and when the high priority process handling the interrupt gets control, it sends a signal to an output port. The time delay between the events is recorded with a logic analyzer. You can do this with QNX while running a background load, and you won't see unexpected delays. Preemption really works. I've seen complaints because one in a billion interrupts was delayed 12 microseconds, and that problem was quickly fixed.
As the number of CPUs increases, microkernels may win out. Locking contention becomes more of a problem for spinlock-based systems as the number of CPUs increases. You have to work really hard to fix this in monolithic kernels, and any badly coded driver can make overall system latency worse.
Re:Translation? by neonsignal · 2008-05-17 15:46 · Score: 3, Insightful

I'll respond because it is fantastic to see new people thinking about these issues. But I must agree with twizmer on this - grabbing multiple resources might solve the problem, but it is very clumsy. Some resources (eg storage) may take milliseconds to complete, whereas others (eg graphics) might take only microseconds. Holding up the fast ones while the slow ones complete is very undesirable (for all the reasons twizmer gives).
There are techniques that are used for problems like deadlocks and starvation: changing priorities on the fly; or enforcing mutex ordering; or even 'prodding' deadlocked tasks, but they are somewhat ugly. You'll find chapters in any book on OS design.
The essential problem is that the use of semaphores (and mutexes etc) is a low level way to control multiple processes; it is analogous to using the goto for flow control. There are languages that have attempted to address this (eg Occam or Modula I) with slightly higher level constructs, but they have not become popular, and are not totally radical.
I believe that we will need new programming languages to achieve safer parallelism. My bet would be on a language with message passing primitives (since they fit well with our object oriented models), and perhaps the use of Petri net formalism to prevent deadlocks. I gather that Nokia's phone OS uses this message passing model.
It should be noted that current processor design does not suit efficient message passing (the emphasis is more on an efficient stack, since that corresponds to the procedural flow of control - an exception may be the old Transputer architecture). However, I think the languages need to be developed first, even if they are not efficient to compile; processor development will support the most popular languages (as it has grown to support the use of C and other procedural languages).