Torvalds on the Microkernel Debate

← Back to Stories (view on slashdot.org)

Torvalds on the Microkernel Debate

Posted by ryuzaki0 on Tuesday May 9, 2006 @06:34PM from the never-met-a-pointer-i-could-trust dept.

diegocgteleline.es writes "Linus Torvalds has chimed in on the recently flamed-up (again) micro vs monolithic kernel, but this time with an interesting and unexpected point of view. From the article: 'The real issue, and it's really fundamental, is the issue of sharing address spaces. Nothing else really matters. Everything else ends up flowing from that fundamental question: do you share the address space with the caller or put in slightly different terms: can the callee look at and change the callers state as if it were its own (and the other way around)?'"

10 of 607 comments (clear)

Min score:

Reason:

Sort:

Code talks by microbee · 2006-05-09 18:52 · Score: 5, Insightful

The whole discussion of micro-kernel vs monolithic kernel is totally pointless. All popular OS kernels are monolithic. We can get back to the debate when we have a working fast microkernel in the market that is actually competitive.

Linus is a pragmatist. He didn't write Linux for academic purpose. He wanted it to work.

But you can always prove him wrong by showing him the code, and I bet he'd be glad to accept he was wrong.
1. Re:Code talks by Anonymous Coward · 2006-05-09 19:16 · Score: 5, Insightful
  
  Three letters: Q N X.
  
  Small, fast, real-time. http://en.wikipedia.org/wiki/QNX
2. Re:Code talks by Bacon+Bits · 2006-05-09 20:40 · Score: 5, Insightful
  
  "Hybrid" kernel? Sorry, I just don't buy this terminology (as Linus put it, it's purely marketing).
  It is pointless to argue semantics. You can say a hybrid kernel is a monolothic kernel trying to be a microkernel, or you can say it is a microkernel trying to be monolithic. As long as you understand what is meant by the term, your agreement about the precise semantics of it is largely irrelevant. Particularly with it's relevance to this debate.
  One of the biggest problems I continually have with technical people (whether that's computer techs or engineers) is that they tend to overemphasize the syntax and semantics of what people say. They tend to latch on to a specific phrase and then rip it apart rather than taking the meaning of the whole (which is the important part) and finding problems in the whole. Most particularly, they tend to find it incomprehensible that a single phrase might have multiple meanings.
  Part if this is doubtlessly due to exposure to highly precise technical jargon, but it is inappropriate to apply strictness of meaning inherent to, say, Python, to everyday language. Even in a technical debate.
  A hybrid kernel in simplest terms is a kernel is a combination of two discrete other types of kernels. Plain English tells you that. It makes no sense to try to wrestle with whether WinNT is a monolithic or microkernel. It's a semantic debate that serves only to label the object, and it doesn't describe it or aid in understanding it. If you say WinNT is a microkernel, you then have to ignore the non-essential code objviously running in kernel mode and that doesn't help understanding. If you say WinNT is a monolithic kernel, you have to ignore the userland processes that are really system services. Again, that's no aid to understanding.
  Stop complaining about the language and forcing labels on things. Labeling is not understanding.
  
  --
  The road to tyranny has always been paved with claims of necessity.
Microkernels and the future of hardware by SigNick · 2006-05-09 19:22 · Score: 4, Insightful

I think Linus hit the spot by pointing out that the future of home computing is going to to focus on parallel processing - it's 2006 and all my computers, including my LAPTOP, are dual-processor systems.

By 2010 I suspect at least desktops are 4-CPU systems and as the numbers of cores increase one of the large drawbacks of microkernels raises it's ugly head: microkernels turn simple locking algorithms into distributed computing-style algorithms.

Every game developer tells us how difficult it is to write multi-threaded code for even our monolithic operating systems (Windows, Linux, OSX). In microkernels you constantly have to worry how to share data with other threads as you can't trust them to give even correct pointers! If you would explicitly trust them, then a single failure at any driver or module would bring down the whole system - just like in monolithic kernels but with a performance penalty that scales nicely with the number of cores. What's even worse is that at a multi-core environment you'll have to be very, very careful when designing and implementing the distribution algorithms or a simple user-space program could easily crash the system or gain superuser privileges.

--
Capitalization is the difference between "Helping your uncle jack off a horse" and "Helping your uncle Jack off a horse"
Re:Obvious by ichin4 · 2006-05-09 19:30 · Score: 5, Insightful

You are forgiven for being wrong, but not for spouting off nonsense despite knowing that you don't know what you're talking about, apparently applying the principal "if my argument involves M$ doing the wrong thing, it must be right".

While neither NT nor Mac OS X are true microkernels, the architecture of both is strongly inspired by microkernel ideas. Like Linus, the developers of these kernels recognized the practical difficulties involved in making full-on microkernels work, but unlike Linus, instead of throwing in the towel completely and doing full-on monolithic kernels, they created cleanly seperated layers interacting via well-defined interfaces whenever they practically could.

If you talk to kernel programmers, most will express a high degree of respect for the NT kernel, which is based on the DEC VMS kernel. It mostly the poor design of systems that sit on top of the kernel that has earned Windows its reputation.
Entire comment by Futurepower(R) · 2006-05-09 19:53 · Score: 5, Insightful

Name: Linus Torvalds (torvalds AT osdl.org) 5/9/06

___________________

_Arthur (Arthur_ AT sympatico.ca) on 5/9/06 wrote:

I found that distinction between microkernels and "monolithic" kernels useful: With microkernels, when you call a system service, a "message" is generated to be handled by the kernel *task*, to be dispatched to the proper handler (task). There is likely to be at least 2 levels of task-switching (and ring-level switching) in a microkernel call.

___________________

I don't think you should focus on implementation details.

For example, the task-switching could be basically hidden by hardware, and a "ukernel task switch" is not necessarily the same as a traditional task switch, because you may have things - hardware or software conventions - that basically might turn it into something that acts more like a normal subroutine call.

To make a stupid analogy: a function call is certainly "more expensive" than a straight jump (because the function call implies the setup for returning, and the return itself). But you can optimize certain function calls into plain jumps - and it's such a common optimization that it has a name of its own ("tailcall conversion").

In a similar manner, those task switches for the system call have very specific semantics, so it's possible to do them as less than "real" task-switches.

So I wouldn't focus on them, since they aren't necessarily even the biggest performance problem of an ukernel.

The real issue, and it's really fundamental, is the issue of sharing address spaces. Nothing else really matters. Everything else ends up flowing from that fundamental question: do you share the address space with the caller, or put in slightly different terms: can the callee look at and change the callers state as if it were its own (and the other way around)?

Even for a monolithic kernel, the answer is a very emphatic no when you cross from user space into kernel space. Obviously the user space program cannot change kernel state, but it is equally true that the kernel cannot just consider user space to be equivalent to its own data structures (it might use the exact same physical instructions, but it cannot trust the user pointers, which means that in practice, they are totally different things from kernel pointers).

That's another example of where "implementation" doesn't much matter, this time in the reverse sense. When a kernel accesses user space, the actual implementation of that - depending on hw concepts and implementation - may be exactly the same as when it accesses its own data structures: a normal "load" or "store". But despite that identical low-level implementation, there are high-level issues that radically differ.

And that separation of "access space" is a really big deal. I say "access space", because it really is something conceptually different from "address space". The two parts may even "share" the address space (in a monolithic kernel they normally do), and that has huge advantages (no TLB issues etc), but there are issues that means that you end up having protection differences or simply semantic differences between the accesses.

(Where one common example of "semantic" difference might be that one "access space" might take a page fault, while another one is guaranteed to be pinned down - this has some really huge issues for locking around the access, and for dead-lock avoidance etc etc).

So in a traditional kernel, you usually would share the address space, but you'd have protection issues and some semantic differences that mean that the kernel and user space can't access each other freely. And that makes for some really big issues, but a traditional kernel very much tries to minimize them. And most importantly, a traditional kernel shares the access space across all the basic system calls, so that user/kernel difference is the only access space boundary.

Now, the real problem with split acce
Re:Linus Quote - "not arguing against it at all" by drgonzo59 · 2006-05-09 20:55 · Score: 4, Insightful

You seem to completely ignore the main reason for using a microkernel -- the ability to prove (even mathematically) that the kernel is correct. In other words the main advantage is not to make a it "easy" or "fun" for the programmers to program, or make Quake run with 25fps faster,but but to enforce a strict and precise security policy. That is why critical real-time OSes are often based on a microkernel which is only about 4000-8000 lines of code. Even at that size is might take years to prove it does what it is supposed to do.
The analogy of centralisation vs. local autonomy is not totally accurate either. Both the monolithic and the microkernel are centralized, except that in the first case there a large beaurocratic structure and in the second case it just a dictator and a couple of "advisors". If the dictator or the king is chosen well, the system will be more predictable and will work much better. If case of the large beaurocratic system, if some of its members get corrupted [and they will because there are so many of them] the whole system will fail. It is like saying that a small bug in the mouse driver will freeze and crash the system with a monolithic kernel. Good thing if the system was only running Doom at the time and not controlling a reactor, or administering a drug. If the same happens in the microkernel system, the kernel will reload the driver, raise an alarm, or in general -- be able to take the system to a predictable predetermined state. Going back to the analogy is it is like having the dictator execute a corrupted staff member and replace him immediately.
The real point: keep OS designers honest by jackjansen · 2006-05-09 20:55 · Score: 5, Insightful

I think the real point here, which both Andy and Linus hint on but don't state explicitly (as far as I'm aware) is about keeping the OS designers and implementers honest. If you need an interface between two parts of the system you should design that interface, define it rigidly, then implement it.
Andy likes microkernels because they force you to do that. Time spent on design leads to insight, which may well point to better and cleaner ways to do the task you originally set out to acomplish.
Linus hates microkernels because they force you to do that. Time spent on design is time lost getting working code out the door, and working code will give you experience that will point to better and cleaner ways to do the task you originally set out to acomplish.
Re:Distributed not that hard. by TobascoKid · 2006-05-09 22:18 · Score: 5, Insightful

But in practice Linux 2.6 is 6 million lines of code and a typical microkernel is less than 10k.

Umm, doesn't that mean while you've prooved that the 10k microkernel lines correct, you'd still have ~6 million lines of code sitting outside the microkernal waiting to be prooved? I can't see how a microkernel can magically do with 10k everything Linux is doing with 6 million lines (especially as by the definition of microkernel, than there's no way it could).

--
At some point, somewhere, the entire internet will be found to be illegal.
Re:Distributed not that hard. by drgonzo59 · 2006-05-09 22:33 · Score: 4, Insightful

You don't have to prove it, as long as the microkernel will be able to put the system into a predetermined state, it could for example unload the driver and try another one or just try to relaod it, it could contact you via a pager and so on. As opposed to the whole system freezing because some idiot wrote if(a=1) instead of if(a==1) in the mouse driver. You can only hope that the system that froze was running Doom and Firefox and wasn't flying planes, or administering drugs.