Torvalds on the Microkernel Debate
diegocgteleline.es writes "Linus Torvalds has chimed in on the recently flamed-up (again) micro vs monolithic kernel, but this time with an interesting and unexpected point of view. From the article: 'The real issue, and it's really fundamental, is the issue of sharing address spaces. Nothing else really matters. Everything else ends up flowing from that fundamental question: do you share the address space with the caller or put in slightly different terms: can the callee look at and change the callers state as if it were its own (and the other way around)?'"
This my favorite Linus quote from that whole thread:
"In the UNIX world, we're very used to the notion of having
many small programs that do one thing, and do it well. And
then connecting those programs with pipes, and solving
often quite complicated problems with simple and independent
building blocks. And this is considered good programming.
That's the microkernel approach. It's undeniably a really
good approach, and it makes it easy to do some complex
things using a few basic building blocks. I'm not arguing
against it at all."
He basically continues his previous argument that monolithic kernels are more efficient and easier to implement. Microkernels may seem simpler, but they have complexity in implementing all but the simple tasks. Microkernels have a more marketable name. "Microkernel" just sounds more advanced than "monolithic". He finishes off with the observation that the term "hybrid kernel" is a trick to grab marketing buzz from the microkernel side of things.
My other first post is car post.
pfff, Linus, what would he know?
Philosophy.
Linus FTFA:
"The fundamental result of access space separation is that you can't share data structures. That means that you can't share locking, it means that you must copy any shared data, and that in turn means that you have a much harder time handling coherency. All your algorithms basically end up being distributed algorithms.
And anybody who tells you that distributed algorithms are "simpler" is just so full of sh*t that it's not even funny.
Microkernels are much harder to write and maintain exactly because of this issue. You can do simple things easily - and in particular, you can do things where the information only passes in one direction quite easily, but anythign else is much much harder, because there is no "shared state" (by design). And in the absense of shared state, you have a hell of a lot of problems trying to make any decision that spans more than one entity in the system.
And I'm not just saying that. This is a fact. It's a fact that has been shown in practice over and over again, not just in kernels. But it's been shown in operating systems too - and not just once. The whole "microkernels are simpler" argument is just bull, and it is clearly shown to be bull by the fact that whenever you compare the speed of development of a microkernel and a traditional kernel, the traditional kernel wins. By a huge amount, too.
The whole argument that microkernels are somehow "more secure" or "more stable" is also total crap. The fact that each individual piece is simple and secure does not make the aggregate either simple or secure."
The whole discussion of micro-kernel vs monolithic kernel is totally pointless. All popular OS kernels are monolithic. We can get back to the debate when we have a working fast microkernel in the market that is actually competitive.
Linus is a pragmatist. He didn't write Linux for academic purpose. He wanted it to work.
But you can always prove him wrong by showing him the code, and I bet he'd be glad to accept he was wrong.
Quick slashdoteffect there, that forum is already down. Anyhow.. mirror: http://www.mirrordot.org/stories/3f6b22ec7a7cffcf2 847b92cd5dec7e7/index.html
http://pastebin.ca/54695
I think Linus hit the spot by pointing out that the future of home computing is going to to focus on parallel processing - it's 2006 and all my computers, including my LAPTOP, are dual-processor systems.
By 2010 I suspect at least desktops are 4-CPU systems and as the numbers of cores increase one of the large drawbacks of microkernels raises it's ugly head: microkernels turn simple locking algorithms into distributed computing-style algorithms.
Every game developer tells us how difficult it is to write multi-threaded code for even our monolithic operating systems (Windows, Linux, OSX). In microkernels you constantly have to worry how to share data with other threads as you can't trust them to give even correct pointers! If you would explicitly trust them, then a single failure at any driver or module would bring down the whole system - just like in monolithic kernels but with a performance penalty that scales nicely with the number of cores. What's even worse is that at a multi-core environment you'll have to be very, very careful when designing and implementing the distribution algorithms or a simple user-space program could easily crash the system or gain superuser privileges.
Capitalization is the difference between "Helping your uncle jack off a horse" and "Helping your uncle Jack off a horse"
Individual pieces aren't really any simpler either. In fact, if you want your kernel to scale, to work well with lots of processes, you are going to run into a simple problem: multitasking.
Consider a filesystem driver in a monolithic kernel. If a dozen or so processes are all doing filesystem calls, then, assuming proper locking and in-kernel pre-emption, there's no problem - each process that executes the call enters kernel mode and starts executing the relevant kernel code immediately. If you have a multiprocessor machine, they could even be executing the calls simultaneously. If the processes have different priorities, those priorities will affect the CPU time they get when processing the call too, just as they should.
Now consider a microkernel. The filesystem driver is a separate server process. Executing a system call means sending a message to that server and waiting for an answer. Now, what happens if the server is already executing another call ? The calling process blocks, possibly for a long time if there's lots of other requests queued up. This is an especially fun situation if the calling process has a higher priority than some CPU-consuming process, which in turn has a higher priority than the filesystem server. But, even if there are no other queued requests, and the server is ready and waiting, there's no guarantee that it will be scheduled for execution next, so latencies will be higher on average than on a monolithic kernel even in the best case.
Sure, there are ways around this. The server could be multi-threaded, for example. But how many threads should it spawn ? And how much system resources are they going to waste ? A monolithic kernel has none of these problems.
I don't know if a microkernel is better than monolithic kernel, but it sure isn't simpler - not if you want performance or scalability from it, but if you don't, then a monolithic kernel can be made pretty simple too...
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
You are forgiven for being wrong, but not for spouting off nonsense despite knowing that you don't know what you're talking about, apparently applying the principal "if my argument involves M$ doing the wrong thing, it must be right".
While neither NT nor Mac OS X are true microkernels, the architecture of both is strongly inspired by microkernel ideas. Like Linus, the developers of these kernels recognized the practical difficulties involved in making full-on microkernels work, but unlike Linus, instead of throwing in the towel completely and doing full-on monolithic kernels, they created cleanly seperated layers interacting via well-defined interfaces whenever they practically could.
If you talk to kernel programmers, most will express a high degree of respect for the NT kernel, which is based on the DEC VMS kernel. It mostly the poor design of systems that sit on top of the kernel that has earned Windows its reputation.
That's a pretty big assumption. Or rather, you have basically taken all the hard parts of doing shared code and said "Let's hope someone else already solved this for us".
Sooooo, it's easy to have someone else handle the multi-process bits in a monolithic design. But when it comes to writing services for microkernels suddenly everyone is an idiot?
Besides, as Linus pointed out, when data is going one way microkernels are easy. And in the case of file systems that is really the case. Sure multiple processes can access it at once, but the time scale on handling the incoming signals is extremely fast compared to waiting for data from disk. Only a really, *really* incompetent idiot would write such a server which blocked until the read was finished.
Provably correct systems will always be better.
..... well, kind of. About 3/4 of the way through the process, I asked a question that nobody else had thought of.
.... but how long will it take?
Well, I could certainly argue THAT one.
Years ago, I was a lead analyst on an IV&V for the shutdown system for a nuclear reactor - specifically, Darlington II in Ontario, Canada.
This was the first time Ontario Hydro wanted to use a computer system for shutdown, instead of the old sensor-relay thingie. This made AECB (Atomic Energy Control Board) rather nervous, as you can understand, so they mandated the IV&V.
I forget his first name - but Parnas from Queen's University in Kingston had developed a calculus to prove the correctness of a programme. It was susinct, it was precice, it was elegant, and it worked wonderfully.
ummmmm
OK, so we prove that the programme is correct, and it'll do what it's supposed to do
You see, everybody had kinda/sorta forgot that this particular programme not only had to be correct, but it had to tell you that the reactor was gonna melt down BEFORE it did, not a week afterwards.
The point is, that there is often much more involved in whether or not a programme (or operating system) is usefull than it's "correctness"
Name: Linus Torvalds (torvalds AT osdl.org) 5/9/06
___________________
_Arthur (Arthur_ AT sympatico.ca) on 5/9/06 wrote:
I found that distinction between microkernels and "monolithic" kernels useful: With microkernels, when you call a system service, a "message" is generated to be handled by the kernel *task*, to be dispatched to the proper handler (task). There is likely to be at least 2 levels of task-switching (and ring-level switching) in a microkernel call.
___________________
I don't think you should focus on implementation details.
For example, the task-switching could be basically hidden by hardware, and a "ukernel task switch" is not necessarily the same as a traditional task switch, because you may have things - hardware or software conventions - that basically might turn it into something that acts more like a normal subroutine call.
To make a stupid analogy: a function call is certainly "more expensive" than a straight jump (because the function call implies the setup for returning, and the return itself). But you can optimize certain function calls into plain jumps - and it's such a common optimization that it has a name of its own ("tailcall conversion").
In a similar manner, those task switches for the system call have very specific semantics, so it's possible to do them as less than "real" task-switches.
So I wouldn't focus on them, since they aren't necessarily even the biggest performance problem of an ukernel.
The real issue, and it's really fundamental, is the issue of sharing address spaces. Nothing else really matters. Everything else ends up flowing from that fundamental question: do you share the address space with the caller, or put in slightly different terms: can the callee look at and change the callers state as if it were its own (and the other way around)?
Even for a monolithic kernel, the answer is a very emphatic no when you cross from user space into kernel space. Obviously the user space program cannot change kernel state, but it is equally true that the kernel cannot just consider user space to be equivalent to its own data structures (it might use the exact same physical instructions, but it cannot trust the user pointers, which means that in practice, they are totally different things from kernel pointers).
That's another example of where "implementation" doesn't much matter, this time in the reverse sense. When a kernel accesses user space, the actual implementation of that - depending on hw concepts and implementation - may be exactly the same as when it accesses its own data structures: a normal "load" or "store". But despite that identical low-level implementation, there are high-level issues that radically differ.
And that separation of "access space" is a really big deal. I say "access space", because it really is something conceptually different from "address space". The two parts may even "share" the address space (in a monolithic kernel they normally do), and that has huge advantages (no TLB issues etc), but there are issues that means that you end up having protection differences or simply semantic differences between the accesses.
(Where one common example of "semantic" difference might be that one "access space" might take a page fault, while another one is guaranteed to be pinned down - this has some really huge issues for locking around the access, and for dead-lock avoidance etc etc).
So in a traditional kernel, you usually would share the address space, but you'd have protection issues and some semantic differences that mean that the kernel and user space can't access each other freely. And that makes for some really big issues, but a traditional kernel very much tries to minimize them. And most importantly, a traditional kernel shares the access space across all the basic system calls, so that user/kernel difference is the only access space boundary.
Now, the real problem with split acce
Here is some good readign amterial, maybe people should read and _understand_ it before posting on the subject..
i tem.5d61c1d591162e4b0ef1bd108bcd45f3/index.jsp?&pN ame=computer_level1_article&TheCat=1005&path=compu ter/homepage/0506&file=cover1.xml&xsl=article.xsl&
0 ,1144,0131429388,00.html
This does not mean you have to agree with the guy.
http://www.computer.org/portal/site/computer/menu
http://vig.prenhall.com/catalog/academic/product/
Because it's based on a huge, monstrous monolithic microkernel.
Anagram("United States of America") == "Dine out, taste a Mac, fries"
Depends on what you mean by Micro Kernel and Monolithic.
True, the kernel of MacOS/X - Darwin, aka XNU, for performance reasons run the Mach and BSD layer both in superuser space to minimize the lattency.
Maybe this is what you call a hybrid kernel: http://en.wikipedia.org/wiki/Hybrid_kernel
You may call XNU whatever you wish but the fact remains:
- it's not a monolithic kernel by design
- it has Mach in it and Mach is some sort of microkernel. Maybe it does not reach "today's" standards of being called a microkernel but it was a very popular microkernel before.
So maybe the things running on top of Mach ( http://developer.apple.com/documentation/Darwin/Co nceptual/KernelProgramming/index.html ) are conceptually "different" from what the services of microkernel should be, and they do share indeed the address space, but this is very very very different from the architecture of a traditional monolithic kernel such as Linux
This guy ( http://sekhon.berkeley.edu/macosx/intel.html ) recently tested some stats software on his Mac running OS X and Linux, and found out that indeed MacOS X had performance issues, very likely due to the architecture of the kernel.
There's even a rumor that says that since Avie Tevanian left Apple ( http://www.neoseeker.com/news/story/5553/ ), some guys are now working for removing the Mach microkernel and migrate to a full BSD kernel in the next release of the operating system.
And now my personal touch. I agree with Linus when he says that having small components doing simple parts on their sides and putting them together with pipes and so on, is somehow the UNIX way and is attracting (too lazy to find the quote). However as he demonstrates later, distributed computing is not easy, and there's also the boundray crossing issue. I guess he has a point when he says this is a problem for performance and the difficulty on designing the system... So if performance is what you indeed expect from a kernel, then you must stop dreaming of a clean-centralized good software architecture like those we have for our high oo-oriented software.
But the truth is that, although developing a monolithic kernel is an easier task to do from scratch than a microkernel, I guess the entry ticket (learning curve) for a monolithic kernel developes is more expensive. The main reason being, "things ARE NOT separated". Anyone, anywhere in the kernel could be modifying the state of that thing, for non obvious reason, even if there's a comment that says "please don't do that" or it shoulld not be the case etc.... Microkernel can obviouisly provide some kind of protection and introspections to these things, but have always hurt performances to do so.
Now it has everything to do on what you expect. Linux has many many many developpers and obviously can afford having a monolithic design that changes every now and then and you may prefer a kernel that goes fast than one whose code is clearn, well organized and easy to read. But the corrolary of that observation is that for the same reasons, grep, cat, cut, find, sort, or whatever unix tools you use with pipes and redirection are similarly a cleaner but YET INEFFICIANT design. However, it's been proven (with time) to be a good idea..
I think things that are "low level" will be bound to have a poor spagehtti software architecture because performance matters and the code is smaller.. but the higher level you go, the less performance matters, and the more code maintenance and evolutivity matters... Everything is a tradeof: good design practice depends on the type of problems your software tackles.
That said, it does not mean no progress can be made in kernel developments. Linux already uses a somewhat different C lang
The analogy of centralisation vs. local autonomy is not totally accurate either. Both the monolithic and the microkernel are centralized, except that in the first case there a large beaurocratic structure and in the second case it just a dictator and a couple of "advisors". If the dictator or the king is chosen well, the system will be more predictable and will work much better. If case of the large beaurocratic system, if some of its members get corrupted [and they will because there are so many of them] the whole system will fail. It is like saying that a small bug in the mouse driver will freeze and crash the system with a monolithic kernel. Good thing if the system was only running Doom at the time and not controlling a reactor, or administering a drug. If the same happens in the microkernel system, the kernel will reload the driver, raise an alarm, or in general -- be able to take the system to a predictable predetermined state. Going back to the analogy is it is like having the dictator execute a corrupted staff member and replace him immediately.
Andy likes microkernels because they force you to do that. Time spent on design leads to insight, which may well point to better and cleaner ways to do the task you originally set out to acomplish.
Linus hates microkernels because they force you to do that. Time spent on design is time lost getting working code out the door, and working code will give you experience that will point to better and cleaner ways to do the task you originally set out to acomplish.
The only way the monolithic vs microkernel debate will go away is if CPUs provide a better way of sharing resources between modules.
One solution to the problem is to use memory maps. Right now each process has its own address space, and that creates lots of problems. It would have been much better if each module had its own memory map, ala virtual memory, so as that the degree of sharing was defined by the O/S. Two modules could then see each other as if they belong to the same address space, but other modules would be inaccessible. In other words, each module should have its own unique view of the memory.
Of course the above is hard to implement, so there is another solution: the ring protection scheme of 80x86 should move down to paging level. Each page shall have its own ring number for read, write, and execute access. Code in page A could access code/data in page B only if the ring number of A is less than or equal to the ring number of B. That's a very easy to implement solution that would greatly enhance modularity of operating systems.
A third solution is to provide implicit segmentation. Right now 80x86 has an explicit segmentation model that forces inter-segment addresses to be 48 bits wide on 32-bit machines (32 bits for the target address and 16 bits for the segment id). The implicit segmentation model is to use a 32-bit flat addressing mode but load the segment from a table indexed by the destination address, as it is done with virtual memory. Each segment shall have a base address and a limit, as it is right now. If a 32-bit address falls within the current segment, then the instruction is executed, otherwise a new segment is loaded from the address and a security check is performed. This is also a very easy to implement solution that would provide better modularization of code without the problems associated with monolithic kernels.
There are various technical solutions that can be supported at CPU level that are not very complex and do not impose a big performance hit. These solutions must be adopted by CPU manufacturers if software is to be improved.
Today most of the software that is used to fly planes (both fighter jets and passenger) is based on a microkernel architecture. So microkernels are not just lab toys, real and mission critical systems are run by microkernel architectures.
The speed problem can often be solved just buy getting a faster hardware. The main reason Linus rejected microkernels back in the day was because the cost of context switches was prohibitive. Today hardware is lot faster (roughtly Moore's law), so context switches will be alright on a 3GHz Pentium IV machines while it would not be doable on a 33Mhz machines.
Also, there is nothing about a microkernel that makes it more inherently provably correct than a monolithic kernel.
Theoretically you are right. But in practice Linux 2.6 is 6 million lines of code and a typical microkernel is less than 10k. It can already take up to a year to check the correctness of a 8k lines of code microkernel and there will be an exponential demand for resources as the code size increases. So in reality it will not be possible to check the linux kernel for correctness.
1. From AST (I'd assume you know who he is since you are interested in Linus/microkernel debate): http://www.cs.vu.nl/~ast/brown/followup/ Read the section "Microkernels Revisited":
I can't resist saying a few words about microkernels. A microkernel is a very small kernel. If the file system runs inside the kernel, it is NOT a microkernel. The microkernel should handle low-level process management, scheduling, interprocess communication, interrupt handling, and the basics of memory management and little else. ... Microsoft claimed that Windows NT 3.51 was a microkernel. It wasn't. It wasn't even close. Even they dropped the claim with NT 4.0.
2. From Windows Internals, the 4th edition, published by Microsoft Press. Page 36: Windows is similar to most Unix systems in that it's a monolithic operating system in the sense that the bulk of the operating system and device driver code shares the same kernel-mode protected memory space. Can we stop claiming Windows has a microkernel now?
But in practice Linux 2.6 is 6 million lines of code and a typical microkernel is less than 10k.
Umm, doesn't that mean while you've prooved that the 10k microkernel lines correct, you'd still have ~6 million lines of code sitting outside the microkernal waiting to be prooved? I can't see how a microkernel can magically do with 10k everything Linux is doing with 6 million lines (especially as by the definition of microkernel, than there's no way it could).
At some point, somewhere, the entire internet will be found to be illegal.
You don't have to prove it, as long as the microkernel will be able to put the system into a predetermined state, it could for example unload the driver and try another one or just try to relaod it, it could contact you via a pager and so on. As opposed to the whole system freezing because some idiot wrote if(a=1) instead of if(a==1) in the mouse driver. You can only hope that the system that froze was running Doom and Firefox and wasn't flying planes, or administering drugs.
While neither NT nor Mac OS X are true microkernels, the architecture of both is strongly inspired by microkernel ideas.
What exactly does "inspired" mean in this case? I am "inspired" by John Holmes but that doesn't mean I have a 12" cock does it?
If you talk to kernel programmers, most will express a high degree of respect for the NT kernel, which is based on the DEC VMS kernel. It mostly the poor design of systems that sit on top of the kernel that has earned Windows its reputation.
So, did VMS have a graphics subsystem in the kernel as well? Also can you provide some examples of kernel experts praising the NT kernel for its microkernel properties? Thanks in advance.
I see you are also a fan of monolithic posts.
This micro-post shows a division into seperable units.
Using message passing, I can efficiently communicate this to you.
Note that other readers may be reading different sections of my post while you read this one.
This section of my post never has to access internal structures of the other sections. In fact, I could have written each section in any order. Feel free to reorder them yourself.
Intron: the portion of DNA which expresses nothing useful.