Tanenbaum-Torvalds Microkernel Debate Continues
twasserman writes "Andy Tanenbaum's recent article in the May 2006 issue of IEEE Computer restarted the longstanding Slashdot discussion about microkernels. He has posted a message on his website that responds to the various comments, describes numerous microkernel operating systems, including Minix3, and addresses his goal of building highly reliable, self-healing operating systems."
I know you're being facetious (comon, mod points for the SAT word), but for those who don't know, Andrew Tanenbaum is covered at Wikipedia. His textbook, Modern Operating Systems, is probably one of the most widely used and excellent resources on the subject. He also likes to get into flame wars with Linus Torvalds when he gets bored. This is ironic because supposedly Linus used Tanenbaum's Minix as a starting point and influence for Linux.
It seems to me the whole issue boils down to memory isolation. If you always have to pass messages to communicate you have good isolation but costly syncronization of data/state and hence potential performance hits. And vica versa: Linux is prone to instability and security breaches from every non-iolated portion of it.
As I understand it, as a novice, the only way to communincate or syncronize data is via copies of data passed via something analogous to a socket. A Socket is a serial interface. If you think about this for a moment, you realize this could be thought of as one byte of shared memory. Thus a copy operation is in effect the iteration of this one byte over the data to share. At any one moment you can only syncronize that one byte.
But this suggests it's own solution. Why not share pages of memory in parallel between processes. This is short of full access to all of the state of another process. But it would allow locking and syncronization processes on entire system states and the rapid passing of data without copies.
Then it would seem like the isolation of mickrokernels would be fully gained without the complications that arrise in multi processing, or compartmentalization.
Or is there a bigger picture I'm missing.
Some drink at the fountain of knowledge. Others just gargle.
Linux is very reliable for me, even on newer hardware with a bleeding edge kernel. Why should I care whether it has a microkernel or monolithic kernel? Everything I deal with is user space. If it runs GNOME, is POSIX-like, and supports some kind of automatic package management, I'll be happy as a clam.
Will hardware drivers be developed faster and more reliably with a microkernel? That seems to be the biggest hurdle in reliable OS development these days... Anyone have a good answer for that, I honestly don't know.
My bicyles
In the CPU cycle flush 00s the debate is just different. Less code running at ring0 means less code that can cause a kernel panic, blue screen or whatever they call it in OSX.
A significant part of the market is OK running Java. The comparitivly small performance cost and high stability payoff of a microkernel makes the tradeoff a no-brainer.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Linus has written the Linux kernel used in millions of computers ranging from PCs to Mainframe.
Tanenbaum still has Minix and doctorate.
Education means nothing if you do nothing with it. Linus has applied his education very well and progress well beyond anything Tanenbaum has accomplished, with or without a doctorate...
That's a very good point, and one that people keep forgetting. If microkernels are so great, where are they? Let's take a look at notable microkernels:
* QNX Neutrino. This is the most successful microkernel ever. It deserves all the praise it gets. Yet it is still a niche product.
* Hurd. After twenty years we're still waiting for a halfway stable release. Hurd development is almost an argument *for* monolithic kernels!
*Minix. This is still an educational kernel. A teaching tool. It remains unsuitable for "real world" use.
* Mach. People claim OSX is a microkernel since it is built on top of Mach. But that ignores the real world fact that OSX is monolithic. People have been misled by the name.
* NT. This is NOT a microkernel! You don't believe anything else Microsoft says, so why do you believe this fairy tale?
In short, QNX is the only successful real world Microkernel. Linus happens to be right on this one: microkernels add too much complexity to the software. From ten thousand feet the high level architecture looks simple and elegant, but the low level implementation is a fraught with difficulties and hidden pitfalls.
A Government Is a Body of People, Usually Notably Ungoverned
Try doing what I do with Minix3: run it in VMWare, allocate it 4GB of RAM, and let VMWare do your virtual memory manegement.
(Yes, I know it's an ugly hack. But it means I don't worry about giving Bash 120mb, and cc some enormous number...)
--- My dad's political betting
I have never experienced the "stalling" problem that affected a very small number of 2004 and 2005 Priuses last year. (OK, hubris correction, make that "not yet..." although my car's VIN is outside the range of VINs supposedly affected).
It was apparently due to a firmware bug.
In any case, when it happened, according to personal reports in Prius forums from owners to whom it happened, the result was loss of internal-combustion-engine power, meaning they had about of mile of electric-powered travel to get to a safe stopping location. At that point, if you reset the computer by cycling the "power" button three times, most of the warning lights would go off, and the car would be fine again. Of course many to whom this happened didn't know the three-push trick... and those to whom it did happen usually elected to drive to the nearest Toyota dealer for a "TSB" ("technical service bulletin" = firmware patch).
These days, conventional-technology cars have a lot of firmware in them, and I'll bet they have a "reset" function available, even if it's not on the dashboard and visible to the driver.
"How to Do Nothing," kids activities, back in print!
...so I can't spend a lot of time in dicussing this, but I always that the main benefit of micro-kernels is completely wasted unless you actually have utilities that can work in partially-functioning environments. What good is it to be able to continue to run a kernel even with your SCSI drive disabled, if all your software to fix the problem is on the SCSI drive?
Now in theory I could see a high-availability microkernel being a good, less expensive alternative, to a classic mainframe environment, especially if you had a well written auto-healing system built in as a default. But that would require a lot of work outside the kernel that just isn't being done right now. And until it is, micro-kernels don't have anything more to offer than monolithic kernerls.
To put it in API terms - it doesn't matter very much whether your library correctly returns an error code for every possible circumstance, when most user level code doesn't bother to check it (or just exits immediately on even addressable errors).
Tanenbaum as always makes a good conceptual case for his perspective, but as time has gone by his examples increasingly prove Linus' point.
Except for QNX the software he cites are either vaporware (Coyotos, HURD), esoteric research toys (L4Linux, Singularity), or brutally violate the microkernel concept (MacOSX, Symbian).
Even his best example, QNX is a very niche product and hard to compare to something like Linux.
Forgetting something?
*Minix. This is still an educational kernel. A teaching tool. It remains unsuitable for "real world" use.
Actually, it's a start of a full-up Microkernel operating system. This isn't your grand-pappy's Minix, it's a brand new code base under the BSD license, intended to be developed out into a complete system. It's still taking baby-steps at the moment, but it's coming along quite nicely.
* NT. This is NOT a microkernel!
NT is a hybrid. It has Microkernel facilities that are constantly being used for something different in each version. Early versions of NT were apparently full Microkernels, but this was changed for performance.
* QNX Neutrino. This is the most successful microkernel ever. It deserves all the praise it gets. Yet it is still a niche product.
I would hardly call QNX a "niche" product. Running on everything from your car engine to Kiosk PCs (yes, that stupid iOpener ran it too), it's an extremely powerful and versatile operating system. Its Microkernel architecture even gives it the ability to be heavily customized for the needs of the application. Don't need networking? So don't run the server! Need a GUI? Just add the Graphics server to the startup.
Microkernels haven't failed. However, you may notice that nearly all the popular Operating Systems we use today were all developed back in the late 80's and early 90's. The real problem is that there hasn't been a need to develop new OSes until now. Now that Security and Stability are more difficult pressing issues than performance, we can go back to the drawing board and start designing new OSes to meet our needs for the next decade and a half.
Javascript + Nintendo DSi = DSiCade
All of these ideas are old, and while high performing don't address the largest issue of all, cross kernel compatability.
Sure you can recompile and all that jaz, but I'd love to see a day where an app could run on any number of kernels out there. This creates real competetion.
What I'd like to see if a kernel more like a CPU. Instead of linking your kernel calls, you place them as if you where placing an Assembly call. Then we can have many companies and open source organizations writing versions of it.
As we move towards multi core cpus this could really lead to performance leads. Where one or more of many cores could be dedicated to the kernel operations listening for operations and taking care of them. No context switches needed, no privledge mode switching.
Drivers and everything else run outside of kernel mode and use low level microcode to execute the code.
The best part I think is you could make it backword compatiable as we re-write. A layer could handle old kernel calls and change them to the micro codes.
As we define everything more and more then we might even be able to design CPUs that can handle it better.
I think I'd prefer the Linux network stack, which AFAIK simply doesn't crash in the first place.
Well... this is a more interesting comment than first reading might suggest. I've always been a bit dubious of "tolerant" software. It might sound counter-intuitive, but I'd rather have libraries/kernels terminate the running program and output a big message saying why rather than tolerate problems and try to continue. In the long-term it pays-off.
A lot of problems in Windows come from Microsoft trying to build Windows to be extremely tolerant of crap software and bizarre library calls, and to keep running as long as possible... and that has come back to bite them. Lots of strange failures that never get fixed because they don't terminate the program, they just end up generating shite later.
I prefer my libraries and kernels to just say... wrong. Fuck off. Or at the very least spazz out and crash spectacularly -- because those sorts of problems GET FIXED QUICKLY. It sounds like a hacky and roundabout idea... but it does work. It forces software to be fixed instead of being tolerant of its faults. Perhaps if you were in an academic environment and in total control of the entire software stack, the pure platonic ideal of development would work.
But you aren't, and it doesn't. Crashes and terminated programs get noticed and the problems fixed. It's real life coding in a nutshell.
Years later, Tanenbaum still makes valid observations, Linus and others continue to make a rather larger project jump through the hoops, and that's fine. The results of academic research may or may not get traction outside of a university, but without the research, there wouldn't be alternatives to contemplate. If I've gathered nothing else about Linus' personality from his writings over the years, it's that he seems to be practical, not particularly hung up on architectural (or licensing) theories... unlike me.
At some point, if his current architecture just isn't doing it for him any more, he might morph into Tanenbaum's 'A' student. It won't be because a microkernel was always right, but that it was right now.
Luke, help me take this mask off
I just finished watching the original Connections series with James Burke.
The start and end of the series hits it home:
The dependence on technology, and the use of technology to get us out of trouble created by technology.
About how one technology is so interdependent on other technology, that one failure can cause the whole technology section to fail.
Initial reference to electricity grid failures, usual caused by one small part failing)
Wouldn't building a system so that one part cannot take down all the other parts be important?
How would the Internet be if the entire thing would fail if one part stopped?
Yes, I want a computer without a reset button...
The other big issue is a lack of threading support.
Threading is the spawn of the Solaris. Oops, I mean the devil. Forking was so slow on Solaris that they had to invent threading to have multiple contexts run at any speed. The whole point behind a microkernel is to HIDE information. Threading EXPOSES information between separately running processes, so you need to have mutexes, semaphores, and all of that synchronization crap that makes for buggy code.
Threading is bad. Don't use it. When you have to use code that uses it, refactor the code to use processes or a state machine. It can be done. Don't whine. But don't use threads either.
Don't piss off The Angry Economist
More important than micro/macro to me would be the ability to keep the system running and edit the system. I used to do that with Scheme back in my college days. It made me realize how something like the telephone system could keep running 24/7 and never go down. These days with MS Windows I gotta reboot every 30 days, and the same with these fscking Linux kernel updates. What if I don't ever want to reboot. I think a microkernel/interpreter would let you modify the running system a lot easier. You could even make incremental changes and then check to make sure they work - preserving the old code so a rollback would be simple.
The point that Andy makes which I agree on, is that computer software is still in its infancy. The part I disagree with is that it'll change by him stating the obvious.
I/O channels would help IBM mainframe channels, which have an MMU between the peripheral and main memory...
I've heard from a friend at Intel that their new chipsets which fully support TCPA have this feature. So maybe trusted computing isn't just about copy prevention.
You don't time things when code is available, you time things when people are available to code. For both micro- and macro- kernels, the race started with the 80386 and the affordability of 32-bit CPUs for the average developer. The reason you didn't have Free Software kernels 15 years after the quasi-availability of UNIX source was that hardly anyone could afford the hardware. The ability to collaborate over a network also made a huge difference.
And Minix 3 doesn't count for real world use. It may be a good starting point for one, and maybe Minix 3.5 might do it. But as of today it would be silly to put Minix on anything but a hobbyist system.
A Government Is a Body of People, Usually Notably Ungoverned
Has anyone thought of that the fact this very conversation may go down in the history of computer science? In 30 more or less years, lecturers will be telling their students about this argument! We're witnessing a more interesting slice of history than our normal mundane day lives :)
Giving IE users a taste of their own medicine since 2005 - http://pods.-is-a-geek.net/