According to Linus, Linux Is "Bloated"
mjasay writes "Linus Torvalds, founder of the Linux kernel, made a somewhat surprising comment at LinuxCon in Portland, Ore., on Monday: 'Linux is bloated.' While the open-source community has long pointed the finger at Microsoft's Windows as bloated, it appears that with success has come added heft, heft that makes Linux 'huge and scary now,' according to Torvalds." TuxRadar provides a small capsule of his remarks as well, as does The Register.
Uh, I'd love to say we have a plan. I mean, sometimes it's a bit sad that we are definitely not the streamlined, small, hyper-efficient kernel that I envisioned 15 years ago...The kernel is huge and bloated, and our icache footprint is scary. I mean, there is no question about that. And whenever we add a new feature, it only gets worse.
And also:
He maintains, however, that stability is not a problem. "I think we've been pretty stable," he said. "We are finding the bugs as fast as we're adding them -- even though we're adding more code." Bottomley took this to mean that Torvalds views that the current level of integration acceptable under those terms. But Mr. Linux corrected him. "No. I'm not saying that," Torvalds answered. "Acceptable and avoidable are two different things. It's unacceptable but it's also probably unavoidable."
I think that's very important to note. His quote by itself is very self-loathing but to add that tit's unavoidable really says a lot. You want to be popular? You have to satisfy more people and in doing so you become more bloated. He does maintain that Linux remains stable and that's usually the biggest problem I have with bloat. It decreases stability. I don't think there's any reason to get excited about level headed rational and reflection.
My work here is dung.
About two years ago I tested wether my Gentoo kernel was really faster. Disabling 3/4 of the options really just improved boot time and memory footprint, but not overall performance that much, at least far from 12%. Compared to a modularized kernel with just the stuff loaded, that was needed, the difference was negligible. I'm not sure if Torvalds is telling the truth about the reasons. To me it seems that the central, overall kernel architecture has degraded over time with regard to performance.
If only there was somebody at the top deciding what to let it/reject in such a way to keep the bloat out! While I am a linux/gpl fanboi, i think the bsd distros don't have this problem because they have much stricter people at the top of their kernels, and i think this is yet another sign that Linus should not be the only one running the show. If Linus isn't producing the kernel desktop users need (it's bloated, has the wrong scheduler, etc) then distros should step up and work around the problem GIT makes it very easy for them to start elsewhere, their previous release tree, mm tree, etc and add the patches they require!
Before you jump at me and say that this will ruin Linux by duplicating work, it will still be the (essentially) same code that goes into the pool, its just the administration that changes, and producing incompatible distro's isn't a problem as the userspace API is fairly stable and changes to the ABI for prop drivers can be agreed on by the major players (or they can just follow linus's changes to them, or go crazy and stabilise the ABI so that the prop drivers work)
IranAir Flight 655 never forget!
The BSD distros do not have this problem, but it's not just the strict top-down management.
It's the users.
Linux is trying to court three major user groups wih the exact same kernel, and trying to be all things to all people. The big corporations who make up most of the Linux coding/funding/purchasing want better server performance (more processors, more RAM, etc). The desktop guys want better desktop, laptop, and netbook experiences (3D graphics, sound cards, processor power scaling). The third are the end-users who contribute almost nothing but want the system to be easy and simple.
BSD however, really only has one user base - and they largely want the same thing. Stability, security, and performance. So all the cute little desktop friendly stuff that Linux keeps adding and all the server-specific stuff that Linux keeps adding aren't there. There's just the one major direction.
Or at least that's my experience, and I've been using it since 2.x.
An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
Constant changes, i.e. lack of stable KBI (kernel binary interface) does not help.
Eventually keeping your incompatible stack is easier than keeping up-to-date with latest and "greatest", especially if you happen to test your code.
It gets done because ultimately somebody says "Fuck this, I can't work on this bloated codebase any longer. We're refactoring, guys!"
Then, if the old lead dev / maintainer / admin doesn't like it, a fork happens...
Projects where this has happened before: The kernel itself, several times (as well as various subsystems, again several times), X (XFree to XOrg), KDE (2-3, 3-4), Amarok (1.x to 2.x), SodiPodi -> Inkscape, Firefox from 2 to 3... These are off the top of my mind, of course - there are lots more.
Of course, there are some cases where this process has failed. I don't think the failure rate is any higher (or lower) than proprietary projects, though...
The incentives are different, but they exist, nevertheless...
Bloat isn't bad.
Version 5.0 of Microsoft's flagship spreadsheet program Excel came out in 1993. It was positively huge: it required a whole 15 megabytes of hard drive space. In those days we could still remember our first 20MB PC hard drives (around 1985) and so 15MB sure seemed like a lot... In 1993, given the cost of hard drives in those days, Microsoft Excel 5.0 took up about $36 worth of hard drive space. In 2000, given the cost of hard drives in 2000, Microsoft Excel 2000 takes up about $1.03 in hard drive space...
In fact there are lots of great reasons for bloatware. For one, if programmers don't have to worry about how large their code is, they can ship it sooner. And that means you get more features, and features make your life better (when you use them) and don't usually hurt (when you don't). If your software vendor stops, before shipping, and spends two months squeezing the code down to make it 50% smaller, the net benefit to you is going to be imperceptible. Maybe, just maybe, if you tend to keep your hard drive full, that's one more Duran Duran MP3 you can download. But the loss to you of waiting an extra two months for the new version is perceptible, and the loss to the software company that has to give up two months of sales is even worse.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Java is actually damn fast if you keep the JVM running at all times. Even wimpy mobile devices like the Kindle can run Java fine. The Kindle is just Linux + JVM on a puny ARM processor.
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
Precisely. The grandparent is forgetting that, in the proprietary world, the scenario you described can't happen. I can't go to my boss and tell him, "Screw this, I'm going to spend the next month refactoring our messy code, rather than adding new functionality." However, I can do that in an open-source project.
We all know what to do, but we don't know how to get re-elected once we have done it
Let's take a look at the patch history of QNX. QNX is a message passing microkernel mostly used for embedded systems. But it can be run with a full GUI, runs on multiprocessors, and can be run as a server. Millions of "headless" embedded systems have QNX inside. I used it in a DARPA Grand Challenge vehicle. BigDog, the legged robot, runs QNX.
Drivers are outside the kernel. All drivers. File systems are outside the kernel. Networking is outside the kernel. And they're all application programs, not some special kind of loadable kernel module.
There have been 14 patches to QNX in the last two years. Only one is an actual kernel patch: "This patch contains updates to the PPCBE version of the SMP kernel. You need this patch only for Freescale MPC8641D boards." Only one is security-related: "This patch updates npm-tcpip-v6.so to fix a Denial of Service vulnerability where receipt of a specially crafted network packet forces the io-net network manager to fault (terminate)." Neither Linux nor Windows comes close to that record.
There's little "churn" in a good microkernel. Since little code is going in, new bugs aren't going in. Good microkernels tend to slowly converge toward a zero-bugs state.
QNX generally has a "there's only one way to do it" approach, like Python. Linux supports three completely different driver placement - compiled into the kernel, loadable as a kernel module at boot time, and run as a user process. QNX only supports one - run as a user process "resource manager". That simplifies things. A "one way to do it" approach means that the one best way is thoroughly exercised and tested. There are few seldom-used dark corners in critical code.
When QNX boots, it brings in an image with the kernel, a built-in process called "proc", any programs built into the boot image, and any shared objects ".so" wanted at boot. These last two run entirely in user space; they're just put in the boot image so they're there at startup. That's how drivers needed at startup get loaded. They don't have to be in the kernel. (In fact, you can put the whole boot image in ROM, and many embedded systems do this.)
A QNX "resource manager" is a program which has registered to receive messages for a certain portion of pathname space. The QNX kernel has no file systems; part of the initial "proc" process is a little program which keeps an in-memory table of "resource managers" and what part of pathname space they manage. This is similar to "mounting" a driver under Linux, but it doesn't require a file system up during boot. File systems are user programs which start up and ask for some pathname space, after which "open" messages are directed to that file system.
Another QNX simplification is that the kernel doesn't load programs. "exec" is implemented by a shared library. That library is loaded with the boot image, to allow things to start up. "exec" runs entirely in user space, with no special privileges, so if there's a bug in "exec" vulnerable to a mis-constructed executable, that program load fails and everything else goes on normally.
The price paid for this is some extra copying, since all I/O is done by message passing. This isn't much of a cost any more, because you're almost always copying from cache to cache. That's an important point. Message passing kernels used to be seen as expensive due to copying cost. But today, copying recently used material is cheap. On the other hand, some early microkernels (Mach comes to mind) worked very hard to mess with the MMU to avoid big copies, moving blocks from one address space to another by changing the MMU. This seems to be a lose on modern CPUs; the cache flushing required when you mess with the address space on recently used data hurts performance.
I used to pump uncompressed video through QNX message passing using 2% of a Pentium III class CPU. Message passing, done right, is not a major performance problem.