2.4, The Kernel of Pain
Joshua Drake has written an article for LinuxWorld.com called
The Kernel of Pain.
He seems to think 2.4 is fine for desktop systems but is only now, after a year of release, approaching stability for high-end use. Slashdot has had its own issues with 2.4, so I know where he's coming from. What have your experiences been? Is it still too soon for 2.4?
I really like using USB, and I like not having to use ALSA for my sound card (not that I have anything against ALSA).
After playing around with debian the other day and seeing all of my hardware that WON'T WORK with the 2.2 series it has basically come to my attention that I am all for the 2.4 series.
Linux is a continously developing system, whether it be the kernel, distribution, or software. Linux will always be "In Developement". Which is perfect for linux.
So yeah ... if you don't like 2.4 ... go back to 2.2 ... yeah ... thought so :-P
Ignore the "p2p is theft" trolls, they're just uninformed
I hadn't noticed that Linux was any slower feeling than Windows. On my Celery 300A Windows is PAINFUL to use, but Linux is amazingly quick - running 2.4.17. I run Windowmaker, and that's it. No Gnome, no KDE, no funny transparent terms.
If tits were wings it'd be flying around.
This guy is complaining that he had troubles on a production server with Mandrake8.1 and its kernel 2.4.
But Mandrake 8.1 ships with both kernel 2.4 and 2.2.
The idea behind it is: if you need all the fancy stuff use 2.4 but if you want stability use 2.2.
So using 2.4 on a server and then complaining that it isn't stable enough is silly IMHO.
That said I agree that 2.4 has been slow to stabilize (VM mess apparently caused by communications problems between Linus and Rick Van Riel).
First, he replaces a known working server with something new. Then he keeps on adding bleeding edge newest kernel upon newest kernel to this box (following his narrative, it sounds as if he installed new kernels upon release).
Second, nowhere does he mention why he needed a 2.4 kernel in the first place. In fact, he mentions how he finally decided to downgrade to 2.2.
So, in conclusion: He upgrades to the bleeding edge without proper need, and when trouble ensues, instead of rolling back, he continues upgrading. Tell me why this guy is not a hopelessly incompetent sysadmin who's trying to blame Linux for his shortcomings?
Hell, even I as a home user waited until 2.4.17 before upgrading my main box from 2.2.19. If I can perceive the weaknesses of the 2.4 kernel, why can't a professional do so?
Mart"I know I will be modded down for this": where's the option '-1, Asking for it'?
Oke, so we're talking:
/not proven/.
1. Mandrake 8.0, *the* desktop distribution _and_ a dot zero release.
2. A kernel lt/eq 2.4.6 with known problems and definetaly
3. A large-scale *production* server.
Somebody hit this guy with a cluestick! Please?
Thimo
Avoid the Gates of Hell. Use Linux!
Interesting what the author was saying about 2.2 versus 2.4 in terms of stability. We have 3 Linux machines which are used quite heavily here at the moment:
1) A dual PIII-800/Intel 440GX/512MB ECC RAM based server, with a Mylex AcceleRAID 170 adapter, an Adaptec AIC-7896 SCSI adapter, Intel EtherExpress Pro 10/100, and an external 450GB SCSI RAID-5. This box is used for NFS/Samba file serving and an e-mail server for around 100 users.
It runs kernel 2.2.17
2) A dual PIII-800/VIA 133 server/1GB PC-133 RAM server, with an Initio A100U2W SCSI adapter, Intel EtherExpress 10/100 and 70GB of external SCSI RAID 1/0. It runs MySQL, Apache, and a collection of internally developed Perl, C and Java server apps, on kernel 2.4.3
3) A dual PIII-450/Intel 440BX/512MB PC-100 RAM server, with an Adaptec 2940UW adapter, Intel EtherExpress 10/100 and 170GB of external SCSI RAID-5. It is used as a development system, and runs MySQL, Apache, and assorted Perl, C and Java apps, on kernel 2.4.1.
Systems 2 and 3 have both been up for 197 days as I type this, and would have been up for over 250 days had we not needed to power them down to move them to a new server room.
System 1 (with the 2.2.17 kernel) has never stayed up for more than 55 days. It hard crashes without anything informative being written to the logs, and obviously required the reset button to be pressed.
Has anyone got any ideas, given the hardware configs and software running on these machines why 2.2 is so horrendous, yet 2.4 so stable?
Is Linux 2.4 unstable? It depends on your perspective and luck. I'm running 2.4.8 and 2.2.19 (Debian potato) on my systems successfully; 2.8.9 thru .12 have been glitchy for me, especially when it comes to running big jobs that stress the VM. Haven't tried anything above .12 yet; I'm waiting for .18. My old cluster runs 2.2 simply because I have no reason to change.
Your mileage, of course, may vary.
I do think that 2.4 has been managed poorly. People complain that Microsoft beta-tests software on thier customers -- yet that is precisely what the kernel team does to Linux users when they release a "stable" kernel with an entirely new VM. A couple months' (weeks'?) testing on developer workstations is not sufficient for an "enterprise" class operating system. Anyone who understands the least bit about complex systems knows that you don't replace critical architecture (the VM) without jeopardizing stability.
It's all water under the bridge now; I hope Linus and company have learned from the 2.4 battles. If 2.6 has the same kinds of problems and controversies... well, I prefer not to think about it. For my part, I plan to beat 2.5 beta kernels to death, to help the testing along. Testing is as important as kernel hacking -- even if it isn't as sexy.
All about me
as an electronics student, i wouldn't dare criticizing the kernel programmers: if you ever tried to program a kernel from scratch, you'd know what a damn job that is...
for all of you interested, there's a great book over at o'reillys, understanding the linux kernel. it covers the changes from the 2.2 to 2.4 version and explains into every very detail the structures behind all the features you enjoy in you everyday linux life ;-)
cu, mephinet
Use the source, Luke!
He seems to think 2.4 is fine for desktop systems but is only now, after a year of release, approaching stability for high-end use.
I don't get it. I use Linux on the desktop. I have to admit that I don't run linux on my main machine. This is only because I've taken my second hard drive out, and put it back into an older machine. [sorry, wine doesn't like Red Alert 2]
Before I did this though, I ran 2.4 kernels on my desktop. None of the problems I may have had were with the kernel. Problems I had were mainly with certain applications and when I pushed them to their limits. Pan, for instance, crashed a lot on me, but that was because I was downloading gigs per day. A simple Pan upgrade fixed that.
In my humble opinion, 2.4 is prime for the desktop. Linux is more than ready for the desktop. I know he says it's ready for the desktop, but not ready for high end systems. To me 'high-end' is what you ask of a computer. I've got a 333MHZ running Red Hat 7.2. The computer is running webmin, proftpd, apache, and many mail daemons. I must also mention that SETI runs 24/7, it only has 64 MB of RAM. It never goes down, it never 'crashes', and is up as long as there is power running to it.
So... it's ready for the desktop? Sure, 2.4.x is prime. All the drivers I've needed supported are there. Even my >$50 webcam.
The question of 'desktop' use isn't with the kernel though. Desktop users don't patch or compile the kernel... how many times do they do it in *indows or MacOS X? They install complete distributions. IMHO, again, the only thing that keeps Linux off the desktop is easy program install. RPM has killed itself with dependencies, and apt-get is console based. Apt-get is waaay better, and it has worked wonders on my Red Hat machine [apt-rpm]. The problem is not being able to download an app and install it like *indows.
Solve this and I will sit outside my local computer store and hand out CDs. I don't know about high end systems, but dammit!, desktop users are ready... format that *indows crap and get a real OS!
Gimme a good apt-get gui... or have the system run apt-get in the background solving dependencies when needed... my g'ma will have it.
BTW, I just saw a guy on TV and his name is... get this: Joe Householder
Get your Unix fortune now!
- running a distribution with bad compatiblity between libs,tools,compilers and the kernel (i.e. Redhat 7.0)
- upgradings the kernel without regard to upgrading libs,tools,compilers, etc.
- upgrading for the purpose of upgrading's sake - no real reason.
- Not that much knowledge of what's really goining on in their machine.
I have set-up enterprise level production servers on Linux for many years and haven't ever <knock on wood - my head > had stability problems. I know of 4 Servers runnig at my most recent employer that have been running 2.4.x Kernels since they were put into service and have never crashed or even hiccuped - over 2 years ago. They are running Oracle, MySQL, Apache with multiple modules, Perl Apps, Samba and NFS filesystems, and other stuff I can't think of right now. Why haven't they had problems -- Good STABLE distribution
- conservative upgrades (2.4.1 to 2.4.5 to 2.4.12)
- running in test before placed in production
Eveyone seems to think that with a kernel from the "stable" tree you shouldn't have any problems whatsoever. Keeps in mind that the kernel alone doesn't make the OS the user space tools are also part of it. Therfore running a kernel in the 2.4.x tree and bleeding edge alpha versions of user space software and server daemons != a stable system necessarily. How many people are running a version of Apache in the 1.3.x tree? Well if you are that's a development tree and not necessariliy stable. Yes there are stable versions, but you must test! Also remember to separate device drivers from the kernel. Just because many are distributed with the kernel doesn't make them part of it.The other problem I've noticed that started with the 2.4.x tree was the 'ac' or Alan Cox branch. Don't get me wrong I think Alan has contributed meny good thing to the kernel; however, I do think Alan has gotten to feeling a little to self-important to Linux's success. Also keep in mind that he works for Red Hat now and everything I've seen with Red Hat's politics is they want to "control" Linux. So, I say stay away from any '-ac' kernel. But that's just my opinion and I could be wrong.
As far as the 'new' VM being put into the 2.4.x kernel - I don't completely agree or disagree with it being done when it was but there were various reason to do it then. I was holding up some important things and the kernels wasn't ready for a 2.5.x tree yet. It was a hard decision on Linus's part but not one I'm going to second guess.
Don't get me wrong. you can get a stable machine with just about any distribution; howver, I have found, from experience, that Slackware has a track record for being the most stable 'out-of-the-box'.
Also keep in mind that with Winblows you'd be rebooting every 14 to 30 days. Even with 2000 and XP.
On the other hand I still have one box at home still running 1.3.72 with an uptime of over 3 years. It's running as my router and is my experiment on just how long with a Linux box run without crashing.
I know this is gonna sound stupid, but why are you running Linux on Alpha boxen? Never heard of Tru64?
/Mikael Jacobson
Greylisting is to SMTP as NAT is to IPv4
Must have been a problem with your system. I have been running Windows 2000 on a K6-266 with 128 MB of RAM for about a year. It flew. It's important to have good disk access, so I put a 10 MB 7200 rpm disk in it and installed Windows on that, which made it even snappier.
The only reasons I bought a new machine is that I needed the K6 to act as a FreeBSD box and because I wanted to play DiVX and games, both of which demand more than a K6-266 regardless of the OS used.
Windowmaker is a window manager, not a desktop environment and that's why it's faster. You don't gets half of the features and integrations a real desktop provides when you only run a WM.
What you're saying is equivalent to one of the many post on using VI when the discussion's topic is IDEs. fine if it does the job for you but 90% of people out there want the fully integrated DEs like GNOME and KDE.
Now, what problems am I talking about? The latest 2.4 kernels still have compilation problems in some drivers (2.4.17 has problems in USB, 2.4.18pre4 has problems in one of the sound drivers). Important and mature packages like MOSIX require patching the kernel and aren't integrated into the kernel. Many hardware setups require recompiling the kernel and experimenting endlessly. Every time you recompile the kernel, you need to recompile some kernel modules. Dependencies and recompilation aren't working correctly--some things don't recompile when they should, and lots of things recompile over and over and over again. The kernel itself is a 30Mbyte download. And the list of problems goes on and on.
People seem to have gotten used to it and think there is nothing wrong. The kernel hackers keep telling us that C and make are just great tools for building kernels. But as a user and sometime driver hacker, I think the kernel is falling apart under its own weight. This is not a system I can recommend to non-technical users--commercial distributions can't cover all the possible kernel configurations (even with fully modularized kernels), and recompilation is out of the question for many users. What is needed?
I think, ultimately, if the kernel wants to survive and be able to keep up with the world, it needs some kind of more flexible dynamic binding of functions at runtime. It also must allow people to start writing kernel components in languages other than C, foremost C++. No, C++ isn't the epitome of good language design, and, yes, people can write even more horrible code in C++ than in C, but C++ can really help with safety, security, resource management, and modularity.
If those things don't happen, I think the Linux kernel will simply fall so far behind that it will get replaced by something else. And that would be a shame because the Linux kernel actually does have a lot of useful functionality, and once compiled and configured, works very well.
I get them in another way, for instance, I use xemacs.
What you're saying is equivalent to one of the many post on using VI when the discussion's topic is IDEs. fine if it does the job for you but 90% of people out there want the fully integrated DEs like GNOME and KDE.
But THIS problem, as absolutly nothing to do with the Linux kernel (which is the original focus of the article). I addition, I never understood what is so useful in KDE or GNOME (apart for applications themselves).
I definitely see this too... In fact I'm about to start a full crusade against shitty windowing performance. (long visible lags between exposure and repaint, very jerky moving/resizing, etc). These are very real issues on Linux/XFree86. I plan to go as far as shooting my screen with a 100Hz camera to really see what's going on, retrace by retrace.
;]).
There are many links in the GUI chain, any of which can cause a problem. Roughly from top to bottom-
1. Widget toolkit (GTK, QT, etc)
2. Client painting library (GDK, QT, etc)
3. Window manager
4. X protocol
5. context switches
6. X server
7. 2D video card driver
The folklore seems to be that 4, 5, and 7 are the major problems - "the X protocol is badly designed, switching between client, server, and window manager processes is too expensive, and XFree86's video drivers are no good."
In reality though, the problems aren't where most people expect. The X protocol is not generally a bottleneck, especially if the client programmer knows what he's doing (wait until the input queue empties before repainting anything, avoid synchronous behavior, double-buffer windows using server-side pixmaps, etc). The copy-and-context-switch overhead isn't too bad either (keep in mind that context switches are much more expensive on Windows, and Windows is the platform to beat for 2D smoothness!). And finally, many of the 2D drivers really do take advantage of all the hardware offers.
The real culprits are turning out to be 1 and 3 - the tookits and window managers. Many of the Linux toolkits (especially GTK) have very advanced widget alignment/constraint systems that bog down when windows are resized. Some toolkits are doing naughty things with the event loop (painting while events are still in the input queue, or trying to "optimize" by pausing for new events), and most of them aren't fully double-buffered yet (though GTK 2.0 and recent KDE/QT are most of the way there). Window managers are some of the most horrific perpetrators of 2D crappiness. Some of them try too hard to snap or quantize window sizes and positions, resulting in jerky motion. Kwin seems to prolong expose/repaint cycles much more than necessary. And finally, I will make one criticism of X's overall architecture - I don't think separating the window manager from the X server was a good choice. The asynchronous relationship between X and the wm can cause nasty delays in window moving and resizing. (plus, all widely-used wm's have basically the same features these days; what's the use of having a choice?
I used to think that the only way to get perfectly smooth 2D would be to embed the widget toolkit in the X server, so that it could handle repainting all on its own. Now I don't think one needs to go that far; it may just take a well-written window manager, and a similarly carefully-designed widget toolkit. (though it may be helpful for the server to mandatorily double-buffer every window - hey, video RAM is plentiful these days =)
There are lots of issues I haven't investigated yet - for instance, I think Windows may be doing something interesting with vblank; dragging windows around seems to show a lot less tearing compared to X... Also, 3D OpenGL windows seem to cause much worse artifacting on both X and MS Windows. It's almost possible to bring an animating OpenGL program to a complete halt just by resizing the window, or dragging another window in front of it.
It's an interesting problem, and I'm glad to see I'm not the only one who cares about it. I find it apalling that (to my knowledge) not one major 2D GUI system has been able to produce 100% correct results - i.e. every window correctly drawn on every single monitor retrace, even while dragging or resizing. Why should we settle for less than 100%?
I've been running 2.4.x kernels since 2.4.0 on my desktop machine. I tried every one except 2.4.11. Never had any problems, even considering I have an all-reisefs box.
On my company's server, however, I'm still running 2.2.19. The servers have been running along nicely, so there is no reason to upgrade. If it ain't broken, don't fix it.
Running mission-critical servers on cutting edge kernels is just silly, and if the servers crash and lose all the data it's the sysadmin's fault, not Linux's.
Just my 2 eurocents.
IMHO, the real problem is the stock kernel.
It is commom to see that the stock kernel has lots
of missing patchs to increase stability and as pointed out by
Rik van Riel which was posted
here in slashdot, Linus rejects
random patchs which cause some areas of the kernel to not be "as good as it should".
The VM is one part which Linus just got random
patchs from Riel and rejected some of them randomically which made the VM suck so hard in
earlier stock 2.4 kernels.
OTOH, kernels shiped from distributions includes
(at least it should) the missing parts and should
be better than the stock kernel from kernel.org .
I don't use Mandrake to tell how good their
kernel is or is not. But I use
Conectiva Linux and I know how good their kernel package is.
Their kernel includes missing fixes that do not get over the stock kernel.
Better of all, their kernel maintainer is
Marcelo Tosati
who maintains the stable kernel tree now.
I think that we will see an improvement into new
2.4 releases.
The latest 2.4.17 kernels from Conectiva can be found in here .
I have to say that the guy who wrote this article and the other consultants seem to be a bunch of idiots. First of all, who would used Mandrake for a server. We are talking about an installation that is meant for a laptop environment, with all the bells and stuff. Also, I thought that it was a normal understanding that you should not used a brand new kernel for anything major, since it by default will lack some stability. But here is this guy installing it and then complaning about stability. What a surprise!
I personally haven't had any problems with 2.4.x. This puppy works 24/7 without any problem whatsoever. Although I definitely feel that the new VM is better than the old one by far. Definitely, Linus made the right decision by yanking the old one off.
With regards to new kernels and distros, my recommendation for people that like to bitch about things is not to install bleeding edge on production systems. This should be more than obvious, in particular for consultants. I would say that the norm is or should be to wait until the kernel is on the teen numbers (i.e. X.14.X) before its used for any major installation. Again, this should be more than obvious, but it seems that people are brain-dead with regards to something so lame. And then they go around writing "Kernel of Pain" bitching articles.
So it took the Linux community a whole YEAR to stablize their Kernel? If MS doesn't have a patch out in 2 days everyone screams. I guess this proves that MS does not have the most unstahle software. Just the most talked about.
I've always found it incredible how hypocritical SOME peole can be.
The big arguement FOR linux up until recently was stability of the OS. With Windows 2000 (and XP I assume) it seems that Linux users are now the ones willing to put up with the more problematic OS, and for some of the same reasons Windows users clung to not-long-ago.
Now my question... Why use Linux? It's that needlessly complex and clunky operating system in-between Windows/OS X.1 & the *BSDs. Windows & the *BSDs being far easier to configure than Linux, with the *BSDs being faster, more secure, more stable, and simply smoother (less clunky) all around.
The *BSDs are PnP (no need for Kudzu) no kernel modules to be manually configured, and the configuration files are far simpler than ANY Linux distro (although you CAN use Sys V scripts instead if you are so inclined-anyone who uses the BSD-style scripts for awhile will not want to use anything else though).
So I ask politely, hoping to avoid flames and rants... Why choose Linux? It's not the most stable, the most secure, the fastest, the most free, the easiest.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
2.2.19 for servers @ Work, which all have uptimes aproaching 1 year.
At home I use RH 7.2, with 2.4.17 vanilla. Can't seem to figure out what uses so much memory, because I have 768MB of RAM, and after a few days, it starts to swap. The 2.2.19 machines @ work, have never dipped into swap, and most of them are 256MB or lower.
I think 2.4.17 needs to iron out VM issues, before it will start being loaded on production machines.
If linux makes any ground at all in the desktop market, do you expect each of those people to know what to do when something crashes on them? Do you think they will have a clue how to fix it or report what happened in detail?
Probably not. They'll probably hit the reset button and try again, until they get so upset they give up and dismiss Linux as a hobbyists dream.
What?
The fact is, if a kernel is marked as stable it should be stable. If the 2.4 kernel came from Microsoft you would all be bashing the hell out of them. Maybe Linus needs to slow down, and look over his code before he releases it.
Slipping Away...
First, I don't know what compelled Joshua to choose Mandrake, whose "bleeding-edgeness" usually keeps them a bit unstable and unpolished.
Second, I've no idea why he backed his customer down to red hat 6.2, maybe he didn't know that red hat 7.0 still has a 2.2 kernel and is way more modern than 6.2. Was he really *that* frightened?
Finally, I don't know what qualifies as "production use", but I have at least 3 servers with Red Hat 7.2, kernel 2.4.7, in "production" (meaning serving a lot of database-driven web pages, serving up files and working as X servers) and I've yet to have a kernel-related crash. Actually, my only downtime has been due to power outages.
"...when I upgraded from 2.2 to 2.4 (Mandrake 7.2 to 8.1), I had (still have) many stability problems..."
"...I don't know what compelled Joshua to choose Mandrake, whose "bleeding-edgeness" usually keeps them a bit unstable and unpolished..."
"...He's using MANDRAKE on a SERVER. For crying out loud, you don't use Mandrake on a server. Get something realistic like Slackware or Debian, and if you want to be a idiot use redhat, not Mandrake..."
"...First of all, who would used Mandrake for a server. We are talking about an installation that is meant for a laptop environment..."
"...i was running a 2.4 on mandrake 8.0 and had nothing but problems...."
"...I've noticed that most of the comments both in the article and others complaining about the 2.4.x kernels and various stability problems are running RedHat [redhat.com], Mandrake [mandrake.com], and even Debian [debian.org] Distros..."
huh..
Perhaps we have a Mandrake issue, here, and not really a 2.4.x kernel issue, and certainly not, as the few M$ trolls have tried to suggest, a Linux issue...
dunno..
Food for thought.
t_t_b
I'm on PJ's "enemies" list! Are you?
You should try window maker, it never crashes, there's not much to it is why. It's been around a while and doesn't get or need updates frequently so it's never bugged out on me yet! also there the new version of evolution from ximian is great. With wmaker you can still intall qt libs and gtk and gnome libs and run your old favs without the hassle of gnome nor kde! Try wmaker it's simple, lightweight, and insanely stable. i reboot here at work, well i don't actuall i shut down on friday nights. i just logout(which doesn't actually restart x, every other evening. it's rare for me to reboot! just some suggestions for ya...
...There's nothing wrong with Southern California that a rise in the ocean level wouldn't cure...
And once again, the reason Linux will never succeed on the desktop.
"My internet access doesn't work."
"You twit. Fire up the debugger like a real man. You've got source!"
"I just want to send an e-card to my granddaughter..."
As long as it demands that the user cater to the OS, the Linux community will never be more than an insulated gang of zealots.
Other causes of flicker: multiple visuals (not a problem on most Linux XFree86 systems), and toolkits (fixable with double buffering and can be reduced though not eliminated by the programmer of the toolkit).
I think the window hould be put into the toolkit. The window borders are no different than any other widget. In fact I believe far more code is expended trying to talk to a window manager than would be needed to do this in a toolkit (which already contains code to draw the buttons and borders). This would allow new ideas in window management to be experimented with, such as getting rid of the borders entirely.
The system might provide a "Task Manager" (using the term taken from Windows) that any program creating a window would talk to. The program would indicate the task that window belonged to and the name of the window itself. The task manager would send commands like "raise this window" or "map this window" or "hide this window" to the program, and by watching the visiblity and positions of windows could provide pagers, icons, and taskbar type interfaces.
I strongly believe that putting widgets into the server is BAD. If X had done this we would be using Athena widgets right now and X would look laughably bad. The fact that X can emulate Windows and Mac interface designs invented 10 years after X was is definate proof that keeping UI elements out of it was the best possible design.