Operational Testing of Linux Kernel 2.5.x
G3ckoG33k writes "The Open Source Development's Lab has begun operational testing of the 2.5.x Kernel: "The staff at OSDL has been involved with development and testing of 2.5 since the beginning and we've noticed that it seems to be very stable for a development tree. So good, in fact, that we think it is ready to be tested in a production environment. We have planned and begun execution of a project to test the 2.5 kernel in our data center using our production environment. The project includes lots of testing and lots of escape hatches so we don't run recklessly off the edge. We began with some of the simpler, less critical servers and, as we build confidence, are moving to the more complex servers. Today we have several servers running 2.5 and within a month we'll have most of the data center migrated to 2.5." Can anyone say Dare Devils?"
I've been trying out 2.5 for quite a while now with varying degrees of sucess.
It would be great to hear from more people like OSDL that it's working well.
Unfortunately, unless RH9 comes with module-init-tools, it will still be a pain to try out the 2.5 kernel.
The reason it's not for production use isn't because it is necessarily crash prone... it's because it can break drastically between minor versions as features are added/changed.
There's a reason people don't use 2.5. It's the DEVELOPMENT kernel. You SHOULD NOT be using it for production use. Often things will break. Sometimes it will cause hard disk corruption. It wouldn't be the first time.
Please, fellow slashdotters, don't be tempted to use 2.5 for your important systems. It's good that it's tested more, but if you do use it, please don't bitch and whine about how it destroyed all your data.
This is however still a DEVELOPMENT kernel. I put that in big letters because it's very, very true. Lots of kernel modules won't compile still. Documentation for what has changed is somewhat spotty, and it took me some time to get everything working decently. And getting a system that can boot into 2.4 or 2.5 seems quite difficult with the new modutils package (or at least I haven't gotten it working yet - have to reinstall modutils RPM if I want to boot into 2.4).
Also there's a major bug with ext3 right now in 2.5.66 - if your computer doesn't shut down cleanly, the journal recovery in 2.5 seems completely broken - I have to reboot into 2.4, let the 2.4 kernel do the journal recover, do a clean shutdown, and THEN boot back into 2.5. Pain in the ass, especially since I've had two hard crashes since I upgraded to 2.5. Also 2.5.66 doesn't compile out of the box with default config. Had to patch one file with a patch from LKML.
So in short, 2.5 may be more stable than usual devel branches, but don't delude yourself about what you are getting into. If you want the latest and greatest in performance for your desktop machine, give it a try. But I wouldn't run even a low uptime-requirement server with it yet.
They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
Ben
I've been running 2.5.x on both my station at work and at home. For the most part it's been pretty stable.
I've run 1.3, 2.1, 2.3 and now 2.5 kernels as they came out and 2.5.5x and on have been a pleasure. I had a 2.1.x kernel eat my file system, I've had nothing like that so far.
Now the caveat: don't run a 2.5.x kernel unless your willing to lose everything, backup regularly! and most important because I don't think anything bad will happen, be prepared to write bug reports correctly! READ THIS AFTER DOWNLOAD! linux-2.5.x/Documentation/BUG-HUNTING
"think of it as evolution in action"
Having gone through high cpu/disk load crashes over multiple kernels, I would suggest a good test plan before embarking on any new kernel.
Our most recent experience with 'stable' kernels (specifically drivers in our case) was the default kernel in RH 8. It had some very subtle issues with Intel's GigPHY/MAC chipset that caused crashes only under specific high load every three to four days. Crashes were not repeatable in specific time frames but would eventually happen. I suggest finding a characteristic set of applications/loading of disk/mem/CPU applications and then test out your favorite kernel under all those circumstances. Many programs that run huge FFTs or other number crunching applications are many times too specific to cause failures. We in this example used a program to calculate huge FFTs while doing looping network file transfers to test without issues... nothing beats the real thing!
Also don't think that even 2.4.x series kernels are above this... as I stated earlier even a heavily patched 2.4.18 kernel could be your downfall... so maybe a 2.5.x kernel is okay but beat the crap out of it before putting both feet in.
-Ho
This reminds me- one problem I've always had is that new stuff that gets thrown into the kernel isn't clearly explained- in the most basic ways. Ie, what the heck is it? I remember lots of versions of 2.4 had features and options with no help to explain what they did. Google searches don't always turn up anything handy- often they turn up lots of hits on patches or posts talking about the feature, but not describing what it actually is.
Anyway, For those wondering what the heck cpufreq is...From a kerneltrap interview:
JA: You also mentioned working on the x86 side of Russell King's cpufreq code. We spoke with Russell King in an earlier interview, but we didn't talk about cpufreq. What is it?
Dave Jones: Quite a few CPUs these days allow changing of the voltage/multiplier/bus speed through software. Russell and Erik Mouw did a bunch of work on the ARM CPUs that support this feature, and started writing a generic framework for this type of technology so that he wouldn't have to duplicate code that for eg, recalculates loops_per_sec in every speed scaling.
etc.
Please help metamoderate.
I'm running the latest NVidia drivers with 2.5.65 and they work fine (after patching of course). The patches can be found on the internet if you look around.
You should only use a development kernel in a production environment if you've already tested it extensively and found it to have no problems with your particular load on your particular hardware with the options you're using. Of course, if you're OSDL, you can actually do this sort of testing, but practically everyone else doesn't have the spare hardware and test suites necessary.
I've been running 2.5.xx on my home server since xx was >30ish. not really by choice either. 2.4.xx (for all xx) is grossly unstable for my combination of hardware. I got ide irq timeouts which brings the machine crashing down often before it had finished booting. It would run in uniprocessor mode but what's the point of that!!!
I have one problem with hostap (wireless access point drivers) and my sound card sharing an interrupt which causes a crash occasionally, but if i don't load the sounds drivers it never crashes.
My hardware is:
ABIT BP6 mb using onboard ata66 ide
2 x Celeron 400 (SMP kernel)
TV card
sound card
NVidia gfx card
wireless card
network card
No, the driver for my card is just getting that feature. In fact, the Windows driver doesn't even have it (but Windows has a kernel mode mixer so programs can still play sounds at the same time, the CPU does the mixing). Other Linux drivers have had it for some time, I understand. And I'm talking about 2 PCM streams here, as in two programs playing sounds at the same time. It has always been able to mix the microphone, line in, MIDI, CD, and other channels of course.
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
I'm running 2.5.65-mm4 on my home box because i wanted to find out whats all the excitement and nice numbers about the new scheduler. After i got all the modules right, i did some tests ... and was a bit dissapointed. You see, it's not all that faster ... it just feels different. Yes, programs do load somewhat faster, but at the same time doing a ls -l in my home dir was kinda slower that with excellent WOLK patchset for 2.4.18. On the other side, i was completely able to browse my large inbox (~20k mails in maildir) while checking md5 of the latest knoppix iso on the same disk.
... i just can't wait to test the 'fixed up' promise driver and ide tcq code! Right now ide tcq on promise is somewhat borken. If ide tcq shows some numbers, that would be the last argument down for scsi vs. ide in our servers...
I have a lot of expectations of the Alan/Andre team with their ide work
I don't understand. I run 2.4.20, standard Debian package. I am listening to the Ozric Tentacles with XMMS as I write. As I test I installed mpg321 and played an Eddie Izzard track from the command line at the same time. No problem., mixed seamlessly.
Now, let's try more channels...
Now, I've mixed Wagner, Ride of the Valkyries into that too. I'm kind of dizzy, but it all works.
Maybe this is a feature of the EMU10k1 driver, or something, but it just works for me.
Woooah my head is spinning! Stop!
Yours Sincerely, Michael.
Nah - the whole purpose of ntfs on linux is to share data with windows, especially on a dual boot setup - no production linux server is going to be using ntfs for anything serious.
If you follow the mailing list (mandatory if you want to run a development kernel, of course) just stay a few versions behind. Download the latest 2.5, but don't install it. Read the list for a couple of weeks. If nobody mentions any show-stopping bugs in your version, you're probably safe to go ahead with the install, and you'll know what to expect from others' posts. DON'T post incessant questions to the list asking whether each new version is OK to run - just watching for bug reports gives you enough information, and doesn't annoy the ML.
This strategy comes from (but probably does not originate with) the FreeBSD-STABLE community.
Most desktop apps at least support going through arts, esd or some other software mixer so while its kindof a crappy solution its not that much of an issue.
I'm not sure about most of the cards available these days, but I do know that at least the SBLive (and the linux drivers, both alsa and the old oss ones) allows for hardware mixing of multiple channels (not sure how many but its way more than just 2).
The ALSA drivers ARE of a much higher quality though. Has anyone else noticed that if you put the PCM volume all the way up on pretty much any sound card with the old OSS drivers you start getting nasty distortion (well not really nasty, but for anyone thats picky about audio its pretty nasty). Main volume is fine all the way up, and its definately not a speaker issue. Going on about the same volume level, PCM down Main all the way up vs PCM all the way up and Main down there is most certainly a difference in the quality of sound.
I've seen this with SBPCI 128, SBLive, and on my 800MHz iBook as well, so its not even limited to one platform.
Regardless of whether you use lilo or grub, you can have the option of booting multiple kernels as long as you have room for them in /boot. When you install a new, development kernel, edit the appropriate boot loader configuration file to make sure you can still boot to a stable kernel (e.g., 2.4.X). I have only had a couple of instances where a new development kernel either wouldn't boot or was unstable once it did. I documented the bug, in a couple instances helped test the patch and could always drop back to my stable kernel while things didn't work. Also, once you get a development kernel that seems stable with your rig, that joins the stable production kernel in your boot configuration. If nothing else, putting "milage" on even a less than current development kernel helps since there *could be* a lurking time dependent error (haven't hit one but could happen).
/boot so you don't fill the partition.
So you end up with usually three and sometimes a few more kernels to choose from when you boot:
1) stable production (2.4.X)
2) seems to be stable development
3) current development
When a "current development" kernel seems to be stable, it becomes your new "seems to be stable" dev kernel and you can drop the old "seems to be current" version. Just be sure to weed out old kernels from
Unless your rig is a completely stock retail box, chances are your specific combination of peripherals and software are unique. So there is no guarantee that your specific configuration will be stable with a development kernel. The beauty of it is, that's a question only you can answer.
They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
Ben
at > US$299.00 vmware is not for everyone...
Putting media services in the X protocol would probably be a good idea, but it is orthogonal to a kernel mode audio mixer. Some programs will always want to access the hardware more directly, and the server itself still needs to access the hardware. The kernel API will still be used, it needs to work well.
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
Apparently you've discovered that the EMU10k1 driver from the 2.4 series also has this ability. Some cards have the ability to mix up to 8 streams, I've heard. If you keep opening up more mpg123s eventually you will hit a wall. The mixing was new to me since my sound card driver (ESS something-or-other) didn't have this capability in 2.4. The new ALSA driver for my card in 2.5 does. Also it doesn't have the annoying bug where every time XMMS switched songs, it swapped the left and right channels :-) The old ESS driver really wasn't very good.
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
Interactive performance - Pretty sharp. I/O background load really doesn't put much of a burden on foreground stuff, but then, 2.4 + preempt patches didn't either. Resizing is weird. Resize slowly, and the effect is like kernel 2.4 (canvas lags behind window frame). Resize fast, and the effect is like OS X, the window frame lags while the canvas catches up. Both kinda suck. CPU background load (MP3 compression) causes the machine to feel like an XP machine -- big 10-15 pauses.
CD drivers - They suck. Certain CDs (Evanescence's Fallen) will cause the CD drive to go into spasms. This doesn't happen under 2.4.
I/O scheduler - Gimpy. Under heavy CPU load (the aformentioned MP3 compression) starting an app that isn't in cache will take tens of seconds.
Compile performance - awesome. I use Gentoo, and I've noticed big improvements.
Power management - Mediocre. APM is alright. ACPI sucks. Causes weird beeping noises when I try to load the "processor" module. It's probably a fault of my Inspiron 8200's fsck'ed DSDT, so I won't bitch, but WinXP has no problem with it.
Stability - Surprisingly good, for development code. A far cry from 2.4, crashes maybe once a week, but much better than the 2.5.20-something releases, which once hosed my entire partition when I burned a bad CD...
A deep unwavering belief is a sure sign you're missing something...
The emu10k1 chip that the SB Live is built around can mix 32 channels together like that in hardware. the Linux driver provides support for that ability by letting you open /dev/dsp for output up to 32 times simeltaneously.
:)
Simple
I actually *do* try to do that in the -mjb tree already ... I collect bug fixes, performance improvements, and diagnostics tools (so if it does break, you can find what went wrong).
... ie we're on 2.5.66 now, so run 2.5.65-mjb2 (the latest 65 version). And don't turn on preempt (it's broken in my tree by something I did).
.. unless you run 2.4-aa or something.
Staying one kernel release back will help you too
The interactivity tweaks are against the O(1) scheduler, so won't do you much good in 2.4
If you're going to upgrade from 2.4 to 2.5, make sure you compile import support in (not off, or as a module), and turn VT console support on explicitly. Those are the usual tarpits for new 2.5 people.