Linux 4.0 Has a File-System Corruption Problem, RAID Users Warned
An anonymous reader writes: For the past few days kernel developers and Linux users have been investigating an EXT4 file-system corruption issue affecting the latest stable kernel series (Linux 4.0) and the current development code (Linux 4.1). It turns out that Linux users running the EXT4 file-system on a RAID0 configuration can easily destroy their file-system with this newest "stable" kernel. The cause and fix have materialized but it hasn't yet worked its way out into the mainline kernel, thus users should be warned before quickly upgrading to the new kernel on systems with EXT4 and RAID0.
I'll stick with Windows Vista, thanks.
If you ever owned horses, you would understand what "stable" means in this context
This is the new 4.0 kernel, A Major version update , less than a month old, that most Linux systems will not have yet ...and the issue has already been patched
Bleeding edge builds get what they expect, stable builds don't even notice
Puteulanus fenestra mortis
md raid. The bug was in commit md/raid0: fix bug with chunksize not a power of 2 causing, you guessed it, a bug with a chunksize not a power of two. I guess "fix" was a bit diversionary.
The real problem was a macro modifying arguments that were later expected to be the unmodified version.
It's stable as in terms of features and changes. i.e. No longer under development and will only receive fixes.
However! Kernels from kernel.org are not for end users, if someone is using these kernels directly then they do so at their own risk.
They are intended for integrators (distributions), whose integration will include their own patches/changes, testing, QA and end user support
There is a reason that RHEL 7 is running Kernel 3.10 and Debian 8 is running 3.16. Those are the 'stable' kernels you were expecting.
When kernel development moved from 2.5 to 2.6 (that later became 3.0), they stopped their odd/even number development/stable-release cycle. Now there is only development, and the integrators are expected to take the output of that to create stable-releases.
Well, there goes that slogan.
Escher was the first MC and Giger invented the HR department.
Losing data goes with the territory if you're going to use RAID 0.
In particular, RAID 0 combines disks with no redundancy. It's JUST about capacity and speed, striping the data across several drives on several controllers, so it comes at you faster when you read it and gets shoved out faster when you write it. RAID 0 doesn't even have a parity disk to allow you to recover from failure of one drive or loss of one sector.
That means the failure rate is WORSE than that of an individual disk. If any of the combined disks fails, the total array fails.
(Of course it's still worse if a software bug injects additional failures. B-b But don't assume, because "there's a RAID 0 corruption bug", that there is ANY problem with the similarly-named, but utterly distinct, higher-level RAID configurations which are directed toward reliability, rather than ONLY raw speed and capacity.)
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way