Slashdot Mirror


Linux 4.0 Has a File-System Corruption Problem, RAID Users Warned

An anonymous reader writes: For the past few days kernel developers and Linux users have been investigating an EXT4 file-system corruption issue affecting the latest stable kernel series (Linux 4.0) and the current development code (Linux 4.1). It turns out that Linux users running the EXT4 file-system on a RAID0 configuration can easily destroy their file-system with this newest "stable" kernel. The cause and fix have materialized but it hasn't yet worked its way out into the mainline kernel, thus users should be warned before quickly upgrading to the new kernel on systems with EXT4 and RAID0.

10 of 226 comments (clear)

  1. It's RAID 0 by Anonymous Coward · · Score: 1, Insightful

    Losing data goes with the territory if you're going to use RAID 0.

    1. Re:It's RAID 0 by kthreadd · · Score: 2, Insightful

      Or it could work just fine. RAID 0 is not dangerous, you may just as well loose your data even if you only use a single drive. Hard drives and SSDs don't go bad that often that it's a problem.

  2. New version ... by JasterBobaMereel · · Score: 5, Insightful

    This is the new 4.0 kernel, A Major version update , less than a month old, that most Linux systems will not have yet ...and the issue has already been patched

    Bleeding edge builds get what they expect, stable builds don't even notice

    --
    Puteulanus fenestra mortis
    1. Re:New version ... by Anonymous Coward · · Score: 2, Insightful

      The last major Linux version update that actually meant something was 1->2. The "major version" bumps in the kernel are now basically just Linus arbitrarily renumbering a release. The workflow no longer has a notion of the next major version.

  3. Re:Warning: RAID 0 by Enry · · Score: 2, Insightful

    RAID 0 is only as unstable as its least stable component. In this case it's most likely a drive failure, and most drives are fairly long MTBFs. The chances of a disk failure increase as a function of time and number of drives deployed. A two-drive RAID 0 will be more stable than a five-drive RAID 0 which will be more stable than a 10 drive RAID 0 that's three years old. In the case of higher RAID levels, you can remove a single (or multiple) drive failure as the point of failure. In this case, the point of failure is the kernel, so it's perfectly legitimate to consider this a really bad problem. Would you say the same thing if the bug affected RAID 1 or RAID 5?

  4. Re:Warning: RAID 0 by nine-times · · Score: 4, Insightful

    Would you say the same thing if the bug affected RAID 1 or RAID 5?

    I suspect not, since his point seemed to be that you shouldn't be using RAID 0 for data that you care about anyway.

    It doesn't really make it ok for a bug to exist that destroys RAID 0 volumes, but it does mitigate the seriousness of the damage caused. And it's true: Don't use RAID 0 to store data that you care about. I don't care if the MTBF is long, because I'm not worried about the mean time, but the shortest possible time between failures. If we take 1,000,000 drives and the average failure rate is 1% for the first year, it's that that comforting to the 1% of people whose drives fail in that first year.

  5. Re:stable by dave420 · · Score: 3, Insightful

    I understand if you are emotionally attached to Linux to the point where someone accidentally criticising it makes you feel uncomfortable, but you really should be able to figure out that "but... but... they're worse!" is no rational response :)

  6. Or just use a power of 2 chunk size? by tlambert · · Score: 3, Insightful

    Or just use a power of 2 chunk size?

    What idiot configuration did someone have to have to trigger this bug?

  7. Re: stable by andydouble07 · · Score: 1, Insightful

    Meanwhile, my Win keeps BSOD.

    Really? Sounds like you're screwing something up pretty bad, haven't seen one of those in about 6 or 7 years.

  8. Re: stable by oobayly · · Score: 3, Insightful

    It's not. However it isn't beyond a reasonable expectation that a dodgy touchpad driver shouldn't be able to kill an OS.