Linux 4.0 Has a File-System Corruption Problem, RAID Users Warned

← Back to Stories (view on slashdot.org)

Linux 4.0 Has a File-System Corruption Problem, RAID Users Warned

Posted by timothy on Thursday May 21, 2015 @01:23AM from the don't-store-the-ark-there dept.

An anonymous reader writes: For the past few days kernel developers and Linux users have been investigating an EXT4 file-system corruption issue affecting the latest stable kernel series (Linux 4.0) and the current development code (Linux 4.1). It turns out that Linux users running the EXT4 file-system on a RAID0 configuration can easily destroy their file-system with this newest "stable" kernel. The cause and fix have materialized but it hasn't yet worked its way out into the mainline kernel, thus users should be warned before quickly upgrading to the new kernel on systems with EXT4 and RAID0.

10 of 226 comments (clear)

Min score:

Reason:

Sort:

It's RAID 0 by Anonymous Coward · 2015-05-21 01:28 · Score: 1, Insightful

Losing data goes with the territory if you're going to use RAID 0.
1. Re:It's RAID 0 by kthreadd · 2015-05-21 05:12 · Score: 2, Insightful
  
  Or it could work just fine. RAID 0 is not dangerous, you may just as well loose your data even if you only use a single drive. Hard drives and SSDs don't go bad that often that it's a problem.
New version ... by JasterBobaMereel · 2015-05-21 01:40 · Score: 5, Insightful

This is the new 4.0 kernel, A Major version update , less than a month old, that most Linux systems will not have yet ...and the issue has already been patched
Bleeding edge builds get what they expect, stable builds don't even notice

--
Puteulanus fenestra mortis
1. Re:New version ... by Anonymous Coward · 2015-05-21 02:14 · Score: 2, Insightful
  
  The last major Linux version update that actually meant something was 1->2. The "major version" bumps in the kernel are now basically just Linus arbitrarily renumbering a release. The workflow no longer has a notion of the next major version.
Re:Warning: RAID 0 by Enry · 2015-05-21 02:08 · Score: 2, Insightful

RAID 0 is only as unstable as its least stable component. In this case it's most likely a drive failure, and most drives are fairly long MTBFs. The chances of a disk failure increase as a function of time and number of drives deployed. A two-drive RAID 0 will be more stable than a five-drive RAID 0 which will be more stable than a 10 drive RAID 0 that's three years old. In the case of higher RAID levels, you can remove a single (or multiple) drive failure as the point of failure. In this case, the point of failure is the kernel, so it's perfectly legitimate to consider this a really bad problem. Would you say the same thing if the bug affected RAID 1 or RAID 5?
Re:Warning: RAID 0 by nine-times · 2015-05-21 02:39 · Score: 4, Insightful

Would you say the same thing if the bug affected RAID 1 or RAID 5?
I suspect not, since his point seemed to be that you shouldn't be using RAID 0 for data that you care about anyway.
It doesn't really make it ok for a bug to exist that destroys RAID 0 volumes, but it does mitigate the seriousness of the damage caused. And it's true: Don't use RAID 0 to store data that you care about. I don't care if the MTBF is long, because I'm not worried about the mean time, but the shortest possible time between failures. If we take 1,000,000 drives and the average failure rate is 1% for the first year, it's that that comforting to the 1% of people whose drives fail in that first year.
Re:stable by dave420 · 2015-05-21 03:01 · Score: 3, Insightful

I understand if you are emotionally attached to Linux to the point where someone accidentally criticising it makes you feel uncomfortable, but you really should be able to figure out that "but... but... they're worse!" is no rational response :)
Or just use a power of 2 chunk size? by tlambert · 2015-05-21 04:16 · Score: 3, Insightful

Or just use a power of 2 chunk size?
What idiot configuration did someone have to have to trigger this bug?
Re: stable by andydouble07 · 2015-05-21 05:18 · Score: 1, Insightful

Meanwhile, my Win keeps BSOD.
Really? Sounds like you're screwing something up pretty bad, haven't seen one of those in about 6 or 7 years.
Re: stable by oobayly · 2015-05-21 07:30 · Score: 3, Insightful

It's not. However it isn't beyond a reasonable expectation that a dodgy touchpad driver shouldn't be able to kill an OS.