What's the Damage? Measuring fsck Under XFS and Ext4 On Big Storage
An anonymous reader writes "Enterprise Storage Forum's long-awaited Linux file system Fsck testing is finally complete. Find out just how bad the Linux file system scaling problem really is."
The only thing Linux does well is to suck. Use AIX, Solaris or HP-UX if you want real UNIX not kiddy shit like Linux.
NOT!
BOOYA!
How fast a full fsck scan is is my last concern. What about how successful they are at recovering the filesystem?
What's the Damage? Measuring fsck Under XFS and Ext4 On Big Storage
Because of politically correct speech, I read the headline as "What's the Damage? Measuring fuck Under XFS and Ext4 On Big Storage"
Jessie Mother fscking crisko!
When I had some EBS problems a couple years ago, I figured I would run xfs_check. It seemed to do absolutely nothing, even if there were disks known to be bad in the md array. xfs is nice and fast, but I haven't seen the xfs_check or xfs_repair to do either of the things I'd assume they'd do -- check and repair. I found it easier to delete the volumes and start from scratch, because any compromised xfs filesystem seems to be totally unfixable. Is fsck for xfs new?
I do stuff Zhrodague
They're testing 70 TB of storage, so with current hard drive quality, the odds of an unrecoverable read error are probably close to 100%. It would be simpler to write a two-line fsck utility to report it:
This just in:
Full filesystem scans take longer as the size of the filesystem increases.
News at 11.
For the FSCK times of EXT4 on 50% loaded 72TB (32TB, 105million files) drive the time was only an hour. I wish my drives at home would FSCK that fast, and I only have 2 TB formatted XFS
Honey badger don't give a fsck.
A single file system that big without checking features that file systems like ZFS or clustering file stores provide seems insane to me.
A much better test of linux "big data"
1) write garbage to X blocks
2) run fsck if no errors found, repeat step 1
How long would it take before either of these filesystems noticed a problem and how many corrupt files do you have? With a real filesystem you should be able to identify and/or correct the data before it takes out any real data.
"If you need to fsck you should already be restoring from backups"
You do realize how long it would take to restore 72tb on the class of hardware they were testing?
be treated by yolur lost its earlier Usenet. In 1995, server crashes Personal rivalries
The lengthy delay in obtaining the results is due to the lack of hardware for testing time waiting for fsck to finish.
Okay, so ext4 takes longer to fsck than XFS does.
Let's look at how they set up the scenario. They made a bunch of RAID6's with two spares each, and *then* made a striped RAID of those to get 72TB. This tells me that they're storing data where uptime is paramount. So, you're not in an organization where you can answer the red phone in your server room and go "Well, we're checking the drive for errors. Our 72TB of business data will be back on line in about a half-hour". So, you've certainly got hot-spares for fail-over, right?... which means that it kinda doesn't matter *how* long your primary is down (within reason, of course). I say "within reason" because the biggest discrepancy I see in their results between ext4 and XFS is about a factor of x8 (about a half-hour for ext4 as opposed to XFS's 4.5 minutes)
Their message seems to be that, if you've got 72TB of data on an array with ext4 and your only way of getting it back is with fsck, you're in a bit of trouble.
Personally, I'd shorten the message by taking the "with ext4" part out.
What system did you end up going with?
How do you back it up?
http://www.enterprisestorageforum.com/print/storage-hardware/linux-file-system-fsck-testing----the-results-are-in.html
going through 3 pages is so annoying...
>I set up an xfs volume a couple years back. After copying a few files over nfs, it became corrupted. the xfs fsck did >something -- it told me that it was so corrupted, it couldn't be fixed.
Well, why don't you quote something from even older -- say linux 0.1 ? If that makes you feel better
XFS as a fs on linux (on SGI it was long time back, i am referring to the port) has matured way better over the years.
Also, xfs has no fsck -- sure it is not a case of mistaken identity ?
You need to use xfs_repair if *required* after dirty playback.
Each pool is a LUN that is 3.6TB in size before formatting or actually 3,347,054,592 bytes as reported by "cat /proc/partitions".
a file system with about 72TB using "df -h" or 76,982,232,064 bytes from "cat /proc/partitions"
Yeah, I think there's definitely a scaling problem there.
Or perhaps a reading comprehension problem, since /proc/partitions reports in blocks, not bytes, but either way it doesn't inspire any kind of confidence in the rest of their testing methodology.
Don't use xfs_check -- it is slow, instead run xfs_repair in -n mode
Also, there is no fsck for xfs -- for people interested in details -- it runs a playback on dirty log during a mount, a xfs_repair may be required after that but that is optional.
In other words, people who compared xfs and ext4 are not aware of this in my opinion.
Quick advice / pro-tip: Don't quote EBS and performance in same line. They don't match no matter what fs you use since the underlying medium sucks. They provide storage on 'elastic' basis -- so go figure out how fast they do it when you are writing at x MB/s
When an article about fsck has a tag line of "What's the damage", I expect to see some discussion of how fsck deals with a damaged file system.
The time required to fsck a file system that doesn't need checking is less interesting and inconsistant with the title. Although, if fsck had complained about the known clean file system that would be interesting.
Wasn't this linux kernel released in, like... 2008? Surely the author could have chosen a kernel at least released in 2011? Also, the tools may be just as old. An article should be surely written to be relevant to what's being presently included in an operating system.
I mean *DEBIAN* is using 2.6.32 in their current stable, due to be released soon. Usually they're years behind. Their upcoming release uses 3.2!
And speaking of that, XFS got a really major upgrade about 3.0 which essentially builds FreeBSD-style softupdates and journalling I/O intelligence to the file system.
I expect a /. article like this to include a summary. Like, a word about what the results actually were, without having to click through twice to get to them.
1. Why did they put a label on the RAID devices? They should have just used /dev/sd[b-x] directly, and not confused the situation with a partition table.
2. Did they align the partitions they used to the RAID block size? They don't indicate this. If they used the default DOS disk label strategy of starting /dev/sdb1 at block 63, then their filesystem blocks were misaligned with their 128 kiB RAID block size, and one in every 32 filesystem blocks will span two disks (assuming 4 kiB filesystem blocks).
3. Why did they use md and not LVM? md can sometimes introduce bandwidth limits, and LVM lets you alternate between striped and linear volumes for your testing.
4. Why don't they report the raw bandwidth of the disk, and maybe some IOPS numbers?
5. Why don't they report total operations and bandwidth consumed as measured by iostat or sar?
6. Why didn't they give geometry hints to mkfs? The ext4 mkfs invocation, for example, should have included "-E stride=$[128 / 4],stripe-width=$[(10 - 2) * (128 / 4)]".
7. What about using an external journal?
8. They report that "during the file system check the server did not swap, and no additional use of virtual memory was observed." Wouldn't it have been better to just do "swapoff -a" and report that no swap was available?
9. Why didn't they (as someone else also suggested above) test an actually damaged filesystem?
10. Is there any indication other than their credentials that these people know what they're doing?
I am not sure it has much impact, but why would you use a 5 year old linux kernel to perform the test? Maturity is all very nice, but if you are pushing technology, it is not always the best approach.
...other file systems, such as ZFS (doesn't it work w/ Linux?), Veritas, UFS and so on?