What's the Damage? Measuring fsck Under XFS and Ext4 On Big Storage
An anonymous reader writes "Enterprise Storage Forum's long-awaited Linux file system Fsck testing is finally complete. Find out just how bad the Linux file system scaling problem really is."
How fast a full fsck scan is is my last concern. What about how successful they are at recovering the filesystem?
When I had some EBS problems a couple years ago, I figured I would run xfs_check. It seemed to do absolutely nothing, even if there were disks known to be bad in the md array. xfs is nice and fast, but I haven't seen the xfs_check or xfs_repair to do either of the things I'd assume they'd do -- check and repair. I found it easier to delete the volumes and start from scratch, because any compromised xfs filesystem seems to be totally unfixable. Is fsck for xfs new?
I do stuff Zhrodague
This just in:
Full filesystem scans take longer as the size of the filesystem increases.
News at 11.
Honey badger don't give a fsck.
A single file system that big without checking features that file systems like ZFS or clustering file stores provide seems insane to me.
I'll go tell _average joe/jane_ to go and get AIX, and dump ubuntu+unity which they like so much because it's shiny and pretty.
After evaluating our options in the 50-200TB range with room for further growth we ended up moving away from linux and to an object based storage platform with a pooled, snapshotted, and checksummed design. One of the major reasons for this was the URE problem, we would virtually be guaranteeing silent data corruption at that size with a filesystem that did not have internal checksums. The closest thing in the OS world would be ZFS whose openness is in serious doubt. It is scary how much trust the community places on spinning rust.
The tests are also useless since the "speed" will be linerally controlled by the IOPS of the array. Sure would be nice to be able to throw 10x15k spindles at 3.5TB ( 230 disks for the 72TB test ) that's one way to improve random IO performance, but how many can afford such luxury on a big data store that could reach into the 100's of TB?
A cranky coward from the shadows is not s reliable source of information.
I have used AIX and Solaris, and I can say that a lot of stuff is easier on Linux.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I'll go tell _average joe/jane_ to go and get AIX, and dump ubuntu+unity which they like so much because it's shiny and pretty.
Few average Joe's have 72TB of disk space, and even for those that do, they're probably ok with 30 - 60 minutes of FSCK time. And more likely, instead of 100's of millions of files, they probably have a few million, so their fsck time will be in the 3 - 15 minute time range.
I've seen servers that take over 3 minutes for their POST check.
Our BTRFS evaluation resulted in rejecting it for some very serious problems ( what they claim are snapshots are actually clones, panic in low memory situations, no fsck, horrible support tools, developers who are hostile to criticism, pre-release software, ... ). ZFS was nice, but limited to non-distributed systems and still had a non-trivial amount of volume and backend management headaches. Personally I use ZFS for my personal servers at home ( incremental snapshots are the bomb ) but out production systems needed more.
They were using 15K RPM SAS drives. Your 7200 RPM drives aren't going to touch the speed of 15K RPM drives on a SAS backplane. Not by a long shot.
If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.
I like how you completely ignored Solaris yet still presented the comment as if it was a valid counterargument.
I also like how GP completely ignored Solaris. I just like the fact it is being ignored.
Fear is the mind killer.
ZFS now runs pretty well on Linux too, as a kernel module, thanks to zfsonlinux. If you're running a Debian-based distro, installing it is trivial (one command to add the PPA, one command to install the package).
killall Anonymous\ Coward
...until you have a drive die during a scrub, destroy a zfs filesystem in a deduplicating zpool, or any other number of things that makes ZFS **ANGRY**, that is. and despite all that, I still trust it more than any most linux filesystems.
You see my nick?
AIX sucks more than Linux.
Usual process for "weird"* AIX Problems:
1) weird problem occurs after install. You report problem to IBM.
2) IBM asks for your software version, see they are the newest ones available, and say they look into it.
3) You ask several month later if they did find anything. They ask for your software version, they ask you to upgrade and see if the problem goes away.
4) You upgrade to newest version.
5) go to 2)
*There are of course non-weird problems where you get the answer from IBM support in 2-3 days, and from Linux forums in 2-3 minutes.
and ZFS is available to Mac OS X systems as an add on. Both opensource, and as of this week, a commercial version is available.
There is very little reason to be running a system with out ZFS, unless you are running AIX, HP-UX or IRIX.
Why would you replace a zero-ed string with another? At least use /dev/random, bro.
Nerdy news for your nerdy needs? http://www.soylentnews.org Soylent News is people!
When an article about fsck has a tag line of "What's the damage", I expect to see some discussion of how fsck deals with a damaged file system.
The time required to fsck a file system that doesn't need checking is less interesting and inconsistant with the title. Although, if fsck had complained about the known clean file system that would be interesting.
No, you're thinking of ReiserFS.
Works best if you use the "Doom as Sys Admin" hack.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
A lot of stuff is also faster on Linux, particularly on the x86. Solaris x86 is dog slow. AIX ("aches") is an appropriate name for a mainframe OS that never really got the hang of this new-fangled "interactive user" stuff. It's a good mainframe OS, that is what it is designed for, tuned for and intended for, but traditional mainframe batch transactional work isn't the sort of payload that is typically run these days. The high-end users want hard real-time (i.e.: they know to the microsecond - or nanosecond, in some cases - exactly when each process will start and stop) for data collection, data analysis and simulation. The data centers want massive multithreading for gigantic servers with minimal overhead and service guarantees per thread. The typical user wants extremely low latency interactive. None of these are pre-scripted batch jobs.
Now, if you wanted to develop a data warehouse for, say, technical writings, journalism, etc, where you're compiling a collection of things that can be typeset overnight, that may be doable as a batch job. However, anyone planning on publishing a journal that needs 72 terabytes of storage had best consider the marketplace a little more closely first. A publishing company, say Nature, might conceivably have use for AIX for batch work. I could see the number of submissions, referee responses and article selections per journal being such that a mainframe would be a perfectly valid way to do things. Even then, it might still be sufficiently small that a live transactional database would be more cost-effective.
Traditionally, batch processing has been a niche market for electrical and gas companies, etc, where the number of customers is staggering. Even then, it has largely been replaced with live transactional systems because customers want things adjusted NOW and not overnight or at the end of the week.
Mass mailers still use batch processing, but printing is the bottleneck and there is no point in having an expensive OS process everything in a fraction of a second on an expensive mainframe when it takes N actual real-world seconds before a printer becomes available to take the next block of data. You need run no faster than the slowest component because the end produce won't be delivered any faster. You would have to have a gigantic number of printers before the OS became a significant factor and most shops just don't have that kind of printing power.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
There are of course non-weird problems where you get the answer from IBM support in 2-3 days, and from Linux forums in 2-3 minutes.
I really wouldn't paint Linux support in such rosy terms. Many forums are heading in the direction of the blind leading the blind; application-specific mailing lists and IRC channels, while improving, still have a slight tendency to say "RTFM n00b!". (Or, as happened to me, "Can't be done. It's a stupid demand anyway. Fuck off" - twenty minutes later I figured out how to do it on my own, so it evidently could be done...)