Btrfs Is Getting There, But Not Quite Ready For Production
An anonymous reader writes "Btrfs is the next-gen filesystem for Linux, likely to replace ext3 and ext4 in coming years. Btrfs offers many compelling new features and development proceeds apace, but many users still aren't sure whether it's 'ready enough' to entrust their data to. Anchor, a webhosting company, reports on trying it out, with mixed feelings. Their opinion: worth a look-in for most systems, but too risky for frontline production servers. The writeup includes a few nasty caveats that will bite you on serious deployments."
It says "experimental." They appreciate you helping them test their file system out. I appreciate it too, so please do. But remember that you are testing an experimental filesystem. When it eats your data, make sure you report it and have backups.
I've been happily using the XFS file system since the early-to-mid-2000s and have never had a problem. It is rock solid and much faster than ext3/ext4 in my experience, tested a lot longer than Btrfs, and handles the millions and millions of small files on redditmirror.cc very effectively.
Slashdot Valentines Beta Massacre: iT WORKED! The boycotts killed Beta!!
I think we need to talk about the oracle in the woodpile - ie, Oracle. BTRFS is an Oracle project. What happens when it goes the way of MySQL? Will Monty Wideanus appear on a white steed to save us?
Do you even lift?
These aren't the 'roids you're looking for.
Lots of production servers user Ext filesystems. If btrfs is all it should be it will certainly replace these file systems one day soon as the safe choice.
Sure people use other filesystems on production Linux servers, but those are not the norm. The safe "Enterprise" (Not necessarily a good thing) choice is still Ext based filesystems.
Meanwhile ZFS announced that it was ready for production last month.
http://zfsonlinux.org/
Ugh, I'm really sorry about this post, Slashdot. I really didn't think it was going to a "First post." What I really meant to post was
ZFS is outside the kernel tree. That is not an ideological issue, but a practical one. It means updates will not come from the normal channels, it means kernel updates form normal channels could break it and it is not getting the attention from the kernel devs an fs should get.
ZFS on linux has probably less testing than Btrfs at this point. It has near no real world testing. Just because the Solaris ZFS is great, and the BSD one is coming along means nothing for the stability and correctness of the Linux port.
If you want to use a different OS than this entire discussion is worthless. You might as well suggest switching everything to OSX and using HFS+.
yea Btrfs has one major bug
if you fill the hard drive up you lose access to the system, you can't log in or even get access to the filesystem and the system locks up
with ext things may act a bit erratic but you could log in and delete/move things off to make room and be ok. but Btrfs you can't if it fills up you lose
unless you take the hard drive out move it to another box and mount it then delete crap that way, but that's a pain in arse.
Ext3 is still chugging along and doing what you want. A filesystem that sacrifices everything for stability.
Not everyone has the same wants and needs. Lots of competing filesystems is a good thing, it leads to a market of ideas. Your lets pick one and force everyone to suffer with our choice just leads to stagnation and even worse results.
The problem with "XFS" eating data wasn't with XFS - it was with the Linux devmapper ignoring filesystem barrier requests.
Gotta love this code:
Martin Steigerwald wrote:
> Hello!
>
> Are write barriers over device mapper supported or not?
Nope.
see dm_request(): /*
* There is no use in forwarding any barrier request since we can't
* guarantee it is (or can be) handled by the targets correctly.
*/
if (unlikely(bio_barrier(bio))) {
bio_endio(bio, -EOPNOTSUPP);
return 0;
}
Who's the clown who thought THAT was acceptable? WHAT. THE. FUCK?!?!?!
And it wasn't just devmapper that had such a childish attitude towards file system barriers:
Andrew Morton's response tells a lot about why this default is set the way it is:
Last time this came up lots of workloads slowed down by 30% so I dropped the patches in horror. I just don't think we can quietly go and slow everyone's machines down by this much...
There are no happy solutions here, and I'm inclined to let this dog remain asleep and continue to leave it up to distributors to decide what their default should be.
So barriers are disabled by default because they have a serious impact on performance. And, beyond that, the fact is that people get away with running their filesystems without using barriers. Reports of ext3 filesystem corruption are few and far between.
It turns out that the "getting away with it" factor is not just luck. Ted Ts'o explains what's going on: the journal on ext3/ext4 filesystems is normally contiguous on the physical media. The filesystem code tries to create it that way, and, since the journal is normally created at the same time as the filesystem itself, contiguous space is easy to come by. Keeping the journal together will be good for performance, but it also helps to prevent reordering. In normal usage, the commit record will land on the block just after the rest of the journal data, so there is no reason for the drive to reorder things. The commit record will naturally be written just after all of the other journal log data has made it to the media.
I love that italicized part. "OMG! Data integrity causes a performance hit! Screw data integerity! We won't be able to brag that we're faster than Solaris!"
See also http://www.redhat.com/archives/rhl-devel-list/2008-June/msg00560.html
There's a lot more out there if you care to look.
Toss in other things like the way Linux handles NFSv2 group membership (More than 16? Let's just silently drop some!) and lots of fanbois wonder why I view Linux as little better than Windows. Hell, Microsoft may fuck things up six ways from Sunday, but they're not CHILDISH when it comes to things like data integrity.
Want more than 16TB on your server? Unless ext4 has very recently grown that support then using an ext based file system is not viable. Remember a RAID5 in 4D+P using 4TB disks will be super close to that 16TB limit. Better hope that you don't want to scale the file system up in the future.
FYI, ext4 can be larger than 16 TB but you need a newer version of the e2fsprogs than is included in a typical enterprise distribution. It's not the kernel filesystem drivers with the limitation, but the user-level utility for formatting a new filesystem.
Ext3 is still chugging along and doing what you want. A filesystem that sacrifices everything for stability.
EXT3, is actually fairly good, and the performance isn't bad _EXCEPT_ for one issue. fsync(), which causes a massive IO barrier against all the other operations in the filesystem. fsync() should only be assuring the named file is consistent, and yet it basically stalls the entire FS to assure that one file. Its a problem with lack of proper IO tagging and actually is a fundamental problem with the block layer in linux. A recent LSML posting about SYNCHRONIZE CACHE hints at the problem too (complete device flush when only a small portion of the IO needs to be flushed).
Installed Xubuntu 12.10 last October(ish) on USB2 stick (jetflash 32G) with Btrfs (only /boot had EXT2 partition, no swap)
Reason: 24/7 machine. It's a notebook - always spinning harddrive is a drag: spins up cooling fun; so I went solid state for primary OS drive.Needed filesystem that spreads wear and does checksums - hence Btrfs.
Usage - downloading stuff (to the stick itself, not the harddrive) plus some NASing. Data volume: wrapped around those 32gigs few times already.
Observations so far: no problems at all.
Other details: Had to play with I/O scheduler (I think settled on CFQ. Interestingly, NOOP sucked). Had to install hdidle (I think) otherwise couldn't force sda to go to sleep (bug (?)).
Please mod parent informative.
One of the retarded things about btrfs is that you can not see how much disk space is being used by each subvolume. How the hell can you have a filesystem and not know how much space is in use or free ??
The design of ZFS is much more wholistic. That is, when we take a step back and look at both the micro and macro we see that we are really trying to solve 3 problems:
* Volume Management
* File System
* Data Integrity
ZFS solves all of these be leveraging knowledge from ALL the layers as one cohesive whole.
https://blogs.oracle.com/bonwick/en_US/entry/rampant_layering_violation
Why RAID is fundamentally broken
https://blogs.oracle.com/bonwick/entry/raid_z
Another interesting doc
http://www.scribd.com/doc/43973847/5/ZFS-Design-Principles
I tried btrfs as my main laptop filesystem:
nice features, speed ok, but i happened to unplug by mistake the power supply, without a battery. bad crash... I tried using btrfsck, and other debug tools, even in the "dangerdon'teveruse" git branch, they just segfaulted. at the end my filesystem was unrecoverable, I used btrfs-restore, only to find out that 90% of my files had been truncated to 0... even files i didn't use for months....
now, maybe it was the compress=lzo option, or maybe I played a little too much with the repair tools (possible), but untill btrfs can sustain power drops without problems, and the repair tools at least do not segfault, I won't use it for my main filesystem...
btrfs is supposed to save a consistent state every 30 seconds, so I don't understand how I messed up that bad.... maybe the superblock was gone and the btrfsck --repair borked everything, I don't know.... luckily for me: backups :)
"I was gratified to be able to answer promptly, and I did. I said I didn't know." -- Mark Twain
zfsonlinux has less testing than Btrfs? Really?
I think you mean *THE LINUX SHIM* has less testing. However, there's this *HUGE* portion of the code, as a wild ass guess I'd say 80%, which is the internal algorithms, data structures, and other internal parts of the file-system that are shared by the Linux and Solaris versions and those have been quite seriously tested for ZFS.
My experience with ZFS under Linux via FUSE was that there were some bugs in the integration layer, but they tended to be fairly shallow and never lead to data loss. This is over around 3 years of ZFS+FUSE on Linux serious use (~30TB of backup storage, home storage server). I tested the heck out of ZFS+FUSE before we deployed it, found some issues, worked with the developers (who were amazing!), and eventually got to a point where the stress test I was running on it was more stable than it was under our OpenSolaris systems a few years prior (and the reason I built the stress test).
Based on my experience with ZFS, ZFS+FUSE, and btrfs, I'd personally trust ZFSonLinux over btrfs. My experimentation with btrfs the last few years has been that it still needs a lot of work.