Btrfs Is Getting There, But Not Quite Ready For Production

← Back to Stories (view on slashdot.org)

Btrfs Is Getting There, But Not Quite Ready For Production

Posted by timothy on Friday April 26, 2013 @02:07AM from the delicious-on-popcrnfs dept.

An anonymous reader writes "Btrfs is the next-gen filesystem for Linux, likely to replace ext3 and ext4 in coming years. Btrfs offers many compelling new features and development proceeds apace, but many users still aren't sure whether it's 'ready enough' to entrust their data to. Anchor, a webhosting company, reports on trying it out, with mixed feelings. Their opinion: worth a look-in for most systems, but too risky for frontline production servers. The writeup includes a few nasty caveats that will bite you on serious deployments."

18 of 268 comments (clear)

Min score:

Reason:

Sort:

Read their website by Anonymous Coward · 2013-04-26 02:11 · Score: 5, Informative

It says "experimental." They appreciate you helping them test their file system out. I appreciate it too, so please do. But remember that you are testing an experimental filesystem. When it eats your data, make sure you report it and have backups.
1. Re:Read their website by pipatron · 2013-04-26 02:41 · Score: 5, Informative
  
  Every file system is/should be labled "experimental" in a way. The long answer from the btrfs FAQ is pretty good, and makes some sense:
  
  Long answer: Nobody is going to magically stick a label on the btrfs code and say "yes, this is now stable and bug-free". Different people have different concepts of stability: a home user who wants to keep their ripped CDs on it will have a different requirement for stability than a large financial institution running their trading system on it. If you are concerned about stability in commercial production use, you should test btrfs on a testbed system under production workloads to see if it will do what you want of it. In any case, you should join the mailing list (and hang out in IRC) and read through problem reports and follow them to their conclusion to give yourself a good idea of the types of issues that come up, and the degree to which they can be dealt with. Whatever you do, we recommend keeping good, tested, off-system (and off-site) backups.
  
  --
  c++; /* this makes c bigger but returns the old value */
2. Re:Read their website by Bengie · 2013-04-26 03:38 · Score: 4, Informative
  
  My cousin said when he had to go "FS shopping" for his research data center, they had some requirements, most notably, being used by several enterprises that all store at least 1PB of data on the FS and have not had any critical issues in 5 years.
  
  He said the only FS that fit-the-bill was ZFS. His team could not find an enterprise company that stored at least 1PB of data on ZFS and had a non-user caused critical problem within the past 5 years. That was many years ago and he has not had a single issue with his multi-PB storage that is being used by hundreds of departments.
  
  ZFS is not perfect, but it sets a very high bar.
3. Re:Read their website by Zero__Kelvin · 2013-04-26 04:08 · Score: 5, Informative
  
  Did your cousin also find out what exact hardware and exact code was used? If my friend has had no problems with filesystem $FS and then I use it with different hardware and code implementing it, then there is still a significant chance that I will have trouble that he did not. Filesystems all work perfectly, because they are conceptual. It is the implementation that may or may not be stable.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
4. Re:Read their website by Harik · 2013-04-26 04:46 · Score: 4, Insightful
  
  It's an issue with any CoW filesystem being full - in order to delete a file, you need to make a new copy of the metadata that has the file removed, then a copy of the entire tree leading up to that node then finally copy the root - and once the root is committed, you can free up the no-longer in-use blocks. At least, as long as they're not still referenced by another snapshot.
  The alternative is to rewrite the metadata in place and just cross your fingers and hope you don't suffer a power loss at the wrong time, in which case you end up with massive data corruption.
  I've filled up large (for home use) BTRFS filesystems before - 6-10tb. The code does a fairly good job about refusing to create new files that would fill the last remaining bit so it leaves room for metadata CoW to delete. The problem may come from having a particularly large tree that requires more nodes to be allocated on a change then were reserved - in which case the reservation can be tuned.
  BTRFS isn't considered 'done' by any means. It was only in the 3.9 kernel that the new raid5/6 code landed, and other major features (such as dedup) are still pending. It's actually very encouraging that a work-in-progress filesystem is as solid as it is already.
ZFS by 0100010001010011 · 2013-04-26 02:21 · Score: 5, Informative

Meanwhile ZFS announced that it was ready for production last month.
http://zfsonlinux.org/
1. Re:ZFS by h4rr4r · 2013-04-26 02:26 · Score: 4, Insightful
  
  It will be ready for production when it can be distributed with the kernel.
  Do you really want to depend on an out of tree FS?
2. Re:ZFS by h4rr4r · 2013-04-26 02:33 · Score: 4, Insightful
  
  Correct sir.
  My point still stands though. Even though the limitation keeping it from being seriously considered for production is caused by a legal issue not a technical one.
Sorry Slashdot. by Anonymous Coward · 2013-04-26 02:31 · Score: 5, Funny

Ugh, I'm really sorry about this post, Slashdot. I really didn't think it was going to a "First post." What I really meant to post was

OMFG fr1st psot!!!! APK!! crazy host file conspiracy! /etc/mod_me_down
Re:Happy with XFS by MBGMorden · 2013-04-26 02:36 · Score: 5, Informative

Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.
I don't know about being more reliable. I use XFS on my RAID array (mdadm) at home. I'm running the latest version of Linux Mint (Nadia), and if I ever lose poser and don't unmount that file system cleanly it looses all recent changes to the drive (and "recent" sometimes stretches to hours ago). The drive mounts fine and nothing appears corrupted (so I guess its not completely data loss), but any files changes (edits, additions, or deletions) to the file system are simply gone.
Its gotten to the point where if I've just put a lot of stuff on the drive I unmount it and then remount it just to make sure everything gets flushed to disk. If I ever get a chance to rebuild that array it most certainly will be using something different.

--
"People who think they know everything are very annoying to those of us who do."-Mark Twain
Re:Why? by h4rr4r · 2013-04-26 02:37 · Score: 5, Insightful

ZFS is outside the kernel tree. That is not an ideological issue, but a practical one. It means updates will not come from the normal channels, it means kernel updates form normal channels could break it and it is not getting the attention from the kernel devs an fs should get.
ZFS on linux has probably less testing than Btrfs at this point. It has near no real world testing. Just because the Solaris ZFS is great, and the BSD one is coming along means nothing for the stability and correctness of the Linux port.
If you want to use a different OS than this entire discussion is worthless. You might as well suggest switching everything to OSX and using HFS+.
Re:The oracle in the woodpile by larry+bagina · 2013-04-26 02:38 · Score: 5, Interesting

Oracle now owns ZFS. They could relicense it if they wanted to. BTRFS was started before the Sun acquisition but it seems strange* to develop BTRFS as a GPL file system with ZFS-like features while ZFS is mature and reliable today.
* Yes, they're a large corporation and right hand doesn't know what left hand does... but isn't this more like the index finger not knowing what the middle finger is doing?

--
Do you even lift?
These aren't the 'roids you're looking for.
Re:Happy with XFS by Booker · 2013-04-26 02:40 · Score: 4, Informative

No, that's FUD and/or misunderstanding on your part.
"data=ordered" is ext3/4's name for "don't expose stale data on a crash," something which XFS has never done, with or without a mount option. ext3/4 also have "data=writeback" which means "DO expose stale data on a crash." XFS does not need feature parity for ill-advised options.
Any filesystem will lose buffered and unsynced file data on a crash (http://lwn.net/Articles/457667/). XFS has made filesystem integrity and data persistence job one since before ext3 existed. Like any filesystem, it has had bugs, but implying that it was unsafe for use until recently is incorrect.
I say this as someone who's been working on ext3, ext4 and xfs code for over a decade, combined.
Re:Yawn, yet another filesystem... by h4rr4r · 2013-04-26 02:50 · Score: 5, Insightful

Ext3 is still chugging along and doing what you want. A filesystem that sacrifices everything for stability.
Not everyone has the same wants and needs. Lots of competing filesystems is a good thing, it leads to a market of ideas. Your lets pick one and force everyone to suffer with our choice just leads to stagnation and even worse results.
Re:Happy with XFS by bored · 2013-04-26 03:10 · Score: 5, Insightful

No, that's FUD and/or misunderstanding on your part.
"data=ordered" is ext3/4's name for "don't expose stale data on a crash," something which XFS has never done,
Actually, I think your the one that doesn't understand how a journaling file system works. The problem with XFS has been that it only journals meta data, and the data portions associated with the metadata are not synchronized with the metadata updates (delayed allocation an all that). This means the metadata portions (filename, sizes, etc) will be correct based on the last journal update flushed to media, but the data referenced by that meta-data may not be.
A filesystem that is either ordering its meta data/data updates against a disk with proper barriers, or journing the data alongside the meta data doesn't have this problem. The filesystem _AND_ its data remain in a consistent state.
So, until your understand this basic idea, don't go claiming you know _ANYTHING_ about filesystems.
Re:Happy with XFS by Kz · 2013-04-26 03:37 · Score: 4, Interesting

Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.
It _is_ quite reliable, even on the face of hardware failure.
Several years ago, I hit the 8TB limit of ext3 and had to migrate to a bigger filesystem. ext4 wasn't ready back then (and still today it's not easy to use on big volumes). Already had bad experiences with reiserfs (which was standard on SuSE), and the "you'll lose data"warnings on XFS docs made me nervous. It was obviously designed to work on very high-end hardware, which I couldn't afford.
so, I did extensive torture testing. hundreds of pull-the-plug situations, on the host, storage box and SAN switch, with tens of processes writing thousands of files on million-files directories. it was a bloodbath.
when the dust settled, ext3 was the best by far, managing to never lose more than 10 small files in the worst case, over 70% of the cases recovered cleanly. XFS was slightly worse, never more than 16 lost files and roughly 50% clean recoveries. ReiserFS was really bad, always losing more than 50-70 files and sometimes killing the volume. JFS didn't lose the volume, but lost files count never went below 130, sometimes several hundred.
needless to say, i switched to XFS, and haven't lost a single byte yet. and yes, there has been a few hardware failures that triggered scary rebuilding tasks, but completed cleanly.

--
-Kz-
Re:It's completely ideological. by UnknownSoldier · 2013-04-26 04:47 · Score: 4, Interesting

Please mod parent informative.
One of the retarded things about btrfs is that you can not see how much disk space is being used by each subvolume. How the hell can you have a filesystem and not know how much space is in use or free ??
The design of ZFS is much more wholistic. That is, when we take a step back and look at both the micro and macro we see that we are really trying to solve 3 problems:
* Volume Management
* File System
* Data Integrity
ZFS solves all of these be leveraging knowledge from ALL the layers as one cohesive whole.
https://blogs.oracle.com/bonwick/en_US/entry/rampant_layering_violation
Why RAID is fundamentally broken
https://blogs.oracle.com/bonwick/entry/raid_z
Another interesting doc
http://www.scribd.com/doc/43973847/5/ZFS-Design-Principles
Re:Happy with XFS by Booker · 2013-04-26 05:42 · Score: 4, Informative

So, until your understand this basic idea, don't go claiming you know _ANYTHING_ about filesystems.
Without sounding like too much of a jerk, I have hundreds of commits in the linux-2.6 fs/* tree. This is what I do for a living.
I actually do have a pretty decent grasp of how Linux journaling filesystems behave. :)
Test your assumptions on ext4 with default mount options. Create a new file and write some buffered data to it, wait 5-10 seconds, punch the power button, and see what you get. (You'll get a 0 length file) Or write a pattern to a file, sync it, overwrite with a new pattern, and punch power. (You'll get the old pattern). Or write data to a file, sync it, extend it, and punch power. (You'll get the pre-extension size). Wait until the kernel pushes data out of the page cache to disk, *then* punch power, and you'll get everything you wrote, obviously.
XFS and ext4 behave identically in all these scenarios. Maybe you can show me a testcase where XFS misbehaves in your opinion? (bonus points for demonstrating where XFS actually fails any posix guarantee).
Yes, ext3/4 have data=journaled - but its not default, and with ext4, that option disables delalloc and O_DIRECT capabilities. 99% of the world doesn't run that way; it's slower for almost all workloads and TBH, is only lightly tested.
Yes, ext3's data=ordered pushes out tons of file data on every journal commit. That has serious performance implications, but it does shorten the window for buffered data loss to the journal commit time.
You want data persistence with a posix filesystem? Use the proper data integrity syscalls, that's all there is to it.