Btrfs Is Getting There, But Not Quite Ready For Production

← Back to Stories (view on slashdot.org)

Btrfs Is Getting There, But Not Quite Ready For Production

Posted by timothy on Friday April 26, 2013 @02:07AM from the delicious-on-popcrnfs dept.

An anonymous reader writes "Btrfs is the next-gen filesystem for Linux, likely to replace ext3 and ext4 in coming years. Btrfs offers many compelling new features and development proceeds apace, but many users still aren't sure whether it's 'ready enough' to entrust their data to. Anchor, a webhosting company, reports on trying it out, with mixed feelings. Their opinion: worth a look-in for most systems, but too risky for frontline production servers. The writeup includes a few nasty caveats that will bite you on serious deployments."

30 of 268 comments (clear)

Min score:

Reason:

Sort:

Read their website by Anonymous Coward · 2013-04-26 02:11 · Score: 5, Informative

It says "experimental." They appreciate you helping them test their file system out. I appreciate it too, so please do. But remember that you are testing an experimental filesystem. When it eats your data, make sure you report it and have backups.
1. Re:Read their website by pipatron · 2013-04-26 02:41 · Score: 5, Informative
  
  Every file system is/should be labled "experimental" in a way. The long answer from the btrfs FAQ is pretty good, and makes some sense:
  
  Long answer: Nobody is going to magically stick a label on the btrfs code and say "yes, this is now stable and bug-free". Different people have different concepts of stability: a home user who wants to keep their ripped CDs on it will have a different requirement for stability than a large financial institution running their trading system on it. If you are concerned about stability in commercial production use, you should test btrfs on a testbed system under production workloads to see if it will do what you want of it. In any case, you should join the mailing list (and hang out in IRC) and read through problem reports and follow them to their conclusion to give yourself a good idea of the types of issues that come up, and the degree to which they can be dealt with. Whatever you do, we recommend keeping good, tested, off-system (and off-site) backups.
  
  --
  c++; /* this makes c bigger but returns the old value */
2. Re:Read their website by isopropanol · 2013-04-26 03:20 · Score: 3, Insightful
  
  Also, read the article. The authors were experimenting and came across some bugs in some pretty hairy edge cases (hundreds of simultaneous snapshots, large disk array suddenly becoming full, etc) that did not cause data loss. They eventually decided not to use BTRFS on one type of system but are using it on others.
  To me, the article was a good thing... But I would have preferred if it was worded as here are some edge case bugs that need fixing before BTRFS is used in our scenario, rather than that these were show stoppers... Because these are not likely show stoppers to anyone who's not implementing the exact same scenario.
  Also It sounds like they should jitter the start time of the backups...
3. Re:Read their website by Bengie · 2013-04-26 03:38 · Score: 4, Informative
  
  My cousin said when he had to go "FS shopping" for his research data center, they had some requirements, most notably, being used by several enterprises that all store at least 1PB of data on the FS and have not had any critical issues in 5 years.
  
  He said the only FS that fit-the-bill was ZFS. His team could not find an enterprise company that stored at least 1PB of data on ZFS and had a non-user caused critical problem within the past 5 years. That was many years ago and he has not had a single issue with his multi-PB storage that is being used by hundreds of departments.
  
  ZFS is not perfect, but it sets a very high bar.
4. Re:Read their website by Zero__Kelvin · 2013-04-26 04:08 · Score: 5, Informative
  
  Did your cousin also find out what exact hardware and exact code was used? If my friend has had no problems with filesystem $FS and then I use it with different hardware and code implementing it, then there is still a significant chance that I will have trouble that he did not. Filesystems all work perfectly, because they are conceptual. It is the implementation that may or may not be stable.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
5. Re:Read their website by Harik · 2013-04-26 04:46 · Score: 4, Insightful
  
  It's an issue with any CoW filesystem being full - in order to delete a file, you need to make a new copy of the metadata that has the file removed, then a copy of the entire tree leading up to that node then finally copy the root - and once the root is committed, you can free up the no-longer in-use blocks. At least, as long as they're not still referenced by another snapshot.
  The alternative is to rewrite the metadata in place and just cross your fingers and hope you don't suffer a power loss at the wrong time, in which case you end up with massive data corruption.
  I've filled up large (for home use) BTRFS filesystems before - 6-10tb. The code does a fairly good job about refusing to create new files that would fill the last remaining bit so it leaves room for metadata CoW to delete. The problem may come from having a particularly large tree that requires more nodes to be allocated on a change then were reserved - in which case the reservation can be tuned.
  BTRFS isn't considered 'done' by any means. It was only in the 3.9 kernel that the new raid5/6 code landed, and other major features (such as dedup) are still pending. It's actually very encouraging that a work-in-progress filesystem is as solid as it is already.
6. Re:Read their website by wagnerrp · 2013-04-26 06:02 · Score: 3
  
  Mirrors are not backups. You are correct about that. They are merely redundancy. Snapshots ARE backups. You can do whatever you want to the original copy, the the snapshot will remain undisturbed. Snapshots are simply not physical backups, however they can be if you export them to a backup server.
Happy with XFS by zidium · 2013-04-26 02:13 · Score: 3, Informative

I've been happily using the XFS file system since the early-to-mid-2000s and have never had a problem. It is rock solid and much faster than ext3/ext4 in my experience, tested a lot longer than Btrfs, and handles the millions and millions of small files on redditmirror.cc very effectively.

--
Slashdot Valentines Beta Massacre: iT WORKED! The boycotts killed Beta!!
1. Re:Happy with XFS by h4rr4r · 2013-04-26 02:18 · Score: 3, Insightful
  
  It also has none of the features that make Btrfs exciting and modern.
  XFS is fine, so is Ext3/Ext4, but Linux need a modern file system.
2. Re:Happy with XFS by bored · 2013-04-26 02:22 · Score: 3, Informative
  
  Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.
  So, if you told me you have been running it for the last year and it was reliable I would have given you more credit than claiming you have been running it for a decade and its been reliable. Because, its had some pretty serious issues that if you didn't hit them means your not a good test case.
  I'm still skeptical, because AKAIK, XFS still doesn't have an order data mode.
3. Re:Happy with XFS by MBGMorden · 2013-04-26 02:36 · Score: 5, Informative
  
  Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.
  I don't know about being more reliable. I use XFS on my RAID array (mdadm) at home. I'm running the latest version of Linux Mint (Nadia), and if I ever lose poser and don't unmount that file system cleanly it looses all recent changes to the drive (and "recent" sometimes stretches to hours ago). The drive mounts fine and nothing appears corrupted (so I guess its not completely data loss), but any files changes (edits, additions, or deletions) to the file system are simply gone.
  Its gotten to the point where if I've just put a lot of stuff on the drive I unmount it and then remount it just to make sure everything gets flushed to disk. If I ever get a chance to rebuild that array it most certainly will be using something different.
  
  --
  "People who think they know everything are very annoying to those of us who do."-Mark Twain
4. Re:Happy with XFS by Booker · 2013-04-26 02:40 · Score: 4, Informative
  
  No, that's FUD and/or misunderstanding on your part.
  "data=ordered" is ext3/4's name for "don't expose stale data on a crash," something which XFS has never done, with or without a mount option. ext3/4 also have "data=writeback" which means "DO expose stale data on a crash." XFS does not need feature parity for ill-advised options.
  Any filesystem will lose buffered and unsynced file data on a crash (http://lwn.net/Articles/457667/). XFS has made filesystem integrity and data persistence job one since before ext3 existed. Like any filesystem, it has had bugs, but implying that it was unsafe for use until recently is incorrect.
  I say this as someone who's been working on ext3, ext4 and xfs code for over a decade, combined.
5. Re:Happy with XFS by bored · 2013-04-26 03:10 · Score: 5, Insightful
  
  No, that's FUD and/or misunderstanding on your part.
  "data=ordered" is ext3/4's name for "don't expose stale data on a crash," something which XFS has never done,
  Actually, I think your the one that doesn't understand how a journaling file system works. The problem with XFS has been that it only journals meta data, and the data portions associated with the metadata are not synchronized with the metadata updates (delayed allocation an all that). This means the metadata portions (filename, sizes, etc) will be correct based on the last journal update flushed to media, but the data referenced by that meta-data may not be.
  A filesystem that is either ordering its meta data/data updates against a disk with proper barriers, or journing the data alongside the meta data doesn't have this problem. The filesystem _AND_ its data remain in a consistent state.
  So, until your understand this basic idea, don't go claiming you know _ANYTHING_ about filesystems.
6. Re:Happy with XFS by Kz · 2013-04-26 03:37 · Score: 4, Interesting
  
  Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.
  It _is_ quite reliable, even on the face of hardware failure.
  Several years ago, I hit the 8TB limit of ext3 and had to migrate to a bigger filesystem. ext4 wasn't ready back then (and still today it's not easy to use on big volumes). Already had bad experiences with reiserfs (which was standard on SuSE), and the "you'll lose data"warnings on XFS docs made me nervous. It was obviously designed to work on very high-end hardware, which I couldn't afford.
  so, I did extensive torture testing. hundreds of pull-the-plug situations, on the host, storage box and SAN switch, with tens of processes writing thousands of files on million-files directories. it was a bloodbath.
  when the dust settled, ext3 was the best by far, managing to never lose more than 10 small files in the worst case, over 70% of the cases recovered cleanly. XFS was slightly worse, never more than 16 lost files and roughly 50% clean recoveries. ReiserFS was really bad, always losing more than 50-70 files and sometimes killing the volume. JFS didn't lose the volume, but lost files count never went below 130, sometimes several hundred.
  needless to say, i switched to XFS, and haven't lost a single byte yet. and yes, there has been a few hardware failures that triggered scary rebuilding tasks, but completed cleanly.
  
  --
  -Kz-
7. Re:Happy with XFS by loufoque · 2013-04-26 04:45 · Score: 3, Informative
  
  Ever heard of the sync command?
8. Re:Happy with XFS by Booker · 2013-04-26 05:42 · Score: 4, Informative
  
  So, until your understand this basic idea, don't go claiming you know _ANYTHING_ about filesystems.
  Without sounding like too much of a jerk, I have hundreds of commits in the linux-2.6 fs/* tree. This is what I do for a living.
  I actually do have a pretty decent grasp of how Linux journaling filesystems behave. :)
  Test your assumptions on ext4 with default mount options. Create a new file and write some buffered data to it, wait 5-10 seconds, punch the power button, and see what you get. (You'll get a 0 length file) Or write a pattern to a file, sync it, overwrite with a new pattern, and punch power. (You'll get the old pattern). Or write data to a file, sync it, extend it, and punch power. (You'll get the pre-extension size). Wait until the kernel pushes data out of the page cache to disk, *then* punch power, and you'll get everything you wrote, obviously.
  XFS and ext4 behave identically in all these scenarios. Maybe you can show me a testcase where XFS misbehaves in your opinion? (bonus points for demonstrating where XFS actually fails any posix guarantee).
  Yes, ext3/4 have data=journaled - but its not default, and with ext4, that option disables delalloc and O_DIRECT capabilities. 99% of the world doesn't run that way; it's slower for almost all workloads and TBH, is only lightly tested.
  Yes, ext3's data=ordered pushes out tons of file data on every journal commit. That has serious performance implications, but it does shorten the window for buffered data loss to the journal commit time.
  You want data persistence with a posix filesystem? Use the proper data integrity syscalls, that's all there is to it.
9. Re:Happy with XFS by bored · 2013-04-26 07:20 · Score: 3, Interesting
  
  Without sounding like too much of a jerk, I have hundreds of commits in the linux-2.6 fs/* tree. This is what I do for a living.
  Well, then your part of the problem. Your idea that you have to be correct or fast is sadly sort of wrong. Its possible to be correct without completely destroying performance. I have a few commits in the kernel as well mostly to fix completely broken behavior (my day job in the past was working on an enterprise unix). So, I do understand filesystems too. Lately, my job has been to replace all that garbage, from the scsi midlayer up, so that a small industry specific "application" can both make guarantees about the data being written to disk while still maintaining many GB/sec of IO. The result, actually makes the whole stack look really bad.
  So, I'm sure your aware that on linux, if you use proper posix semantics (fsync() and friends) the performance is abysmal compared to the alternatives. This is mostly because of the "broken" fencing behavior (which has recently gotten better but still is far from perfect) in the block layer. Our changes depend on 8-10 year old features available in SCSI to make the guarantees that aren't available everywhere. But it penalizes devices which don't support modern tagging, ordering and fencing semantics rather than ones that do.
  Generally in linux, application developers are stuck either dealing with orders of magnitude performance loss, or they have to play games in an attempt to second guess the filesystem. Neither is a good compromise and its sort of shameful.
  Maybe its time to admit linux needs a filesystem that doesn't force people to choose either abysmal performance, or no guarantees about integrity.
Re:replace ext3 and ext4? really? by h4rr4r · 2013-04-26 02:21 · Score: 3, Informative

Lots of production servers user Ext filesystems. If btrfs is all it should be it will certainly replace these file systems one day soon as the safe choice.
Sure people use other filesystems on production Linux servers, but those are not the norm. The safe "Enterprise" (Not necessarily a good thing) choice is still Ext based filesystems.
ZFS by 0100010001010011 · 2013-04-26 02:21 · Score: 5, Informative

Meanwhile ZFS announced that it was ready for production last month.
http://zfsonlinux.org/
1. Re:ZFS by h4rr4r · 2013-04-26 02:26 · Score: 4, Insightful
  
  It will be ready for production when it can be distributed with the kernel.
  Do you really want to depend on an out of tree FS?
2. Re:ZFS by Bill_the_Engineer · 2013-04-26 02:30 · Score: 3, Interesting
  
  Incompatible license prevents ZFS inclusion with the kernel. This is why Btrfs exists and explains Oracle's involvement with both.
  
  --
  These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
3. Re:ZFS by h4rr4r · 2013-04-26 02:33 · Score: 4, Insightful
  
  Correct sir.
  My point still stands though. Even though the limitation keeping it from being seriously considered for production is caused by a legal issue not a technical one.
4. Re:ZFS by Chris+Mattern · 2013-04-26 02:44 · Score: 3, Interesting
  
  Mixing licenses does not somehow make things "not production ready".
  No, using a file system that doesn't ship with the kernel makes things "not production ready." Licensing is the reason why it doesn't ship with the kernel, but it's not shipping with the kernel that keeps it out of critical production use.
5. Re:ZFS by wagnerrp · 2013-04-26 06:25 · Score: 3, Insightful
  
  Anyone using nVidia GPUs for compute cards in a data center is using the closed nVidia drivers. Anyone not using them for that purpose likely doesn't even have any nVidia hardware in the first place.
Sorry Slashdot. by Anonymous Coward · 2013-04-26 02:31 · Score: 5, Funny

Ugh, I'm really sorry about this post, Slashdot. I really didn't think it was going to a "First post." What I really meant to post was

OMFG fr1st psot!!!! APK!! crazy host file conspiracy! /etc/mod_me_down
Re:Why? by h4rr4r · 2013-04-26 02:37 · Score: 5, Insightful

ZFS is outside the kernel tree. That is not an ideological issue, but a practical one. It means updates will not come from the normal channels, it means kernel updates form normal channels could break it and it is not getting the attention from the kernel devs an fs should get.
ZFS on linux has probably less testing than Btrfs at this point. It has near no real world testing. Just because the Solaris ZFS is great, and the BSD one is coming along means nothing for the stability and correctness of the Linux port.
If you want to use a different OS than this entire discussion is worthless. You might as well suggest switching everything to OSX and using HFS+.
Re:The oracle in the woodpile by larry+bagina · 2013-04-26 02:38 · Score: 5, Interesting

Oracle now owns ZFS. They could relicense it if they wanted to. BTRFS was started before the Sun acquisition but it seems strange* to develop BTRFS as a GPL file system with ZFS-like features while ZFS is mature and reliable today.
* Yes, they're a large corporation and right hand doesn't know what left hand does... but isn't this more like the index finger not knowing what the middle finger is doing?

--
Do you even lift?
These aren't the 'roids you're looking for.
Re:Yawn, yet another filesystem... by h4rr4r · 2013-04-26 02:50 · Score: 5, Insightful

Ext3 is still chugging along and doing what you want. A filesystem that sacrifices everything for stability.
Not everyone has the same wants and needs. Lots of competing filesystems is a good thing, it leads to a market of ideas. Your lets pick one and force everyone to suffer with our choice just leads to stagnation and even worse results.
Re:It's completely ideological. by UnknownSoldier · 2013-04-26 04:47 · Score: 4, Interesting

Please mod parent informative.
One of the retarded things about btrfs is that you can not see how much disk space is being used by each subvolume. How the hell can you have a filesystem and not know how much space is in use or free ??
The design of ZFS is much more wholistic. That is, when we take a step back and look at both the micro and macro we see that we are really trying to solve 3 problems:
* Volume Management
* File System
* Data Integrity
ZFS solves all of these be leveraging knowledge from ALL the layers as one cohesive whole.
https://blogs.oracle.com/bonwick/en_US/entry/rampant_layering_violation
Why RAID is fundamentally broken
https://blogs.oracle.com/bonwick/entry/raid_z
Another interesting doc
http://www.scribd.com/doc/43973847/5/ZFS-Design-Principles
tried it as main laptop filesystem by Luke_22 · 2013-04-26 04:47 · Score: 3, Interesting

I tried btrfs as my main laptop filesystem:
nice features, speed ok, but i happened to unplug by mistake the power supply, without a battery. bad crash... I tried using btrfsck, and other debug tools, even in the "dangerdon'teveruse" git branch, they just segfaulted. at the end my filesystem was unrecoverable, I used btrfs-restore, only to find out that 90% of my files had been truncated to 0... even files i didn't use for months....
now, maybe it was the compress=lzo option, or maybe I played a little too much with the repair tools (possible), but untill btrfs can sustain power drops without problems, and the repair tools at least do not segfault, I won't use it for my main filesystem...
btrfs is supposed to save a consistent state every 30 seconds, so I don't understand how I messed up that bad.... maybe the superblock was gone and the btrfsck --repair borked everything, I don't know.... luckily for me: backups :)

--
"I was gratified to be able to answer promptly, and I did. I said I didn't know." -- Mark Twain