Slashdot Mirror


Btrfs Is Getting There, But Not Quite Ready For Production

An anonymous reader writes "Btrfs is the next-gen filesystem for Linux, likely to replace ext3 and ext4 in coming years. Btrfs offers many compelling new features and development proceeds apace, but many users still aren't sure whether it's 'ready enough' to entrust their data to. Anchor, a webhosting company, reports on trying it out, with mixed feelings. Their opinion: worth a look-in for most systems, but too risky for frontline production servers. The writeup includes a few nasty caveats that will bite you on serious deployments."

64 of 268 comments (clear)

  1. Read their website by Anonymous Coward · · Score: 5, Informative

    It says "experimental." They appreciate you helping them test their file system out. I appreciate it too, so please do. But remember that you are testing an experimental filesystem. When it eats your data, make sure you report it and have backups.

    1. Re:Read their website by pipatron · · Score: 5, Informative

      Every file system is/should be labled "experimental" in a way. The long answer from the btrfs FAQ is pretty good, and makes some sense:

      Long answer: Nobody is going to magically stick a label on the btrfs code and say "yes, this is now stable and bug-free". Different people have different concepts of stability: a home user who wants to keep their ripped CDs on it will have a different requirement for stability than a large financial institution running their trading system on it. If you are concerned about stability in commercial production use, you should test btrfs on a testbed system under production workloads to see if it will do what you want of it. In any case, you should join the mailing list (and hang out in IRC) and read through problem reports and follow them to their conclusion to give yourself a good idea of the types of issues that come up, and the degree to which they can be dealt with. Whatever you do, we recommend keeping good, tested, off-system (and off-site) backups.

      --
      c++; /* this makes c bigger but returns the old value */
    2. Re:Read their website by Tarlus · · Score: 2

      And make sure those backups aren't also on a btrfs volume.

      --
      /* No Comment */
    3. Re:Read their website by isopropanol · · Score: 3, Insightful

      Also, read the article. The authors were experimenting and came across some bugs in some pretty hairy edge cases (hundreds of simultaneous snapshots, large disk array suddenly becoming full, etc) that did not cause data loss. They eventually decided not to use BTRFS on one type of system but are using it on others.

      To me, the article was a good thing... But I would have preferred if it was worded as here are some edge case bugs that need fixing before BTRFS is used in our scenario, rather than that these were show stoppers... Because these are not likely show stoppers to anyone who's not implementing the exact same scenario.

      Also It sounds like they should jitter the start time of the backups...

    4. Re:Read their website by Bengie · · Score: 4, Informative

      My cousin said when he had to go "FS shopping" for his research data center, they had some requirements, most notably, being used by several enterprises that all store at least 1PB of data on the FS and have not had any critical issues in 5 years.

      He said the only FS that fit-the-bill was ZFS. His team could not find an enterprise company that stored at least 1PB of data on ZFS and had a non-user caused critical problem within the past 5 years. That was many years ago and he has not had a single issue with his multi-PB storage that is being used by hundreds of departments.

      ZFS is not perfect, but it sets a very high bar.

    5. Re:Read their website by AvitarX · · Score: 2

      Being unfixable when full is a pretty big show stopper IMO.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    6. Re:Read their website by Zero__Kelvin · · Score: 5, Informative

      Did your cousin also find out what exact hardware and exact code was used? If my friend has had no problems with filesystem $FS and then I use it with different hardware and code implementing it, then there is still a significant chance that I will have trouble that he did not. Filesystems all work perfectly, because they are conceptual. It is the implementation that may or may not be stable.

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    7. Re:Read their website by UnknownSoldier · · Score: 2

      > There is no room for any bugs at all in a filesystem to which you will trust your essential data.

      Your ideology is admired except it is not practical :-(

      * So you are able to guarantee you are able to write 100% bug free code?

      * AND it can deal with hardware failures such as bad memory?

      I have a bridge to sell you :-)

    8. Re:Read their website by Harik · · Score: 4, Insightful

      It's an issue with any CoW filesystem being full - in order to delete a file, you need to make a new copy of the metadata that has the file removed, then a copy of the entire tree leading up to that node then finally copy the root - and once the root is committed, you can free up the no-longer in-use blocks. At least, as long as they're not still referenced by another snapshot.

      The alternative is to rewrite the metadata in place and just cross your fingers and hope you don't suffer a power loss at the wrong time, in which case you end up with massive data corruption.

      I've filled up large (for home use) BTRFS filesystems before - 6-10tb. The code does a fairly good job about refusing to create new files that would fill the last remaining bit so it leaves room for metadata CoW to delete. The problem may come from having a particularly large tree that requires more nodes to be allocated on a change then were reserved - in which case the reservation can be tuned.

      BTRFS isn't considered 'done' by any means. It was only in the 3.9 kernel that the new raid5/6 code landed, and other major features (such as dedup) are still pending. It's actually very encouraging that a work-in-progress filesystem is as solid as it is already.

    9. Re:Read their website by Bigby · · Score: 2

      Does btrfs support the removal of a stripped volume yet? I want to issue a "remove" command, let it re-balance, and then remove that drive. I know the other disk can take on the space. Then I want to add another larger volume, which I know it supports.

    10. Re:Read their website by Anonymous Coward · · Score: 2, Informative

      Mirrors and snapshots are not backups. They can be used to create backups, but are not backups in themselves.

    11. Re:Read their website by g1zmo · · Score: 2

      Netgear's consumer-level NAS products are now using btrfs. This being the Internet and all, folks are complaining in forums and Facebook about...well if not about this then I guess it would be something else.

      --
      I have found there are just two ways to go.
      It all comes down to livin' fast or dyin' slow.
      -REK, Jr.
    12. Re:Read their website by wagnerrp · · Score: 3

      Mirrors are not backups. You are correct about that. They are merely redundancy. Snapshots ARE backups. You can do whatever you want to the original copy, the the snapshot will remain undisturbed. Snapshots are simply not physical backups, however they can be if you export them to a backup server.

    13. Re:Read their website by KiloByte · · Score: 2

      When it comes to data safety, btrfs has been production ready for a few years already. There are issues with latency -- largely fixed -- and dealing with asinine abuse of fsync(). That's also mostly dealt with, although there's no real full fix other than fixing problematic software in the first place. There's no real way to have efficient cow/etc and fast fsync together, but you don't need the latter if the filesystem can do transactions for you.

      So we have a filesystem with a number of safety features but relatively new code vs one with code/design that's 40 years old but has hardly any safety features at all. I'd say, it's ext4 that's not production ready: a no-op backup can take half an hour (a big spinning disk that holds a bunch of vservers).

      --
      The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    14. Re:Read their website by Tough+Love · · Score: 2

      I won't buy your bridge or move my systems away from Ext4 for the time being. BTW, E2fsck does a great job of repairing filesystems that have been corrupted (sometimes massively) by hardware failure of various kinds. This is an essential trick that ZFS and Btrfs have yet to learn.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    15. Re:Read their website by Anonymous Coward · · Score: 2, Informative

      I'm sorry, but I call BS on this. I love ZFS and it is a great ,solid file system. But your friend couldn't have gone looking for case studies of five-year-plus ZFS usage "many years ago". ZFS has only been around for about eight years, the first few of those years it saw very limited usage (ie OpenSolaris). Yes, ZFS is a great file system, but let's stick to factual reasons why it is good, no need to make up stories.

  2. Happy with XFS by zidium · · Score: 3, Informative

    I've been happily using the XFS file system since the early-to-mid-2000s and have never had a problem. It is rock solid and much faster than ext3/ext4 in my experience, tested a lot longer than Btrfs, and handles the millions and millions of small files on redditmirror.cc very effectively.

    --
    Slashdot Valentines Beta Massacre: iT WORKED! The boycotts killed Beta!!
    1. Re:Happy with XFS by h4rr4r · · Score: 3, Insightful

      It also has none of the features that make Btrfs exciting and modern.

      XFS is fine, so is Ext3/Ext4, but Linux need a modern file system.

    2. Re:Happy with XFS by bored · · Score: 3, Informative

      Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.

      So, if you told me you have been running it for the last year and it was reliable I would have given you more credit than claiming you have been running it for a decade and its been reliable. Because, its had some pretty serious issues that if you didn't hit them means your not a good test case.

      I'm still skeptical, because AKAIK, XFS still doesn't have an order data mode.

    3. Re:Happy with XFS by h4rr4r · · Score: 2

      No, I am suggesting datacenter linux needs something like ZFS. Proper snapshotting, block level dedupe, and all that jazz.

      Btrfs is not yet ready, but in the next decade it will take on this role.

    4. Re:Happy with XFS by MBGMorden · · Score: 5, Informative

      Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.

      I don't know about being more reliable. I use XFS on my RAID array (mdadm) at home. I'm running the latest version of Linux Mint (Nadia), and if I ever lose poser and don't unmount that file system cleanly it looses all recent changes to the drive (and "recent" sometimes stretches to hours ago). The drive mounts fine and nothing appears corrupted (so I guess its not completely data loss), but any files changes (edits, additions, or deletions) to the file system are simply gone.

      Its gotten to the point where if I've just put a lot of stuff on the drive I unmount it and then remount it just to make sure everything gets flushed to disk. If I ever get a chance to rebuild that array it most certainly will be using something different.

      --
      "People who think they know everything are very annoying to those of us who do."-Mark Twain
    5. Re:Happy with XFS by Booker · · Score: 4, Informative

      No, that's FUD and/or misunderstanding on your part.

      "data=ordered" is ext3/4's name for "don't expose stale data on a crash," something which XFS has never done, with or without a mount option. ext3/4 also have "data=writeback" which means "DO expose stale data on a crash." XFS does not need feature parity for ill-advised options.

      Any filesystem will lose buffered and unsynced file data on a crash (http://lwn.net/Articles/457667/). XFS has made filesystem integrity and data persistence job one since before ext3 existed. Like any filesystem, it has had bugs, but implying that it was unsafe for use until recently is incorrect.

      I say this as someone who's been working on ext3, ext4 and xfs code for over a decade, combined.

    6. Re:Happy with XFS by jabuzz · · Score: 2

      On the other hand the code was first released as production nearly 20 years ago. Of all the current Linux file systems XFS has the best performance, the best scalability and the best stability.

      Want to put 100TB of data on btrfs be my guest.

    7. Re:Happy with XFS by bored · · Score: 5, Insightful

      No, that's FUD and/or misunderstanding on your part.

      "data=ordered" is ext3/4's name for "don't expose stale data on a crash," something which XFS has never done,

      Actually, I think your the one that doesn't understand how a journaling file system works. The problem with XFS has been that it only journals meta data, and the data portions associated with the metadata are not synchronized with the metadata updates (delayed allocation an all that). This means the metadata portions (filename, sizes, etc) will be correct based on the last journal update flushed to media, but the data referenced by that meta-data may not be.

      A filesystem that is either ordering its meta data/data updates against a disk with proper barriers, or journing the data alongside the meta data doesn't have this problem. The filesystem _AND_ its data remain in a consistent state.

      So, until your understand this basic idea, don't go claiming you know _ANYTHING_ about filesystems.

    8. Re:Happy with XFS by Anonymous Coward · · Score: 2, Informative

      there's CXFS which _is_ a clustered filesystem. Not as popular as GFS or OCFS2, but it's there, and uses the same block format as 'regular' XFS.

      Not sure what you refer by "mature OS", but note that ZFS is _not_ a cluster filesystem by any strecth of the definition.

    9. Re:Happy with XFS by jedidiah · · Score: 2

      That would be the same "mature" operating systems that have generally needed to employ products from 3rd party vendors in order to have interesting filesystems.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    10. Re:Happy with XFS by Kz · · Score: 4, Interesting

      Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.

      It _is_ quite reliable, even on the face of hardware failure.

      Several years ago, I hit the 8TB limit of ext3 and had to migrate to a bigger filesystem. ext4 wasn't ready back then (and still today it's not easy to use on big volumes). Already had bad experiences with reiserfs (which was standard on SuSE), and the "you'll lose data"warnings on XFS docs made me nervous. It was obviously designed to work on very high-end hardware, which I couldn't afford.

      so, I did extensive torture testing. hundreds of pull-the-plug situations, on the host, storage box and SAN switch, with tens of processes writing thousands of files on million-files directories. it was a bloodbath.

      when the dust settled, ext3 was the best by far, managing to never lose more than 10 small files in the worst case, over 70% of the cases recovered cleanly. XFS was slightly worse, never more than 16 lost files and roughly 50% clean recoveries. ReiserFS was really bad, always losing more than 50-70 files and sometimes killing the volume. JFS didn't lose the volume, but lost files count never went below 130, sometimes several hundred.

      needless to say, i switched to XFS, and haven't lost a single byte yet. and yes, there has been a few hardware failures that triggered scary rebuilding tasks, but completed cleanly.

      --
      -Kz-
    11. Re:Happy with XFS by Bengie · · Score: 2

      It is impossible to compete with a FS+VolumeManager+RAID hybrid. There is just some stuff that impossible to do without coupling those layers and those impossible things are becoming requirements.

    12. Re:Happy with XFS by Tough+Love · · Score: 2

      there is no reason the "volume/RAID" abstraction and "file system" abstraction should not be merged. Separating the two was a solution to a problem that no longer exists

      Oh, so true. Indeed, problems like modularity, maintainability and shared functionality stopped existing long ago as we all know.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    13. Re:Happy with XFS by gmack · · Score: 2

      XFS is mostly reliable but, as I found out with several PCs, if it gets shut off at the wrong time it will need a disk repair and then you are in for some fun because their repair utility doesn't work at all on a mounted FS (even if it is read only) meaning to repair a damaged XFS volume you will now need to use a boot disk.

    14. Re:Happy with XFS by loufoque · · Score: 3, Informative

      Ever heard of the sync command?

    15. Re:Happy with XFS by Harik · · Score: 2

      Oh, so true. Indeed, problems like modularity, maintainability and shared functionality stopped existing long ago as we all know.

      It's almost like people have discovered that you can have modularity and shared functionality in a different way than artifically seperating storage layers and throwing away important data at each layer boundry.

    16. Re:Happy with XFS by Booker · · Score: 4, Informative

      So, until your understand this basic idea, don't go claiming you know _ANYTHING_ about filesystems.

      Without sounding like too much of a jerk, I have hundreds of commits in the linux-2.6 fs/* tree. This is what I do for a living.
      I actually do have a pretty decent grasp of how Linux journaling filesystems behave. :)

      Test your assumptions on ext4 with default mount options. Create a new file and write some buffered data to it, wait 5-10 seconds, punch the power button, and see what you get. (You'll get a 0 length file) Or write a pattern to a file, sync it, overwrite with a new pattern, and punch power. (You'll get the old pattern). Or write data to a file, sync it, extend it, and punch power. (You'll get the pre-extension size). Wait until the kernel pushes data out of the page cache to disk, *then* punch power, and you'll get everything you wrote, obviously.

      XFS and ext4 behave identically in all these scenarios. Maybe you can show me a testcase where XFS misbehaves in your opinion? (bonus points for demonstrating where XFS actually fails any posix guarantee).

      Yes, ext3/4 have data=journaled - but its not default, and with ext4, that option disables delalloc and O_DIRECT capabilities. 99% of the world doesn't run that way; it's slower for almost all workloads and TBH, is only lightly tested.

      Yes, ext3's data=ordered pushes out tons of file data on every journal commit. That has serious performance implications, but it does shorten the window for buffered data loss to the journal commit time.

      You want data persistence with a posix filesystem? Use the proper data integrity syscalls, that's all there is to it.

    17. Re:Happy with XFS by lgw · · Score: 2

      Dammit, I agree with h4rr4r - what is this world coming to?

      One problem with snapshotting is that it's pretty useless without a standard way to quiesce apps. It was a huge deal on the Microsoft side when shadow copy happened. You really want to be able to take a snapshot of a DB store or mail store and be sure you're getting something coherent, which will require the cooperation of the software involved.

      Multiple volume snaps also require a similar framework if you want coherent sets of snaps. (Almost all complex software makes the assumptions about write order for performance reasons and to avoid locking - but those assumptions are easily violated by snapping multiple volumes.

      --
      Socialism: a lie told by totalitarians and believed by fools.
    18. Re:Happy with XFS by bored · · Score: 3, Interesting

      Without sounding like too much of a jerk, I have hundreds of commits in the linux-2.6 fs/* tree. This is what I do for a living.

      Well, then your part of the problem. Your idea that you have to be correct or fast is sadly sort of wrong. Its possible to be correct without completely destroying performance. I have a few commits in the kernel as well mostly to fix completely broken behavior (my day job in the past was working on an enterprise unix). So, I do understand filesystems too. Lately, my job has been to replace all that garbage, from the scsi midlayer up, so that a small industry specific "application" can both make guarantees about the data being written to disk while still maintaining many GB/sec of IO. The result, actually makes the whole stack look really bad.

      So, I'm sure your aware that on linux, if you use proper posix semantics (fsync() and friends) the performance is abysmal compared to the alternatives. This is mostly because of the "broken" fencing behavior (which has recently gotten better but still is far from perfect) in the block layer. Our changes depend on 8-10 year old features available in SCSI to make the guarantees that aren't available everywhere. But it penalizes devices which don't support modern tagging, ordering and fencing semantics rather than ones that do.

      Generally in linux, application developers are stuck either dealing with orders of magnitude performance loss, or they have to play games in an attempt to second guess the filesystem. Neither is a good compromise and its sort of shameful.

      Maybe its time to admit linux needs a filesystem that doesn't force people to choose either abysmal performance, or no guarantees about integrity.

    19. Re:Happy with XFS by operagost · · Score: 2

      I've been using ReiserFS since 2006. It's killer.

      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
  3. The oracle in the woodpile by larry+bagina · · Score: 2

    I think we need to talk about the oracle in the woodpile - ie, Oracle. BTRFS is an Oracle project. What happens when it goes the way of MySQL? Will Monty Wideanus appear on a white steed to save us?

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

    1. Re:The oracle in the woodpile by larry+bagina · · Score: 5, Interesting
      Oracle now owns ZFS. They could relicense it if they wanted to. BTRFS was started before the Sun acquisition but it seems strange* to develop BTRFS as a GPL file system with ZFS-like features while ZFS is mature and reliable today.

      * Yes, they're a large corporation and right hand doesn't know what left hand does... but isn't this more like the index finger not knowing what the middle finger is doing?

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    2. Re:The oracle in the woodpile by bill_mcgonigle · · Score: 2

      It's Oracle. They'll re-license ZFS just as soon as it's no longer profitable for them not to.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  4. Re:replace ext3 and ext4? really? by h4rr4r · · Score: 3, Informative

    Lots of production servers user Ext filesystems. If btrfs is all it should be it will certainly replace these file systems one day soon as the safe choice.

    Sure people use other filesystems on production Linux servers, but those are not the norm. The safe "Enterprise" (Not necessarily a good thing) choice is still Ext based filesystems.

  5. ZFS by 0100010001010011 · · Score: 5, Informative

    Meanwhile ZFS announced that it was ready for production last month.

    http://zfsonlinux.org/

    1. Re:ZFS by h4rr4r · · Score: 4, Insightful

      It will be ready for production when it can be distributed with the kernel.

      Do you really want to depend on an out of tree FS?

    2. Re:ZFS by Bill_the_Engineer · · Score: 3, Interesting

      Incompatible license prevents ZFS inclusion with the kernel. This is why Btrfs exists and explains Oracle's involvement with both.

      --
      These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
    3. Re:ZFS by h4rr4r · · Score: 4, Insightful

      Correct sir.
      My point still stands though. Even though the limitation keeping it from being seriously considered for production is caused by a legal issue not a technical one.

    4. Re:ZFS by Oceanplexian · · Score: 2

      It will be ready for production when it can be distributed with the kernel.

      ZFS is not included in the Linux kernel because it is not GPL compatible.
      Licensing has nothing to do with how production-ready a product is. ZFS is significantly more mature than btrfs.

    5. Re:ZFS by Anonymous Coward · · Score: 2, Insightful

      It will be ready for production when it can be distributed with the kernel.

      Do you really want to depend on an out of tree FS?

      That's why the fileserver runs FreeBSD. Has other benefits, too.

    6. Re:ZFS by h4rr4r · · Score: 2

      Yes, but the statement is still true.

      It means you will not get updates via normal channels, or normal channel updates might break it. That simply is not something most datacenters want to deal with. ZFS is more mature on Solaris and BSD, on Linux today it might be ahead of btrfs, but neither is production ready in the sense that datacenters mean it.

    7. Re:ZFS by Chris+Mattern · · Score: 3, Interesting

      Mixing licenses does not somehow make things "not production ready".

      No, using a file system that doesn't ship with the kernel makes things "not production ready." Licensing is the reason why it doesn't ship with the kernel, but it's not shipping with the kernel that keeps it out of critical production use.

    8. Re:ZFS by h4rr4r · · Score: 2

      This.

      The reason they can't ship together and be updated via normal RHEL/SUSE/Debian updates is licensing, but the technical problem keeping it from being seriously considered for production is that they can't be updated and shipped together.

    9. Re:ZFS by wagnerrp · · Score: 3, Insightful

      Anyone using nVidia GPUs for compute cards in a data center is using the closed nVidia drivers. Anyone not using them for that purpose likely doesn't even have any nVidia hardware in the first place.

    10. Re:ZFS by wagnerrp · · Score: 2

      All that means is ZFS gets forked, and FreeBSD, OpenIndiana, Nexenta, or one of the other Solaris clones takes primary ownership.

    11. Re:ZFS by anyanka · · Score: 2

      No one installs the closed nVidia drivers on production machines.

      Depends on what you're producing. But yeah, I'd avoid it if not strictly necessary.

  6. Sorry Slashdot. by Anonymous Coward · · Score: 5, Funny

    Ugh, I'm really sorry about this post, Slashdot. I really didn't think it was going to a "First post." What I really meant to post was

    OMFG fr1st psot!!!! APK!! crazy host file conspiracy! /etc/mod_me_down

  7. Re:Why? by h4rr4r · · Score: 5, Insightful

    ZFS is outside the kernel tree. That is not an ideological issue, but a practical one. It means updates will not come from the normal channels, it means kernel updates form normal channels could break it and it is not getting the attention from the kernel devs an fs should get.

    ZFS on linux has probably less testing than Btrfs at this point. It has near no real world testing. Just because the Solaris ZFS is great, and the BSD one is coming along means nothing for the stability and correctness of the Linux port.

    If you want to use a different OS than this entire discussion is worthless. You might as well suggest switching everything to OSX and using HFS+.

  8. Re:A Few Nasty Caveats? by Cito · · Score: 2

    yea Btrfs has one major bug

    if you fill the hard drive up you lose access to the system, you can't log in or even get access to the filesystem and the system locks up

    with ext things may act a bit erratic but you could log in and delete/move things off to make room and be ok. but Btrfs you can't if it fills up you lose

    unless you take the hard drive out move it to another box and mount it then delete crap that way, but that's a pain in arse.

  9. Re:Yawn, yet another filesystem... by h4rr4r · · Score: 5, Insightful

    Ext3 is still chugging along and doing what you want. A filesystem that sacrifices everything for stability.

    Not everyone has the same wants and needs. Lots of competing filesystems is a good thing, it leads to a market of ideas. Your lets pick one and force everyone to suffer with our choice just leads to stagnation and even worse results.

  10. You don't understand the problem by Anonymous Coward · · Score: 2, Interesting

    The problem with "XFS" eating data wasn't with XFS - it was with the Linux devmapper ignoring filesystem barrier requests.

    Gotta love this code:

    Martin Steigerwald wrote:
    > Hello!
    >
    > Are write barriers over device mapper supported or not?

    Nope.

    see dm_request(): /*
                      * There is no use in forwarding any barrier request since we can't
                      * guarantee it is (or can be) handled by the targets correctly.
                      */
                    if (unlikely(bio_barrier(bio))) {
                                    bio_endio(bio, -EOPNOTSUPP);
                                    return 0;
                    }

    Who's the clown who thought THAT was acceptable? WHAT. THE. FUCK?!?!?!

    And it wasn't just devmapper that had such a childish attitude towards file system barriers:

    Andrew Morton's response tells a lot about why this default is set the way it is:

    Last time this came up lots of workloads slowed down by 30% so I dropped the patches in horror. I just don't think we can quietly go and slow everyone's machines down by this much...

    There are no happy solutions here, and I'm inclined to let this dog remain asleep and continue to leave it up to distributors to decide what their default should be.

    So barriers are disabled by default because they have a serious impact on performance. And, beyond that, the fact is that people get away with running their filesystems without using barriers. Reports of ext3 filesystem corruption are few and far between.

    It turns out that the "getting away with it" factor is not just luck. Ted Ts'o explains what's going on: the journal on ext3/ext4 filesystems is normally contiguous on the physical media. The filesystem code tries to create it that way, and, since the journal is normally created at the same time as the filesystem itself, contiguous space is easy to come by. Keeping the journal together will be good for performance, but it also helps to prevent reordering. In normal usage, the commit record will land on the block just after the rest of the journal data, so there is no reason for the drive to reorder things. The commit record will naturally be written just after all of the other journal log data has made it to the media.

    I love that italicized part. "OMG! Data integrity causes a performance hit! Screw data integerity! We won't be able to brag that we're faster than Solaris!"

    See also http://www.redhat.com/archives/rhl-devel-list/2008-June/msg00560.html

    There's a lot more out there if you care to look.

    Toss in other things like the way Linux handles NFSv2 group membership (More than 16? Let's just silently drop some!) and lots of fanbois wonder why I view Linux as little better than Windows. Hell, Microsoft may fuck things up six ways from Sunday, but they're not CHILDISH when it comes to things like data integrity.

  11. Re:replace ext3 and ext4? really? by jabuzz · · Score: 2

    Want more than 16TB on your server? Unless ext4 has very recently grown that support then using an ext based file system is not viable. Remember a RAID5 in 4D+P using 4TB disks will be super close to that 16TB limit. Better hope that you don't want to scale the file system up in the future.

  12. Re:replace ext3 and ext4? really? by Anonymous Coward · · Score: 2, Interesting

    FYI, ext4 can be larger than 16 TB but you need a newer version of the e2fsprogs than is included in a typical enterprise distribution. It's not the kernel filesystem drivers with the limitation, but the user-level utility for formatting a new filesystem.

  13. Re:Yawn, yet another filesystem... by bored · · Score: 2

    Ext3 is still chugging along and doing what you want. A filesystem that sacrifices everything for stability.

    EXT3, is actually fairly good, and the performance isn't bad _EXCEPT_ for one issue. fsync(), which causes a massive IO barrier against all the other operations in the filesystem. fsync() should only be assuring the named file is consistent, and yet it basically stalls the entire FS to assure that one file. Its a problem with lack of proper IO tagging and actually is a fundamental problem with the block layer in linux. A recent LSML posting about SYNCHRONIZE CACHE hints at the problem too (complete device flush when only a small portion of the IO needs to be flushed).

  14. My experiance, for what it worth... by sshir · · Score: 2

    Installed Xubuntu 12.10 last October(ish) on USB2 stick (jetflash 32G) with Btrfs (only /boot had EXT2 partition, no swap)

    Reason: 24/7 machine. It's a notebook - always spinning harddrive is a drag: spins up cooling fun; so I went solid state for primary OS drive.Needed filesystem that spreads wear and does checksums - hence Btrfs.

    Usage - downloading stuff (to the stick itself, not the harddrive) plus some NASing. Data volume: wrapped around those 32gigs few times already.

    Observations so far: no problems at all.

    Other details: Had to play with I/O scheduler (I think settled on CFQ. Interestingly, NOOP sucked). Had to install hdidle (I think) otherwise couldn't force sda to go to sleep (bug (?)).

  15. Re:It's completely ideological. by UnknownSoldier · · Score: 4, Interesting

    Please mod parent informative.

    One of the retarded things about btrfs is that you can not see how much disk space is being used by each subvolume. How the hell can you have a filesystem and not know how much space is in use or free ??

    The design of ZFS is much more wholistic. That is, when we take a step back and look at both the micro and macro we see that we are really trying to solve 3 problems:

    * Volume Management
    * File System
    * Data Integrity

    ZFS solves all of these be leveraging knowledge from ALL the layers as one cohesive whole.
    https://blogs.oracle.com/bonwick/en_US/entry/rampant_layering_violation

    Why RAID is fundamentally broken
    https://blogs.oracle.com/bonwick/entry/raid_z

    Another interesting doc
    http://www.scribd.com/doc/43973847/5/ZFS-Design-Principles

  16. tried it as main laptop filesystem by Luke_22 · · Score: 3, Interesting

    I tried btrfs as my main laptop filesystem:

    nice features, speed ok, but i happened to unplug by mistake the power supply, without a battery. bad crash... I tried using btrfsck, and other debug tools, even in the "dangerdon'teveruse" git branch, they just segfaulted. at the end my filesystem was unrecoverable, I used btrfs-restore, only to find out that 90% of my files had been truncated to 0... even files i didn't use for months....

    now, maybe it was the compress=lzo option, or maybe I played a little too much with the repair tools (possible), but untill btrfs can sustain power drops without problems, and the repair tools at least do not segfault, I won't use it for my main filesystem...

    btrfs is supposed to save a consistent state every 30 seconds, so I don't understand how I messed up that bad.... maybe the superblock was gone and the btrfsck --repair borked everything, I don't know.... luckily for me: backups :)

    --
    "I was gratified to be able to answer promptly, and I did. I said I didn't know." -- Mark Twain
  17. Re:Why? by jafo · · Score: 2

    zfsonlinux has less testing than Btrfs? Really?

    I think you mean *THE LINUX SHIM* has less testing. However, there's this *HUGE* portion of the code, as a wild ass guess I'd say 80%, which is the internal algorithms, data structures, and other internal parts of the file-system that are shared by the Linux and Solaris versions and those have been quite seriously tested for ZFS.

    My experience with ZFS under Linux via FUSE was that there were some bugs in the integration layer, but they tended to be fairly shallow and never lead to data loss. This is over around 3 years of ZFS+FUSE on Linux serious use (~30TB of backup storage, home storage server). I tested the heck out of ZFS+FUSE before we deployed it, found some issues, worked with the developers (who were amazing!), and eventually got to a point where the stress test I was running on it was more stable than it was under our OpenSolaris systems a few years prior (and the reason I built the stress test).

    Based on my experience with ZFS, ZFS+FUSE, and btrfs, I'd personally trust ZFSonLinux over btrfs. My experimentation with btrfs the last few years has been that it still needs a lot of work.