Slashdot Mirror


Ubuntu Plans To Make ZFS File-System Support Standard On Linux

An anonymous reader writes: Canonical's Mark Shuttleworth revealed today that they're planning to make ZFS standard on Ubuntu. They are planning to include ZFS file-system as "standard in due course," but no details were revealed beyond that. However, ZFS On Linux contributor Richard Yao has said they do plan on including it in their kernel for 16.04 LTS and the GPL vs. CDDL license worries aren't actually a problem. Many Linux users have been wanting ZFS on Linux, but aside from the out of tree module there hasn't been any luck in including it in the mainline kernel or with tier-one Linux distributions due to license differences.

279 comments

  1. Re:ZFS is nice... by Anonymous Coward · · Score: 4, Insightful

    Why would you need nvidia drivers on a file server? Use Ubuntu Server, it's made for, well, being a server.

  2. Re:ZFS is nice... by gilgongo · · Score: 2

    What was the Nvidia video driver doing on a server?

    --
    "And the meaning of words; when they cease to function; when will it start worrying you?"
  3. Re: ZFS is nice... by WarJolt · · Score: 1

    ZFS is not really the supported setup for Ubuntu. I've only has issues with the proprietary nvidia driver. I've always been able to fix those issues.

    When ZFS and nouveau are supports by default then that configuration will be tested and ideally more robust. I wouldn't worry.

  4. Re:ZFS is nice... by Anonymous Coward · · Score: 0, Troll

    Because GP is a point-and-drool GUI noob.

  5. Happy ZoL user here by Anonymous Coward · · Score: 0

    I've been using it for several months now. Love the feature set. Wish I had switched earlier. No desire to use btrfs and the rest.

    Only issue is a lack of installer support and the need to maintain kernel modules.
    This announcement by Ubuntu will hopefully light a fire under the other Linux distros.

    1. Re:Happy ZoL user here by armanox · · Score: 1

      Maybe. Red Hat has a very different legal outlook on things then Canonical does.

      --
      I'm starting to think GNU is the problem with "GNU/Linux" these days.
  6. Re:ZFS is nice... by Frnknstn · · Score: 3, Interesting

    I run ZFS on any / every machine I can, server or not. That is one filesystem where the features outweigh all possible concerns.

    --
    If it's in you sig, it's in your post.
  7. What he didn't say by n1ywb · · Score: 3, Informative

    is anything like "ZFS will be the default". He just said that it would be in the distro.

    --
    -73, de n1ywb
    www.n1ywb.com
    1. Re:What he didn't say by tomhath · · Score: 2

      He also didn't say it would be the default on "Linux" (whatever that is). Just Ubuntu.

  8. BTRFS is getting there by DarkOx · · Score: 4, Insightful

    I don't why so many in the Linux community are so hooked on ZFS. BTRFS has a feature set that is rapidly getting there, its becoming more a more mature in terms of code that is already in the upstream.

    Why not just put your energy there?

    --
    Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    1. Re:BTRFS is getting there by Phil+Urich · · Score: 5, Interesting

      Hell, it's already in many cases a superior experience on Linux, starting with that you can shrink a BTFS volume but you still can't shrink a ZFS volume. I suppose in the enterprise-centric world that ZFS is aimed at that's pretty much never an issue, but I've even run into it personally multiple times myself working for a small business and have been glad that I was running BTRFS instead. Frankly, for many use-cases it seems like running ZFS on Linux is more hassle for the sake of then more hassle later on.

      --
      I remember sigs. Oh, a simpler time!
    2. Re:BTRFS is getting there by Guspaz · · Score: 4, Interesting

      Because BTRFS is and has always been redundant? ZFS is far more mature, and stories abound of BTRFS failing on people. BTRFS is still unstable, particularly their RAID5/6 support. Developers should be putting their efforts into ZFS instead of BTRFS.

    3. Re:BTRFS is getting there by TrekkieGod · · Score: 4, Interesting

      I don't why so many in the Linux community are so hooked on ZFS. BTRFS has a feature set that is rapidly getting there, its becoming more a more mature in terms of code that is already in the upstream.

      Why not just put your energy there?

      As someone who uses both zfs (for file server storage) and btrfs (for the OS), my reason for using zfs is raidz. If btrfs implemented something similar, I'd drop zfs.

      --

      Warning: Opinions known to be heavily biased.

    4. Re:BTRFS is getting there by rl117 · · Score: 5, Informative

      It's really quite simple. ZFS is a great filesystem. It's reliable, performant, featureful, and very well documented. Btrfs has a subset of the ZFS featureset, but fails on all the other counts. It has terrible documentation and it's one of the least reliable and least performant filesystems I've ever used. Having used both extensively over several years, and hammered both over long periods, I've suffered from repeated Btrfs dataloss and performance problems. ZFS on the other hand has worked well from day one, and I've yet to experience any problems. Neither are as fast as ext4 on single discs, but you're getting resilience and reliability, not raw speed, and it scales well as you add more discs; exactly what I want for storing my data. And having a filesystem which works on several operating systems has a lot of value. I took the discs comprising a ZFS zpool mirror from my Linux system and slotted them into a FreeBSD NAS. One command to import the pool (zpool import) and it was all going. Later on I added l2arc and zil (cache and log) SSDs to make it faster, both one command to add and also entirely trouble-free.

      Over the years there have been lots of publicity about the Btrfs featureset and development. But as you said in your comment that it's "rapidly getting there". That's been the story since day one. And it's not got there. Not even close. Until its major bugs and unfortunate design flaws (getting unbalanced to unusability, silly link limits) are fixed, it will never get there. I had high hopes for Btrfs, and I was rewarded with severe dataloss or complete unusability each and every time I tried it over the years since it was started. Eventually I switched to ZFS out of a need for something that actually worked and could be relied upon. Maybe it will eventually become suitable for serious production use, but I lost hope of that a good while back.

    5. Re: BTRFS is getting there by Anonymous Coward · · Score: 0

      Btrfs does implement something similar to raidz, it's just not particularly stable yet.

    6. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Wasn't there a history of the lead dev constantly promising features then delaying them? The community was then basically begging him to delegate out the work. Seem to remember it revolving around the util for checking fs integrity. Obviously a pretty important feature and I remember thinking something's not right here.

      Or maybe everything's fine now and deserves another look.

    7. Re:BTRFS is getting there by danbob999 · · Score: 1

      Just because ZFS is mature on Solaris doesn't make it mature on Linux. Out of tree modules always suck in the long run. I agree that there is more future in BTRFS because of that. Unless they finally release ZFS under the GPL.

    8. Re:BTRFS is getting there by rahvin112 · · Score: 1

      There is no problem with Raid 5 on Btrfs. I've been running a raid 5 on btrfs for more than a year. I'm not sure why ZFS fanboi's always resort to lying about the btrfs features and support, if it's just laziness on their part or they are astroturfing.

      ZFS may be stable, but it development is relatively frozen and btrfs is in many ways better but has lagged in development. This is probably because Oracle was the primary backer of btrfs before they bought Sun and there just aren't enough developers working on btrfs but that doesn't mean it's still not advancing and improving every year. I personally like btrfs a lot more than zfs because some of the features such as shrinking and growing pools is a MAJOR feature that ZFS lacks while btrfs supports all the zfs features that I care about.

      But either way, if you are going to talk about features or aspects of btrfs at least make sure your information is current for gods sakes!

    9. Re:BTRFS is getting there by danbob999 · · Score: 2

      I don't why so many in the Linux community are so hooked on ZFS. BTRFS has a feature set that is rapidly getting there, its becoming more a more mature in terms of code that is already in the upstream.

      Why not just put your energy there?

      The fact is that 99% of the users couldn't care less about ZFS or BTRFS. Ext4 is just fine, and ext3 was also fine before, for 99% of the use cases. Hence, most people will just stick to their default FS.
      Likewise, most Windows users are fine with NTFS, and wouldn't switch to ZFS even if it became available on Windows.

    10. Re:BTRFS is getting there by OrangeTide · · Score: 1

      XFS is mature on Linux. Just need to add snapshots and things back in. I worked for a start-up that did just that, and then we never got to the point where we could release our XFS snapshot and replication support. Pity.

      --
      “Common sense is not so common.” — Voltaire
    11. Re:BTRFS is getting there by rahvin112 · · Score: 1

      Raid support has been built into btrfs for ages now. It's been rock stable for me for over a year with an 8 disk array in raid 5 configuration. But the simplicity with which you can add and subtract drives, parity drives and others makes btrfs a total winner IMO.

      https://btrfs.wiki.kernel.org/...

    12. Re:BTRFS is getting there by ArmoredDragon · · Score: 4, Interesting

      I've been using ZFS on Linux for about 3.5 years now, it's been pretty stable. I can't say I've heard of a case of it failing for somebody other than user error.

    13. Re:BTRFS is getting there by Guspaz · · Score: 4, Informative

      zfsonlinux hit both unstable and stable releases on Linux earlier than btrfs: if your only definition of stable is how long it's been around on Linux, then btrfs is still less mature.

      Being in-tree says nothing about the stability of a module, but ZFS doesn't need to be under the GPL to be in Linus' tree: the GPL does not forbid code aggregation. That said, neither Linus nor the ZoL team want ZoL in Linus' tree.

    14. Re:BTRFS is getting there by ArmoredDragon · · Score: 0

      What do you mean it scales as you add more disks? You can't add disks to a ZFS array. You can replace them with bigger disks, but not just add them.

    15. Re:BTRFS is getting there by Anonymous Coward · · Score: 1

      I am going to hope that BTRFS gets there, and does some things better than ZFS did. ZFS is *moderately* mature in the solaris environment, but that does not imply that it will ever be sufficiently mature in the linux environment. Even though ZFS is moderately mature in a solaris environment, it has some major fundamental issues. You cannot tell a ZFS how to do it's data layout; it decides on what RAID stripe size to allocate based on how the writes it receives are sized. If you decide to run a database on ZFS, use a RAID-Z config (most common config), and pre-load the database (most common way of getting data into the database initially) then hit it with an OLTP workload, ZFS will perform *terribly* - because it got large streaming writes up front, allocated huge stripe sizes, which makes rewrite performance go to hell in a handbasket, because for every tiny write, it needs to read the whole stripe that encompasses the write, re-calculate the checksums and write a whole new stripe out to disk. That being said, for say a typical corporate fileserver, ZFS is fantastic.

    16. Re:BTRFS is getting there by Guspaz · · Score: 1

      In what way is development on ZFS frozen? Development is extremely active, probably far more so than btrfs.

    17. Re:BTRFS is getting there by cas2000 · · Score: 4, Interesting

      Because there's really no comparison between btrfs and ZFS. ZFS is years ahead in both stability and features. Only someone who's never used both would say that they are in any way close.

      The only really useful thing that btrfs does that ZFS does not is rebalancing - that's a great feature and i'd love to see it in ZFS (but it will probably never get there).

      ZFS has lots of features that btrfs doesn't have and likely never will.

    18. Re:BTRFS is getting there by rl117 · · Score: 3, Informative

      I mean the performance gains as you add more discs.

      And regarding adding discs to an array, you certainly can. Just add addtional raid sets to the pool. That is, rather than adding discs to the existing array, you scale it up by adding additional arrays to the same pool. See the documentation.

    19. Re:BTRFS is getting there by rl117 · · Score: 4, Insightful

      It can certainly work when everything is working correctly. Have you tested its behaviour when things don't work correctly, for example by pulling the cable on one of the discs as it's running? Does it carry on running, does it transparently recover when you plug it back in? When I had a cable become unseated and the connection glitched, Btrfs happily toasted the data on the drive, and its mirror, and panicked the kernel whenever the discs were plugged in; I had to zero them out on another system before I could even try to reformat them. One of the major historical weak points has been that the failure codepaths were poorly tested, and this can come to bite you quite badly.

    20. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      you actually can add more disks to a zpool. zpool add tank raidz disk1 disk2 disk3 disk4

    21. Re:BTRFS is getting there by fnj · · Score: 1

      Raid support has been built into btrfs for ages now.

      The commenter you are replying to plainly said RAIDZ. If you don't know how vastly superior RAIDZ is to RAID, you'd be better off not making your unwitting lack of knowledge so obvious.

    22. Re:BTRFS is getting there by Anonymous Coward · · Score: 1, Interesting

      You mean the recordsize. You can adjust that easily on ZFS:

      http://open-zfs.org/wiki/Performance_tuning#Dataset_recordsize

      There is even explicit documentation for running databases on ZFS:

      http://open-zfs.org/wiki/Performance_tuning#Database_workloads

    23. Re:BTRFS is getting there by Anonymous Coward · · Score: 2, Interesting

      13.1% of the code changed between 0.6.4 and 0.6.5:

      http://fossies.org/diffs/zfs/0.6.4_vs_0.6.5/index.html

      That is far from being frozen. Even Linux does not have that percentage of its code change between releases:

      http://fossies.org/diffs/linux/4.2_vs_4.3-rc1/

      It would be interesting if someone checked out much fs/btrfs changes between releases.

    24. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      I don't why so many in the Linux community are so hooked on ZFS. BTRFS has a feature set that is rapidly getting there, its becoming more a more mature in terms of code that is already in the upstream.

      Why not just put your energy there?

      http://www.ilsistemista.net/index.php/virtualization/47-zfs-btrfs-xfs-ext4-and-lvm-with-kvm-a-storage-performance-comparison.html

      ZFS works well in enterprise distributions today while btrfs might work well in 5 years.

    25. Re:BTRFS is getting there by fnj · · Score: 1

      What do you mean it scales as you add more disks? You can't add disks to a ZFS array. You can replace them with bigger disks, but not just add them.

      Wrong. You can't scale a pool by adding VDEVs to the existing pool, but you can expand without practical limit by generating VDEVs out of VDEVs and adding them. E.g., if you have 6 drives in a RAIDZ2, you can build a RAIDZ2 out of 6 (or some other number of) RAIDZ2s, including your original and 5 new ones. The downtime when you switch over to your expanded config is minimal. Think in terms of a minute or two if you plan right. The pool array setup is a background and incremental process, and the pool is available to start using essentially immediately. And your existing data is seamlessly preserved in place. You don't have to back up and restore. That's just an example.

      In fact using a gigantic number of drives in a scalar array goes against good practice in ZFS and in other file systems. There is no real limit to how hierarchical your structure can be in ZFS, and still reduce to a single root (if you so wish).

      The basic building block in ZFS is the VDEV. You make your striped, or mirrored, or striped-and-mirrored, or RAIDZ1 (single parity), RAIDZ2 (double parity), or RAIDZ3 (triple parity) array out of VDEVs, which can be whole drives, partitions, files, or VDEVs themselves. You can use any combination of any of these kinds of arrays as your VDEVs in your higher-level array.

      If you play a game of trying to knock ZFS's design and capabilities, you will lose.

    26. Re:BTRFS is getting there by mi · · Score: 2

      I don't why so many in the Linux community are so hooked on ZFS.

      Because it is good. In particular, it offers the only sensible way to make good use of the ephemeral storage offered by Amazon's Web Services (AWS) in a general case — the fast (SSD) storage can be used as read-cache for a ZFS stable of mount-points.

      Why not just put your energy there?

      Why do put any energy into reinventing the wheel? And struggle with triangular "wheels" in the process?

      --
      In Soviet Washington the swamp drains you.
    27. Re:BTRFS is getting there by rahvin112 · · Score: 1

      I know fully well what raidz is and it's not significantly different than the raid in btrfs. Maybe you should understand what they are before you make a comment simply because one has a z on the end.

    28. Re:BTRFS is getting there by ZorinLynx · · Score: 4, Interesting

      Likewise; we use it all over the place in our department. We have a bunch of 96TB/80TB usable ZFS file servers based on 24 4TB SATA drives. The performance is amazing for the price and they are rock solid under all kinds of heavy load, except for one tiny bug we hit recently that has been fixed already.

    29. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      The main difference between RAID 5/6 in btrfs and regular RAID 5/6 is that btrfs has an in-kernel stripe cache to try to mitigate read-modify-write. To my knowledge, it does not take advantage of CoW to avoid the write hole.

      https://btrfs.wiki.kernel.org/index.php/RAID56

      The btrfs developers seem to use a different definition of write hole on their wiki than the one used everywhere else. If you consider the writes to be done in-place, their description of the "write hole" is just another implication of the definition used everywhere else.

    30. Re:BTRFS is getting there by ZorinLynx · · Score: 2

      >neither Linus nor the ZoL team want ZoL in Linus' tree.

      I bang my head on my desk frequently over Linus' stubborn nature, then I realize it's that same stubborn nature that makes Linux as great as it is, so I forgive him.

      If ZFS were part of the kernel, bugfixes and updates would have to follow the Linux kernel release schedule, which would make it a huge hassle to update the code on running systems without building custom kernels.

      Building custom kernels is something you shouldn't be doing in a production environment unless it's either 1995 or you're a masochist. :)

    31. Re:BTRFS is getting there by Bengie · · Score: 2, Interesting

      BTRFS is the SystemD of filesystems. Lots of features, poor design. Features can be great, but they come at a cost. To summarize the issues with BTRFS, is it violates the principle of least surprise, which can result in some completely unexpected gotchas. The other thing is it is not truly transactional/atomic. By design, it requires fsck, which means the filesystem can be left in an inconsistent state. This opens the doors for a host of issues that ZFS is guaranteed to never have.

      Not to mention, there are still plenty of people complaining about it eating their data.

    32. Re:BTRFS is getting there by Bengie · · Score: 2

      As much as I love ZFS over BTRFS, monoculture is bad. If anything, BTRFS is a learning experience for the entire community, but I do think ZFS needs first class support.

    33. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Several years

      Except that btrfs has improved a lot over the years, the link limits have been fixed many versions ago and one faulty hdd has been running it for more than a year without issues.

    34. Re:BTRFS is getting there by ZorinLynx · · Score: 4, Interesting

      I recently "fixed" one of our ZFS fileservers at work which was performing very poorly by *removing* a failing drive. The drive was taking a few seconds to read blocks, obviously dying, so it was slowing down the entire system. As soon as I pulled it ZFS finally declared it dead and the filesystem was running at full performance again.

      I felt so confident being able to just walk up and yank the troublesome drive; that's how much trust I've built in ZFS. It's incredibly stable and fault tolerant.

    35. Re:BTRFS is getting there by Bengie · · Score: 5, Insightful

      Nearly all of the original Sun devs that created ZFS in the first place, still work on OpenZFS full time and are paid to do so. OpenZFS is very actively developed. They have 2 or more presentations per year about all of the changes they're constantly making and some of the upcoming big changes. Currently they are focusing on standardizing ZFS between FreeBSD, Luminos, and Linux. It's a large refactoring effort to have all ZFS's code bases to live in the same tree. One OpenZFS code tree for all OSes. Everyone will be in sync.

      While you can't shrink ZFS pools because they cannot do that atomically, and they refuse to do anything that allows the end user to shoot themselves in the foot, like leaving the FS in an inconsistent state, you can create a new pool that is smaller and import your larger pool into the smaller one, as long as it fits. Can't do it in-place, but you can do it. It just sucks to do that with a 1PiB+ pool. But who shrinks those?

    36. Re:BTRFS is getting there by Bengie · · Score: 2

      Don't use RAID5. When one drive dies, there is a very good chance another drive will die, even if the that drive is a different model or brand.

    37. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      sure, then add the nice software raid tools and call it a day. XFS would be great for most ppl.

    38. Re:BTRFS is getting there by Bengie · · Score: 1

      Yes, you can add drives to the pool. Except in mirrored vdevs, you can't change the number of drives, but you can add more vdevs to the pool.

    39. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      No parity-based raid, e.g. raidz3 or raidz4 in BRTFS, so higher chance of data corruption or failure with it.

    40. Re:BTRFS is getting there by Bengie · · Score: 2

      They recently said that pointer-rewrite, which is required for re-balancing, will not happen. They have looked at the issue for a few years now and cannot figure out a safe way to make it work that wouldn't open up a window for dataloss. The only way to rebalance or shrink is to make a new pool on another set of drives, and import the existing pool.

    41. Re:BTRFS is getting there by fnj · · Score: 2, Informative

      I don't know what your definition of "significant" is, but the BTRFS wiki says "The one missing piece, from a reliability point of view, is that it is still vulnerable to the parity RAID 'write hole', where a partial write as a result of a power failure may result in inconsistent parity data." ZFS RAIDZ is expressly free from the write hole. That is very significant to me.

      RAIDZ's write hole advantage is a product of three specifics: (1) RAID5 has n data disks plus one dedicated parity-only disk; ZFS distributes all data and all parity across all disks - (2) ZFS updates metadata before data; RAID5 has no concept of metadata - and (3) COW (both have this).

      And before you object "but UPS" - UPSs and power supplies can fail, too - and a kernel panic is essentially a "power failure" too; one which a UPS is powerless to prevent.

      If that Wiki should be out of date, you can show me something that isn't, but all I find out there is a lot of outdated stuff.

    42. Re:BTRFS is getting there by rahvin112 · · Score: 1

      My understanding is that the it does take advantage of COW. The problem is your parity has two copies, and you have two copies of the data that may or may not match the parity because it lost power during the write. This is why they call it a write hole because the algorithm can't be sure which copy of the written data is the right one because their are two copies of parity data as well. It's a tricky problem that's going to need either some pretty smart algorithms to sort out which copy is the right one or a prompt to the user to identify the correct file.

      On the other hand IIRC zfs suffers from a similar problem with power loss and it's one of the reasons zfs is recommended to have a battery backup to allow a clean shutdown. It is my understanding this is a general design problem with COW filesystems that really hasn't been completely solved yet. In that if you have a corrupted write with parity data it becomes difficult for the computer to identify the correct data and parity because it has multiple copies of each and some may or may not be corrupt. It's just a really tough problem.

    43. Re:BTRFS is getting there by rahvin112 · · Score: 3, Informative

      The features you list as "specific" to zfs exist in btrfs. btrfs can have dedicated parity drives or you can spread the data and parity across multiple drives in any order or pattern you would like.

      The write hole in btrfs is AFIAK also present in zfs and listed as a risk of a power failure during write on a raid pool with COW filesystems. This risk is that loss of power during write can result in multiple different parity blocks for the same data and that in such an instance the filesystem cannot identify the correct data or parity (depending on the order you write them) and there are only a few solutions to this that involve resorting to a known good (older) copy and result in lost data (from the write).

      IIRC this is a listed risk in the FAQ for ZFS. Just as the same write hole risk exists in btrfs. Also IIRC ZFS takes the path of writing parity before data such that it will lose new data rather than risk a corruption of existing parity blocks. Whereas, again IIRC btrfs COW's the new data then COW's the parity block which risks inconsistent parity but at less risk of data loss (as parity can be recomputed).

      Two different solutions to the same problem that is intrinsic to COW filesystems with parity data. Neither is particularly better IMO as both run the risk of data loss in an extreme event. Though such events are rare.

    44. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      You don't have to remove atomically to shrink. And it's not that they refuse. In fact, it's actually in progress. Go read this:

      http://blog.delphix.com/alex/2015/01/15/openzfs-device-removal/

    45. Re:BTRFS is getting there by rahvin112 · · Score: 2

      There are people that argue you shouldn't use raid at all unless it's 10. Raid isn't a backup solution. It's a performance and reliability solution. If you need data backup you should be using real backups, not raid.

    46. Re:BTRFS is getting there by Wolfrider · · Score: 1

      --You can definitely add more disks if you are using mirrored drives in your pool, instead of RAIDZ. I created a Linux ZFS RAID0 (no redundancy) pool with 2 brand-new drives initially, then bought 2 more drives of the same brand and capacity a month later, and upgraded the pool in-place with no downtime to a zRAID10.

      --If I want to expand the size of the pool, I can just add 2 more disks in a mirrored configuration.

      # zpool add mirpool mirror ata-ST9500420AS_5VJDN5KL ata-ST9500420AS_5VJDN5KJ

      --Note that this syntax is using Linux /dev/disk/by-id devices.

      --There are some caveats and best-practices that one should read up on, for instance using ashift=12 with 4K sector drives; and using GPT partition tables on ZFS disks; but ZFS has by far been the most reliable and useful filesystem I've ever used.

      REF:
      https://blogs.oracle.com/partn...
      http://zfsonlinux.org/faq.html
      http://jrs-s.net/2015/02/06/zf...
      https://jsosic.wordpress.com/2...

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    47. Re:BTRFS is getting there by AikonMGB · · Score: 1

      Device removal on ZFS may be a thing, and may not require block pointer rewrite; it's the latter that is probably not going to happen.

    48. Re:BTRFS is getting there by evilviper · · Score: 1

      I don't why so many in the Linux community are so hooked on ZFS. BTRFS has a feature set that is rapidly getting there,

      I think you already explained it in that first sentence... ZFS has been stable, reliable, and successfully managing huge amounts of data for the past decade (2005). BTRFS is still unstable, not remotely a suitable alternative for ZFS, with only the vague promise of maybe eventually "getting there".

      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    49. Re:BTRFS is getting there by mcrbids · · Score: 2

      5 years ago, it seemed that BTRFS was rapidly getting there, and its inclusion into the kernel made it feel like a rather sure bet!

      (crickets)

      5 years later, BTRFS is still "rapidly" getting there. I've tried it numerous times and had horrible data loss events literally every single time, and this as recently as a month ago.

      Meanwhile, we're using ZFS on Linux in a complex production environment in a worst-case mixed read/write use case and it's been absolutely rock solid bullet proof, demonstrably more stable than EXT4. Yes. More stable than EXT4. And this while bring so many incredible features to the administration table! Until you've lived with snapshots, replication, clones, pools, zvols, extendable pools, and dynamic resource allocation, it's like trying to explain Monet to a blind person.

      I sincerely hope that ZFS finally becomes a first class citizen in the Linux community.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    50. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Well, you can't resize a vdev in ZFS. That's a serious design flaw in my opinion. It means you can't add a new drive to an existing array to expand the storage. If you want to add more storage you have to either totally rebuild the old array (which means you need to at least double your storage space or replace all the drives) or add on a whole new vdev array... Totally insane.

      In BTRFS or mdadm if you want to expand your storage you just add one or more drives whenever you want.

    51. Re:BTRFS is getting there by JBMcB · · Score: 1

      and pre-load the database (most common way of getting data into the database initially) then hit it with an OLTP workload, ZFS will perform *terribly* - because it got large streaming writes up front, allocated huge stripe sizes, which makes rewrite performance go to hell in a handbasket,

      L2ARC is supposed to fix this. I've heard good things about it.

      --
      My Other Computer Is A Data General Nova III.
    52. Re:BTRFS is getting there by turbidostato · · Score: 1

      "Don't use RAID5. When one drive dies, there is a very good chance another drive will die, even if the that drive is a different model or brand."

      True, but "very good chance" is still less than 100%.

      I had a lot of systems with RAID5 and so, I lost filesystems due to a second drive (and even a third) dying before recovering from the previous one but also a majority that didn't so, all in all, RAID5 showed its value.

      Maybe you are one of those that think RAID means "backup" instead of "higher MTBF".

    53. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      rapidly getting there is NOT there.
      btrfs is a bugridden hopeless joke.
      zfs has been strong since day one because they add features to the base.
      not write a bunch of code and pray like linux does.

    54. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      thats because both btrfs and linux are immature.
      bsd, specifically FREEBSD, have been around and honed and refiend in the old school solid ways since day one.
      zfs works great on freebsd and i'm never going back to linux.

    55. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Sorry, ZFS can do pretty much anything regarding shrink, grow, rebalance, migration, defrag, etc.
      The modules simply haven't been written yet.
      Git it time, participate.

    56. Re: BTRFS is getting there by Guy+Smiley · · Score: 1

      Running any database on RAID-5/6 or RAID-Z storage will suck. Better to use mirrored storage for random IO workloads.

    57. Re: BTRFS is getting there by Anonymous Coward · · Score: 1

      Hell, piss, shit... Why do you talk like that?

    58. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      BTRFS single data, RAID1 metadata, RAID1 system on Linux 4.1.6 behaves thusly:

      When a drive drops out of an array, the current BTRFS transaction is aborted. The affected FS is dropped into read-only mode. You then umount the FS, bounce the drives, remount the FS, and be good to go.

      A full scrub of the drives revealed neither data nor metadata corruption. *Examination* of the data on the drives revealed that data right up to the failure successfully made it on to the disk. Those drives have been happily chugging along, months after the incident.

      So, I'm not sure what happened to you, but it *definitely* didn't happen to me. Were you using an old (or perhaps known-buggy) kernel? Was your volume *maybe* mounted with nobarrier or nodatasum? Enabling *either* one of those options is a *sure-fire* path to data loss unless your underlying storage device is 100% robust in the face of unexpected failures.

      Were you maybe using BTRFS's RAID5 or RAID6 on a older kernel? Both the wiki and mailing lists mention that prior to 3.19 RAID5/6 support was not quite there, and that as of 3.14, error handling for RAID5/6 was not really present.

    59. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      I love ZFS, but the parent post is a joke, right? Just making sure that the poster isn't delusional...

    60. Re:BTRFS is getting there by Anonymous Coward · · Score: 1

      Raid 5 spreads the parity across the array. (Raid 5 is striping with parity).
      You're thinking of raid 4.

      I know this because I use raid 5 constantly just for that feature. It's saved my ass several times even though the rebuilds take forever. Like 2 days for my 20tb array to rebuild from parity data if i replace a drive..

      source: https://en.wikipedia.org/wiki/Standard_RAID_levels

    61. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Because ZFS is mature and stable, while BTRFS is slowly getting there. In about 5 years BTRFS might be where ZFS is now today.

    62. Re:BTRFS is getting there by Jezral · · Score: 2

      I used to use ZFS on my hacky home backup solution (Linux in VirtualBox with USB storage - yes, I know), but it would corrupt the disks once per month or so. Switched to btrfs, and it just works.

      Features that btrfs has over ZFS, and I use:
      - Mutable snapshots. It is infuriating that ZFS's snapshots are immutable. Mind you, I very rarely modify snapshots, but I damn well want to be able to without having to dump+reload all data. This alone is reason enough that I'll never again use ZFS where btrfs is available.
      - Offline on-demand deduplication. Being able to dedup files when I want is very nice. cp --reflink is also super.
      - Sane hardware requirements. ZFS is designed for extremely high quality hardware (and lots of RAM) that doesn't lie to the OS, which is just not what most of us are running. btrfs is designed for everyday use.

      Features that I miss from ZFS:
      - Online live deduplication. But it's sooo sloooow and requires so much memory, that I don't miss it much.

      Asides from that they're pretty equal in my experience. They both offer transparent compression, which is what I really want.

    63. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      BTRFS has a feature set that is rapidly getting there

      Hahahaha, oh man! Thanks for the laugh!

      BTRFS has been "getting there" for several years, whereas ZFS has already been there for as long as BRTFS has existed. Sure, it currently has a small foot print in Linux distributions, but in additional to Linux several other operating systems successfully use it and it's a rock solid product.

      God, I'm still chuckling. Haha.

    64. Re:BTRFS is getting there by delt0r · · Score: 1

      We have had this happen more than once. Basically trashing the drives while it is recovering, forces/finds trips more fail states or something. However the worst every i had to deal with, was a RAID setup for a apple timemachine backup. It was unrecoverable at every level because OSX did something stupid. Why anyone would use a mac as a server i will never know.

      --
      If information wants to be free, why does my internet connection cost so much?
    65. Re:BTRFS is getting there by rl117 · · Score: 1

      This particular incident was on a much older kernel; this bug is now fixed.

      But I've been testing Btrfs regularly since the start. Every single time I've retried testing with it, I have hit a different bug or design flaw. It's been a repeated pattern since the start. It's gotten better for sure, but it's never reached the point of being truly solid, or performant. I can't trust it with my data, which is of course its primary purpose.

      The last time I used it in anger, I was doing repeated whole-archive rebuilds of Debian. Guess how long it ran before failing? 36 hours. That's it. 1.5 days from pristine new filesystem to unusable wreck. I'm hitting the "totally unbalanced" bug which makes the filesystem go readonly, and is "fixed" by rebalancing. But this bug means that the filesystem could randomly go readonly *at any point in time* with no warning. You couldn't possibly rely on that in production if it could fail at any moment, but that's the awful reality of the design of Btrfs. Now this is thrashing the disk continuously with several snapshots being created and deleted in parallel every minute, but imagine how long it would take on a typical fileserver or desktop. I don't know, I guess it might be usage- or load-dependent. And that's the point, it's an unknown; it could happen at any time to anyone using Btrfs. This is not really a bug; it's a fundamental design flaw.

      And apart from the bugs and design flaws, the performance is awful. For Debian package builds, while snapshotting was really fast, we had to deliberately disable all use of fsync due to this killing performance entirely while running dpkg. Arguably also a design flaw due to the way fsync forces a flush of the whole filesystem. No other filesystem has performance characteristics this bad.

    66. Re:BTRFS is getting there by geggo98 · · Score: 2

      ZFS is more battle tested. BTRFS is a very fine file system, but it is still stabilizing. They just recently added support for RAID 5 and 6, a quite big features with lots of changes. A file system just takes a few years in the wild, before it can be considered stable. There are just weird corner cases, misbehaving hardware, subtile bugs and so on that you will only find in the wild. When BTRFS with RAID 5 is about 5 to 10 years old, it can be considered stable. Until then, ZFS is the first choice for everything that holds real data.

    67. Re:BTRFS is getting there by jabuzz · · Score: 1

      When did snapshot support on XFS get removed? The problem with ZFS is that it gives you very little extra over mdadm/LVM/XFS. Basically it is just the combination of the three into an integrated stack.

      The biggest difference is the checksumming but frankly if you are about this stuff then the checksumming that ZFS is inadequate anyway, and you would be better of using a DIF/DIX which covers everything all the way up and down the storage stack and is a better solution that ZFS checksums.

      Finally if you have really large amounts of storage that ZFS is supposedly designed for then frankly it sucks, there is no Information Life Cycle Management features for starters, and if you have a few hundred TB of storage you are way way better of with a clustered storage system like GPFS than ZFS. Smaller than that the combination of LVM/XFS is perfectly adequate.

    68. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      I used to use ZFS on my hacky home backup solution (Linux in VirtualBox with USB storage - yes, I know), but it would corrupt the disks once per month or so. Switched to btrfs, and it just works.

      Features that btrfs has over ZFS, and I use:
      - Mutable snapshots. It is infuriating that ZFS's snapshots are immutable. Mind you, I very rarely modify snapshots, but I damn well want to be able to without having to dump+reload all data. This alone is reason enough that I'll never again use ZFS where btrfs is available.
      - Offline on-demand deduplication. Being able to dedup files when I want is very nice. cp --reflink is also super.
      - Sane hardware requirements. ZFS is designed for extremely high quality hardware (and lots of RAM) that doesn't lie to the OS, which is just not what most of us are running. btrfs is designed for everyday use.

      Features that I miss from ZFS:
      - Online live deduplication. But it's sooo sloooow and requires so much memory, that I don't miss it much.

      Asides from that they're pretty equal in my experience. They both offer transparent compression, which is what I really want.

      You can clone snapshots to produce writeable copies. snapshot+clone for making a writeable copy is analogous to fork+exec for starting a new program. The two steps are separate rather than one function.

      As for corruption on VirtualBox USB storage, it sounds like that did not honor flushes. You likely will have trouble with btrfs too, although btrfs does not use a merkle tree, which could hide issues such as things being out of sync.

    69. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      You will never encounter a situation where a partial write confuses ZFS because ZFS checksums will tell it whether the block is valid or not. That a major difference between traditional RAID 5/6 and RAID-Z, because ZFS uses checksums to tell if the RAID-Z blocks are valid and it uses CoW to avoid doing read-modify-write operations on RAID-Z itself.

      As for the write hole, this is typically described in the context of performance because it requires read-modify-write to do in-place operations. btrfs describes it as a reliability issue, which implies that it lacks checksums for the RAID 5/6 blocks. My understanding is that Chris Mason copied the MD RAID code into btrfs, so it is quite likely that it lacks checksums for the RAID blocks. This is a design flaw that is very different from what ZFS does.

    70. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      I have been using ZFS on Linux for about the same amount of time now. No real issues (other then kernel module issues on CentOS 7) . I am a Solaris Admin by heart and use ZFS on enterprise systems and in 5 or so year now we have not had any issues.

    71. Re:BTRFS is getting there by brambus · · Score: 1

      The write hole in btrfs is AFIAK also present in zfs and listed as a risk of a power failure during write on a raid pool with COW filesystems.

      The problem you describe makes no sense in ZFS. ZFS never overwrites in-place and a synchronous write is not acknowledged until all component devices (including parity) have sync'ed to stable storage. ZFS will never ever try to read a partially written stripe block (simply because it has no pointers to it yet). After a synchronous write (O_SYNC) returns, it is guaranteed to have all of its data available, regardless if it was overwriting a portion of a file in place, or appending new data to a file.
      I think you're misunderstanding how raid-z actually works. raid-z is kinda like RAID-5, but not completely and it's this difference that allows ZFS to not have a write hole at all. All writes to a raid-z, regardless of size, are "full-stripe". The key in ZFS is that there is no fixed stripe size. I'd recommend Jeff Bonwick's original article on raid-z for a writeup of the principles and Matt Ahrens' article ZFS RAIDZ stripe width, or: How I Learned to Stop Worrying and Love RAIDZ for a nice diagram illustrating the layout.

    72. Re:BTRFS is getting there by Bengie · · Score: 1

      RAID5 increases the chance of needing to go to backup. What if your backup device is also using RAID5? Now you've increased your chance of going to a secondary backup. The difference between RAID5 and RAID6(3 disk) can be the difference between rarely going to backup and never going to backup in your lifetime.

    73. Re:BTRFS is getting there by Bengie · · Score: 1

      RAID isn't backup, but your backup devices may be using RAID, and increasing the MTBF of your backup device is a worthy cause. The main benefit of RAID10 is IOPS. I would argue that using any FS that doesn't have checksumming is useless. If you don't know your data is good, what point is there?

      The main point is a small difference in investment gives a huge difference in benefit.

    74. Re:BTRFS is getting there by Anonymous Coward · · Score: 1

      "The write hole in btrfs is AFIAK also present in zfs and listed as a risk of a power failure during write on a raid pool with COW filesystems."

      You *don't* know. There's no write hole on zfs.

    75. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      https://blogs.oracle.com/bonwick/entry/raid_z

      No write hole on raidz

    76. Re:BTRFS is getting there by Eunuchswear · · Score: 1

      Amazing.

      I've done the same thing with ext3. (ext3 on lvm on mdadm).

      Why would a filesystem need to do this stuff, it's the job of the raid or volume manager layer.

      --
      Watch this Heartland Institute video
    77. Re:BTRFS is getting there by Eunuchswear · · Score: 1

      Me! Me!

      If your data is important you'd be mad to use anything other than raid10.

      --
      Watch this Heartland Institute video
    78. Re:BTRFS is getting there by Fweeky · · Score: 1

      That's only a very partial solution - vdev removal, not vdev shrinking. And it's got a pretty meh way of going about it (removing a vdev leaves a permanent layer of redirection in its place).

      What we want is something called "block pointer rewriting", which would allow far more flexibility in the modification of an existing pool - possibly even dynamically changing RAID levels on the fly. Unfortunately it's a massive job that nobody's sufficiently interested in solving.

    79. Re:BTRFS is getting there by Fweeky · · Score: 1

      - Mutable snapshots. It is infuriating that ZFS's snapshots are immutable.

      Er, snapshots should be immutable. They're used as sources for backups and replication, allowing them to be mutable would defeat the main purpose.

      zfs clone if you want a writable copy. What's wrong with that?

      ZFS is designed for extremely high quality hardware (and lots of RAM) that doesn't lie to the OS

      ZFS is designed to be robust in face of crappy lying disks. That's what all the checksumming and self healing is about - ZFS will cope *far* better with your dire consumer drives than most traditional filesystems. But yes, it likes its RAM, and it likes its redundancy.

    80. Re:BTRFS is getting there by Fweeky · · Score: 2

      This is one of the things the Solaris-derived versions have tended to be better at handling - ZFS expects failing drives to be detected/managed by an external fault management service (fmd) which doesn't exist on other OS's. ZFS itself doesn't mark a drive as bad itself unless it outright disappears from the system.

    81. Re:BTRFS is getting there by Fweeky · · Score: 1

      Hence why the ZFS filesystem layer is built on top of its volume management layer, and why you manage things like that using the zpool command, not the zfs command.

    82. Re:BTRFS is getting there by brambus · · Score: 1
      While you are correct that ZFS /w raid-z doesn't have a write hole problem, you got the reasons for it a little wrong, so consider these just helpful tips from a ZFS developer. The real trick with raid-z is ZFS' COW nature combined with the fact that all writes a full-stripe writes (variable stripe size). Alternatively, you could say that ZFS doesn't really have anything like a stripe, but instead has a variable block component distribution map which depends on the block's location and size. Here's the actual code that does the raid-z map computation.

      RAID5 has n data disks plus one dedicated parity-only disk; ZFS distributes all data and all parity across all disks

      RAID-5 also spreads parity among all component disks. Each stripe, the parity disk is switched. This is done to achieve higher throughput on reads, as without it, one disk would always sit idle for read workloads.

      ZFS updates metadata before data

      Actually, ZFS updates metadata together with user data, but the trick is that the update is never performed in place. So what happens is that we write user data along with nearly all the metadata needed to access it. Then, once everything has finished writing (and has been sync'ed to stable storage), we update the root block pointers to point to the new metadata tree and again, sync those. In this respect ZFS is much more like an ACID-compliant database than just a conventional filesystem.

    83. Re:BTRFS is getting there by Marqis · · Score: 1

      (1) RAID5 has n data disks plus one dedicated parity-only disk; ZFS distributes all data and all parity across all disks

      RAID5 distributes parity and data across all disks, having a single dedicated parity-only disk is RAID3.

    84. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      I'm curious - what sorts of data at home do you store that contain lots of duplication? The bulk of my home data consists of pictures, videos, and music, none of which would contain duplicate data.

    85. Re:BTRFS is getting there by ilsaloving · · Score: 1

      Because up until fairly recently, it wasn't even considered stable. It's RAID capabilities are also still immature. Those are two very big blockers for anyone who considers not only their data, but their usable time, to be sacrosanct.

      Surprises should be left for birthday parties, not servers.

    86. Re:BTRFS is getting there by Aaden42 · · Score: 1

      For me, lack of RAID-5 is enough reason to consider BTRFS deficient. The glowing accolades as to the reliability and accuracy of fsck tools for BTRFS leave a bit to be desired: https://btrfs.wiki.kernel.org/...

      I looked long & hard at BTRFS about four months ago and was considering migrating from ZFS. It blew my mind how much was lacking in BTRFS in comparison and that anyone would consider it superior in any way other than that it's in-tree.

    87. Re:BTRFS is getting there by Aaden42 · · Score: 1

      Basically it is just the combination of the three into an integrated stack.

      Basically just, that's exactly why it excels. The benefits from the integration are very significant in real world use. Zero overhead snapshots, resilver only used blocks instead of blindly copying an entire device block by block, scrubbing only used blocks in pool to ensure all copies are consistent across devices & no bit rot has occurred (and FIXING it if it has).

      You dismiss the checksumming, but that's how ZFS detects bit rot, even on a RAID-1 mirror. Mirror a device without checksums, one device has a bad write but still readable (IE no device-level read error). Without checksumming, you have no way of knowing which is correct. ZFS can detect and recover from this during a scrub or automatically and transparently on read.

      The integration really does make ZFS a superior system.

    88. Re:BTRFS is getting there by Aaden42 · · Score: 1

      You misunderstand the differing purposes of RAID-X and backup.

      `sudo rm -rf /`

      Tell me how having 27 redundant drives in my RAID array saved me from going to backup in my lifetime.

    89. Re:BTRFS is getting there by Aaden42 · · Score: 1

      The value of "very good chance" is up for significant debate. I've replaced failing drives in RAID-X (for X >1) arrays a few dozen times. Haven't had that second loss happen yet. I'm not saying it's not possible, but it's not something that keeps me awake at night given the other compensating controls I have in place.

      Among those compensations:
      1) ZFS resilver != MD resync. If I lose a 4TB drive that had 2TB of stuff actually on it, I'm only copying 2TB worth of blocks. 50% less wear & tear on the other array members, less liklihood of a getting shot a second time while the Doctor's regenerating...

      2) ZFS self-healing on read. Every time blocks are read, any individual device read problems get detected (by checksum), repaired (by block relocation), and signaled to userland (by the `zpool status` command, and also as picked up by my system monitoring solution (Zabbix) and alerted to me via email & phone push).

      3) `zfs scrub` which is an on-demand, read the entire drive & apply (2). Scheduled to run weekly, I know if there are issues starting to occur long before a device fails.

      4) Freaking backups. On another machine. Maybe on tape if you've got the $$$. In my case, "last year's" (or 3-4 years's...) disks go in another server which gets powered weekly for zfs send/receive, then powered down again. Also RAIDz, also subject to maybe failing at the same time as the others, but we're well past lottery odds at that point I think...

      Any errors signaled by 2 or 3 get a couple of shots at the reseat the device, retry game. After that, the drive gets subbed out. Sometimes the reseat is enough for months or years of additional error free operation, sometimes the device was really going. Either way, ZFS has warned me far enough in advance to remediate before a second failure.

    90. Re:BTRFS is getting there by OrangeTide · · Score: 1

      I've used cheap snapshots on WAFL, modified XFS, and plenty of other systems over the years. It's incredibly useful to have, from being able to backup or replicate over a slow link, to making an automated build server that always has the latest code checked out. It's incredible frustrating to me that these features that have been available on Linux 10 years ago aren't ubiquitous yet when they can open so may new workflows.

      Checksum is really nice when you can't do RAID-5. You summed up the real weakness in RAID-1, in that once the data diverges it's hard to determine which is good and which is bad. Of course a volume manager can provide you a block level checksum today, and from that you can run any file system you want on top of it. Some of the more expensive RAID controllers already support chunk checksum in hardware in case there is some performance issue with doing it in software. But those tend to force you to rewrite the entire chunk so the checksums can be recalculated, can't do a small single block modification (512/1K/4K/whatever) on those platforms. ZFS gives you some extra flexibility in that regard.

      --
      “Common sense is not so common.” — Voltaire
    91. Re:BTRFS is getting there by NitroWolf · · Score: 1

      My problem with BTRFS and ZFS, and I admit I may be in the minority, is the handling of RAID. Creating a raid setup is fantasically easy in ZFS and BTRFS and is miles ahead of mdadm. However, the problem comes when you want to expand your raid. If you want to increase your capacity, you have to create a whole new raid the same size as your old raid.

      I'm sorry, but I really don't want to put together another 16 TB of disks and add another 16TB to my raid. I just want to add another 3 or 6 TB hard drive and expand it that much. I don't consume TB of data in the span of a few weeks or month. Adding an additional 3 TB to my RAID will last me for another year or so. It would be pointless to add another 16 TB and it would waste 2 additional disks for no reason.

      If I add another 3 TB disk to my RAID6 under mdadm, I get another ~3 TB. If I add another 8 3 TB disks to my current raid, I get another 24 TB. If I add anothe 8 disks to a ZFS or BTRFS raid setup, I get another... 16TB. Fuck that.

      Other than that, I haven't found anything that I dislike about ZFS or BTRFS... but the RAID situation is a real deal killer.

    92. Re:BTRFS is getting there by fnj · · Score: 1

      rahvin112 - You are unquestionably mistaken about the write hole. The design and implementation of ZFS specifically banishes it. Other respondents have made this clear with some authority, and I will only add this reference:

      "RAID-Z is a data/parity scheme like RAID-5, but it uses dynamic stripe width. Every block is its own RAID-Z stripe, regardless of blocksize. This means that every RAID-Z write is a full-stripe write. This, when combined with the copy-on-write transactional semantics of ZFS, completely eliminates the RAID write hole. RAID-Z is also faster than traditional RAID because it never has to do read-modify-write."

      Now, it is quite conceivable to me that BTRFS could possibly achieve a similar result, if it melds redundancy at the file system level as ZFS does, and uses similar careful design. But NOT just by the fact of implementing RAID5, which I read as your claim.

      brambus has an excellent response which I am sure you have seen. He corrects some mis-speaks of my own, which are a bit at the detail level, including precisely how ZFS closes the write hole. So it doesn't finish the metadata update and then the data update. It instead atomically sets a pointer to the updated COW writes at the end. Either you get the uncorrupted "before", or the uncorrupted "after", depending on exactly when your power fails.

    93. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      > This particular incident was on a much older kernel; this bug is now fixed.

      Frankly, you should have prefaced your initial comment with this disclaimer. Even EXT4 has had data-loss bugs, and it's a *far* less ambitious project that's *widely* considered to be stable and mature.

      Anyway. I -too- have been running with btrfs nearly from the start. [0] It sees use on most of my systems as rootfs and /home, so I don't make use of anything fancier than RAID1 and don't often make a squillion snapshots. However, I *was* using docker for a while a while back and using its btrfs backend, which makes *liberal* use of snapshots.

      Aside from some ENOSPC issues *way* back in the day, and a pretty reliably-triggered PAX overflow detection when decompressing transparently compressed data on my 32-bit laptop [1][2], I haven't run into anything catastrophic.

      ISTR that the btrfs folks talked about introducing code that make fsync *much* smarter say... six months to maybe a year-and-a-half ago. When were you doing the Debian package build stuff, and did you report those snapshot-induced failures to the mailing list? If you did, did they ever follow up with a fix that made it into a later kernel version?

      [0] I started using it back when the Vertex LE was *just* commercially released. I've actually been running btrfs on the same Vertex LE ever since I purchased the drive so very, very many years ago.

      [1] I'm *eagerly* awaiting the 4.2 Grsecurity patch set. For now, I'm stuck on the 4.1 kernel series. :(

      [2] The BTRFS volume I mentioned in my first comment is attached to a 64-bit system running the same kernel as my laptop, is mounted with compress-force=zlib, and doesn't encounter those overflow issues I see on my laptop. So, it's a good bet that the issue is 32-bit specific. I *really* want to see if the issue is fixed in 4.2, or if this is yet another false positive in PaX's overflow detection, but, alas. :P

    94. Re:BTRFS is getting there by fnj · · Score: 1

      Raid 5 spreads the parity across the array. (Raid 5 is striping with parity). You're thinking of raid 4.

      Thank you. Error accepted. I got bitten by some bad material on the web. What ZFS does do differently is this: every (variable size) logical block written is its own stripe. It does not have a fixed stripe size like RAID5, which more than one block may share. This eliminates some read-modifiy-writes which RAID5 has to do (with the attendant reduction in throughput), as well as being instrumental in closing the write hole. It can do this because the redundancy is implemented at the same level as the filesystem. I.e., it is aware of the data structure, which traditional RAID5 is not.

    95. Re:BTRFS is getting there by Anonymous+Psychopath · · Score: 1

      I don't why so many in the Linux community are so hooked on ZFS. BTRFS has a feature set that is rapidly getting there, its becoming more a more mature in terms of code that is already in the upstream.

      Why not just put your energy there?

      If you eliminate the word "Linux" it changes the context slightly. Is data resiliency is your goal, FreeBSD+ZFS is a better solution than Linux+BTRFS, unless your specific use case makes the OS more important than the filesystem for some reason.

      --

      Eagles may soar, but weasels don't get sucked into jet engines.

    96. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      BTRFS is rapidly changing, and generally considered unstable. I don't care what you say, read the bug logs.

      BTRFS also isn't supporting many features of non-cloud users, such as 3-disk parity. I believe i recall hearing how someone write a patch to allow, and it was rejected, or ignored.

      btrfs doesnt have the equivilent of an SLOG (external journal) so that writes are ack'd onto faster disks, and periodically written out to main disk.

      btrfs doesnt have the equivilent of an L2ARC (SSD read only cache for highly accessed data) as zfs has.

      what features does btrfs beat ZFS in?

    97. Re:BTRFS is getting there by fnj · · Score: 1

      brambus - thank you for improving my knowledge of the details of this wonderful product - and thank you as a developer! There is a lot of bad info out there. At least I knew that the integrity was very cleverly guaranteed, even if I unwittingly took liberties with the details.

    98. Re:BTRFS is getting there by fnj · · Score: 1

      marquis - thank you - correction noted. It's been too long since I have used any kind of traditional RAID.

    99. Re:BTRFS is getting there by Jezral · · Score: 1

      Er, snapshots should be immutable. They're used as sources for backups and replication, allowing them to be mutable would defeat the main purpose.

      zfs clone if you want a writable copy. What's wrong with that?

      The problem with zfs clone is that "clones can only be created from a snapshot" which means that deleting a file from a clone does not delete the file from the underlying snapshot, so the space is never actually freed. So when I accidentally have a very large temporary file in my backup set, it's stuck taking up space until it cycles out of history.

    100. Re:BTRFS is getting there by Jezral · · Score: 1

      I'm curious - what sorts of data at home do you store that contain lots of duplication?

      I should've qualified that. The home backup system is the part of it that I have here at home, but the data is from several servers around the world, plus my personal files. And of course there's other backup sites so it doesn't 100% rely on my house or connection. And I have since improved my part with a dedicated machine rather than VirtualBox, though still USB attached storage because I had the disks anyway.

    101. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      ZFS implement is advertised to be specifically immune to write hole.

    102. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      incorrect; ZFS does not suffer write hole, and does NOT recommend a battery-backup for clean shutdown. You do need a SSD that is safe in the event of a power-loss, if you want FAST write performance, and a guarantee of your data. If the device is not power safe, it'll still recover up to whatever was written properly to the SLOG.

      https://blogs.oracle.com/bonwick/entry/raid_z

    103. Re:BTRFS is getting there by greenfruitsalad · · Score: 1

      i also agree with the parent. oracle develops their ZFS with much needed features like encryption, while we, free software users, are still stuck on minor adjustments to v28 ZFS because almost all developer effort is concentrated on implementation of openzfs among all the OSs. that's not development of ZFS. it's development of implementation.

      OpenZFS should've dumped the name long time ago and gone its own way. the name just confuses people. when openzfs gets encryption completely incompatible with oracle zfs', it'll just lead to a lot of problems when people try to import pools from solaris on illumos/linux (or vice versa).

      i don't think oracle is EVER going to publish sources to their ZFS. i do hope btrfs stabilises soon. i'll happily switch (a few years after early adopters).

    104. Re:BTRFS is getting there by greenfruitsalad · · Score: 1

      if he's anything like me, it's virtual machines. i have about 50 virtualbox disk images, about 100 linux containers and since i've enabled deduplication i've become extrely lazy and use duplicity instead of hardlinks for my files. e.g. i have same photos in ~/Pictures/by_occassion/whatshisname_wedding as i have in ~/Pictures/by_date/2013-06. i know it's bad practice but if the FS can deal with it, why not?

    105. Re:BTRFS is getting there by rl117 · · Score: 1

      This isn't a problem with the tools though. Having immutable snapshots is a part of the design, and a good feature at that. Writable snapshots with Btrfs cause more problems than they solve; most of the time I want the snapshot to be an immutable snapshot of the state at a certain point in time and being writable removes that certainty.

      If you have this requirement, then make a clone, delete the file and snapshot the clone; you now have a snapshot with the offending file removed. Slightly annoying, yes, but certainly possible.

    106. Re:BTRFS is getting there by rl117 · · Score: 1

      It was simply one example (of several similar examples) which happened over a long period of time. I've suffered every time I've used it, from its early days right up to last year. Yes, that particular bug is gone. But there are plenty more which took its place and were similarly awful.

      The snapshot issues are the ENOSPC metadata/data block allocation bug; I didn't report it because it's well known. So much so that it's on the Btrfs wiki. I've not seen any evidence that it's been fixed in the meantime.

      Regarding fsync, maybe it's been improved in the last year. I won't hold my breath though. It's a fundamental problem with using b-trees as the on-disk structure; you're forced to write out all prior pending transactions in addition to your own in order to maintain correct operation ordering and consistency within the tree. Maybe they found a way around that, if so I'd be interested in any pointers.

    107. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      My understanding is that the it does take advantage of COW. The problem is your parity has two copies, and you have two copies of the data that may or may not match the parity because it lost power during the write. This is why they call it a write hole because the algorithm can't be sure which copy of the written data is the right one because their are two copies of parity data as well. It's a tricky problem that's going to need either some pretty smart algorithms to sort out which copy is the right one or a prompt to the user to identify the correct file.

      On the other hand IIRC zfs suffers from a similar problem with power loss and it's one of the reasons zfs is recommended to have a battery backup to allow a clean shutdown. It is my understanding this is a general design problem with COW filesystems that really hasn't been completely solved yet. In that if you have a corrupted write with parity data it becomes difficult for the computer to identify the correct data and parity because it has multiple copies of each and some may or may not be corrupt. It's just a really tough problem.

      Your understanding is wrong. The Open ZFS documentation says that battery backups are not necessary:

      http://open-zfs.org/wiki/Hardware#Controllers

      I have never heard of anyone recommending the use of a battery backup with ZFS. The battery backup units are potential points of failure that ZFS developers explicitly advise against using.

    108. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      > Yes, that particular bug is gone. But there are plenty more which took its place and were similarly awful.

      It's *still* good to mention in your complaints roughly *when* the failures happened and *roughly* what kernel rev you were using. Btrfs development is pretty fast; stuff that went wrong in a six-month-old kernel is far more likely than not to be fixed now. (As demonstrated by the difference in failure behavior between your drive loss incident and mine. :) )

      > The snapshot issues are the ENOSPC metadata/data block allocation bug; I didn't report it because it's well known. So much so that it's on the Btrfs wiki.

      Are you talking about this? https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_get_.22No_space_left_on_device.22_errors.2C_but_df_says_I.27ve_got_lots_of_space

      If you *are*, then know that btrfs grew a global reserve pool a little while back that -I suppose- is intended to help handle these things. You can't see it with btrfs fi df, but you *can* see it with btrfs fi usage:

      ~ # btrfs fi usage /
      Overall:
              Device size: 37.17GiB
              [snip]
              Global reserve: 512.00MiB (used: 0.00B)
              [snip]

      As an aside, if you run into ENOSPC errors when running a btrfs balance operation on a volume that has less than 1GB of free space, from what I've been able to deduce from my testing, 100% full btrfs chunks take up 1GB of space. In order to rebalance such a chunk, btrfs currently needs at least 1GB of space free to do the rewrite.

      > ...I'd be interested in any pointers [regarding fsync perf].

      https://btrfs.wiki.kernel.org/index.php/Changelog#By_version_.28linux_kernel.29 lists several kernel versions where fsync safety and/or perf improvements happened. There are several threads in linux-btrfs in the past ten+ months that mention scattered fsync performance improvements. Some of them are *likely* unrelated to your workload (for example, improvements to nocow fsync perf), but some of them *might* be. I don't know enough about either btrfs internals or your workload to be able to point out which of the many would be relevant to your interests.

      This set of somewhat recent (2015-04-28) benchmarks from the Tux3 people shows *quite* favorable fsync performance from BTRFS on spinning rust: http://www.spinics.net/lists/kernel/msg1977366.html .

      However, Dave Chinner re-runs the tests on an SSD http://www.spinics.net/lists/kernel/msg1978233.html and gets *really* bad numbers for BTRFS.

      BUT, make sure to read the rebuttal and re-test results on tmpfs-backed devices by Daniel Phillips (the fellow who ran the initial tests): http://www.spinics.net/lists/kernel/msg1978483.html . It's a damn shame that neither Phillips nor Chinner mention the kernel version that they're using. One presumes that -because they're testing an experimental FS- it's the latest available, but presumption is not a good way to do science. ;)

    109. Re:BTRFS is getting there by turbidostato · · Score: 1

      "RAID5 increases the chance of needing to go to backup"

      "Increases" implies a comparation. RAID5 increases chance of needing to go to backup compared to what? JBOD?

      Of course RAID6 increases MTBF versus RAID5, that's its purpose. Now, you say from "rarely going" to "never in your life time". Surprise: cost also increases in more or less the same amount. Oh! and, by the way, I already told you that I managed quite a lot of systems. I *also* lost (in my lifetime, no less) RAID6 systems: once because of a malfunctioning controller and another one, believe or not, because of three disks dying in quick sucesion (from an array of seven).

    110. Re:BTRFS is getting there by Bengie · · Score: 1

      Please explain your non-RAID disk-based backup solution? Don't even mention tape, that's irrelevant to this discussion, it's just another medium. You can store your data on punch-cards for all I care.

    111. Re:BTRFS is getting there by Bengie · · Score: 1

      That's why I said "RAID6(3 disk)". I have a cousin that has been a guest speaker for several national talks about how to properly setup your storage, and he managed a 10PiB logical datastore for nearly a decade using RAID6(RAIDZ3) and never had to go to backup. Your servers should be setup in a way that a single controller does not take down the array. Even better is you can setup a master-master/slave shared SAS plane, so if one file-server dies, the slave picks up. Yes, you can allow multiple computers to directly share the same physical harddrives, FreeBSD supports that. It's file system agnostic and works transparently.

    112. Re:BTRFS is getting there by turbidostato · · Score: 1

      "I have a cousin that has been a guest speaker for several national talks about how to properly setup your storage"

      And the hope is that he knows better what he talks about than you.

      "Your servers should be setup in a way that a single controller does not take down the array"

      No. My servers need to be setup in a way that supports my business case. Sometimes that means going real time fully geographically redundant out of multi-tiered multi-rack-sized storage units (I also managed i.e. 3PAR arrays... you know you are starting to talk big when you count your storage real state by rack cabinets, not disks), some others, I can and should go with a single humble COTS SATA disk.

      "Even better is you can setup a master-master/slave shared SAS plane, so if one file-server dies, the slave picks up. Yes, you can allow multiple computers to directly share the same physical harddrives, FreeBSD supports that."

      And here you show for what you are, my cute PFY..

    113. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      There is no such thing as a "ZFS array". ZFS abandons the block device centric notion of storage that RAID does, so the term array is inappropriate for it. Instead, you need to think of it as a vdev tree. The top level (below the root) vdevs is where the raidz and mirrored vdevs go. You can always add more. Where you cannot add more is a raidz vdev.

    114. Re:BTRFS is getting there by brambus · · Score: 1

      Glad I could help.

    115. Re: BTRFS is getting there by BitZtream · · Score: 1

      Running a database on raid5/RAIDZ is pretty stupid for a whole bunch of reasons many of which are documented in man ZFS

      The performance will be shit no matter what you do.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    116. Re:BTRFS is getting there by jwhitener · · Score: 1

      I don't why so many in the Linux community are so hooked on ZFS. BTRFS has a feature set that is rapidly getting there...

      I don't know about others, but for me it is because ZFS has been enterprise ready for a long time. I've been using it for years at work. So a mixture of trust, familiarity, maturity, etc.

      I have not used BTRFS yet, mainly because I haven't encountered a situation that needed anything more than ZFS is currently providing.

    117. Re:BTRFS is getting there by Bengie · · Score: 1

      What? FreeBSD supports master-master(only one drive writes at a time per blockdevice, which is negotiated) shared physical HDs over SAS. You just need a high speed link between the two masters and FreeBSD figured out the current master at the CAM layer allowing for it to work with all filesystems. ZFS is nice in that you can simply do asynchronous constant ZFS replication to a remote pool. Not real back-up, since data lost will replace the loss to the remote machine.

    118. Re: BTRFS is getting there by BitZtream · · Score: 1

      You can attach and detach block devices from vdevs at will. You can't remove top level vdevs. You can add vdevs. I've done it many times

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    119. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      vdev removal is mainly a way to fix accidentally doing "zpool add -f pool /dev/by-id/somecache" when you really meant "zpool add -f pool cache /dev/by-id/somecache". That's happened to real people by accident when the "-f" is actually required (for example when trying to share a single physical cache device among multiple pools and/or storage subsystems). Or missing out the "mirror" or "raidz3" keywords. Or things like that, which can happen even if you carefully use the "-n" flag. Even though it's a mistake you probably will only ever make once, if someone does it to your huge pool, you're going to have a bad several days.

      vdev removal where the to-be-removed vdevs are very new is worth the overhead of a tiny amount of permanent extra data in the pool.

      You're right that it's not a good solution for removing a very full vdev.

      In fact, the only type of top level vdev that can be removed is a single block device or file. You cannot presently remove even broken mirror vdevs (although you can force detach in that case), let alone raidzNs. You also have to be wary about the remaining top level vdevs, although there's been iterative improvements there.

      Finally, in the code delphix released, the indirection pool data goes away as the referenced blocks go away, and you get less of that data if you are simultaneously freeing during the removal (undoing some ill–timed writes to a pool that just got badly configured), you reduce the amount of indirection data that gets stored in the first place.

      The random IOPS during all of this will probably be the biggest bite, though. As usual.

    120. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Zfs will quite correctly stop using a drive that posts more than a small handful of checksum errors (or I/O errors) outside of a scrubbing or resilvering context. Zpios is used to test that functionality in zfsonlinux.

    121. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Raidz, or rather zfs's zio & spa layers generally, does not update "metadata before data". Zfs quiesces and syncs a transaction group which results in an as-linear-as-possible and highly aggregated set of LBAs-to-be-written, which are then elevator-algorithmed out to the leaf devices. By the start of the sync most of the data has been processed and metadata generated from that (the data has to be processed first since metadata blocks always store the checksums of their children). Finally, at the end of a transaction group sync, the uberblocks are updated, and since they are the root of the tree of metadata-to-data, really it's metadata-last.

      That the Merkle-tree roots are written last is why zfs is crash-resistant; a crash should *never* lead to an inconsistent pool -- the worst that should happen is that a transaction group is never finalized and thus its partially-written, asynchonous data are unreachable after the pool is imported again. (Unacknowledged synchronous data will also be lost, but acknowledged synchronous data will be stored firmly in the pool log or in a separate log device).

    122. Re:BTRFS is getting there by Anonymous Coward · · Score: 0

      Device removal is a thing; Delphix has released code. However it is not the thing you really want.

      The use case is:

      "zpool add -f foo /dev/bar" instead of "zpool add -f foo cache /dev/bar" # oops ! destroy pool time !

      Removal creates a lot of indirection data on the pool for each live object on the device being removed.

      The released removal code also cannot yet remove anything but a single-device leaf vdev. If you accidentally do "zpool add -f foo mirror /dev/c1 /dev/c2" instead of "zpool add -f foo log mirror /dev/c1 /dev/c2", it still can't help you (at present).

      There are restrictions on the surviving vdevs; at present they also must be single-device leaf vdevs (either individual disks or files), although code exists which works for the case where (2-way but not arbitrary) mirrors are the survivors.

      In short, it's for dealing quickly with a frustratingly common operator error (almost everyone does it exactly once eventually) that is otherwise hard to back out of.

      You are correct that it does not require (nor is anything like) block pointer rewrite.

      This is a bit old. A refresh will happen around zfs day (which is next week).

      http://blog.delphix.com/alex/2...

    123. Re:BTRFS is getting there by Fweeky · · Score: 1

      The problem with zfs clone is that "clones can only be created from a snapshot" which means that deleting a file from a clone does not delete the file from the underlying snapshot, so the space is never actually freed

      zfs promote clone-filesystem && zfs destroy clone-filesystem@snapshot-it-was-based-on

  9. EXT4 support in Marshmallow by tuppe666 · · Score: 1

    Android finally gets EXT4 support in Marshmallow to provide real and wonderful dupport for SDCards, and suddenly Ubuntu goes ZFS. There may be many advantages with ZFS. Matching that of the worlds largest OS doesn't hurt

    1. Re:EXT4 support in Marshmallow by fahrbot-bot · · Score: 1

      Android finally gets EXT4 support in Marshmallow to provide real and wonderful dupport for SDCards, and suddenly Ubuntu goes ZFS. There may be many advantages with ZFS. Matching that of the worlds largest OS doesn't hurt

      And when Andriod gets ZFS, we'll be ready for when those 256 zebibyte SD cards come out.

      --
      It must have been something you assimilated. . . .
    2. Re:EXT4 support in Marshmallow by Anonymous Coward · · Score: 0

      ZFS needs a considerable amount of memory to maintain its performance. Not really a feature of most systems using Android.

  10. Re:ZFS is nice... by Anonymous Coward · · Score: 2, Interesting

    So how are they doing this without license conflict? Are they doing a clean-room implementation of ZFS?

  11. Re:ZFS is nice... by Guspaz · · Score: 2, Funny

    My file server has a very low-end nVidia graphics card in it. There was some sort of issue with the stock drivers that shipped with the distro, such that I got no video output at all, and I don't have any GUI installed, just text-mode console. I had to install the nVidia drivers to get it working.

  12. Re:ZFS is nice... by Chris+Mattern · · Score: 1

    What was the Nvidia video driver doing on a server?

    What was any kind of an X server doing on a server?

  13. perhaps i am lost.. by Anonymous Coward · · Score: 0

    But ladies and gents, help me or us to understand the issues with nVidia drivers and how they relate to a file server?
    If the refrence is to a GUI or some other graphic artifact, then pls state it..
    If there is a refrence to how the driver renders the screen in text mode (console) pls elaborate
    If the nVidia is only related to GI's as they relate to install packages, also pls elaborate.
    Last, Dont post a link.. Take a moment to explain the content of which you are promoting..
    I fail to see/understand how a Graphic Driver has any relation to a File system.
    At the end of the day, they are all presented to th eOS in the same way/manner thus no difference in operation..

    I am saying al of this because ZFS is Friggin DOPE!!! dag diggety dope..
    With the snapshoting features, the ability to change the behavior of NFS(samba), support of native containers, the ability to "scrub" a currently mounted FS, Raid-Z, native de-dupe, oh man the list goes on..
    I am sure someone or some entetity, is building some sort of appliance to do just that,, what ever it is, as long as its reasonable..

    1. Re:perhaps i am lost.. by ArmoredDragon · · Score: 1

      How's this for an appliance:

      - Have a server with an ssd and 4 disks
      - Install VMware ESXi bare metal
      - Create a filer VM that you install ubuntu with zfs on to, and use VT-d to pass the disks directly to that VM
      - Have that VM share the ZFS volume as an NFS share that is only open to one IP
      - Create another VM that mounts that NFS share and subsequently offers these services to the rest of the network: Samba, Plex, Couchpotato, Sickrage, Rutorrent/rtorrent

      At least, this is what my server looks like anyways. Total of 9TB (after parity) disk space with a symmetric gig connection.

      Who needs netflix or cable anymore? :D

    2. Re: perhaps i am lost.. by Anonymous Coward · · Score: 0

      I have been there, but passing the controller to the VM lets you un a chicken and egg situation, how do bootstrap the zfs VM? The Only way its to get another share on the as datastore, an host it there. It creates a boot dependency that is a PITA.

    3. Re: perhaps i am lost.. by ArmoredDragon · · Score: 1

      Oh no there are no datastores on it. I did try that route, using a software iSCSI scheme, and it doesn't work terribly well because it can't recover on its own if you reboot it.

      Instead the VM itself just boostraps off of the datastore that hosts the ESXi install. It would be no loss if it were to fail, as unlike most people who build these things, I have a complete build doc ready to go so that I can have a fresh instance of it back up and running in 30 minutes. Since the bulk of the data is stored on the ZFS volume, it's not a problem.

  14. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    SSH?

  15. Re:ZFS is nice... by jonnythan · · Score: 1

    Server with automatic upgrades and video drivers?

    No backups, either? You had to reinstall the OS?

  16. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    Ubuntu server likes to use nouveau and hi-rez(tm) console text when you install it. It's fairly annoying.

  17. openSuSE by Anonymous Coward · · Score: 0

    openSuSE presently defaults to BTRFS and XFS, both at the same time.

    Windows only has NTFS. LOL

    1. Re: openSuSE by Anonymous Coward · · Score: 1

      Windows has more than just NTFS. You seem to be ignoring ReFS at the very least. And if you're being generous, you should count exFat.

  18. Re:ZFS is nice... by Anonymous Coward · · Score: 1

    Blah, blah, blah. Been running ZFS on Ubuntu for years on my NAS too. There was one hiccup along the way but otherwise flawless. It's not a userspace thing anymore, it's the real deal and the maintainer is excellent.

    Nothing against BSD, I use it too. But your anecdote is simply some type of vent, and certainly does not equal data.

    (Btw, Nvidia on your NAS...wtf?)

  19. Re:ZFS is nice... by __aaclcg7560 · · Score: 1

    Ubuntu saw the built-in Nvidia video card on the desktop motherboard I was using at the time and installed the Nvidia drivers. Initial setup was fine. The automatic upgrades usually screwed things up.

    FreeNAS has the VGA-only driver for video output and works fine with the built-in AMD video card on the desktop motherboard that I'm currently using..

  20. Re:ZFS is nice... by Guspaz · · Score: 1

    SSH is my primary interface to the server, but sometimes you've got to get on a box locally, like if you mess up something network related, or you mess up a change to grub, or who knows what. It's not common, but I don't have a serial terminal, so having video output when needed is very important.

  21. Re:ZFS is nice... by __aaclcg7560 · · Score: 2

    Being unable to SSH into my Ubuntu file server was usually the first indication that the automatic update went FUBAR. The black screen from the video card didn't help either.

  22. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    The real WTF is that the OS doesn't automatically create a checkpoint/snapshot before installing an OS update. Oh, wait, ZFS isn't a first-class citizen on Linux yet. Never mind.

  23. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    Ubuntu doesn't install binary (closed source) drivers without any user action. You have to manually tell it to install those.

  24. Re:ZFS is nice... by ArylAkamov · · Score: 1

    What advantages does it have over other file systems?

  25. Re:ZFS is nice... by ArmoredDragon · · Score: 1

    What kind of server doesn't have IPMI?

  26. Re:ZFS is nice... by OrangeTide · · Score: 1

    Some of us do use CUDA or OpenCL in our servers. Not that Nouveau is much use for that, but it is the default and you gotta boot Ubuntu up with the defaults at least once before you can configure it properly.

    --
    “Common sense is not so common.” — Voltaire
  27. Re:ZFS is nice... by Guspaz · · Score: 1

    The home kind?

  28. Re:Just wait a while by ArmoredDragon · · Score: 1

    Actually there was a problem with zfs and systemd on ubuntu; namely you couldn't have the ZFS stack automatically start and mount the filesystem at bootup; it just wasn't possible until only very recently.

  29. Re:ZFS is nice... by Anonymous Coward · · Score: 2, Informative

    To name a few: A variety of flavors of built-in RAID / replication. Built in error detection and correction. Snapshots. The ability to send and receive deltas between snapshots from one server to another.

  30. Re:ZFS is nice... by sexconker · · Score: 3, Insightful

    A typical home Linux server - AKA an old PC - won't have IPMI. Actual servers typically will have IPMI, but they cost $BIG_BUCKS$. And even then, IPMI is extremely limited.

    On the Dell servers I bought a few months ago I can't do anything useful with it beyond power on/off or text-only console redirection over serial (over LAN) before the OS loads (I can get into BIOS and the RAID controller ROM, not much else).
    Unless of course I pony up more cash for their iDRAC Standard/Pro/Enterprise/etc. shit. THEN I can get graphical console redirection, some storage space to flash firmware from, and even USB/optical drive redirection.

  31. BTRFS by ajzimm3rman · · Score: 0

    Butter FS isn't good enough for you Mr? I think they're admitting that.

  32. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    Curious how well it performs on mobile systems (laptops, tablets, phones, ...) vs. current filesystems (e.g., EXT)? In particular, does battery usage suffer significantly? Are CPU/RAM requirements higher?

  33. Re:perhaps i am lost-butt now i am found?? by Anonymous Coward · · Score: 0

    OK OK.. Now through the thread it becomes clear..

    the nVidia driver issues are related to the installation not the actual operation of ZFS itself..
    Prior to my posting, it did not seem clear..
    Moving past all of that stuff..
    Pretty impressive system you got there?? Care to share the Cost to feed that setup on daily basis??
    If it's in a Datacenter, whats the cost per month to keep it fed n cool??
    Also, if possible pls include the cost of the connection as well.
      Ballpark would be fine.
    While your setup is impressive no doubt, is it feasible to the average "joe"??

    For example, I have an acient Hitachi v9500 series ThunderHead, I have revamped the storage backplane with current equipment. Sas 12g LSI controller, fuly loaded with 12 drives (which are Toshiba PX02SMF Series 1.6TB SAS 12Gbps.) for a grand total of 12.8GB with 2 Hot global spares..
    This setup while not as intricate as yours, is still plenty fast and may cost the same.
    But the factor for the readers @ large that we dance around, the cost..
    the drives alone are..
    http://www.memory4less.com/m4l_itemdetail.aspx?itemid=1470047648&partno=PX02SMB160&rid=90&origin=pla&gclid=CMSriPL_rsgCFY9cfgodSUUD0g
    $5,739.54 each x10,, to much math for me :(
    then the Thundah Head
    http://www.ebay.com/itm/like/140744278072?ul_noapp=true&chn=ps&lpid=82
    $854.05

    then factor in the cost of care and feeding month after month..

    So,, paying for cable/satelite at such an attractive price point may naught be such a bad thing at this critical juncture.

    ok off my soap box, back 2 werk..

  34. Re:perhaps i am lost-butt now i am found?? by Anonymous Coward · · Score: 0

    OK OK.. Now through the thread it becomes clear..

    the nVidia driver issues are related to the installation not the actual operation of ZFS itself..
    Prior to my posting, it did not seem clear..
    Moving past all of that stuff..
    Pretty impressive system you got there?? Care to share the Cost to feed that setup on daily basis??
    If it's in a Datacenter, whats the cost per month to keep it fed n cool??
    Also, if possible pls include the cost of the connection as well.

      Ballpark would be fine.
    While your setup is impressive no doubt, is it feasible to the average "joe"??

    For example, I have an acient Hitachi v9500 series ThunderHead, I have revamped the storage backplane with current equipment. Sas 12g LSI controller, fuly loaded with 12 drives (which are Toshiba PX02SMF Series 1.6TB SAS 12Gbps.) for a grand total of 12.8GB with 2 Hot global spares..
    This setup while not as intricate as yours, is still plenty fast and may cost the same.
    But the factor for the readers @ large that we dance around, the cost..
    the drives alone are..
    http://www.memory4less.com/m4l_itemdetail.aspx?itemid=1470047648&partno=PX02SMB160&rid=90&origin=pla&gclid=CMSriPL_rsgCFY9cfgodSUUD0g
    $5,739.54 each x10,, to much math for me :(
    then the Thundah Head
    http://www.ebay.com/itm/like/140744278072?ul_noapp=true&chn=ps&lpid=82
    $854.05

    then factor in the cost of care and feeding month after month..

    So,, paying for cable/satelite at such an attractive price point may naught be such a bad thing at this critical juncture.

    ok off my soap box, back 2 werk..

    ack, sorry I meant to say 10 drives not 12.. My bad.. apologies..

  35. One last thing to put it all into perspcetive by Anonymous Coward · · Score: 0

    One last thing I forgot to mention..
    I won that hardware config in a poker game, and it also includes care and feeding for 1 year at the same DC that shares Sonet connectivity with HOTMAIL.COM(outlook.com)..

    So, my situation is not very common.. and thus potentially prohibitive to the public..

    thanks

  36. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    Assume also hosts his largish PG database and wants faster sorting so he installs the PGStrom module. GPUs are useful for quite a bit more than just graphics.

  37. Re:ZFS is nice... by Bengie · · Score: 4, Funny

    It makes other FSs look like FAT32.

  38. Re:ZFS is nice... by ZorinLynx · · Score: 2

    This is what I wonder as well.

    What's frustrating is that it's not the ZFS license that's the problem. It's the GPL. Oracle couldn't give a flying fuck if someone put ZFS into the Linux kernel, but the GPL zealots would probably raise a huge stink about it and keep it from happening.

    For the record, I support open source; I just don't like the "viral" nature of the GPL. The ZFS situation is a case where it's doing more harm than good.

  39. Re: ZFS is nice... by bmk67 · · Score: 1

    Supermicro X9 server boards come with IPMI and are not particularly expensive.

  40. oh yeah by Anonymous Coward · · Score: 0

    I would so run Ubuntu if ZFS was decently supported.

  41. Re:ZFS is nice... by rubycodez · · Score: 1

    Ubuntu does install the shitty open source driver for nvidia, noveau, if it detects nvidia card

  42. Re:ZFS is nice... by __aaclcg7560 · · Score: 0

    As a rule, I prefer not to have GUI on a Linux box. If I must have GUI, I prefer to use a window manager like xfce.

  43. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    He's like "Bro I was dragging some files into a 7zip dialog as root and I accidently overwrote /dev/nvidia, Ubuntu teh sucks!"

  44. Why? by Anonymous Coward · · Score: 0

    For most people with a single user homebrew server, whats the benefit?

    For real business users who run VMs and take SAN snapshots, whats the benefit?

  45. Re:ZFS is nice... by caseih · · Score: 2

    Sorry but that's simply not true. It was Sun and now Oracle that purposely chose an incompatible license for ZFS. Nothing to do with the GPL here. Your complaints are like the people that buy up land around an airport, build houses, and then complain about the noise.

    Anyway, if you read the fine articles you'd discover that what Ubuntu is going to do is include ZoL modules in their kernel packages. This takes advantage of GPLv2's aggregation clause which lets you ship non GPL binaries with GPL'd binaries because they aren't linked together (think an OS distribution). Once the modules get loaded, that taints the kernel but since it's the end user that initiates this by choosing to use ZFS, there's no copyright violation. ZoL has always operated this way, actually.

    In other words ZoL will not be compiled into the kernel, as to do so by Ubuntu would be a license violation. But Ubuntu plans to ship and support the binary kernel modules. Sounds eminently reasonable to me. Hopefully we'll see this approach adopted by other distributions, athough ZoL is not that hard to get running at all.

  46. Re:ZFS is nice... by myowntrueself · · Score: 1

    A typical home Linux server - AKA an old PC - won't have IPMI. Actual servers typically will have IPMI, but they cost $BIG_BUCKS$. And even then, IPMI is extremely limited.

    On the Dell servers I bought a few months ago I can't do anything useful with it beyond power on/off or text-only console redirection over serial (over LAN) before the OS loads (I can get into BIOS and the RAID controller ROM, not much else).
    Unless of course I pony up more cash for their iDRAC Standard/Pro/Enterprise/etc. shit. THEN I can get graphical console redirection, some storage space to flash firmware from, and even USB/optical drive redirection.

    And IPMI console typically requires java. Within a year or so NO browser will support that!

    --
    In the free world the media isn't government run; the government is media run.
  47. Re:ZFS is nice... by __aaclcg7560 · · Score: 2

    ZFS wasn't design for mobile systems. FreeNAS requires a minimum of 8GB RAM and 1GB per every 1TB of raw storage for optimal ZFS performance.

  48. Re:ZFS is nice... by __aaclcg7560 · · Score: 1

    When I was using Ubuntu for file server, I was using software RAID and ReiserFS.

    https://en.wikipedia.org/wiki/ReiserFS

  49. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    2-300 dollars is expensive

  50. Re:ZFS is nice... by __aaclcg7560 · · Score: 1

    Regardless of the file system, I don't believe Ubuntu makes for a good file server OS based on my experience. I switched to FreeNAS five years ago because Ubuntu kept hosing the OS disk whenever the Nvidia drivers got updated. Maybe things has changed since then. Maybe not.

  51. Re:ZFS is nice... by fisted · · Score: 1

    But a serial console certainly did help, right?

  52. Re:ZFS is nice... by Anonymous Coward · · Score: 1

    IOW it does way too much, stuff that should be handled at other layers.
    It's the systemd of filesystems.

  53. Re:ZFS is nice... by ZorinLynx · · Score: 1

    >athough ZoL is not that hard to get running at all.

    It's easy to get running but hard to KEEP running, because DKMS has a bad habit of breaking sometimes when updating the kernel or ZFS itself.

    I'd say about a 50/50 chance of having the system come up correctly after a "yum update" for the kernel or ZFS on RHEL 6.

    Being able to just install binary modules would probably help considerably, provided they are built correctly by the distro maintainers.

  54. Re: ZFS is nice... by Anonymous Coward · · Score: 1

    Only ZFS deduplication needs excessive amounts of memory to perform well. No other configuration requires much RAM.

  55. Re:ZFS is nice... by evilviper · · Score: 1

    And IPMI console typically requires java. Within a year or so NO browser will support that!

    No, you can use serial-over-LAN via native utilities like ipmitool. You're talking about the idiot-friendly web interface a few OEMs happen to include. Most ipmi implementations don't even have any web/browser interface to begin with.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  56. Re:ZFS is nice... by evilviper · · Score: 1

    And even then, IPMI is extremely limited.

    I don't see how anyone can claim "IPMI is extremely limited" with a straight face. It does nearly everything you could want in an OoBM interface, except (usually) a GUI. You can do lights-out management, powering systems off and on, setting BIOS/UEFI options like boot device statelessly (not just at boot-up), it can be configured to have a dedicated NIC port, or shared with the OS whether you're bonding NICs or not, gives you a serial console (including BIOS access) over the network. etc., etc.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  57. Re:ZFS is nice... by evilviper · · Score: 2, Insightful

    I don't have a serial terminal, so having video output when needed is very important

    So that's three-strikes... You're 1) using a regular PC as a server (no IPMI), 2) that PC doesn't even have a serial port to be used as an OoBM console, and finally 3) you've got some issue with the video card not even displaying text-mode. With all three strikes against your server, I just can't muster any sympathy for the predicament you put yourself in, relying on an unsuitable cheap piece of crap equipment.

    In fact it's probably FOUR strikes... Presumably your video problem was an issue with KMS or similar, and 4) you didn't bother to figure out how to fix/disable/bypass it, and use plan old text-mode. Instead you went with the quickest (but obviously flawed and easily breakable) option of depending on a proprietary video driver. That's just not thinking things through. Reminds me of folks who has just a switchable PDU as their sole method of OoBM... works right up until they acidentally do a clean shutdown of a remote server.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  58. Re:ZFS is nice... by myowntrueself · · Score: 1

    And IPMI console typically requires java. Within a year or so NO browser will support that!

    No, you can use serial-over-LAN via native utilities like ipmitool. You're talking about the idiot-friendly web interface a few OEMs happen to include. Most ipmi implementations don't even have any web/browser interface to begin with.

    When you are accessing a server on the other side of the world with a dedicated server hosting provider, serial takes a bit more set-up for them and they just don't do it. I've certainly never encountered a single one, and I deal with hundreds of such hosting providers.

    Many sysadmins in the real world are very sadly stuck with the web interface and have no other option.

    --
    In the free world the media isn't government run; the government is media run.
  59. They add ZFS, but with fscking systemd, i pass by Anonymous Coward · · Score: 0

    They can add zfs if they want to, but systemd means I pass. Its a clusterfuck. Get rid of systemd and I will try it. Till then, its crapware.

  60. Re:ZFS is nice...zZzZzZzZz by pigsycyberbully · · Score: 0

    A server performs multiple functions it's a server. Windows has a interface and is very easy to use on a server. A server can have multiple operating systems running on it ( serving ) different computers. Fast graphics on a server is commonplace. I could be feeding multiple computers data from the server while using the server for other purposes even running a full-blown Debian system or a Windows system and so on. A server is not like the late 80s or 90s which fed simple HTML pages or a textbased file.. a home sharing file sharer could do that from a old 386 computer! Even a cube server as a 2 GB graphic card running 24 hours a day and outperforms any laptop, system and is 100% more reliable with a ROM upgradable firewall rules made on-the-fly. You can bet your life on it that some of the postings on here are posted from a server directly running in some kind of interface or or Apple or Windows or Linux and so on. I just posted this message directly from the server running multiple operating systems. The 90s is over.. if I open the side door of my server the card says "super alloy power cooling 2GB ASUS graphic card." that's enough I'm bored with this writing now.

  61. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    I get it. You don't like things because they are too good. Okay. That would explain the systemd hate.

  62. Re:ZFS is nice... by SirMasterboy · · Score: 1

    FreeNAS only needs 8GB of RAM because of the OS is tunes, not because of ZFS. FreeNAS runs entirely in RAM.

    The FreeBSD manual and the Solaris manual both state 1GB of system RAM to use ZFS. I'm running a 40TB ZFS pool with 16GB of RAM and performance is excellent.

  63. Re:ZFS is nice... by SirMasterboy · · Score: 1

    Because there isn't a conflict if done right.

    People that claim there is a conflict generally don't understated how the licenses actually work and what they allow and don't allow.

    Tthere is no legal issue preventing the sources from being combined because neither the CDDL nor the GPL place restrictions on aggregations of source code, which is what putting ZFS into the same tree as Linux would be. Binary modules built from such a tree could be distributed with the kernel's GPL modules under what the GPL considers to be an aggregate. These concepts have passed legal review by many parties.

  64. Re:ZFS is nice... by SirMasterboy · · Score: 1

    DKMS is not the only way to install ZoL though. It can be built and install perfectly fine without DKMS and I do this for some of my machines.

    That being said I have been using DKMS on a Debian box for 3 years now and have gone through many, many ZoL upgrades and many kernel upgrades and have never had any issue with the upgrading not going smoothly. Sounds more like a problem with the ZoL maintainer for RHEL.

  65. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    What? No!

  66. Re:ZFS is nice... by Bruce+Perens · · Score: 1

    Aggregate means two programs that are not combined and just live on the same filesystem. In the case of a filesystem driver, it's read into the kernel space and touches unexported APIs of the kernel and various kernel internals.

    It is thus a derivative work.

  67. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    No, you don't make for a good file server owner. Seriously. An Nvidia card, in a server? What on Earth are you smoking?

  68. CDDL and GPL don't mix by Bruce+Perens · · Score: 3, Informative

    Regardless of what Ubuntu has convinced themselves of, in this context the ZFS filesystem driver would be an unlicensed derivative work. If they don't want it to be so, it needs to be in user-mode instead of loaded into the kernel address space and using unexported APIs of the kernel.

    A lot of people try to deceive themselves (and you) that they can do silly things, like putting an API between software under two licenses, and that such an API becomes a "computer condom" that protects you from the GPL. This rationale was never true and was overturned by the court in the appeal of Oracle v. Google.

    1. Re:CDDL and GPL don't mix by Anonymous Coward · · Score: 0

      Bruce: some of us think that appeal was bogus and out to be overruled by the Supremes. Given the nature of the de minimis range check violation, there really were no merits to the case.

      The ZFS guys clearly have their own codebase, which they have designed to be portable across multiple operating systems, which is developed under it's own license. The limited amount of 'glue' needed to support porting should certainly fall under fair-use.

      Now as for aggregating it, that is where the GPL departs from license and tries to break in to more EULA land. I don't know if copyright has ever really been upheld to that level before. I suspect not or we would see a situation like sci-fi authors start to demand that second-hand book shops not bundle copies of their work with Enders Game given Orson Scott Card's political leanings.

      Copyright law does not seem to confer that sort of power to set terms once the author has been compensated for a copy, it only prevents duplication under certain terms and despite how Stallman may feel about it, I don't think we really want that to happen. I think even the SFLC has taken the permissive position with regards to aggregation but I could be wrong.

      - sedwards

    2. Re:CDDL and GPL don't mix by Anonymous Coward · · Score: 1

      Or they could use dkms to build the driver on first boot (after each kernel upgrade). GPL(v2) will let you do pretty much anything if you're not redistributing the binaries.

      That could require either a non-zfs /boot partition, or a userspace driver that can be used to read the code on first boot, or adding the zfs driver code and build deps to a special initrd for first boot. None of these are ideal, but they're technically possible.

    3. Re:CDDL and GPL don't mix by Anonymous Coward · · Score: 0

      Regardless of what Ubuntu has convinced themselves of, in this context the ZFS filesystem driver would be an unlicensed derivative work.

      From the Debian consultations it appears (and a real shame Debian hasn't been open about it, BTW) that the SFLC OK'd it, but the FSF didn't.

      Now ... assume that the FSF is somehow right ... who exactly is going to sue Ubuntu, Debian, and any other distribution shipping ZFS? Linus and Oracle, that hold the copyrights, clearly have no interest .. and if either of them does go legal, people can always move to BSD.

    4. Re:CDDL and GPL don't mix by brambus · · Score: 1

      If anybody could sue Canonical for shipping ZFS in Ubuntu, it couldn't be Oracle, because the CDDL doesn't prohibit combining CDDL'd code with other licenses, provided the CDDL'd bits remain CDDL and that you distribute them to your users. It's the GPL that prohibits these combos - hence the "infectious license" moniker. So it'd have to be Linux copyright holders suing Canonical (oh the irony) for presumably combining the GPL'd Linux kernel with the CDDL'd ZFS code. Seeing as nothing like this has yet happened even with Canonical distributing the much more egregious Nvidia binary blob, I think the entire notion of this being a real legal hurdle is nothing but GPL-purist FUD.

  69. Re:perhaps i am lost-butt now i am found?? by ArmoredDragon · · Score: 1

    Pretty impressive system you got there?? Care to share the Cost to feed that setup on daily basis??

    Hmm...25 cents I think? It's a Lenovo TS440 that I bought on Amazon.

  70. Re:ZFS is nice... by ArmoredDragon · · Score: 1

    I bought a Lenovo TS440 on Amazon for $400. Included a SAS controller, E3-1245 CPU, hot-swapable PSU, motherboard, and hot swap HDD bays in the case...They show up now and then, check slickdeals.

  71. Re:ZFS is nice... by ArmoredDragon · · Score: 1

    And IPMI console typically requires java. Within a year or so NO browser will support that!

    No, actually the typical IPMI console is AMT these days, and you can connect to it with an ordinary VNC client, which isn't going away any time soon.

  72. Re:ZFS is nice... by ArmoredDragon · · Score: 1

    With modern IPMI you can do more than that too, such as booting to an ISO image from all the way across the internet. You can even do full GUI and everything with a simple VNC client. Just so long as the machine powers on and has an internet connection configured, it'll work.

    Intel's AMT boards do all of this anyways, and they're quite common these days.

  73. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    All that and a quota system that works, and is not a pain to setup.

  74. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    It's for this reason I put up with the open source nouveau driver - at least you never get booted into a text console and have to scrabble to make it work.

  75. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    purposely chose an incompatible license for ZFS

    Verily did Ellison, who is Morgoth, plunder the ZFSilmarils and carry them off to AngBSD, denying the light of a filesystem that doesn't suck liquid shit to the poor denizens of Linuxnor.

    Or, in reality, fuck that noise. The viral bullshit of the GPL is entirely on Linux. It isn't Sun, Oracle, or any bloody one else's responsibility to choose a license compatible with Stallman's bearded ramblings.

  76. Re:ZFS is nice... by thegarbz · · Score: 1

    The majority people build themselves out of cheap consumer parts.

  77. Re:ZFS is nice... by thegarbz · · Score: 1

    Congrats. That's still more money than my server was, even if you take into account the VGA cable I had to buy because of lack of IPMI and every other monitor in the house having a HDMI cable.

    In 3 years I've had to access the system from a local console precisely once, when a typo in a script brought down the primary network interface. IPMI has almost zero use case for me and attaching a monitor and keyboard to the server is something that takes the best part of 30 seconds, so it doesn't even warrant a consideration when I buy equipment (actually I'm in the process of upgrading the server now because I need more SATA slots).

  78. Re:ZFS is nice... by dk20 · · Score: 2

    As someone who has 50TB on a system with 16GB ram i agree with you.
    I wish people would stop spreading this "1gb ram/tb" FUD.

    It is the recommendation for DEDUP, not for standard ZFS.

  79. Re:ZFS is nice... by geggo98 · · Score: 1

    [...] it's read into the kernel space and touches unexported APIs of the kernel and various kernel internals.

    Oh, that's easy to fix: Ubuntu already has ,a href="https://wiki.ubuntu.com/Kernel/Dev/KernelGitGuide">its own kernel tree. So they can just expose the APIs in question and build APIs for the needed kernel internals. Its very easy for them to modify their fork of the kernel so it can be combined with ZFS. Of course they won't be able to push the changes upstream to the vanilla kernel; I think Linus has already stated that he would not accept ZFS related patches. But this should not be a showstopper.

  80. Re:ZFS is nice... by Blaskowicz · · Score: 1

    There are likely ways around that but before I elaborate on it, let's say I've had weird regressions with my nvidia card support recently. It's a geforce 7 woefully outdated but that I like to use still it's still as powerful as when I got it (it is the same tech as the Playstation 3 GPU, with half the computing units but same fillrate and bandwith, and a hard to beat 32 watt TDP)

    With the "nouveau" driver you've had to boot with the "nomodeset" kernel option (or alternatively one that disables nouveau 3D acceleration), this has been a big enough issue that it has been on the short Linux Mint release notes every six monthes.
    1+ year ago I could still run nouveau with 3D acceleration (was on Mint 16 past its due date) and now it's fucked. But you might able to run nouveau in such a "degraded" state - still fine for e.g. playing an unaccelerated 720p H264 video in full screen. From that state - or if you could not get there - try to run Xorg with the VESA driver (is it called vesa or vesafb) : every card can run in fucking VESA, or should be able to.
    Or find an old PCI card (ATI Rage Pro PCI works fine for instance, any shit PCI 1MB card can give you text mode or 800x600 16bit)

    Fast forward to the state of the art Ubuntu 14.04 compatible : both 304 driver and nouveau driver sucked (304 driver lacks resolutions/refresh rates I would want to use), but it took a third option I've never thought of trying : the 173 driver (i.e. even more "legacy" that the 304 one). Had to run the older 3.13 kernel, then beat the PC into submission so it would boot the 3.13 kernel by default. Then a "sudo nvidia-xconfig" later, I'm set. But Steam refuses to run because it has a hard check for 304 or later driver, which I had long forgotten about. So I can't use it to run a 15 years old game (Counter Strike 1.6)

    So, I ought to be modded down because I run the wrong hardware too! BTW nevermind the "superior" linux driver support : I'm sure the graphics card can work properly with Windows 98, ME, 2000, XP, Vista and 7.

  81. Re:ZFS is nice... by TheRaven64 · · Score: 1

    2GB/TB is recommended if you're doing deduplication. That said, performance degrades quite smoothly. I've got a machine with 3x4TB drives in RAID-Z with only 8GB of RAM. Disk performance isn't great, but I mostly access it via WiFi and it's absolutely fine for that. Eventually I'll get a new motherboard for that can handle more than 8GB of RAM...

    --
    I am TheRaven on Soylent News
  82. Re:ZFS is nice... by Blaskowicz · · Score: 1

    a "serial terminal" is not that hard to come by, you only need a null modem cable or USB-to-serial and some terminal software (came with Windows 3.1 and 95)

  83. Re:ZFS is nice...zZzZzZzZz by Blaskowicz · · Score: 1

    Welcome to Windows 95 : fast graphics, and it is a file server where you can just right-click a directory and share it on the network. At the same time you can use the desktop and even play 3D games or DOS games. If you have enough RAM just run the version with bug fixes, called Windows 98SE.

  84. Re:ZFS is nice... by Blaskowicz · · Score: 1

    On the other hand if you install the nvidia driver then you get 80x25 text.
    Nouveau likes to set a 2048x1536 graphical console (!) or on a lower end CRT monitor, 1600x1200.
    Used to have one of the drivers display a blank console : log in works etc. but the monitor was entirely black (which on a proper non-LCD looks as if it is turned off)
    By fucking with boot options or config files you can eventually get a 80x25 console. If anyone ever got a 80x50 console let me know.

  85. Re:ZFS is nice... by TheRaven64 · · Score: 3, Insightful

    No. RAID isn't better handled at other layers. If you don't know about the filesystem semantics then you need NVRAM or journalling at the block level to avoid the RAID-5 write hole. RAID-Z doesn't have this problem. If you're recovering a failed block-level RAID, then you need to copy all of the data, including unused space. With ZFS RAID (all levels), you only copy the used data. There are numerous other advantages to rearranging the layers, including being a lot more flexible in the provisioning.

    It's also a mistake to think of ZFS as a layer. ZFS has three layers: the lowest handles physical disks and presents a linear address space, the middle presents a transactional object store, and the top presents something that looks like a filesystem (or a block device, which is useful for things like VM disk images).

    --
    I am TheRaven on Soylent News
  86. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    >athough ZoL is not that hard to get running at all.

    It's easy to get running but hard to KEEP running, because DKMS has a bad habit of breaking sometimes when updating the kernel or ZFS itself.

    I'd say about a 50/50 chance of having the system come up correctly after a "yum update" for the kernel or ZFS on RHEL 6.

    Being able to just install binary modules would probably help considerably, provided they are built correctly by the distro maintainers.

    Please file github issues for such problems so that such upgrade pains can be fixed:

    https://github.com/zfsonlinux/zfs/issues/new

    That said, there was recently a DKMS fix for a regression caused by an upstream DKMS change in ZoL:

    https://github.com/zfsonlinux/zfs/commit/3ef005c674e3207e8c6fba5d65a76468f97084ae

    It should be included with 0.6.5.3.

  87. Re:ZFS is nice... by GuB-42 · · Score: 1

    On servers, and even sometimes on desktops, I like to use the minimal version of Ubuntu (a ~40MB iso). It installs the bare minimum which is basically a text (not hi-res) console and apt so that you can do the rest yourself.
    You can do it with Debian too, and used like this, Ubuntu and Debian are very similar.

  88. NIH by Anonymous Coward · · Score: 0

    Licensing has never been the issue for including ZFS, it's the NIH-syndrome. Now just btrfs has been so bad for so long, even Linux guys are getting anxious.

  89. Re:ZFS is nice... by GuB-42 · · Score: 1

    No, it is a good thing these things are done at the filesystem level.
    For example, RAID-Z (ZFS build-in RAID) eliminates the "write hole" problem. And its error detection combined with replication allows it to recover from corrupted rather than just unreadable data. These are features you can't have if replication, error detection and transactions are in separate layers.
    As for snapshots, they exploit a specificity of ZFS which is copy-on-write to make them extremely efficient. You can't do this without access to the filesystem internals.

  90. Re:ZFS is nice... by Eunuchswear · · Score: 1

    Curious how well it performs on mobile systems (laptops, tablets, phones, ...) vs. current filesystems (e.g., EXT)? In particular, does battery usage suffer significantly? Are CPU/RAM requirements higher?

    Hah! My phone uses btrfs, how insane is that!

    $ mount
    /dev/mmcblk0p28 on / type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    devtmpfs on /dev type devtmpfs (rw,relatime,size=412964k,nr_inodes=103241,mode=755)
    none on /proc type proc (rw,relatime)
    none on /sys type sysfs (rw,relatime)
    tmpfs on /dev/shm type tmpfs (rw,relatime)
    devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620)
    tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
    tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,mode=755)
    cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
    cgroup on /sys/fs/cgroup/debug type cgroup (rw,nosuid,nodev,noexec,relatime,debug)
    cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
    cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
    cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
    cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
    cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls)
    cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
    cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
    debugfs on /sys/kernel/debug type debugfs (rw,relatime)
    tmpfs on /tmp type tmpfs (rw)
    fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
    mtp on /dev/mtp type functionfs (rw,relatime)
    /dev/mmcblk0p18 on /firmware type vfat (ro,relatime,uid=1000,gid=1000,fmask=0337,dmask=0227,codepage=cp437,iocharset=iso8859-1,shortname=lower,errors=remount-ro)
    /dev/mmcblk0p19 on /drm type ext4 (rw,nosuid,nodev,relatime,data=ordered)
    /dev/mmcblk0p25 on /persist type ext4 (ro,nosuid,nodev,relatime,data=ordered)
    /dev/mmcblk0p28 on /home type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    /dev/mmcblk0p9 on /var/systemlog type ext4 (rw,nosuid,nodev,relatime,data=ordered)
    statefs on /run/state type fuse.statefs (rw,nosuid,nodev,relatime,user_id=0,group_id=998,default_permissions,allow_other)
    tmpfs on /mnt/asec type tmpfs (rw,relatime,mode=755,gid=1000)
    tmpfs on /mnt/obb type tmpfs (rw,relatime,mode=755,gid=1000)
    /dev/mmcblk0p28 on /opt/alien/data type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    /dev/mmcblk0p28 on /opt/alien/bin type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    /dev/mmcblk0p28 on /opt/alien/sbin type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    /dev/mmcblk0p28 on /opt/alien/lib type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    /dev/mmcblk0p28 on /opt/alien/usr type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    /dev/mmcblk0p28 on /opt/alien/var type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    /dev/mmcblk0p28 on /opt/alien/etc type btrfs (rw,noatime,thread_pool=4,ssd,noacl,space_cache,autodefrag)
    tmpfs

    --
    Watch this Heartland Institute video
  91. Re:ZFS is nice... by Eunuchswear · · Score: 1

    No, but Sun are on record as having chosen CDDL primarily because it was GPL-incompatible.

    --
    Watch this Heartland Institute video
  92. Re: ZFS is nice... by MikeFM · · Score: 1

    If it must have a GUI then use an iPad. No reason to run the UI on the same device as the work getting done. Usually use Linux on cloud servers and embedded devices anymore.. can't see any reason I'd want an actual server any more and only keep a desktop for running third-party software that requires Windows or MacOS.

    --
    At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
  93. no by Anonymous Coward · · Score: 0

    If Ubuntu makes ZFS the default, it will be the default for Ubuntu.

    If we want to make it the default for Linux, we'll have ask RedHat. Sure Pottering won't object to making systemd depend on ZFS.

  94. Re:ZFS is nice... by Coren22 · · Score: 1

    I'll bet that killed the file server...

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  95. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    Yawn...

  96. Re: ZFS is nice... by __aaclcg7560 · · Score: 2

    When I tested wireless clients at Cisco, I installed the GUI with Fedora or Mint because I needed to run YouTube video in a loop. The division chief wanted to fire me for using 75% of the wireless bandwidth for YouTube. He didn't realize that I had 30 laptops running YouTube video and supporting 300 users without a hiccup in network performance. All the YouTube videos were from the Cisco channel, which included several interview with him. Nothing like seeing your face on 30 screens.

  97. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    In the end we turned off the IPMI because we didn't want the forced reboots to install security updates for the IPMI. There are already enough for the OS and applications.

  98. Re:ZFS is nice... by __aaclcg7560 · · Score: 1

    I wasn't set up for a serial console back then. When I recently rebuilt my file server, I added a serial port for console access.

  99. Re:ZFS is nice... by __aaclcg7560 · · Score: 1

    It was the automatic update of the Nvidia driver that hosed the OS disk.

  100. Re:ZFS is nice... by rl117 · · Score: 1

    You might think the layered approach would make sense, but actually the separation of concerns prevents the system from doing things intelligently. For example, consider what happens when a disk fails and is replaced. In the layered case, the md layer or the hardware RAID controller will resync the data. This will be a simple linear reconstruction of the data. In the ZFS case, it only copies the missing data; unused blocks are not uselessly synced. This is called "resilvering". This isn't just faster, it also increases the chances of successful resyncing since the probability of failure during reconstruction is reduced.

    As others have mentioned in reply, there are a number of other useful practical advantages as well.

  101. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    So you can do more with it.

  102. Re:ZFS is nice... by sexconker · · Score: 1

    Yet you can't do what the OP said he needed the video out for - fix OS configuration issues.

    UNLESS your mobo manufacturer developed/bought a chip and software to do that for your specific OS, AND included it for your mobo/license, AND they actively maintain it to make sure it actually works. OR that chip is embedded into the CPU (such as Intel's backdoor suite with an ever-changing name), AND your mobo/BIOS/UEFI exposes it, AND it works for your OS, AND you're properly licensed for it (typically built into the cost of the mobo).

    The closest you'll get in the real world is a chipset that pipes keyboard, mouse, and video (graphical console) over LAN. Dell charges a buttload to license this, it only works with the built-in Intel GPU as far as I know, and you're stuck with a shitty Java web portal. You can't really call this IPMI.

  103. Re:ZFS is nice... by Coren22 · · Score: 1

    https://en.wikipedia.org/wiki/...

    No, literally killed.

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  104. Re:ZFS is nice... by ilsaloving · · Score: 2

    Wow, with all the hostile responses your post has been getting, I almost started thinking that I had joined the LKML by mistake.

  105. Re:perhaps i am lost-butt now i am found?? by Anonymous Coward · · Score: 0

    Hmm I see, looking at the specs, its a workstation, or a sever of the "past" come to the desktop..
    I guess that really doesn't a server per'se.. but from a home perspective, seems somewhat overkill.
    also, 25c seems a little short with regard to care and feeding..
    good luck with that,

    Thanks,

  106. Re:ZFS is nice... by Aaden42 · · Score: 1

    You've over-simplified what is & isn't a derivative work. Linus himself has written about the distinction specifically as it applied to the Andrew Filesystem:

    http://yarchive.net/comp/linux...

    According to Linus, a driver that was originally written independently of Linux for another system and simply ported *to* Linux is not a derivative work. That's exactly the case for what ZoL is.

    You've also misrepresented what the ZFS modules do in terms of their contact with kernel internals:

    touches unexported APIs of the kernel

    Neither ZoL or the SPL layer that it depends on touch any non-public or GPL-only symbols of the kernel. If they did, you'd be correct in there being an issue. They don't, and there isn't.

  107. Re:ZFS is nice... by __aaclcg7560 · · Score: 0

    That murder is completely irrelevant to using ReiserFS on an Ubuntu file server.

  108. Re:ZFS is nice... by Coren22 · · Score: 1

    Do you often miss jokes? I even tried to explain the joke for you, but you are persisting in taking the joke as literal.

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  109. Re:ZFS is nice... by david_thornley · · Score: 1

    Linus isn't the arbiter of copyright law, so I'd rather consult a real lawyer who specializes in the field.

    --
    "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  110. Re:ZFS is nice... by __aaclcg7560 · · Score: 0

    Probably because your "joke" is off the mark?

  111. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    Such a record does not exist as people at Sun thought the CDDL was a good idea for different reasons. Some thought that being GPL incompatible was a good reason, but as far as I can tell, those were in the minority. Others were interested in compatibility with proprietary software and clause to provide an explicit patent grant. The GPLv2 does not do these things.

  112. Re:ZFS is nice... by Bruce+Perens · · Score: 1

    I think you need to look at this in the context of the appeal of Oracle v. Google. We had a concept of an API being a boundary of copyright based on 17 CFR 102(b) and elucidated by Judge Walker's finding in CAI v. Altai. That stood for a long time. But Oracle v. Google essentially overturned it and we're still waiting to see what the lower court does in response.

  113. Re:ZFS is nice... by Bruce+Perens · · Score: 1

    Linus knows absolutely nothing about law and every time he opens his mouth about it he only makes the confusion around the issue worse.

  114. Re:ZFS is nice... by Bruce+Perens · · Score: 1

    Uh, that doesn't work. The problem is that doing exactly what you've written down is contriving to avoid your copyright responsibility by deliberately creating a structure in someone else's work which you believe would be a copyright insulator. If you went ahead and did this (I'm not saying that you personally would be the one at Ubuntu to do so), I'd love to be there when you are deposed. Part of my business is to feed attorneys questions when they cross-examine you. I have in a similar situation made a programmer look really bad, and the parties settled as soon as they saw the deposition and my expert report. See also my comment regarding how Oracle v. Google has changed this issue. You can't count on an API to be a copyright insulator in any context any longer.

  115. Re:ZFS is nice... by greenfruitsalad · · Score: 1

    1. most people don't like wearing earplugs inside their house (your PSU sounds like a horny elephant)
    2. is electricity free where you live?

    i have several hp microservers - the oldest, N36L, consumes 0.06A when idling. my newest is N54L and that raises the consumption to a whopping 0.09A when idling. i can't hear the fan from more than a metre away unless it's hot in the room and it spins up.

  116. Re:ZFS is nice... by greenfruitsalad · · Score: 1

    i once accidentally enabled automatic security updates on a production servers (cloned installation). the next day, we had fun investigating why mysql servers (daemons) restarted in the middle of a busy day. i ended up with a first&last warning in written form from my management.

  117. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    Linus isn't the arbiter of copyright law, so I'd rather consult a real lawyer who specializes in the field.

    Are you suggesting that Canonical failed to do that?

  118. Re: ZFS is nice... by Gr8Apes · · Score: 1

    can't see any reason I'd want an actual server any more

    Costs?

    --
    The cesspool just got a check and balance.
  119. Re:ZFS is nice... by Gr8Apes · · Score: 1

    I used to have a server farm, it was loud in that room. I'm even considering downsizing the current all purpose desktop to lower the heat and noise footprint to low and none.

    --
    The cesspool just got a check and balance.
  120. Re:ZFS is nice... by Frnknstn · · Score: 1

    This is completely correct. By having knowledge of all layers, ZFS is able to easily offer features that other systems don't.

    One of my favourites what happens when you set a filesystem to keep two copies of a file. Instead of placing the second copy on a random device determined by the RAID layer, it will attempt to ensure that all blocks from one device are placed on the adjacent device.

    The advantage of that is non-obvious at first glance, but what it means is this: When two devices in the JBOD fail, instead of corrupting all the files when *any* two devices fail, it means you will only have corruption when two *adjacent* drives fail.

    In a 5-device JBOD, that means the chance of corruption when the second device fails drops from ~100% to 25%.

    --
    If it's in you sig, it's in your post.
  121. Re:ZFS is nice... by Frnknstn · · Score: 2

    One GREAT advantage it has over your bog-standards filesystems like NTFS and ext4 is its copy-on-write architecture, and the essentially free and near-instant snapshot system it provides.

    When you take a snapshot of a filesystem, it simply makes a copy of the superblock. All of the space on the devices remain marked as in-use, and both snapshots share exactly the same physical storage.

    When you make a change to one of the snapshots, it simply writes the changed blocks to a different location on the underlying devices and leaves the still-in-use original block alone.

    --
    If it's in you sig, it's in your post.
  122. Re:ZFS is nice... by evilviper · · Score: 1

    Yet you can't do what the OP said he needed the video out for - fix OS configuration issues.

    Yes you can. Of course you can. I've used IPMI extensively, and have absolutely no idea what you are ranting on about.

    UNLESS your mobo manufacturer developed/bought a chip and software to do that for your specific OS,

    There's nothing OS-specific about IPMI. There's no "chip" for each OS.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  123. Re:ZFS is nice... by paulatz · · Score: 1

    You mean the Vanity Server

    --
    this post contain no useful information, no need to mod it down
  124. Re:ZFS is nice... by sexconker · · Score: 1

    IPMI interfaces with hardware and knows nothing of the OS. If you're using IPMI to mess with your OS, then your vendor has implemented hooks into your specific OS in their specific BMC/iLO/iDRAC/whatever controller, which you can access via their tools, a web portal, etc. Their iLO/iDRAC/whatever also implements IPMI, which you can access via free and open IPMI tools as well as their proprietary tools.

    IPMI 2.0 includes serial over LAN, but that's text only console redirection. If you want graphical console redirection, you need to use a proprietary tool from your vendor, your server's BMC has to support it , and you have to pay for the license for it. Dell calls their IPMI implementation "iDRAC", and every motherboard always has the latest BMC capable of doing whatever, but you have to license iDRAC, iDRAC Pro, iDRAC Enterprise, etc. The cheapest option when buying a server is iDRAC Express, which gives you IPMI and none of their proprietary shit. You get power on, off, cycle, read the (hardware) system event log, configure the network settings of the BMC, and console redirection to a serial port. For Dell, you also have to enable redirection via COM2 in the BIOS if you want serial over LAN.

    IPMI doesn't touch the fucking OS. IPMI lets you build tools to do that, but that basically means spotty support for Windows servers.
    If you're running graphical Linux, you need graphical console redirection, as Guspaz does:

    SSH is my primary interface to the server, but sometimes you've got to get on a box locally, like if you mess up something network related, or you mess up a change to grub, or who knows what. It's not common, but I don't have a serial terminal, so having video output when needed is very important.

    The only ways to get graphical console redirection are to use a hardware solution connected to video ports or to use proprietary vendor shit. IPMI does not do this. IPMI console redirection is text only. Read the spec.

  125. Re: ZFS is nice... by BitZtream · · Score: 1

    ... You know X supports network connections and has for over 30 years ... Right? If you can SSH in, you can have a gui. ssh -X is your friend

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  126. Re: ZFS is nice... by BitZtream · · Score: 1

    RAIDZ is crap. It doesn't have the write hole because it ALWAYS PERFORMS badly when writing RAIDZ. Never use RAIDZ, bite the bullet pay for the mirrored setup and experience ZFS that doesn't suck ass.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  127. Re: ZFS is nice... by BitZtream · · Score: 1

    ZFS itself needs 4gb of ram to be useful even on small dishes or you end up with no effective caching at all. 8gb is the practical minimum for a small hfs file sever that only does nfs. You want 4-5gb of ram for every tb of deduped data unless you know your dataset really well or you can quickly end up with a file system that can't be mounted because it's continually reading the dedup table that won't fit in RAM and must be consulted before every single read or write ... and if your using the machine for anything, 16gb with no dedup.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  128. Re: ZFS is nice... by BitZtream · · Score: 1

    Bullshit.

    They REQUIRE 1gb, the state that less than 4gb is not recommended and disable features by default if you only have 4. They also both state that ZFS loves RAM pretty clearly. If you use less than eight, your probably doing it wrong. 16 is a minimum for a pure NAS server.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  129. Re: ZFS is nice... by BitZtream · · Score: 1

    But it's combined by the user at runtime, not by canocal. The GPL allows an end users to do this.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  130. Re: ZFS is nice... by BitZtream · · Score: 1

    Or your just dull and missed it?

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  131. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    Cool story, bro.

    *eyeroll*

  132. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    People build themselves? Bit of a chicken and egg problem, there...

  133. Re: ZFS is nice... by Bruce+Perens · · Score: 1

    But it's combined by the user at runtime, not by canocal. The GPL allows an end users to do this.

    This is a way that people kid themselves about the GPL. If the user were really porting ZFS on their own, combining the work and never distributing it, that would work. But the user isn't combining it. The Ubuntu developer is creating instructions which explicitly load the driver into the kernel. These instructions are either a link script that references the kernel, or a pre-linked dynamic module. Creating those instructions and distributing them to the user is tantamount to performing the act on the user's system, under your control rather than the user's.

    To show this with an analogy, suppose you placed a bomb in the user's system which would go off when they loaded the ZFS module. But Judge, you might say, I am innocent because the victim is actually the person who set off the bomb. All I did was distribute a harmless unexploded bomb.

    So, it's clear that you can perform actions that have effects later in time and at a different place that are your action rather than the user's. That is what building a dynamic module or linking scripts does.

    There is also the problem that the pieces, Linux and ZFS, are probably distributed together. There is specific language in the GPL to catch that.

    A lot of people don't realize what they get charged with when they violate the GPL (or any license). They don't get charged with violating the license terms. They are charged with copyright infringement, and their defense is that they have a license. So, the defense has to prove that they were in conformance with every license term.

    This is another situation where I would have a pretty easy time making the programmer look bad when they are deposed.

  134. Re: ZFS is nice... by Bruce+Perens · · Score: 1

    I should add one thing. Distributing the instructions which create the derivative work is tantamount to distributing the infringing derivative work. The logic above applies to all of that.

  135. Re:ZFS is nice... by evilviper · · Score: 1

    IPMI 2.0 includes serial over LAN, but that's text only console redirection. If you want graphical console redirection, you need to use a proprietary tool from your vendor,

    I... don't. Why would I? Both Linux and Windows (since 2003 with EMS) lets you do any system fixing & reconfiguration you could want, via serial console.

    With OoBM, you really only need to get your system booting again, and network reconfigured and working. After that, you connect via in-band management, whether that's SSH, RDP, NX, VNC, etc. It's stupid, wrong, and terribly inefficient to use your OoBM for all your system management.

    If you're running graphical Linux, you need graphical console redirection

    Bullshit. You mean if you don't have a clue how to manage the basics on a Linux system without the GUI, THEN you're in trouble if you don't have graphical console redirection.

    You get power on, off, cycle, read the (hardware) system event log, configure the network settings of the BMC, and console redirection to a serial port. For Dell, you also have to enable redirection via COM2 in the BIOS if you want serial over LAN.

    As I said before, IPMI also enables you to change BIOS settings. You can "enable redirection via COM2 in the BIOS" directly & remotely via IPMI, without ever entering the BIOS. It's a simple one-liner. Then you

    The "redirection" isn't even really needed, except for seeing BIOS messages on boot-up. Otherwise you just need to tell your OS to enable a console on serial, and you're fine.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  136. Re:ZFS is nice... by Anonymous Coward · · Score: 0

    Annoyingly, in the ZFS case, resilvering is an extremely IOPS-intensive process compared to block-level resilvering. In the limit where your leaf vdevs are full, resilvering performance (no matter what the structure of the top level vdev is -- mirrored, raidz, raidz3, ...) is almost certainly slower in the ZFS case than in a hardware raid case. In fact, in extreme cases -- large leaf vdevs (3+ TiB devices), low seek times (milliseconds), and correlated I/O (raidz3), resilvering a single device in a quiet but full vdev can take *days*; it's much worse when the vdev is not quiet (people have reported *weeks* to resilver ~4TB drives in the real world).

    Oracle has broken up resilvering into two phases, one of which is highly linearized; that helps enormously, but is still much slower than a simple linear copy of a multi-terabyte disk when, for example, there is approximately 50% free (or alternatively unreachable) space.

  137. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    Oracle zfs resilvers at full platter speed, ie 100mb/sec. Openzfs resilvers slow, ie days in worst case

  138. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    I used 1gb ram pc with solaris and zfs for over a year without problems. 4gb ram is ok, if you skip dedupe.

  139. Re: ZFS is nice... by Anonymous Coward · · Score: 0

    Bullshit. I used 1gb ram pc with solaris and zfs for over a year without problems. 4gb ram is ok if you skip dedupe. Read wikipedia article on zfs and learn.