Slashdot Mirror


Ubuntu 16.04 LTS To Have Official Support For ZFS File System (dustinkirkland.com)

LichtSpektren writes: Ubuntu developer Dustin Kirkland has posted on his blog that Canonical plans to officially support the ZFS file system for the next Ubuntu LTS release, 16.04 "Xenial Xerus." The file system, which originates in Solaris UNIX, is renowned for its feature set (Kirkland touts "snapshots, copy-on-write cloning, continuous integrity checking against data corruption, automatic repair, efficient data compression") and its stability. "You'll find zfs.ko automatically built and installed on your Ubuntu systems. No more DKMS-built modules!" N.B. ext4 will still be the default file system due to the unresolved licensing conflict between Linux's GPLv2 and ZFS's CDDL.

37 of 191 comments (clear)

  1. For home users, basically meaningless. by Anonymous Coward · · Score: 4, Insightful

    All file systems are approximately the same for most day to day users. I would be interested in knowing which is fastest at read/writes.

    1. Re: For home users, basically meaningless. by Zeromous · · Score: 2

      Large files brtfs or xfs.
      For millions of small files...ext4

      --
      ---Up Up Down Down Left Right Left Right B A START
    2. Re:For home users, basically meaningless. by Anonymous Coward · · Score: 3, Insightful

      All file systems are approximately the same for most day to day users. I would be interested in knowing which is fastest at read/writes.

      And that's meaningless without specifying the hardware you're doing the comparison on, your access pattern(s), file system layout, data distribution within the file system, and other factors.

    3. Re:For home users, basically meaningless. by Anonymous Coward · · Score: 3, Informative

      I think you don't know what ZFS really is. It's a very different deal than etx4, ufs etc... It is the file system that made HW raid controllers obsolete. Even with a single disk setup you get a lot of features that you don't have on most of the other FS. It is a big deal just because of cheap snapshots, and data integrity checks.
      And no, BTRFS is not even close... yet.

    4. Re:For home users, basically meaningless. by UnknownSoldier · · Score: 3, Informative

      > I would be interested in knowing which is fastest at read/writes.

      Ignoring the fact that this is a HIGHLY ambiguous question, i.e. you don't specify _which_ RAID setting, here are some benchmarks:

      = 2010 =
      http://www.zfsbuild.com/2010/0...

      = 2013 =
      ZFS On Linux 3.8 Kernel, ZOL 0.6.1
      https://openbenchmarking.org/r...

      = 2015 =
      A PERFORMANCE COMPARISON OF ZFS AND BTRFS ON LINUX
      * https://www.diva-portal.org/sm...

    5. Re:For home users, basically meaningless. by Curunir_wolf · · Score: 4, Interesting

      I think you don't know what ZFS really is. It's a very different deal than etx4, ufs etc... It is the file system that made HW raid controllers obsolete.

      It also made just about any computer with less than 8 GB of RAM obsolete. It's also not very friendly with applications that need large chunks of RAM, like a database or large Java VM application - the ARC cache causes a lot of fragmentation and is often slow to release it when other applications need more.

      --
      "Somebody has to do something. It's just incredibly pathetic it has to be us."
      --- Jerry Garcia
    6. Re:For home users, basically meaningless. by Aaden42 · · Score: 4, Interesting

      On 64-bit hosts, the ARC cache is a non-issue. Java needs contiguous *virtual* memory space. Physical memory fragmentation isn't a problem w/ the MMU translating contiguous 64-bit address space to possibly non-contiguous physical pages. On 32-bit hosts, that gets dicey. On 64-bit, you've got plenty of room even w/ ARC.

      That said, I'd love to see ARC & the native Linux disk cache functionality either merge or at least have ARC behave more like the normal caching mechanism (IE free up RAM more eagerly), but it's not actually caused me significant problems on 64-bit.

    7. Re:For home users, basically meaningless. by The-Ixian · · Score: 4, Interesting

      I used MythTV for years as a DVR and I tried a lot of different file systems.

      The 2 that always worked the best were JFS and XFS for the sole reason that large file deletes took almost no time at all. Compared to several seconds or even minutes with other file systems.

      --
      My eyes reflect the stars and a smile lights up my face.
    8. Re:For home users, basically meaningless. by MightyYar · · Score: 2

      I had an incident where my photos folder suffered silent filesystem corruption. Fortunately, my backup tool (Unison) does enough file comparisons and did not brainlessly overwrite the undamaged images still in backup but instead flagged it as a conflict. It taught me a lesson about what is "good enough" for day-to-day users. Just like a lightning strike taught me about off-site backup for day-to-day users.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    9. Re:For home users, basically meaningless. by epine · · Score: 2

      These benchmarks are sensitive to extremely subtle differences in how each file system interpret safety semantics, which unfortunately none of these "benchmark" utilities actually check.

      By "subtle" I mean just a scattered handful of sunflower seeds, which may (or may notâ"don't look at the light!) attract the attention of the Black Swan of Extreme Face Melt.

      One thing I read a while back explained how rigorous NFS semantics were pretty much guaranteed to cut your benchmark results in half, compared to how these semantics are traditionally implemented on just about any Linux system.

      Is that bridge safe? The pragmatic answer is this: a million people have driven over it so far, and no-one has died yet.

      To the kind of person who gets a secret thrill from "First post!" the logic of this probably seems impeccable.

      Subject: First post!

      I don't want a pickle,
      I just want to r-i-i-ide my motor sickle.

      cya
      snookie—don't cry ... I love you so much!

    10. Re:For home users, basically meaningless. by ThomasSpaziani · · Score: 2

      Not at all. ZFS with a nice GUI can be extremely useful for home users. The ability to custom tailor compression per folder, or snapshots for easy backup and quick retrieval of overwritten files. All that doesn't have to be enterprise only, it can have profoundly positive impact on general users day to day.

    11. Re: For home users, basically meaningless. by Zeromous · · Score: 3, Insightful

      If you are using ZFS, you need to have the offline backup.

      So many things can go wrong with ZFS due to failures beyond your control. You use ZFS so you don't have to restore, and keep an offline backup for when ZFS is fucked.

      If you can't afford to offline your ZFS data, ZFS is not for you.

      --
      ---Up Up Down Down Left Right Left Right B A START
    12. Re:For home users, basically meaningless. by lewiscr · · Score: 2

      It also made just about any computer with less than 8 GB of RAM obsolete.

      a) Pick the right tool for the job.
      b) ZFS works fine without lots of RAM. Either cap the ARC, or disable it.

      I plan to use ZFS for my personal NAS. I'll have 4TiB of storage (spinners) and 2GiB of RAM. It's mostly media storage, so ARC isn't terribly useful. And ZFS will auto-disable the ARC if the machine has less than 4TiB of RAM. Sure, it's not going to set any benchmarks records, but I don't need it to. Streaming media at the home scale isn't taxing for modern PCs.

      It's also not very friendly with applications that need large chunks of RAM, like a database or large Java VM application

      I love ZFS for my database servers. It plays very well with PostgreSQL, because in PG you can tell it how much RAM to use as a buffer AND estimate how much RAM the OS will use for cache. Just tell PG that the OS will do all the caching, and things are good. ZFS beat the crap out of my HW RAID card in the PG benchmarks, with the same amount of RAM, without adjusting the configs I mentioned.

      And lets not forget the other great features it offers:

      • It's beautiful for RAID1. mdadm is weak with partial failures. If a drive has bitrot, mdadm will tell you, but it can't tell you which drive is right. ZFS knows which one is right, and fixes it automatically.
      • Auto expansion is available (disabled by default). I've been upgrading my personal NAS for 15 years, one part at a time. I have expanded the LVM+mdadm ext FS from 100GB -> 250GB -> 500GB -> 2TB. It's easier now that ext3 has online resize, but it's still a lot more work than ZFS.

      I do wish ZFS could handle changing the layout on the fly, and shrinking volumes. It's corner case, but it would come in handy in some failure scenarios. Veritas Volume Manager was the only thing I've worked with that did this well, and that was hella expensive. (I consider any software that costs more than the machine that it runs on to be hella expensive.)

    13. Re: For home users, basically meaningless. by Tough+Love · · Score: 2

      The main drawback with zfs is, it does not have a repairing fsck and never will have one. The koolaid you are supposed to drink is that raid will fix any corruption, so if anything ever does go wrong, and that would include bugs, random memory bit flips, multiple disk errors (lightning storm anyone?) and any number of other hazards that defeat raid recovery, zfs is just screwed and won't even attempt to get back the data that is most probably still sitting there, mostly intact.

      If you need snapshots and remote replication more than you need the comfort zone of being able to repair just about anything that goes wrong, then ZFS is for you, otherwise Ext4 is still the most robust general purpose filesystem around.

      The other thing that really hasn't come out yet is, how does ZFS perform compared to Ext4? So far nobody has done proper head to head benchmarks on identical hardware and OS. Soon it should be easy. I'm guessing that ZFS comes in pretty solidly in the rear of the pack.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    14. Re: For home users, basically meaningless. by Tough+Love · · Score: 2

      ZFS does not need a fsck utility because it cannot break like other filesystems.

      Excellent sense of humour :)

      --
      When all you have is a hammer, every problem starts to look like a thumb.
  2. BTRFS by ssam · · Score: 4, Interesting

    I'll stick with BTRFS thanks. It gives me all those features, is GPL and has been trouble free for me on many TB of disks for several years.

    1. Re:BTRFS by fbobraga · · Score: 2, Informative

      It's a promising fs, but is not very stable now: I've tried BTRFS in a netbook (with Arch): it corrupted a micro-SD disk so many times that I've gived up and used ext4 (from it: a never have considered to use BTRFS in production systems yet like I do with ZFS])

    2. Re:BTRFS by Anonymous Coward · · Score: 2, Informative

      I'll stick with BTRFS thanks. It gives me all those features, is GPL and has been trouble free for me on many TB of disks for several years.

      Encryption? Oh yeah:

      Btrfs does not support native file encryption (yet), and there's nobody actively working on it. It could conceivably be added in the future.

      "Nobody actively working on it" is a big problem with BTRFS.

      BTRFS comes from Oracle - pre-Sun purchase. It was Oracle's answer to ZFS. And now Oracle owns ZFS and doesn't need a copy of the original. It's not quite abandonware, but the central impetus for it's creation and advancement is gone.

      And most of all:

      Is btrfs stable?

      Short answer: Maybe.

      Ouch. That's the official BTRFS wiki page.

    3. Re:BTRFS by lkcl · · Score: 2, Interesting

      If you pick your file system because its GPL, you're pretty retarded. And yes, retarded is the appropriate word here.

      he's picking his file system so that he complies with Copyright Law. why would you have an issue with that? i also don't understand why a Corporation (Canonical) would encourage people to ignore Copyright Law.

    4. Re:BTRFS by Anonymous Coward · · Score: 5, Informative

      As a die hard BTRFS user that chases kernel releases like a addict chases crack, I can't help but say that there are still some annoying issues out there.

      While none have given me data loss, you'll get the occasional deadlock from a set of kthreads that do compression or a severe slowdown with next to no disk I/O and big WAITIO (usually get 16.xx Load in such cases on a quad core machine). For the slowdown case you'll get a speed drop from 150MB/s to 900~KB/s on spinning rust for a couple of minutes. Happens only after heavy use in the range of 2+TB written with forced compression.

      ENOSPC? Not on my end. Trying to copy a file and running out of space results in WAITIO through the roof while BTRFS tries to find free space. I've had a job that stalled and thrashed the hard drive for 9 hours while it tried to recover space. At no point did it simply kill the transfer due to out of space, btrfs usage showed around 1GB of space left with plenty for metadata. It's at 1GB free for data extents and that's what kills the whole deal. You can't use that last 1GB, you'll just deadlock until some space is recovered by deleting files manually. Happens every time, just make sure to transfer something that is larger than the available free space and watch it suffer.

      All this with Linux kernel 4.4.2. Looking at the various mailing lists with regular posts from people with obscure problems I've never encountered before, can't really say it's on par with ZFS stability. And ZFS On Linux is still missing a few things last I checked from the true ZFS implementation, but it's usable. Can't comment on ZOL long term stability, but I would feel comfortable enough using it instead of BTRFS for say a production server.

    5. Re:BTRFS by Bert64 · · Score: 2

      So given that Oracle creates btrfs as a competitor to zfs because the latter used a license incompatible with the linux kernel, and now they own zfs, why wouldnt they just gpl (or dual license) zfs and forget about btrfs?

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    6. Re:BTRFS by DRJlaw · · Score: 2, Interesting

      he's picking his file system so that he complies with Copyright Law. why would you have an issue with that? i also don't understand why a Corporation (Canonical) would encourage people to ignore Copyright Law.

      Identify the copyright violation:

      "This problem is being worked around by providing the kernel facilities through a separate kernel module, a technical solution for a legal problem that is also being employed by vendors and distributors of proprietary hardware drivers."

      The GPL does not autmatically apply to anything that touches the kernel. It only applies to derivative works of a GPLed work. If they write a GPLed wrapper that is a derivative of both the kernel and the ZFS sources and chose to dual license it, then there's no need for the ZFS sources to be GPL licensed -- merely the wrapper. No GPL-code-inspired modifications, no GPL-defined derivatie work and no GPL licensing requirement. (So sad.)

      For a group which worships the copyright hack that the GPL represents, it's odd that so many become so blind and incensed by anyone who dares to come up with a couter-hack to overcome some of the license's more idiotic features (i.e., it's open source, but it's not pure, GPL-certified open source, so you can't use it with our stuff). The only case that comes close to supporting GPL proponents' borg-like interpretation of the term "derviate work" is the Oracle v. Google fiasco. If that's the company that you want to keep, don't expect sympathy from me.

    7. Re:BTRFS by Anonymous Coward · · Score: 2, Funny

      he's picking his file system so that he complies with Copyright Law. why would you have an issue with that? i also don't understand why a Corporation (Canonical) would encourage people to ignore Copyright Law.

      Identify the copyright violation:

      "This problem is being worked around by providing the kernel facilities through a separate kernel module, a technical solution for a legal problem that is also being employed by vendors and distributors of proprietary hardware drivers."

      The GPL does not autmatically apply to anything that touches the kernel. It only applies to derivative works of a GPLed work. If they write a GPLed wrapper that is a derivative of both the kernel and the ZFS sources and chose to dual license it, then there's no need for the ZFS sources to be GPL licensed -- merely the wrapper. No GPL-code-inspired modifications, no GPL-defined derivatie work and no GPL licensing requirement. (So sad.)

      For a group which worships the copyright hack that the GPL represents, it's odd that so many become so blind and incensed by anyone who dares to come up with a couter-hack to overcome some of the license's more idiotic features (i.e., it's open source, but it's not pure, GPL-certified open source, so you can't use it with our stuff). The only case that comes close to supporting GPL proponents' borg-like interpretation of the term "derviate work" is the Oracle v. Google fiasco. If that's the company that you want to keep, don't expect sympathy from me.

      I had to write a module like that one.

      The acronym for the module in our product was "bmrms". I forget what the "official" meaning was, but it really meant Bite Me RMS

    8. Re:BTRFS by UnknownSoldier · · Score: 2

      The parent is probably referring to the fact that CDDL is NOT compatible with the GPL.

      https://lists.debian.org/debia...

      Unfortunately Sun then developed the CDDL[1] and JÃrg Schilling
      released parts of recent versions of cdrtools under this license.
      The CDDL is incompatible with the GPL. The FSF itself says that this
      is the case as do people who helped draft the CDDL. One current and
      one former Sun employee visited the annual Debian conference in Mexico
      in 2006. Danese Cooper clearly stated there that the CDDL was
      intentionally modelled on the MPL in order to make it GPL-
      incompatible. For everyone who wants to hear this first-hand, we have
      video from that talk available at [2].

      You can read the FSF position about the CDDL at [3]. The thread behind
      [4] contains statements on the issue made by Debian people; for more
      context also see the other mails in that thread.
      In short - the CDDL has extra restrictions, which the GPL does not
      allow. JÃrg has a different opinion about this and has repeatedly
      stated that the CDDL is not incompatible, interpreting a facial
      expression in the above-mentioned video, calling us liars and generally
      appearing unwilling to consider our concerns (he never replied to the
      parts where we explained why it is incompatible). As he has basically
      ignored what we have said, we have no choice but to fork. While the CDDL
      *may* be a free license, we never questioned if it is free or not, as it
      is not our place to decide this as the Debian cdrtools
      maintainers. However, having been approved by OSI doesn't mean it's ok
      for any usage, as JÃrg unfortunately seems to assume. There are several
      OSI-approved licenses that are GPL-incompatible and CDDL is one of
      them. That is and always was our point.

      [1] http://www.opensource.org/lice...
      [2] http://meetings-archive.debian...
      [3] http://www.gnu.org/licenses/li...
      [4] http://lists.debian.org/debian...

    9. Re:BTRFS by ssam · · Score: 3, Interesting

      I am using BTRFS on luks on my laptop. Even during a motherboard failure that cause repeated hard poweroffs I did not loose any data (and thanks to data checksumming I know that there is no corruption lurking in the files). BTRFS has developers at Facebook, Fujitsu, SUSE, IBM and still gets patches from people at Oracle. Seems a fairly healthy project to me.

    10. Re:BTRFS by Aaden42 · · Score: 2

      Oracle considers ZFS a competitive advantage. It's their answer to NetApp's WAFL. Not sure the reasoning behind creating btrfs (other than possibly just merger schedules resulting in them owning both), but it's very likely they consider the GPL/CDDL incompatibility and resulting copyright FUD/trolling to be a feature. Having an in-tree ZFS module on Linux isn't something Oracle wants to see.

    11. Re:BTRFS by wolrahnaes · · Score: 2

      The GPL does not autmatically apply to anything that touches the kernel. It only applies to derivative works of a GPLed work. If they write a GPLed wrapper that is a derivative of both the kernel and the ZFS sources and chose to dual license it, then there's no need for the ZFS sources to be GPL licensed -- merely the wrapper. No GPL-code-inspired modifications, no GPL-defined derivatie work and no GPL licensing requirement. (So sad.)

      There's actually a lawsuit going on right now about this very tactic. https://sfconservancy.org/copy... (article source is funding the suit, so apply grains of salt as appropriate)

      IANAL, but the position I've always heard (and seemingly the one this lawsuit is taking) is that the "GPL shim" is only legitimate in cases where the proprietary side it's interfacing with is the same on other platforms. In that case instead of just being an obvious attempt to run around copyleft it becomes a mere adapter to a vendor's standard interface. Supposedly this is how the nVidia driver has worked for a long time now, the binary blob is the same on all supported operating systems and only the shim that adapts it to each kernel differs. VMware's attempt on the other hand appears to be pretty much the opposite side of the spectrum, with heavy integration with one specific kernel under the guise of a "shim".

      Personally I think that's a legitimate workaround, since in the end most drivers exist to connect the kernel with some proprietary black-box hardware over an interface that's standard to the hardware but may or may not be documented. That the public standard interface is implemented in software rather than hardware doesn't seem like it should be meaningful to me.

      I guess if that suit runs to completion we'll at least see some line drawn in the sand legally which obviously wouldn't directly apply to anywhere not under German law but would probably influence future cases.

      --
      I used to get high on life, but I developed a tolerance. Now I need something stronger.
    12. Re:BTRFS by Kjella · · Score: 2

      The GPL does not autmatically apply to anything that touches the kernel. It only applies to derivative works of a GPLed work. If they write a GPLed wrapper that is a derivative of both the kernel and the ZFS sources and chose to dual license it, then there's no need for the ZFS sources to be GPL licensed -- merely the wrapper. No GPL-code-inspired modifications, no GPL-defined derivatie work and no GPL licensing requirement. (So sad.)

      That's not how copyright works. If you have say a piece of code licensed for non-commercial use you can't just write a wrapper and say my commercial use application talks to a wrapper, the wrapper talks to your code so the terms don't apply. Instead it relies on a sleight of hand where the user is creating the illegal derivate through assembling bits that were acquired legally using a prepared script. Just like you can acquire a bunch of legal chemicals, start a meth lab and end up with an illegal product.

      --
      Live today, because you never know what tomorrow brings
  3. For containers by DrYak · · Score: 4, Informative

    More precisely, the blog bost is about using ZFS' copy-on-write (CoW) capabilities in the context of linux containers.
    (thin virtualized machines. The guest share the same kernel as the host, but the userspace is separated and compartmentalized using the kernel's cgroup feature.
    Similar to BSD Jails and Solaris containers.
    Think like a chroot, except extented to all the other concepts beside file system).

    The fast and easy snapshoting that come with CoW filesystems like ZFS (or BTRFS for that matters) makes it very easy to spin new virtualized containers simply by snapshoting the subtree holding the empty template, while wasting only minimal resource (only the differences are stored as the two copies diverge over time).

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
  4. 16.04 will be really exciting LTS by Parker+Lewis · · Score: 2

    12.04 and 14.04 were kind of previous versions with updated programs, nice polished and updated drivers. But 16.04 will have exciting new stuff: privacy enabled by default, ZFS, new software centre, first LTS with systemd (yeap, mind that, I like it!) and kernel 4.

  5. Like a train wreck in reverse by BaronM · · Score: 4, Insightful

    Every time I see news about ZFS and Linux, it's a little bit less of a mess. Eventually, I expect that all of the major distributions will go this route and sidestep the licensing issue by providing distro-supported modules that are installed by user request, sort of like the way that Nvidia drivers are provided.

  6. Re:Too little, too late... by LichtSpektren · · Score: 3, Insightful

    I used the wrong tool for the job, therefore it sucks.

  7. Lies by dlenmn · · Score: 3, Interesting

    It's not quite abandonware, but the central impetus for it's creation and advancement is gone.

    I wasn't planning to comment on this thread, but this is too big a lie to let stand -- unless by "not quite abaondonware" you mean "has absolutely nothing in common with abandonware besides being a type of software". Oracle was never the sole developer, and now that Oracle has lost interest, the developers just moved to other companies and kept doing the same thing. Its raison d'etre remains to provide an advanced filesystem that's easily integrated with linux, which for better or worse means being licensed under the GPL or something compatible.

    As for encryption, yeah that would be nice to have, but it's not like zfs has all the features btrfs has. I'll take btrfs's online balancing (ability to add and remove drives at will) over built in encryption, but I realize that's a personal choice.

    Finally, let's actually quote the FAQ correct only stability:

    Short answer: Maybe.

    Long answer: Nobody is going to magically stick a label on the btrfs code and say "yes, this is now stable and bug-free". Different people have different concepts of stability: a home user who wants to keep their ripped CDs on it will have a different requirement for stability than a large financial institution running their trading system on it. If you are concerned about stability in commercial production use, you should test btrfs on a testbed system under production workloads to see if it will do what you want of it. In any case, you should join the mailing list (and hang out in IRC) and read through problem reports and follow them to their conclusion to give yourself a good idea of the types of issues that come up, and the degree to which they can be dealt with. Whatever you do, we recommend keeping good, tested, off-system (and off-site) backups.

    Pragmatic answer: (2012-12-19) Many of the developers and testers run btrfs as their primary filesystem for day-to-day usage, or with various forms of real data. With reliable hardware and up-to-date kernels, we see very few unrecoverable problems showing up. As always, keep backups, test them, and be prepared to use them.

    For all practical purposes, btrfs is stable. Everything they say in the long answer basically applies to linux in general (unless you have a support contract with Red Hat or the likes).

  8. Re:Where's The Lie? by Eunuchswear · · Score: 2

    Where's the lie?

    Does systemd not replace the system log with a binary file that is unusable by every application that reads or writes to the system log?

    No.

    Does systemd not break the system administration tool chain?

    No.

    Does systemd not consume and discard STDERR making troubleshooting and debugging a masochist's delight?

    No

    Up until 14.04LTS everything dumps into /var/log/syslog, a standard text file. Beginning with 16.04LTS and thanks to systemd that is replaced by a binary file that is virtually inaccessible by everything else.

    Run rsyslogd.

    Won't bother with your boring crap about eth0. All my network interfaces have real names, which works with udev just like it used to.

    --
    Watch this Heartland Institute video
  9. Re:RAM? by Aaden42 · · Score: 2

    Do not, repeat DO NOT ENABLE DE-DUPE unless you have gargantuan amounts of RAM.

    Rule of thumb is 5GB of RAM per 1TB of ZFS data: http://constantin.glez.de/blog...

    If you ever enable dedupe on a pool, it's on forever. You can't actually turn off the extra RAM requirements since there *could* be de-duped blocks, and ZFS must check for those on every pool import. On a system with insufficient RAM, it's possible to end up with a pool that can take hours or days to import with no indication that it's actually still importing and not just dead.

    Unless you have truly epic levels of duplication, it'll be cheaper to buy more disk to hold the extra copies than to buy enough RAM. (Also keeping in mind that with snapshots & copy-on-write clones, you essentially get dedupe of those blocks for "free" without enabling pool-wide dedupe.)

  10. Re:Too little, too late... by ssam · · Score: 3, Informative

    Why do you need the closed nvidia driver on a server? Nouveau should be fine or even just the vesa driver. (I could say why do you even need a video card on a server, but I guess some folk prefer that to using ssh or a serial connection from a laptop)

  11. Re:RAM? by Anonymous Coward · · Score: 2, Interesting

    Also don't enable dedup if you have media with a nontrivial seek time. It's tolerable on flash (but you do lots of extra I/O on write) but the deduplication table (DDT) tends to develop a random layout with respect to device LBAs, and the DDT needs to be consulted on each write to a dataset/zvol with dedup enabled, and it also needs to be scanned *first* during scrubs and resilvers. DDTs with millions of entries can require hundreds of thousands of random I/Os, which means hundreds of thousands of seeks. At 10-30 ms/seek, you will have a bad time if you ever need to replace a disk in your pool, or scrub regularly.

    Additionally, you have to update the DDT when you remove duplicates. When you remove a file from a deduplicated dataset: DDT consult. If DDT is not in memory, that means random I/O. When you remove a snapshot filled with data entered into the DDT: DDT consults for each record (128k, typically; sometimes smaller). Worse, asynchronous destruction of snapshots and datasets across a reboot boundary must be finished before the pool becomes available during the import process. And at the reboot, your ARC is empty and your L2ARC is unavailable before import, so each removed record is much more likely to result in a disk seek.

    On SSDs, not so bad. On rotating media, deduplication can result in operational disasters -- I have personally seen pools take a WEEK to import after an unexpected restart during a "zfs destroy -R foo/bar" where bar has dedup=on.