Slashdot Mirror


Native ZFS Is Coming To Linux Next Month

An anonymous reader writes "Phoronix is reporting that an Indian technology company has been porting the ZFS filesystem to Linux and will be releasing it next month as a native kernel module without a dependence on FUSE. 'In terms of how native ZFS for Linux is being handled by this Indian company, they are releasing their ported ZFS code under the Common Development & Distribution License and will not be attempting to go for mainline integration. Instead, this company will just be releasing their CDDL source-code as a build-able kernel module for users and ensuring it does not use any GPL-only symbols where there would be license conflicts. KQ Infotech also seems confident that Oracle will not attempt to take any legal action against them for this work.'"

11 of 273 comments (clear)

  1. Re:Freedom ain't free by h4rr4r · · Score: 4, Insightful

    Sun used the CDDL just to make sure Linux never got ZFS. Even that move is not going to save solaris, only open sourcing it earlier would have done that. I say this as a linux user who likes solaris and thinks it will be a shame to see it die. Well I like it once the GNUtools are installed, the solaris versions sucked.

    They are both quite open, how free they are some might argue about.

  2. Good Article by bill_mcgonigle · · Score: 5, Insightful

    No, really. I had a bunch of questions going in, and they were all answered. This is rare enough to warrant a shout out to Michael Larabel.

    I disagree with some of his subjective claims like x86_64 being a substantive limitation or ZFS on Linux remaining niche (I guess that depends on how you define the niche...) but he got the national lab project, the zpool version, the Oracle (nee Sun) patent problem. Kudos.

    FreeBSD 9 is probably where ZFS will wind up finding a proper home, I'm guessing.

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    1. Re:Good Article by h4rr4r · · Score: 4, Insightful

      How do you think it is not a substantive limitation?

      My phone runs linux and is not x86 of any shape or register size, nor is my workstation, nor are many other machines I have running linux. This is just like people who think flash working only on x86 32bit linux is good enough.

      If FreeBSD ever gets a good ZFS implementation expect lawsuits to commence.

    2. Re:Good Article by TheRaven64 · · Score: 4, Informative

      My phone runs linux and is not x86 of any shape or register size, nor is my workstation, nor are many other machines I have running linux

      I can't speak for the Linux version, but ZFS on FreeBSD needs x86-64 for three reasons:

      First, and most simply, this is the platform that all of the ZFS developers use, so it is the one that is most tested. This doesn't mean that it won't work elsewhere, it just means that it is not well tested anywhere else.

      The second is a performance consideration. ZFS uses a lot of 64-bit arithmetic for computing checksums and so on. On most 32-bit platforms, doing 64-bit arithmetic means that you need to split the operands between two registers, effectively halving the number of GPRs that you have to work with. On x86-32, this basically limits you to 2 registers, which cripples performance - every operation involves some stack spills. This is an x86-specific limitation. On ARM, for example, you have 16 32-bit registers, which can be viewed as 8 64-bit registers for certain instructions. Doing a lot of 64-bit arithmetic on an ARM chip still doesn't generate as much register pressure as even doing 32-bit operations on x86.

      The final limitation is memory. ZFS likes to have 600MB or so of kernel memory. On x86, the divide between kernel and userspace memory is typically done using segmentation. The kernel has one segment, marked in the GDT as requiring ring-0 permission to access. When you switch to kernel space, the segment register points to this entry. In userspace, you use other segments (sometimes just one per process, sometimes one for stack, one for heap, and so on, sometimes one for all processes with some churn between them). With other implementations, this is done at the page level, although that's more expensive. The kernel's memory, however, is always mapped into the userspace process's address space - it just isn't always accessible.

      The reason for this is that x86 lacks sensible TLB controls. If the kernel's address space were not mapped in this way, then every system call would require a TLB flush, which would impact performance. The more address space that you allocate to the kernel, the less you give to userspace apps. If the kernel has 2GB of address space, userland apps can only have 2GB each. On ARM, each TLB entry is tagged with an ASID. The kernel and userspace programs' address spaces are entirely separate, but transitions between the two don't require a TLB flush because the userspace process can't see entries tagged with the kernel's ASID.

      Rather than saying that ZFS requires 64-bit, or requires x86-64, it's more accurate to say that it won't work (well) on x86-32 due to inherent limitations of the platform. That doesn't mean that it won't work well on other 32-bit or 64-bit architectures which are less braindead.

      --
      I am TheRaven on Soylent News
  3. If it comes out and works well by Sycraft-fu · · Score: 5, Informative

    Seems a little early to be putting faith in that. It's feature list looks good, on par with other modern desktop file systems like HFS+ and NTFS. However it is currently unstable. When will that be fixed? Who knows? Maybe it moved full steam ahead and we have a stable, capable file system next month. Maybe the project loses steam and languishes and 4 years from now it is still "unstable" and "coming soon."

    You can't really say how well it'll work until there is stable code to test. Remember designing a file system isn't the real hard part. I'm not saying it is trivial work or that it is unimportant but it is by far the easier part of all this. You can write out a specification that sounds great on paper, but then you have to implement it. That is the much harder part. You have to make it fast, stable, not corrupt data, able to do everything it should and so on.

    This is part of the reason why NTFS on Linux has been so tricky. It is actually pretty well documented in the Windows Internals book, and other places, but it is a complex file system. FAT, on the other hand, is real simple and thus not hard to implement.

    As an example you can look at driver sized. The NTFS driver in Windows is 1.6MB. The FAT driver, on the other hand which supports multiple versions of FAT, is only 200k. The NTFS kernel driver is one of the very largest in the system, only the ATi video driver (much larger) and TCP/IP stack (a bit larger) are bigger than it on my system.

    So we'll see what happens with btrfs. As of late, there's not been much activity. The last version update was June 2009. Maybe they are rolling up final testing for production release, or maybe things have slowed down and release is not near. We'll just have to wait and see, but it is foolish to believe this will be the Next Big Thing(tm) at this point.

    1. Re:If it comes out and works well by Christophotron · · Score: 4, Informative

      BTRFS is not that unstable really.. I have been running for a few months now, since the on-disk file structure was finalized. it's in a raid 1 configuration across 2 300gig drives on one of my home servers and it hasn't had a hiccup yet, even with lots of file i/o. i think it would like more than the CPU and RAM I gave it, but its still less resource intensive than ZFS. AFAIK ZFS would not even run on that machine due to the 32 bit processor and only 512mb of RAM. Some of the features are not implemented yet but it is certainly stable enough to test..

    2. Re:If it comes out and works well by benjymouse · · Score: 4, Informative

      So you are suggesting I can freeze IO to the machine, then run a snapshot command on NTFS?

      I would be glad to hear it.

      The Volume Shadow Service (VSS) is always running (by default). Backup utilities - including the ones which come with Windows - use VSS to create a snapshot and perform backup from that point in time. It doesn't freeze IO; rather it goes to copy-on-write.

      On server versions you can also create snapshots interactively by using the vssadmin tool.

      Shares can be set up to create a shadow copies multiple times per day. This is not copy on every write - but it *is* copy on write once a block is part of a snapshot. Any client (plugin needed for XP, IIRC) can display previous versions which are available snapshots.

      VSS actually goes beyond NTFS integration (which is probably why it is a service and not just a NTFS feature). Certain applications - e.g. Exchange, SQL Server and Hyper-V - also integrate with VSS. Instead of VSS operating directly on e.g. SQL Server files, it integrates with the server to create a snapshot for the database files. During restore the system knows how some applications took part in the shadow copy. This ensures that I can correctly restore *all* the files needed to bring a SQL server database back to a certain point-in-time. It also allows the SQL server to prune the log automatically.

      I have a Server2008R2 which has several Hyper-V images (development and testing). When I perform a backup of the server, VSS interacts with Hyper-V to perform backup of the virtual machines as well. A Server2003 which hasn't been set up to support VSS is actually "hibernated" by Hyper-V/VSS - then backed up - then brought back into running state. That could be considered "freezing IO", I suppose.

      --
      Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
    3. Re:If it comes out and works well by Cyberax · · Score: 4, Interesting

      "* *Actual* performance problems due to fragmentation - outside of a few corner cases - are basically nonexistant. "

      Yep. That's why I have to run defragmenter on our build server every week...

      Also, Windows is notoriously slow with file operations. It's not directly related to NTFS, but more to extremely inefficient VFS stack.

      "* Can you explain what you mean by "it's done above the VFS layer" ? Surely you're not trying to argue symlinks and shortcuts are the same thing ? "

      http://neosmart.net/blog/2006/vista-symlinks-revisited/

      "* RAID is handled at the block device level, not the filesystem level (and many, many people believe putting RAID into the "filesystem" is an architecturally bad thing, so that's hardly something it can be plainly criticised for)."

      However, filesystem-level RAIDs have a lot more functionality than block-level RAIDs. Look at ZFS or BTRFS.

      "* Do you have a source for up-to-date benchmarks ?"

      I have my own set of benchmarks. Well, NTFS on Windows is almost always slower (and quite often like 100 _times_ slower) than Linux filesystems.

      http://rsdn.ru/File/37054/benchmark.zip - this is the source.

      http://rsdn.ru/forum/philosophy/1710544.1.aspx - this is a post with benchmark results (in Russian, sorry - I can translate if you have any questions)

      http://rsdn.ru/forum/philosophy/1712431.aspx - this post contains this benchmark, slightly adapted.

      I regularly re-run these tests. So far, Windows is only getting slower compared to Linux.

      I've recently created a multithreaded version of this test. Well, let's say that NTFS sucks so badly, that it's hard to understand how MS has managed to achieve this.

  4. Re:Freedom ain't free by coerciblegerm · · Score: 5, Informative

    No, Sun used the CDDL because they hate the restrictions on GPL. The sharing issues go both ways, Sun wanted to keep some ownership. It's not like the BSD license exists just to spite GPL.

    This is the third time I've seen someone post something to this effect in the past week. I smell a smear campaign. Nonetheless, I'm calling BS here. Daneese Cooper, one of the individuals who helped draft the CDDL, stated that they based the CDDL on the MPL "partially because it is GPL incompatible. That was part of the design when they released OpenSolaris." It was made deliberately GPL-incompatible, but this has nothing to do with 'restrictions' in the GPL.

  5. Can I remove a disk from it yet? by Daffy+Duck · · Score: 5, Interesting

    http://www.opensolaris.org/jive/thread.jspa?threadID=131604
    http://www.opensolaris.org/jive/thread.jspa?messageID=270957

    Long story short: disk pools in ZFS can only grow, so don't make any mistakes unless you can afford to do a full dump and restore. Sun had been "working on" this for years. Anyone heard any news lately?

  6. Re:Freedom ain't free by Anonymous Coward · · Score: 4, Informative

    when it comes to license compatibility issues in general, it is the GPL which is decidedly incompatible with every other license.

    That's FUD if I've ever seen FUD. Check out the FSF's list of free software licenses; there's many licenses that ARE GPL-compatible. Excluding the GNU licenses themselves, there's at least Apache 2.0, Artistic 2.0, Berkeley DB, Boost, Modified BSD, CeCILL, Clear BSD, Cryptix, eCos 2.0, Educational Community 2.0, Eiffel Forum 2, EU Datagrid, Expat, FreeBSD (!), FreeType, iMatix, Independent JPEG Group, imlib2, Intel Open Source, ISC, NCSA, Netscape Javascript, OpenLDAP, Perl 5, PD, Python 2, Python up to 1.6, Ruby, SGI B 2.0, SML/NJ, Unicode, VIM 6.1+, w3c, webm, WFTPL 2, X11, XFree86 1.1, zlib and Zope 2.

    And keep in mind that these are *licenses*; in reality, most projects won't even bother making up their own licenses. "Decidedly incompatible with every other license". Sheesh!

    some GPL advocates tend to view those who choose a non-GPL license as trying to thwart GNU and/or Linux so they don't have to admit that maybe other licenses have terms and conditions that have their own merit.

    Who are those mysterious "GPL advocates" you mention, then? Also, what does this have to do with a situation where Sun really WAS trying to "thwart GNU and/or Linux", by its own admission?

    Look, the CDDL isn't a bad license per se, and the FSF page linked above lists it as a free software license, too, if a GPL-incompatible one (it does urge you not to use it for that reason, but hey, this *is* the FSF). But the original point was that Sun wanted to make sure that ZFS etc. would not be available on Linux, and they chose/engineered a GPL-incompatible license specifically to ensure that. You're not even contesting that anymore, so why are you still arguing about the whole thing?

    It's a fact. Sun didn't want Linux to get ZFS. Get over it.