Slashdot Mirror


Native ZFS Is Coming To Linux Next Month

An anonymous reader writes "Phoronix is reporting that an Indian technology company has been porting the ZFS filesystem to Linux and will be releasing it next month as a native kernel module without a dependence on FUSE. 'In terms of how native ZFS for Linux is being handled by this Indian company, they are releasing their ported ZFS code under the Common Development & Distribution License and will not be attempting to go for mainline integration. Instead, this company will just be releasing their CDDL source-code as a build-able kernel module for users and ensuring it does not use any GPL-only symbols where there would be license conflicts. KQ Infotech also seems confident that Oracle will not attempt to take any legal action against them for this work.'"

28 of 273 comments (clear)

  1. Re:Freedom ain't free by h4rr4r · · Score: 4, Insightful

    Sun used the CDDL just to make sure Linux never got ZFS. Even that move is not going to save solaris, only open sourcing it earlier would have done that. I say this as a linux user who likes solaris and thinks it will be a shame to see it die. Well I like it once the GNUtools are installed, the solaris versions sucked.

    They are both quite open, how free they are some might argue about.

  2. Good Article by bill_mcgonigle · · Score: 5, Insightful

    No, really. I had a bunch of questions going in, and they were all answered. This is rare enough to warrant a shout out to Michael Larabel.

    I disagree with some of his subjective claims like x86_64 being a substantive limitation or ZFS on Linux remaining niche (I guess that depends on how you define the niche...) but he got the national lab project, the zpool version, the Oracle (nee Sun) patent problem. Kudos.

    FreeBSD 9 is probably where ZFS will wind up finding a proper home, I'm guessing.

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    1. Re:Good Article by h4rr4r · · Score: 4, Insightful

      How do you think it is not a substantive limitation?

      My phone runs linux and is not x86 of any shape or register size, nor is my workstation, nor are many other machines I have running linux. This is just like people who think flash working only on x86 32bit linux is good enough.

      If FreeBSD ever gets a good ZFS implementation expect lawsuits to commence.

    2. Re:Good Article by mysidia · · Score: 3, Informative

      Because ZFS is not production quality on a 32-bit CPU or with less than at least additional 2GB of RAM available for ARC, even on Solaris where ZFS is most mature. Bare minimum for ZFS: 1Gb RAM, 64bit proc.

      If you have a 32-bit CPU or less than 2GB system RAM, use UFS or Ext3, forget about ZFS for such hardware configurations, unless you want to experience pain (system hangs, memory starvation, crashes / Panics due to 32-bit address space squeeze causing fragmentation and ultimately inability to allocate ARC efficiently).

    3. Re:Good Article by mysidia · · Score: 3, Interesting

      It's worth mentioning that the latest version of Windows Server (2008 R2) is 64-bit only as well.

      And ZFS has always had 64-bit as minimum system requirements for production systems, even on Solaris.

      That is, 32-bit is considered okay for limited testing, unsuitable for production use, particularly for use with zpools larger than a few hundred GB in size or so.

      If you have a 1TB or larger storage pool with ZFS, you need 2gb of RAM and a 64-bit CPU to have something acceptable and stable. This is true whether you used Solaris or BSD.

      I consider it a good thing that the person porting to Linux is actually enforcing the basic 64-bit requirement. Maybe fewer people who don't read docs and 'system requirements' sheets will get burned that way, by not noticing that "32-bit is not suitable for enterprise use", and say ZFS on Linux 'sucks', because they screwed up basic configuration and deployment requirements ?

    4. Re:Good Article by TheRaven64 · · Score: 4, Informative

      My phone runs linux and is not x86 of any shape or register size, nor is my workstation, nor are many other machines I have running linux

      I can't speak for the Linux version, but ZFS on FreeBSD needs x86-64 for three reasons:

      First, and most simply, this is the platform that all of the ZFS developers use, so it is the one that is most tested. This doesn't mean that it won't work elsewhere, it just means that it is not well tested anywhere else.

      The second is a performance consideration. ZFS uses a lot of 64-bit arithmetic for computing checksums and so on. On most 32-bit platforms, doing 64-bit arithmetic means that you need to split the operands between two registers, effectively halving the number of GPRs that you have to work with. On x86-32, this basically limits you to 2 registers, which cripples performance - every operation involves some stack spills. This is an x86-specific limitation. On ARM, for example, you have 16 32-bit registers, which can be viewed as 8 64-bit registers for certain instructions. Doing a lot of 64-bit arithmetic on an ARM chip still doesn't generate as much register pressure as even doing 32-bit operations on x86.

      The final limitation is memory. ZFS likes to have 600MB or so of kernel memory. On x86, the divide between kernel and userspace memory is typically done using segmentation. The kernel has one segment, marked in the GDT as requiring ring-0 permission to access. When you switch to kernel space, the segment register points to this entry. In userspace, you use other segments (sometimes just one per process, sometimes one for stack, one for heap, and so on, sometimes one for all processes with some churn between them). With other implementations, this is done at the page level, although that's more expensive. The kernel's memory, however, is always mapped into the userspace process's address space - it just isn't always accessible.

      The reason for this is that x86 lacks sensible TLB controls. If the kernel's address space were not mapped in this way, then every system call would require a TLB flush, which would impact performance. The more address space that you allocate to the kernel, the less you give to userspace apps. If the kernel has 2GB of address space, userland apps can only have 2GB each. On ARM, each TLB entry is tagged with an ASID. The kernel and userspace programs' address spaces are entirely separate, but transitions between the two don't require a TLB flush because the userspace process can't see entries tagged with the kernel's ASID.

      Rather than saying that ZFS requires 64-bit, or requires x86-64, it's more accurate to say that it won't work (well) on x86-32 due to inherent limitations of the platform. That doesn't mean that it won't work well on other 32-bit or 64-bit architectures which are less braindead.

      --
      I am TheRaven on Soylent News
  3. Re:Open Source != Free Software by guruevi · · Score: 3, Insightful

    I don't know if that's true. I know you probably can't redistribute the kernel with the CDDL bits but you can redistribute them separately (CDDL = Common Development and Distribution License). Then all you have to do is make sure that your software (or customer) installs the right bits and then you can get a pretty decent NAS box.

    Besides the legal issues, I would love to see them tackle the technical issues. ZFS itself is very clean in code, very well documented and pretty simple once you get down to the wire. The issue (and selling point) is going to be performance and upkeep and for commercial implementations support. If the upkeep is going to be similar to BSD's implementation (several versions behind) or the performance as bad as FUSE, people are just going to stick to OpenSolaris (or one of it's commercially supported decendants like Nexenta).

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  4. Hey if Phoronix says it, it has to be true! by Beelzebud · · Score: 3, Funny

    I hear that every install of ZFS for Linux comes with a pre-installed Steam client, and a free copy of Team Fortress 2 For Linux!

  5. Re:who cares?! by EvanED · · Score: 3, Insightful

    ZFS has becoming vapor ware since apple announced snow kitty wasnt gunna support it.

    I do not think that word means what you think it means.

  6. Re:Freedom ain't free by h4rr4r · · Score: 3, Insightful

    No, they are a company that exists to make money. Saving Solaris would make them more money. Very simple. Corporations do not hate like that, they only do what they must to maximize profit.

    BSD is a fine license, it was created for a real purpose, not to just protect a doomed product.

  7. If it comes out and works well by Sycraft-fu · · Score: 5, Informative

    Seems a little early to be putting faith in that. It's feature list looks good, on par with other modern desktop file systems like HFS+ and NTFS. However it is currently unstable. When will that be fixed? Who knows? Maybe it moved full steam ahead and we have a stable, capable file system next month. Maybe the project loses steam and languishes and 4 years from now it is still "unstable" and "coming soon."

    You can't really say how well it'll work until there is stable code to test. Remember designing a file system isn't the real hard part. I'm not saying it is trivial work or that it is unimportant but it is by far the easier part of all this. You can write out a specification that sounds great on paper, but then you have to implement it. That is the much harder part. You have to make it fast, stable, not corrupt data, able to do everything it should and so on.

    This is part of the reason why NTFS on Linux has been so tricky. It is actually pretty well documented in the Windows Internals book, and other places, but it is a complex file system. FAT, on the other hand, is real simple and thus not hard to implement.

    As an example you can look at driver sized. The NTFS driver in Windows is 1.6MB. The FAT driver, on the other hand which supports multiple versions of FAT, is only 200k. The NTFS kernel driver is one of the very largest in the system, only the ATi video driver (much larger) and TCP/IP stack (a bit larger) are bigger than it on my system.

    So we'll see what happens with btrfs. As of late, there's not been much activity. The last version update was June 2009. Maybe they are rolling up final testing for production release, or maybe things have slowed down and release is not near. We'll just have to wait and see, but it is foolish to believe this will be the Next Big Thing(tm) at this point.

    1. Re:If it comes out and works well by Christophotron · · Score: 4, Informative

      BTRFS is not that unstable really.. I have been running for a few months now, since the on-disk file structure was finalized. it's in a raid 1 configuration across 2 300gig drives on one of my home servers and it hasn't had a hiccup yet, even with lots of file i/o. i think it would like more than the CPU and RAM I gave it, but its still less resource intensive than ZFS. AFAIK ZFS would not even run on that machine due to the 32 bit processor and only 512mb of RAM. Some of the features are not implemented yet but it is certainly stable enough to test..

    2. Re:If it comes out and works well by EvanED · · Score: 3, Informative

      NTFS doesn't do COW, but it's had snapshotting for a while under the name "volume shadow copy". This was added in XP or 2003, and even given somewhat of a UI in the form of "previous versions" in Vista.

    3. Re:If it comes out and works well by benjymouse · · Score: 4, Informative

      So you are suggesting I can freeze IO to the machine, then run a snapshot command on NTFS?

      I would be glad to hear it.

      The Volume Shadow Service (VSS) is always running (by default). Backup utilities - including the ones which come with Windows - use VSS to create a snapshot and perform backup from that point in time. It doesn't freeze IO; rather it goes to copy-on-write.

      On server versions you can also create snapshots interactively by using the vssadmin tool.

      Shares can be set up to create a shadow copies multiple times per day. This is not copy on every write - but it *is* copy on write once a block is part of a snapshot. Any client (plugin needed for XP, IIRC) can display previous versions which are available snapshots.

      VSS actually goes beyond NTFS integration (which is probably why it is a service and not just a NTFS feature). Certain applications - e.g. Exchange, SQL Server and Hyper-V - also integrate with VSS. Instead of VSS operating directly on e.g. SQL Server files, it integrates with the server to create a snapshot for the database files. During restore the system knows how some applications took part in the shadow copy. This ensures that I can correctly restore *all* the files needed to bring a SQL server database back to a certain point-in-time. It also allows the SQL server to prune the log automatically.

      I have a Server2008R2 which has several Hyper-V images (development and testing). When I perform a backup of the server, VSS interacts with Hyper-V to perform backup of the virtual machines as well. A Server2003 which hasn't been set up to support VSS is actually "hibernated" by Hyper-V/VSS - then backed up - then brought back into running state. That could be considered "freezing IO", I suppose.

      --
      Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
    4. Re:If it comes out and works well by Cyberax · · Score: 4, Interesting

      "* *Actual* performance problems due to fragmentation - outside of a few corner cases - are basically nonexistant. "

      Yep. That's why I have to run defragmenter on our build server every week...

      Also, Windows is notoriously slow with file operations. It's not directly related to NTFS, but more to extremely inefficient VFS stack.

      "* Can you explain what you mean by "it's done above the VFS layer" ? Surely you're not trying to argue symlinks and shortcuts are the same thing ? "

      http://neosmart.net/blog/2006/vista-symlinks-revisited/

      "* RAID is handled at the block device level, not the filesystem level (and many, many people believe putting RAID into the "filesystem" is an architecturally bad thing, so that's hardly something it can be plainly criticised for)."

      However, filesystem-level RAIDs have a lot more functionality than block-level RAIDs. Look at ZFS or BTRFS.

      "* Do you have a source for up-to-date benchmarks ?"

      I have my own set of benchmarks. Well, NTFS on Windows is almost always slower (and quite often like 100 _times_ slower) than Linux filesystems.

      http://rsdn.ru/File/37054/benchmark.zip - this is the source.

      http://rsdn.ru/forum/philosophy/1710544.1.aspx - this is a post with benchmark results (in Russian, sorry - I can translate if you have any questions)

      http://rsdn.ru/forum/philosophy/1712431.aspx - this post contains this benchmark, slightly adapted.

      I regularly re-run these tests. So far, Windows is only getting slower compared to Linux.

      I've recently created a multithreaded version of this test. Well, let's say that NTFS sucks so badly, that it's hard to understand how MS has managed to achieve this.

  8. Re:Freedom ain't free by larry+bagina · · Score: 3, Interesting

    There's a despair poster, I believe, with a caption along the lines "it could be, your main purpose in life, is to provide a warning to others". (Damn it, the internet made me check ... "It could be that the purpose of your life is only to serve as a warning to others."

    ZFS's purpose was not to be a next generation file system, but to encourage next generation file systems to be built. Free Software has a tendency to get stuck at "good enough" sometimes. And someone has to come along and show that there is a better way. Competition is good. Sometimes it's internal (gcc vs egcs), sometimes it commercial (CVS vs perforce and bitkeeper).

    What if ZFS was GPL? What if it went into Linux? It might get incremental tweaks, but it would stagnate at "good enough". Instead, btrfs, hammer, etc were developed -- much better, much cleaner file systems.

    ZFS has some cute tricks. What could be better than taking a sledgehammer to a disk drive without causing problems? But ultimately, ZFS would hold linux back.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  9. Re:Open Source != Free Software by quercus.aeternam · · Score: 3, Insightful

    This is both Open and Free, just not quite as free as Stallman would like.

    CDDL licensed code can be freely distributed and modified, so long as it is compiled with a compatible license.

    This is why BSD has no issues with including ZFS. The BSD license is less restrictive than the GPL.

  10. Re:Freedom ain't free by coerciblegerm · · Score: 5, Informative

    No, Sun used the CDDL because they hate the restrictions on GPL. The sharing issues go both ways, Sun wanted to keep some ownership. It's not like the BSD license exists just to spite GPL.

    This is the third time I've seen someone post something to this effect in the past week. I smell a smear campaign. Nonetheless, I'm calling BS here. Daneese Cooper, one of the individuals who helped draft the CDDL, stated that they based the CDDL on the MPL "partially because it is GPL incompatible. That was part of the design when they released OpenSolaris." It was made deliberately GPL-incompatible, but this has nothing to do with 'restrictions' in the GPL.

  11. Can I remove a disk from it yet? by Daffy+Duck · · Score: 5, Interesting

    http://www.opensolaris.org/jive/thread.jspa?threadID=131604
    http://www.opensolaris.org/jive/thread.jspa?messageID=270957

    Long story short: disk pools in ZFS can only grow, so don't make any mistakes unless you can afford to do a full dump and restore. Sun had been "working on" this for years. Anyone heard any news lately?

    1. Re:Can I remove a disk from it yet? by diegocg · · Score: 3, Informative

      The ZFS design makes this very difficult. Btrfs, on the other hand, has supported this feature for a long time, thanks to a nice design feature called backrefs.

    2. Re:Can I remove a disk from it yet? by catmistake · · Score: 3, Funny

      Sun has had this tutorial out for a couple years now.

  12. Re:ZFS recap by Anonymous Coward · · Score: 3, Informative

    We've heard much about ZFS, but being a slashdotter, I can't recklessly go on and RTFA. So, maybe someone here can recap its main benefits. Maybe a power point slide?

    Here's a good PDF on it:

            http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/zfslast.pdf

    Here's the PDF being presented by the co-creators, Jeff Bonwick and Bill Moore:

            http://blogs.sun.com/video/entry/zfs_the_last_word_in

    Three parts, one hour each. Streamable blip.tv as well as a downloadable M4V file.

    Two, ten minute videos:

            http://www.youtube.com/watch?v=gthel59G56c
            http://www.youtube.com/watch?v=OdHUub462pM

    Though I recommend you set aside the three hours (even if it's over several days) to really get a good understanding of how things work.

  13. Re:Freedom ain't free by Anonymous Coward · · Score: 3, Informative

    This is the third time I've seen someone post something to this effect in the past week. I smell a smear campaign.

    Nonetheless, I'm calling BS here. Daneese Cooper, one of the individuals who helped draft the CDDL, stated that they based the CDDL on the MPL "partially because it is GPL incompatible. That was part of the design when they released OpenSolaris." It was made deliberately GPL-incompatible, but this has nothing to do with 'restrictions' in the GPL.

    And Cooper's assertion was reject by Simon Phipps, Sun's Chief Open Source Officer for quite a while (before leaving Oracle in the last few weeks):

    http://www.opensolaris.org/jive/message.jspa?messageID=55013#55008
    http://en.wikipedia.org/wiki/Common_Development_and_Distribution_License#GPL_incompatibility_controversy

  14. Re:Freedom ain't free by Thundersnatch · · Score: 3, Insightful

    Instead, btrfs, hammer, etc were developed -- much better, much cleaner file systems.

    How can filesystems that don't exist in stable release form yet be "better" than ZFS?

    ZFS is far ahead of btrfs, both in terms of stability, features, and usability. Btrfs doesn't have parity RAID, dedupe, or replication yet. These are critical features for large-scale systems. In short, it isn't even close to ZFS. ZFS is also "cleaner" in my opinion, in both design and UI. Oracle funding most btrfs development also raises a question of btrfs momentum now that they own ZFS and Solaris.

  15. Re:Freedom ain't free by Jaxoreth · · Score: 3, Funny

    What could be better than taking a sledgehammer to a disk drive without causing problems?

    Shooting it with a .45?

    --
    In general, it is safe and legal to kill your children. -- POSIX Programmer's Guide
  16. That's not the GPL's fault by symbolset · · Score: 3, Insightful

    That's not the GPL's fault. It's the fault of the IP lawyers who are dicing permissions exceedingly fine. The GPL is designed to guarantee certain freedoms at the cost of others. It does its job very well, and is well architected with a lot of forethought considering we're only on version three after 21 years. At least one of those two revisions can be blamed not on the faults of the license but on the changing legal and IP environment.

    Believe it or not once upon a time if you wrote some code somebody found interesting you just sent it to them. No patents. No copyrights. No approvals from management or legal. You just sent it, happy that someone else might benefit from not redoing the work you'd done once already. The idea of profiting from the derivatives they might make, or the derivatives of the derivatives, was simply not an idea that would occur to a normal person. If you had suggested such a thing at that time we'd have thought it hilarious.

    And now I have to point to the onion on my belt, which was the fashion in my day.

    --
    Help stamp out iliturcy.
    1. Re:That's not the GPL's fault by Znork · · Score: 3, Insightful

      Yes, it is the GPL's fault. The CDDL is a per-file license. It places absolutely no restrictions on what other code can be combined with it in other files.

      As the CDDL is deliberately GPL incompatible, had there not been any other issues, one can assume that Sun would have added 'may not be distributed together with GPL licensed code'. The CDDL/GPL incompatibility was on purpose, it was a feature asked for by Solaris engineers. Had the Linux kernel been BSD licensed, the CDDL would have been made incompatible with the BSD license.

      Generally, fault implies some form of control over the issue. Under the circumstances, the only party with any control in this case would have been Sun, and as they would have redesigned the license until it was not compatible, it's quite obvious where any 'fault' should be assigned.

      And unless the Oracle buyout has changed some attitudes within Sun for the better (heh), it's also quite naive of KQ Infotech to believe that Sun/Oracle would not go after them for violating the point of the license, as opposed to the actual text of the license (assuming any wider distribution). Standing is hardly a necessary prerequisite for a company of Oracles size to grind a small company into dust in the courts (and both Oracle and Sun would have standing as kernel contributors to sue any distributor of ZFS+Linux kernel combo).

      Personally I can't say I consider it either a big loss or much to complain about. ZFS was a huge (HUGE) deal for Solaris, considering the painfully anemic storage stack it had in disksuite+ufs, but for any OS with a more modern volume management and file system stack it merely boiled down to a few nice features and some drawbacks, depending on your underlying storage architecture (SAN capabilities, etc).

  17. Re:Freedom ain't free by Anonymous Coward · · Score: 4, Informative

    when it comes to license compatibility issues in general, it is the GPL which is decidedly incompatible with every other license.

    That's FUD if I've ever seen FUD. Check out the FSF's list of free software licenses; there's many licenses that ARE GPL-compatible. Excluding the GNU licenses themselves, there's at least Apache 2.0, Artistic 2.0, Berkeley DB, Boost, Modified BSD, CeCILL, Clear BSD, Cryptix, eCos 2.0, Educational Community 2.0, Eiffel Forum 2, EU Datagrid, Expat, FreeBSD (!), FreeType, iMatix, Independent JPEG Group, imlib2, Intel Open Source, ISC, NCSA, Netscape Javascript, OpenLDAP, Perl 5, PD, Python 2, Python up to 1.6, Ruby, SGI B 2.0, SML/NJ, Unicode, VIM 6.1+, w3c, webm, WFTPL 2, X11, XFree86 1.1, zlib and Zope 2.

    And keep in mind that these are *licenses*; in reality, most projects won't even bother making up their own licenses. "Decidedly incompatible with every other license". Sheesh!

    some GPL advocates tend to view those who choose a non-GPL license as trying to thwart GNU and/or Linux so they don't have to admit that maybe other licenses have terms and conditions that have their own merit.

    Who are those mysterious "GPL advocates" you mention, then? Also, what does this have to do with a situation where Sun really WAS trying to "thwart GNU and/or Linux", by its own admission?

    Look, the CDDL isn't a bad license per se, and the FSF page linked above lists it as a free software license, too, if a GPL-incompatible one (it does urge you not to use it for that reason, but hey, this *is* the FSF). But the original point was that Sun wanted to make sure that ZFS etc. would not be available on Linux, and they chose/engineered a GPL-incompatible license specifically to ensure that. You're not even contesting that anymore, so why are you still arguing about the whole thing?

    It's a fact. Sun didn't want Linux to get ZFS. Get over it.