Slashdot Mirror


Lustre File System Getting New Community Distro

darthcamaro writes "Oracle acquired a lot of open source tech from Sun that has since been forked — or is in the process of being forked. The open source Lustre high performance computing file system isn't on the list of forked projects, but it is getting a new, community-driven distro that is trying really hard to say that they're not officially a fork. 'Since April of 2010 there has been confusion in the community, and we've seen an impact in the business confidence in Lustre,' Brent Gorda, CEO and president of Whamcloud told InternetNews.com. 'The community has been asking for leadership, the commitment of a for-profit entity that they can rely on for support and a path forward for the technology.'"

14 of 68 comments (clear)

  1. What is Lustre File System by Icyfire0573 · · Score: 5, Informative

    From their website:
    http://wiki.lustre.org/index.php/Main_Page

    High Performance and Scalability

    For the world's largest and most complex computing environments, the Lustre file system redefines high performance, scaling to tens of thousands of nodes and petabytes of storage with groundbreaking I/O and metadata throughput.

    1. Re:What is Lustre File System by CAIMLAS · · Score: 4, Interesting

      At a functional level, Lustre (GPL) is to ZFS (CDDL) as CXFS (commercial) is to XFS (GPL) for SGI. They are the upper 'cluster' layer to take advantage of the underlying filesystems' capability. I believe this approach is divergent from that of GFS, due to the upper/lower approach, but I'm not that familiar with clustered filesystems.

      However: Arguably, Lustre on ZFS is a mumuchch better option due to ZFSs inherent capability superiorty over XFS. I've liked XFS historically, but ZFS is so drastically superior than anything else out there (in terms of storage management and available capacity and throughput) - all 'out of the box' that it's a no-brainer to use zvols for things other than direct zfs posix access. (For instance, they make great VM iSCSI targets, or local raw disks for VMs, or..)

      Side note: the linux zfsonlinux.org port is being successfully used as the base volume manager for Lustre right now, so it is apparently quite capable/stable at that level. (zfsonlinux does not yet have zfs posix support.) Lustre on ZFS It, apparently, scales much better than the traditional LVM/RAID/etc. backend methods.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    2. Re:What is Lustre File System by fgodfrey · · Score: 2

      Obviously, we have internal benchmarks that tend to show that Lustre is good but I can't talk about specifics on those. What I can do, though, is link to this: http://www.cs.rpi.edu/~chrisc/COURSES/PARALLEL/SPRING-2009/papers/MADbench2-2009.pdf

      The stuff that I found most interesting is on page 12. The machines named Jaguar and Franklin are Cray's running Lustre. Bassi and Jacquard are both running GPFS. On page 15 they claim that they can make up for the deficiency in Lustre's default settings for shared access to a single file by tuning it.

      Unsurprisingly, the type of operation you're doing ends up determining which filesystem is best for your application.

      In terms of scalability, from the Wikipedia page for the Jaguar system at Oak Ridge National Labs (a large Cray XT5), their Lustre filesystem is 10 petabytes with read/write performance of approximately 240GB/sec (not sure what benchmark was used to get that number).

      --
      Go Badgers! -- #include "std/disclaimer.h"
    3. Re:What is Lustre File System by Monkeedude1212 · · Score: 2

      At a functional level, Lustre (GPL) is to ZFS (CDDL) as CXFS (commercial) is to XFS (GPL) for SGI.

      And who says the IT world has too many confusing Acronyms?

    4. Re:What is Lustre File System by fgodfrey · · Score: 2

      It certainly *can* be used with commodity hardware, but the majority (or maybe all?) of Lustre installations are in high performance computing with thousands, or tens of thousands, of clients (usually the nodes of a supercomputer) accessing the shared file system.

      Where more commodity hardware can come in is the installation of the filesystem servers themselves. A system's Object Storage Targets and Metadata Servers (pieces of Lustre) can be external to the Cray and connected via some interconnect such as Infiniband. It should be noted that even the "commodity" hardware for the filesystem isn't exactly cheap if you want a huge capacity and high reliability...

      --
      Go Badgers! -- #include "std/disclaimer.h"
    5. Re:What is Lustre File System by lachlan76 · · Score: 2

      Lustre's the standard spelling outside of the US, so it wouldn't make much difference. More likely to be just the preference of the original author.

    6. Re:What is Lustre File System by drsmithy · · Score: 3, Funny

      Any reason Luster cannot be spelled correctly?

      Assuming the name is supposed to indicate something that shines, and not a sex addict, it is spelled correctly.

    7. Re:What is Lustre File System by CAIMLAS · · Score: 2

      I'll tell you were I got that idea: experience.

      Managing filesystems in lvm2, on raid cards - all with their own specific commands - is a real pain in the ass when you've got tens of hosts or more per admin, with many different roles and functionality.

      So then you've got to have snmp set up for each of those hosts (often with different controller cards) to monitor those RAID cards status (with the shitty RAID console tool which lacks anything resembling documentation). Then you've got to manage LVM, with its "easily" understood UUIDs. And then you've got to manage your filesystem - whatever that may be - to verify it's intact.

      With Lustre, you'll be adding another layer of 'shit I have to monitor'. Even if your hardware is 100% identical across all nodes, is that not a significant pain in the ass to manage for?

      ZFS just needs an HBA and 'bare' drives. Baring ZFS itself failing on you (not outside the bounds of reason), you've really only got one/two things to look out for with zfs: data errors and disk failures, both of which are reported with zpool status and can be easily monitored. This is regardless of the version of the system, or even which OS it's running on (Solaris and its derivatives, FreeBSD, and yes, even Linux).

      Unlike with RAID, I don't have to contend with the possibility of RAID-5 write hole.

      If I need to move to new hardware in a pinch, I can - trivially. I can be a desktop, assuming I've got a suitable controller for all the disks (nothing 'special').

      Rebuilds are a fraction of the time as they are on hardware RAID (or mdraid, until recently), because only the actual data is replicated.

      Hell, you're going to get more performance out of ZFS than you will from the 'common' vendor (Dell, HP, IBM) RAID cards.

      I'm assuming Lustre will scale better on ZFS better than on traditional methods because everything else scales better on ZFS. Growth is easier; management is easier; maintenance is easier. "Storage management" is actually somewhat enjoyable with ZFS, instead of the tedious skulduggery it usually tends to be otherwise.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
  2. Ended project by diegocg · · Score: 4, Informative

    According to insidehpc, Oracle has stopped developing Lustre and developers "have reportedly been encouraged to apply for other positions within the company".

    A group of Lustre users already created OpenSFS on October 2010 to continue developing Lustre.

  3. Very first thing to do is... by Daniel+Phillips · · Score: 3, Interesting

    Lose every tie to ZFS. Every. Single. One.

    Right now.

    Like every piece of software Oracle is involved in, ZFS is a big fat patent trap. Not only that, but ZFS is a lot slower than Ext3 and Ext4, and probably Btrfs[1] as well. There is absolutely no benefit to using ZFS as an object storage target, there is only the certainty of legal problems.

    [1] Oracle is involved with Btrfs too, so exercise due caution.

    --
    Have you got your LWN subscription yet?
    1. Re:Very first thing to do is... by Daniel+Phillips · · Score: 2

      Unfortunately, Lustre-on-ZFS [zfsonlinux.org] is substantially faster that lustre on ext3, mainly because ZFS combines the features of an lvm and a filesystem

      That's bafflegab and incorrect. Or if you disagree, please explain why.

      --
      Have you got your LWN subscription yet?
    2. Re:Very first thing to do is... by Daniel+Phillips · · Score: 2

      And by the way, is your opinion based on benchmarks, or on hype from Sun? I strongly suspect the latter.

      --
      Have you got your LWN subscription yet?
    3. Re:Very first thing to do is... by TheRaven64 · · Score: 4, Informative

      ZFS is a big fat patent trap

      Oracle has released the ZFS code under the CDDL. While lots of Linux people hate the license, it has very strong patent retaliation clauses. Oracle explicitly grates you patent licenses for everything required to use ZFS via clause 2.1. All other contributors do via clause 2.2. Anyone exerting patents against ZFS immediately (well, within 60 days) loses this grant and has their (copyright) license terminated as well via clause 6.2.

      Since Sun accepted third-party contributions to ZFS under the OpenSolaris program, if Oracle tried exerting patents against any ZFS distributor then they would immediately have to stop distributing Solaris and then remove all of these contributions before they could start again.

      The ZFS patents are only an issue for a reimplementation of ZFS for Linux, and that's a problem caused by the GPL. Using the FreeBSD or NetBSD ports of ZFS (or even the FUSE port) gives you an explicit grant to the patents.

      --
      I am TheRaven on Soylent News
    4. Re:Very first thing to do is... by Anonymous Coward · · Score: 2

      Comparing ZFS to ant of the EXT FSes is pointless, and utterly misses the point of ZFS.

      Do ext3/4 provide snapshotting?
      Do they provide deduplication?
      Do they perform hash checks to avoid duplicating files in the first place?
      Do they provide ANY of the dozens of features that set ZFS apart from other filesystems?

      Don't bother, the answer is no.

      And if you're going to disable those features on ZFS, then you have no reason to be using it in the first place, so you're effectively making an apples to zebras comparison to argue that because a certain technology is owned by a certain company you don't like, you should avoid it at all costs on the basis that other products that don't even more remotely close to offering the same functionality are somehow better at what it provides, by not providing it at all.

      Beyond that, Oracle is stuck abiding by the terms of the CDDL for as long as they continue to distribute ZFS under the CDDL, while keeping in mind that releases distributed under the CDDL remain under the CDDL.

      You're not one of those butthurt zealots fearmongering over ZFS on the grounds that the GPL forbids making a useful Linux implementation of it, are you? You certainly seem to be.