Slashdot Mirror


Red Hat announces GFS

PSUdaemon writes "Over at Kernel Trap they have an announcment that Red Hat has released GFS under the GPL and offer it through RHN. This could potentially be a very substantial offering from Red Hat."

25 of 240 comments (clear)

  1. executive summary? by Speare · · Score: 4, Insightful

    Would it be too much to ask that the writeup blurb include a ten-word summary of what makes GFS any different from any other Linux-ready filesystem? Many sites get slashdotted, making most links unusable for 12 hours or more.

    --
    [ .sig file not found ]
    1. Re:executive summary? by Night+Goat · · Score: 4, Insightful

      I'd be happy with just the mention that it IS a file system. I had no idea what GFS was until I read your post.

    2. Re:executive summary? by Anonymous Coward · · Score: 5, Informative

      From http://sources.redhat.com/cluster/gfs/

      GFS (Global File System) is a cluster file system. It allows a cluster of computers to simultaneously use a block device that is shared between them (with FC, iSCSI, NBD, etc...). GFS reads and writes to the block device like a local filesystem, but also uses a lock module to allow the computers coordinate their I/O so filesystem consistency is maintained. One of the nifty features of GFS is perfect consistency -- changes made to the filesystem on one machine show up immediately on all other machines in the cluster.

      and

      GFS has no single point of failure, is incrementally scalable from one to hundreds of Red Hat Enterprise Linux servers, and works with all standard Linux applications.

      Dunno if any other linux "file systems" have all that. :p

    3. Re:executive summary? by Pros_n_Cons · · Score: 5, Informative

      Yes, here is a news.com article on it.

      The GFS software lets files be stored in a single file system shared by numerous servers. The information can reside on servers themselves or on a storage area network.

      The software is used to speed data access and replicate information so it's still available even if individual machines fail. It's useful for the two conventional types of clusters: groups of machines linked so one can take over for another in case of a problem, and groups linked as part of a sprawling supercomputer.

      Red Hat GFS is tuned to work with Oracle's 9i RAC, database software that can spread across multiple clustered machines, and work with Red Hat's cluster software for ensuring services remain available despite computer problems.

      --

      -- "of course thats just my opinion, I could be wrong." --Dennis Miller
    4. Re:executive summary? by Anonymous Coward · · Score: 4, Informative
      I'm confused. If it's a shared disk setup, how can there not be a single point of failure? If your FC/iSCSI disk box goes down, where's your storage gone? Obviously I've missed something, so if anyone would care to explain it to me I'm all ears...

      Yes your architecture can be designed with a single point of failure. However, in practice you will want to connect this to a SAN. The SAN will be full of dually connected disks, have 2 main controllers, at least 2 power supplies, be connected to two switch banks via 2 HBA's, and each server will be connected to each switch. For added safety, direct connect another SAN to the first, and mirror all data between the SAN's.

      But mainly, a good SAN is designed to be dually redundant from the ground up. Kind of like those (Fujitsu? Panasonic?) servers that have 2 standard mobo's in them and sync all data between cpu's, so if one dies the whole system is still alive.



      What I need is a simple mirroring system for two failover servers, without single point of failure.

      What kind of servers? The best method will depend on the type of server.



      It's very frustrating. DFS and FRS seem to work just fine under Windows, so why hasn't Linux got it?

      Because you haven't paid for it yet, be it in cash or time.

  2. Re:Compatibility? by Pros_n_Cons · · Score: 4, Informative

    "Will it run on distros other than Redhat?"

    Of course it will, It's GPL and looking for inclusing into the kernel. Just like everything else from Red Hat. If you expect them to optimize it for SuSe, Mandrake, Gentoo you're mistaken but sometimes they supply Debian packages for things they write. If it doesn't get accepted upstream for whatever reason It's up to vendors to supply the packages, not the writer of the software.

    --

    -- "of course thats just my opinion, I could be wrong." --Dennis Miller
  3. Re:Free for $2,200? by Rik+van+Riel · · Score: 5, Informative
    Just because its opensource doesn't mean you can download it for free.


    Though in this case, you can download GFS and all the related software for free. Just go to the
    cluster project page.
  4. Really? by cubicledrone · · Score: 5, Funny

    GFS on the GPL? From RHN? WTF?

    Normally I'd ask what's the BFD? but most people would just LOL. Then other people would probably want to know if it comes on DVD or FTP, but the FAQ will explain it JIT. Now what would be really cool would be a PDA that would run it with an RGB display, but it might need extra RAM.

    HTH.

    --
    Business isn't willing to pay for products, innovation and careers, so we get brands, mortgage commercials and layoffs.
  5. GFS is cool! by Anonymous Coward · · Score: 5, Informative

    GFS allows multiple redundant storage computers to serve a whole lot of other servers for data availability purposes. It isn't just another FS like EXT* or JFS or .... It's a transparent networkable filesystem with failover and all of the other goodies needed to implement a hardcore enterprise level solution for serving needs like a million hits a minute sites, or filesharing with 50,000 users...

  6. GFS defined... by jarich · · Score: 5, Informative
    From the website....

    Red Hat Global File System (GFS) is an open source, POSIX-compliant cluster file system and volume manager that executes on Red Hat Enterprise Linux servers attached to a storage area network (SAN). It works on all major server and storage platforms supported by Red Hat. The leading (and first) cluster file system for Linux, Red Hat GFS has the most complete feature set, widest industry adoption, broadest application support, and best price/performance of any Linux cluster file system today.

    Red Hat GFS allows Red Hat Enterprise Linux servers to simultaneously read and write to a single shared file system on the SAN, achieving high performance and reducing the complexity and overhead of managing redundant data copies. Red Hat GFS has no single point of failure, is incrementally scalable from one to hundreds of Red Hat Enterprise Linux servers, and works with all standard Linux applications.

    Red Hat GFS is tightly integrated with Red Hat Enterprise Linux and distributed through Red Hat Network. This simplifies software installation, updates, and management. Applications such as Oracle 9i RAC, and workloads in cluster computing, file, web, and email serving can become easier to manage and achieve higher throughput and availability with Red Hat GFS.

    Highlights

    Performance

    Red Hat GFS helps Red Hat Enterprise Linux servers achieve high IO throughput for demanding applications in database, file, and compute serving. Performance can be incrementally scaled for hundreds of Red Hat Enterprise Linux servers using Red Hat GFS and storage area networks constructed with iSCSI or Fibre Channel.

    Availability

    Red Hat GFS has no single-point-of-failure: any server, network, or storage component can be made redundant to allow continued operations despite failures. In addition, Red Hat GFS has features that allow reconfigurations such as file system and volume resizing to be made while the system remains on-line to increase system availability. Red Hat Cluster Suite can be used with GFS to move applications in the event of server failure or for routine server maintenance.

    Ease of Management

    Red Hat GFS allows fast, scalable, high througput access to a single shared file system, reducing management complexity by removing the need for data copying and maintaining multiple versions of data to insure fast access. Integrated with Red Hat Enterprise Linux (AS, ES, and WS) and Cluster Suite, delivered via Red Hat Network, and supported by Red Hat's award winning support team, Red Hat GFS is the world's leading cluster file system for Linux.

    Advanced features

    Scalable to hundreds of Red Hat Enterprise Linux servers. Integrated with Red Hat Enterprise Linux 3 and delivered via Red Hat Network, comprehensive service offerings, up to 24x7 with one-hour response. Supports Intel X86, Intel Itanium2, AMD AMD64, and Intel EM64T architectures. Works with Red Hat Cluster Suite to provide high availability for mission-critical applications. Quota system for cluster-wide storage capacity management. Direct IO support allows databases to achieve high performance without traditional file system overheads. Dynamic multi-pathing to route around switch or HBA failures in the storage area network. Dynamic capacity growth while the file system remains on-line and available. Can serve as a scalable alternative to NFS. Product Information Supported on Red Hat Enterprise Linux AS, ES, and WS. Red Hat Cluster Suite support available on Red Hat Enterprise Linux 3. Support for a wide variety of Fibre Channel and iSCSI storage area network products from leading switch, HBA, and storage array vendors. Mature, industry-leading, field-proven, open source cluster file system.

  7. `GFS' by Anonymous Coward · · Score: 4, Interesting

    I was reading only the other day about the Google File System. So there are now two acronymns which are both GFS which both refer to a distributed file system. That's not going to get confusing. Nope, not at all.

  8. What about security? by ee96090 · · Score: 4, Interesting

    I don't see security in the least of features. Calling this a Global file system is a bit presumptuous, considering the lack of security prevents it from being used outside of a closed LAN segment.

    --
    Gustavo J.A.M. Carneiro
  9. Newcomer? by cduffy · · Score: 5, Informative

    They bought this technology when they bought Sistina. Sistina has been working on GFS for a long time.

  10. Re:Good Distributed Filesystems? by finkployd · · Score: 4, Insightful

    Coda is NOT the sccessor to AFS, DFS (of Transarc fame) was, and it was really really good. Probably the best distributed filesystem out there. Unfortunatly setting up DCE (the environment that DFS ran in) was complicated and only really large institutions used it. Since it was not profitable IBM (the last major vendor supporting it) has discontinued it. And hampered the Open Group's attempts to open source it I might add. :(

    Finkployd

  11. Re:Free for $2,200? by Anonymous Coward · · Score: 4, Insightful

    Oh, but they said it was free, they didn't say it was free.

    Don't you know the difference between "free" and "free"?

    If so, let me explain:

    1) Internet Explorer is free, for instance, as you don't pay for it;

    2) Internet Explorer is not free because you cannot have its source to modify and make it more secure;

    3) Professional distros like Red Hat and Suse are not free because you have to pay to have it;

    4) These same professional distros are free because you can compile the source yourself whenever you can.

    Got it? If you don't understand this, you'll might believe next time someone says "Linux is not free". Don't be fooled! It is free!

    Now, the relevant quote is:

    "We're looking for people help us work on this project so we can eventually get it included into the Linux kernel. Comments, suggestions, patches, and testers are more than welcome."

    See the part that mentions "get it included into the Linux kernel"? It means it will be free.

    Now, these superb guys at RH really should charge for a professional product with support. Soon, very soon, they might discover they must do what Sun does: have a personal low cost (maybe gratis) version, so that people can tweak it, use at home, report bugs etc.

    I, for one, thank them for all the fish and get the message that everyone must contribute, no matter how little, and not just wait for them to make things for us.

    And don't use English to discuss such things. Or, better yet, change English so that it becomes fit for use. I suggest stop using free to mean gratis. Just use gratis, like in "There's no gratis lunch".

  12. GFS has a troubled license history by freelunch · · Score: 4, Informative

    GFS was well-liked at supercomputing centers I have worked with until Sistina dropped the GPL license in favor of proprietary. They did this very suddenly and without warning. It pissed off a lot of potential users and the open source community. It has since fallen out of favor.

    This move by Red Hat gives new life (and resources) to GFS beyond the OpenGFS Project that has also been continuing to work on the code.

    Another recent development in this area is HP's decision to productize Lustre. Lustre is perhaps the most prominent and promising HPC filesystem.

    SGI also announced a major deal last week involving Luster:

    The new file system is expected to sustain write rates in excess of 8GB/sec and demonstrate single client write rates of more than 600MB/sec. To achieve this performance, the new file system will leverage Lustre, an open source, object-oriented file system with development lead by Cluster File System Inc., with funding from DOE. Lustre currently is used on four of the top five supercomputers, including the PNNL cluster based on 1,900 Intel® Itanium® 2 processors.

  13. Re:Good Distributed Filesystems? by Salamander · · Score: 4, Interesting
    None of these filesystems allows regular users to access remote filesystems (superuser privileges are required for mounting) like with FTP

    No, and they don't cook your dinner for you either, but if that's what you're expecting then you're completely missing the point of what a cluster filesystem is for. Granted, the name "Global File System" is a misnomer, but it has been a misnomer for several years now and if you have anything more than a dilettante's interest in this you should know what GFS really does.

    What's so hard about getting this stuff right?

    Yeah, everything's easy when you're not the one doing it. Tell me what you do, and I'll tell you how wimpy that is. If you think that maintaining consistency across multiple machines in a cluster without compromising performance is easy, you're a fool. If you think that high availability of any form is easy, then you're an idiot. If you think putting those two together doesn't lead to an exponential increase in complexity and hence difficulty, you're a moron.

    If you want a filesystem stub (not really a complete filesystem) that lets you access files stored half-way around the world over a standard protocol, look into one of the many efforts based on WebDAV. If you want a true global filesystem, look into OceanStore so you can appreciate some of the problems that are involved. If you want to be able to change the filesystem namespace without being root, look into Plan 9. Do your own googling. None of those are what GFS is about.

    --
    Slashdot - News for Herds. Stuff that Splatters.
  14. Re:yes, that's actually the basic idea by cjsnell · · Score: 4, Informative

    While you do have the basic idea down, your suggestion of a clustering FS isn't the best for your application. You are describing "vertical scaling", which GFS and clusters will be very good for. Web serving is not a good place for a cluster--"horizontal scaling" is how you scale most web sites and web applications. Typically, for web serving, you will have a block of content that can fit on the hard disk of the average web server.

    The best way to deliver this to the user (in this case, the slashdotter) would be to replicate this content onto a group of web servers using rsync(1). Each machine serves the content off of its local drive and can use its memory to cache/buffer the disk reads. In front of the web servers, you would put a wire-speed load balancer, such as an Nortel Alteon content switch or a Foundry Networks ServerIron switch. The load balancer, when configured properly will take care of monitoring your web servers. It would take me too long to explain it here, but these switches are sophisticated enough that they can take failed webservers out of the load-balancing group for everything from a ping failure to a content failure.

    The key to designing web architectures is simplicity. Web serving does not need fancy clustering software or distributed filesystems. Very few web sites will not fit on the hard disk of your average 1U server. Keep it simple and put the intelligence up front in the switch.

    What is GFS good for? Many things! It would be great for a large computational cluster that had a very large (multi-terabyte) dataset and high disk I/O requirements. Anything that has a requirement to provide one or more very large files to a number of cluster nodes would be perfect for GFS.

    Chris

  15. Re:I don't think so by Sunspire · · Score: 5, Informative

    Red Hat's HA clustering software is also GPL but it doesn't run on other distros (and is not supported by Red Hat on other distros).

    Of course Red Hat doesn't support other distros, but what makes you think the clustering software doesn't work on them? All the bits and pieces are available for download. If you find any "if (distro != RH) exit()" code in the fully GPL'd cluster toolchain, please feel free to remove them. There's no secret sauce to RHEL, it's all open source and everyone is free to copy and modify the code.

    There's already one distro that includes the new GPL'ed GFS filesystem out as of today, Lineox. And Red Hat will be working to get GFS up to spec for inclusion in the official Linux kernel according to posts made to the kernel mailing list.

    The code itself is open source, that is true, but "Red Hat Enterprise Linux subscription [is] required"

    This only refers to that point that Red Hat is not interested in selling to you unless you have a RHEL subscription. That $2,200 gets you GFS up and running on your RHEL cluster in a turnkey fashion, and it gives you the option to purchase further 24/7 one-hour response support contracts. You're free to assemble it all into a working system by yourself if you want.

    --
    It's like deja vu all over again.
  16. Re:Not quite, but OpenAFS would be a good option by BitchKapoor · · Score: 4, Informative

    AFS is for distributed computing, GFS is for fault-tolerant cluster computing, similar to SGI's CXFS. Calling it a "global file system" is a misnomer.

  17. Re:Redhat vs. Novell by Sunspire · · Score: 5, Insightful

    Don't go down that road... Red Hat's contibutions to Linux absolutely dwarf SuSE's to date in no uncertain terms.

    But let's just focus on the most recent efforts of both companies. Realistically no distro is going to include Yast, but it's still a very good move since it will allow SuSE ISO images to be distributed without the existing restricitions in the future and I'm thankful to Novell for it. On the other hand, Red Hat buying Sistina for $31 million and setting their arguably only asset GFS free and then working on including it in the Linux kernel proper directly also benefits Novell and other Linux distributors.

    "lately has been locking down their Linux offerings"? How about giving some concrete examples. Last time I checked RHEL was 100% open source and available for download, and so is Fedora Core for the home user. SuSE has been cleaning up their act since they got purchased by Novell, but to play them against Red Hat, who has been completely 100% behind open source since day one, as somehow a more free alternative is laughable.

    --
    It's like deja vu all over again.
  18. Re:I don't think so by SuperQ · · Score: 4, Informative

    actualy, they do support other distros. Sistina software, who was aquired by RedHat, is down the street from my office. They still show SuSE as a supported distro.

    I am personaly going to try installing GFS on some Debian systems for a U of M student group who recently got a donation of some used Fibre-Channel disk.

    What I'm hoping for now is support for ia64, and other platforms. It would also be nice if GFS could now be ported to other OS's like AIS and Solaris.

  19. See also OpenGFS and OCFS by sneakerfish · · Score: 4, Informative

    There is also OpenGFS http://opengfs.sourceforge.net/ and Oracle Cluster File System http://oss.oracle.com/projects/ocfs/

    These may go away since their major reason for existing was that Sistina had closed up source for GFS.

    Thanks RedHat. With LVM2, GFS, my EMC SAN and my cluster of Gentoo boxes (ya, sorry 'bout that part) I'm going to have lots of fun.

  20. Re:What is a SAN by Anonymous Coward · · Score: 4, Informative
    It sounds like a SAN pretends to be a single large block device. (i.e. a disk)

    A SAN can be a single large block device. The specifics will depend on the SAN, but you should be able to arrange the disks in any RAID configuration (or none), and present 1 or more block devices to 1 or more servers.

    When I manipulate a sector on the disk, the SAN is actually manipulating the same sector on multiple identical drives.

    Not necessarily the same sectors, depending on whether we're talking physical or logical sectors, but basically that's correct.

    So from this standpoint, it sounds similar to RAID, except for the redundant power supplies.

    Well most servers come with 2+ power supplies for fail-over, so even the redundant power supplies isn't different.

    From the description, it sounds like SAN has another important difference from RAID. The SAN, redundant power supplies, redundant drives, and all, is a separate system from the computer.

    There are disk arrays you can buy that direct attach to computers. These too would be separate units from the computer (benefit: if the computer dies, reattach the pack to a separate computer. A lot simpler than having to remove/insert each disk).

    Unlike RAID, which pretends to be a single block device, the SAN can be accessed by multiple CPU's.

    Depending on the RAID device, you can configure multiple logical devices across multiple physical devices. Dell's PERC's generally allow this (ok, not across separate disks, but if there are 10 disks, you could have 2 sets of 5 disks in RAID5).

    A RAID device can be accessed by multiple CPU's in the case of a 2+ way server. So, you mean multiple servers, not multiple servers.

    A SAN can be connected to many servers - 64, 128, 1024, etc, depending on the SAN and your budget.

    (Therefore, you don't want to put an ordinary filesystem onto it, such as Ext3.) Therefore the design of GFS, which allows multiple cpu's to concurrently manipulate the filesystem.

    Yes, the FS will depend on the use. If you can hookup multiple servers to the same disk, then you need an FS that can handle that. If you are planning on dynamically growing the device, then you need an FS that can handle that.

    Do I fundamentally misunderstand?

    Parts you understand. A SAN also has many other uses, like disk consolidation, functionality, and management, but these issues and uses will really depend on your environment.

    For example, if you generally buy a server with a bunch of disk in case you ever need it, then you probably have a big range of % use on your servers. A SAN lets you consolidate that disk space in one place. Perhaps you have one server running at 30% total disk use, another at 99%, and another at 50%. Would be nice to dynamically allocate the disk from the unused servers to the disk on the 99%, but barring inefficient methods, this is very, very difficult. With a SAN, i can grow those disk devices on the fly and make sure each server always has X amount free (probably around 20% free space). When you're talking about many servers, or lots of unused space, this can add up to a big ROI.

    Or, let's say you use a proprietary FS like Veritas for your Enterprise servers. Buying automatic mirroring for those servers may add up to a lot of money, so instead invest once in your SAN's disk mirroring product and use this for those servers (yeah this may be just as or more costly, depending on your SAN).

    And there are other functionalities, like server independant snapshots and mirrors - your FS may handle snaps or mirroring, but can a separate server mount that? With a SAN, that can be possible. Imagine your webserver mounts a RO mirror of the data that is only changeable via a more protected server. You could do that via NFS, but it would come at a speed cost. With the SAN, you're not limited to the

  21. Re:... compared to InterMezzo, CODA or oMFS? by thule · · Score: 4, Informative

    The difference is how it tries to solve the problem. NFS works over IP and access files at the inode level. This requires the server system or device to be running RPC and the NFS protocol. Most network filesystems work in a similar way. You have servers and clients accessing the servers via some protocol.

    Now imagine a filesystem designed for servers that allows them to access the filesystem at a block level directly via the shared bus. Let's say a parallel SCSI buss (or any bus that allows more than one host, e.g. iSCSI, Fibre Channel, Firewire). Imagine how fast it would be to access a shared disk over Fibre Channel! The problem is that if two servers mount the filesystem at the same time it would normally currupt the filesystem. People with SAN's (Storage Area Networks) solve this problem by making mini virtual hard drives and setting ACL's on them so only one host can access that virtual hard drive at a time. This could lead to a waste of space.

    GFS solves the SAN problem by using a Distributed Lock Manager (DLM). No one host is the server of the filesystem, but writes/locks are coordinated via the DLM. Now multiple hosts *can* share a virtual hard drive or real block device and not corrupt the filesystem. If a host dies, no problem, there is no server for the filesystem!

    Let's give an example. Say you have a firewire enclosure. Now plug that firewire hard drive into two computers. This, by the way, may still require a patch to sbp so that Linux will tell the enclosure to allow both hosts to talk to it at the same time. Now that the hard drive is talking to both computers you could run GFS on it and access the data at the block level by both systems. Now start serving email via IMAP (load balanced), *both hot*, no standby. Now kill a box. IMAP still works. No remounting, no resycronization.

    Pretty amazing if you ask me! This technology is pretty rare. IBM has GPFS. SGI has Clustered XFS. Both are pretty expensive. GFS? RedHat just re-GPL'd it! Microsoft? Ummm. I think they are just now getting logical volume management.

    GFS also has nice features like journaling (kinda required for this sorta thing), ACL's, quotas, and online resizing.

    Now tell me Linux isn't enterprise!