Slashdot Mirror


Distributed Filesystems for Linux?

zoneball asks: "What would you use for a distributed file system for Linux? I have several GNU/Linix machines running at home, and wanted to be able to see more or less the same file tree (especially all the ~user directories) regardless of which machine I'm connected to, and where the traversal into the distributed file system space is largely transparent for the end-user. Are there any URLs or documents that compare the features, bugs, road map, stability of these and other distributed filesystems? Which offers the best stability and protection from future obsolescence?"

Zoneball looked at 3 distributed filesystems, here are his thoughts:

" Open AFS was the solution I chose because I have the experience with it from college. For performance, AFS was built with an intelligent client-side cache, but did not support network disconnects nicely. But there are other alternatives out there.

Coda appears to be a research fork from an earlier version of AFS. Coda supports disconnected operations. But, the consensus on the Usenet (when I looked into filesystems a while ago) was that Coda was still too 'experimental.'

Intermezzo looks like it was started with the lessons learned from Coda, but (again from Usenet) people have said that it is still too unstable and it crashes their servers. The last 'news' on their site is dated almost a year ago, so I don't even know if it's being developed or not"

So if you were to recommend a distributed filesystem for Linux machines, would you choose one of the three filesystems listed here, or something else entirely?

54 of 375 comments (clear)

  1. NFS by mao+che+minh · · Score: 4, Informative
    I know that this is going to be the most common answer, but just go with NFS. It's not the most secure option around, but obviously the simplest to implement and the best documented.

    NFS Linux FAQ
    Howto #1
    Howto #2

    If you find yourself needing help, try asking people at Just Linux forums, or trying the NFS mailing list.

    1. Re:NFS by vandan · · Score: 3, Informative

      I have to agree.
      It takes about 5 minutes to get an understanding of what you need. After setting it up it just works.
      NFS is a great ... Network File System. No need to re-invent the wheel here.

    2. Re:NFS by gallir · · Score: 3, Informative
      Handling connection/disconnection is your automounter daemon "autofs".

      Disconnection in a DFS means a certain degree of replication: you still are able to work on your files even you you have no access to you repository, or you are off-line. Autofs doesn't do that, altough you can have some rsync's scripts to partially solve the problem, it's not a scalable or viable workaround for several users.

      NIS on the other hand is not a good solution for WAN connections or different networks. Should you use this kind of soultion, I'd take a look to openldap instead.

      --
      sgis ddo ekil t'nod i
    3. Re:NFS by Anonymous Coward · · Score: 1, Informative
      ...just go with NFS. It's not the most secure option around...

      Well, you could do what people have been doing for well over a decade and use Secure NFS. It should come by default with any decent Unix system, and it supports a variety of cryptosystems (Kerberos, Diffie-Helman, etc.)

      I'm not sure if it's fully-supported in Linux, though. A check of recent Usenet articles seems to indicate that Linux doesn't support it all that well (perhaps things have changed?). But that doesn't mean NFS itself isn't a secure option.

    4. Re:NFS by cduffy · · Score: 2, Informative

      [NFS] gives you a single, unified view of a file system tree that can span many machines.

      Only if your mount tables are the same everywhere, and they need to be kept in sync on the client side. By contrast, in AFS, change where a volume is mounted anywhere -- and it's changed everywhere. Add a new volume on a different server on one client, and it's there in that same place on all of them. No mucking with the automounter, no distributing config files to all your machines, none of that mess.

      AFS makes administration tremendously easier after one's scaled the initial learning curve. It performs far, far better than NFS on large networks (and merely somewhat better on smaller ones). It makes security policy easier to impliment and maintain -- even if someone roots one of your boxen.

      AFS is just plain Good Stuff.

    5. Re:NFS by g4dget · · Score: 3, Informative
      Only if your mount tables are the same everywhere,

      That's what NIS is for. Furthermore, the flexibility of being able to set up machines with different views of the network is crucial in many applications. None of my workstations or servers actually have the same mount tables: they all get some stuff via NIS, and some stuff is modified locally. The restrictions AFS imposes are just unacceptable.

      AFS makes administration tremendously easier after one's scaled the initial learning curve.

      AFS is an administrative nightmare. Apart from the mess that ACLs cause and the problems of trying to fit real-world file sharing semantics into AFS's straightjacket, just the number of wedged machines due to overfull caches and its complete disregard for UNIX file system semantics cause no end of support hassles. Then, there is the poor support for Windows clients. We started out using AFS because it sounded good on paper, but it was a disaster in terms of support, and we got rid of it again after several years of suffering.

      It performs far, far better than NFS on large networks (and merely somewhat better on smaller ones).

      AFS's caching scheme works better than what NFS is doing for small files, but that case is fast and easy anyway. AFS's approach falls apart for just the kind of usage where it would matter most: huge files accessed from many machines.

      Both NFS and AFS have very serious problems. But between the two, NFS is far simpler than AFS, is easier to administer in complex real-world environments, respects UNIX file system semantics better, and works better with large files. I can guardedly recommend NFS or SMB ("there is nothing better around, so you might as well use it"), but I can't imagine any environment for which AFS is a reasonable choice anymore. The only thing AFS had ever going for it as far as I'm concerned is that it was fairly secure at a time when NFS had no security whatsoever, but that is not an issue anymore.

    6. Re:NFS by cduffy · · Score: 2, Informative
      The restrictions AFS imposes are just unacceptable.

      Really, now? Tell me what you're trying to do that AFS won't allow you (not *how* you're trying to do it, but *what* you're trying to do), and I'll tell you how to do it with AFS.

      Apart from the mess that ACLs cause and the problems of trying to fit real-world file sharing semantics into AFS's straightjacket

      WHAT?! I could say the same thing about UNIX's user/group/world semantics, and far more defensibly. ACLs allow all sorts of useful things; I can have a log directory that's append-only except to sysadmins, for instance; or have a mail spool directory writable to only its owner and processes on the mailserver; or lots of useful things that standard UNIX semantics don't support. Who's wearing the straitjacket, again?

      AFS's approach falls apart for just the kind of usage where it would matter most: huge files accessed from many machines.

      That's a fairly rare case (for me, maybe not for you) -- just about all my huge (multi-gig) files are things like databases accessed by only one machine.

      ...I can't imagine any environment for which AFS is a reasonable choice anymore...

      It's a reasonable choice for mine (a fairly small [approx 40-user] software company with lots of servers and few Windows clients -- and lots of potential for need to scale). On the high end, it's also a reasonable choice for IBM, who uses it internally (incidentally, the friend who introduced me to AFS used to be a sysadmin there). NFS may be capable of scaling up to my network -- but it sure as hell couldn't scale up to IBM's.

  2. Yup NFS by laugau · · Score: 2, Informative

    NFS + Automounter plus NIS and you get everything you ever wanted. NFS is fast, well known and documented and transparent.

    1. Re:Yup NFS by roro_parnucious · · Score: 2, Informative

      You can rsync your passwd/shadow files from the master periodically, and use /etc/nisswitch.conf to say that you want to try NIS first, local login second.

      Hell, if you're going to go to the trouble, you can just use the rsync method and not NIS! But I digress...

  3. Self Certifying File System by nescafe · · Score: 5, Informative

    I would use SFS, the Self Certifying File System. Assuming all the systems you are using are supported, it offers global, secure access to anything you care to export.

  4. Well it depends... by Tsugumi · · Score: 5, Informative
    For my money, nfs in a LAN, afs over a WAN, it really depends on the size of the network your trying to play with.

    Since openafs forked from the old transarc/IBM codebase, it looks as if it has a real future. It's used by a load of educational and research institutions (notably CERN), as well as Wall Street firms.

    1. Re:Well it depends... by Anonymous Coward · · Score: 1, Informative

      " One nasty glitch is that (at least in some installations) AFS file permissions are maintained separately from standard *nix file permissions. In other words, "chmod o-r *" does nothing to stop other people from reading your files"

      That is not a glitch, that is a feature. You use the "fs setacl" AFS command to set an ACL (Access Control List). AFS supports seven different permission bits, r(ead), l(ist), i(nsert), d(elete), w(rite), (loc)k and a(dminister).

  5. NFS/BOOTP by rf0 · · Score: 2, Informative

    I'm sure other ehere will suggest NFS but why not just go whole hog and setup you clients to boot off a server then mount the same NFS filesystem. That way total transparency without having to make sure that n FS is always mounted

    Just my $00.2

    Rus

  6. Background on DFS by El+Pollo+Loco · · Score: 5, Informative

    Check here for a good background on DFS. It also has a quick table comparison of the popular programs, and a walkthrough to set up Intermezzo.

  7. PVFS by Kraken137 · · Score: 5, Informative

    We use PVFS at work to give us a high-performance network filesystem for use with our clusters.

    http://parlweb.parl.clemson.edu/pvfs/

  8. openmosix by joeldg · · Score: 5, Informative

    I run an openmosix cluster with the openmosix filesystem here at work. Three computers.. no problems...
    If you want to take a look..
    http://lucifer.intercosmos.net/index.php
    linkage and I am going to be placing some tutorials up. -joeldg

  9. Re:Format, Install Windows Server 2000 or 2003 by sneakybilly · · Score: 2, Informative

    DFS is just replication. DFS works in a number of ways in simplest form you could use rsync to achieve the same thing. Combination of NFS and RSYNC could be used to achieve its more complex form.

  10. Ye olde Samba by Anonymous Coward · · Score: 4, Informative

    Samba works fine. I personally have approximately 5 samba mounts in my filesystem totally transparent for anybody who was to walk up and use my computer.

    No need to unnecessarily complicate things here, samba is simple to set up and functions great.

  11. Re:permissions? by phorm · · Score: 4, Informative

    That's what NIS is for. You can schedule regular downloads of group/passwd files, which are updated in a NIS database stored on a master server, and passed down to "slave" servers.

  12. Re:permissions? by Anonymous Coward · · Score: 1, Informative

    use NIS (or LDAP or SMB authentication or any other centralized authentication method)

  13. Intermezzo does appear to be a current project by Dr.Zap · · Score: 5, Informative

    While there is no new news posted on the site, ther are current tarballs on the ftp server, as recent as 5.9.03. (but that file appears to be a redux, last update to code seems to be 3.13.03)

    The sourceforge page for the project (http://sourceforge.net/projects/intermezzo) shows status as production/stable but the info there looks stale too.

  14. NIS + NFS by Anonymous Coward · · Score: 1, Informative

    I'll agree with the majority here, and say NIS+NFS is the way to go. But I'll deviate a bit and say use FreeBSD as your NIS/NFS server. It's dead simple to set up, and FreeBSD is rock solid. I have an NIS+NFS server here running FreeBSD, with Slackware boxes mounting from it. Works like a charm.

  15. Re:permissions? by Dysan2k · · Score: 4, Informative

    To be honest, big time, but a lot of people forget the other side of life with NFS, and that's NIS/NIS+. The yp-tools include pretty good NIS support, but not sure of NIS+. Would use niether in a production environment personally, but a common Auth system which is easy to manage would solve that issue.

    Could also look into LDAP (VERY complex, no good starting point that I've been able to find) and Kerbreos auth methods as well.

    Should give you a central point for uids/usernames. But NFS does not have transparent mounting that I'm aware of so that you could mount, say the /home directory of 5 computers onto / on a central system and it display all the mounts simultaneously. For example:

    <ECODE>
    CPU1 contains: /home/foo
    /home/baz

    CPU2 contains: /home/tic
    /home/tac

    CPU3 contains: /home/toe

    on CPU4, you'd do the following:
    mount CPU1:/home /home
    mount CPU2:/home /home
    mount CPU3:/home /home

    And you'd end up with on CPU4:
    /home/tic
    /tac
    /toe
    /foo
    /baz
    </ECODE>

    If there is a way to do this, please lemme know. I've heard people talk about it in the past, but haven't seen anything come of it yet.

    --
    -What have you contributed lately?
  16. NFS & autofs by Greg@RageNet · · Score: 3, Informative

    What you are looking for is 'autofs', which has been used extensively in solaris and linux for years (forever). You can set up an NFS share and then have autofs mount/unmount it on demand. The advantage is that if the share is not in use it's unmounted and the machine will be less vulnerable to hanging if the NFS server goes down. See the AutoFS Howto for more information on setting it up.

    -- Greg

    --
    Slashdot, would a spell-checker for posting be too much to ask? It's not rocket science!
  17. NFS is not a DFS by purplebear · · Score: 5, Informative

    Just so you all know. NFS is a network accessible FS. A DFS can also be network accessible from clients, but it physically resides on multiple systems.

  18. Re:Mirroring file system by dlakelan · · Score: 4, Informative

    Whoa, you definitely need Unison.

    Unison will synchronize any two file trees in The Right Way (TM).

    Get the gtk version for interactive conflict resolution.

    --
    ((lambda (x) (x x)) (lambda (x) (x x))) http://www.endpointcomputing.com a scientific approach to custom computing.
  19. Re:permissions? Automounter by Greg@RageNet · · Score: 3, Informative
    autofs will do this.. for example, you would have an auto.master like:
    /home auto.home
    and auto.home like:
    foo cpu1:/home/foo
    baz cpu1:/home/baz
    tic cpu2:/home/tic
    tac cpu2:/home/tac
    ....
    The result would be all the right /home/* directories.
    --
    Slashdot, would a spell-checker for posting be too much to ask? It's not rocket science!
  20. AFS is great but a pain in the butt by Anonymous Coward · · Score: 1, Informative

    If you have a lot of time to invest in setting everything up, training all your users in AFS, and administering a complex system, then AFS is a great choice. It provides more flexible ACL's and is generally more secure than NFS (if a user hacks root on your system, they can't comprimise the AFS volumes without obtaining a token).

    OTOH, having administered AFS installations in the past, I would steer clear of AFS unless you really understand it and are willing to make the investment in time and personnel to make it work for you.

    NFS (optionally +NIS) is a tried-and-true solution; it's dirt-simple to set up if security is not paramount and there's a pleathora of documentation for it on the Web (i.e., for free). Every UNIX I've ever used had some sort of NFS client, if not a server, built in. Most Linux distro's come with NFS clients and servers prepackaged and ready to go -- if you go with AFS, you'll have to install stuff on every box and handle patches, upgrades, etc., through a separate process. And there are nice Windows clients that talk to NFS (or you can run Samba to go the other way) (or both, if you're a masochist).

    Plus, most of your fellow slashdotters agree that NFS is the way to go. :)

  21. Tutorial by TheFlu · · Score: 5, Informative

    I just went through this process a few weeks ago and I must say I'm really glad I went through the trouble of setting it up...it's very cool. I actually wrote a tutorial about how to accomplish this by using NIS and NFS. I hope you find it helpful.

    The only trouble you might run into with the setup I used is some file-locking issues with programs wanting to share the same preference files.

  22. Unison file syncronizer by Anonymous Coward · · Score: 1, Informative

    I have exactly the same problem here at home, except I've thrown a couple of laptops into the mix. The solution that I've come up with us to use Unison to syncronize directories between machines. The big advangage is that Unison is as simple as it gets. It just plain works. It doesn't matter what the filesystem, network reliabilty, or even operating system is (it works on Win32 too).

    Setup a cron job to unison the home directories over an SSH link at a regular interval. Not only do you have a distributed filesystem, but every client has a complete copy also.

  23. Re:AFS vs NFS by pHDNgell · · Score: 4, Informative

    I'm disturbed at the number of people who are recommending NFS as a distributed filesystem solution. While it might be easy to get going initially, I've had more long-term problems with my NFS server and client interactions than my AFS. To get my NFS clients to behave anything like AFS clients, I had to build and install an automounter that could use NIS config.

    You only have to wait for the first day you want to reboot a fileserver without breaking every system on your network or waiting for startup dependencies, etc... One day, I moved all of the volumes off of an active fileserver (i.e. volumes being written) and shut the thing down and moved it to another machine room, brought it back up, and moved the volumes back. The reads and writes continued uninterrupted, no clients had to be restarted, no hung filesystems anywhere, etc...

    --
    -- The world is watching America, and America is watching TV.
  24. unison, anyone? by gooofy · · Score: 2, Informative

    The problem with these distributed files systems seems to be that they're either pretty old and lacking features like disconnected operation (AFS) or seem to be unstable or, even worse, unmaintained (Intermezzo, Coda).
    For many simple purposes backups can be done quite nicely using rsync or something like bacular. For laptop/notebook support unison is definitely worth a look. It syncs directories like rsync does, but in both directions. Works nicely for me.

    --
    time is a funny concept
  25. Unison by brer_rabbit · · Score: 2, Informative

    Anyone with a desktop and a laptop they want to maintain in sync definetely needs Unison. This is one of the coolest tools I found after I picked up a laptop.

  26. Remote Synchronised filesystems by danpat · · Score: 3, Informative

    I've spent quite some time researching this issue for here at work. We have two primary offices, separated by a 256k of network topology. Too slow for most users to find acceptable (large files, several 10s of seconds to copy). A bit of a culture problem but oh well.

    I looked into a whole pile of options for having a "live" filesystem, a-la NFS, but the bandwidth killed interactivity (this is for users who've never used 100mbit network filesystems before).

    I found the following:

    1. Windows 2000 Server includes a thing called "File Replication Service". Basically, it's a synchronisation service. You replicate the content to many servers, and the service watches transactions on the filesystem, and replicates them to the rest of the mirrors as soon as it can. You can write to all mirrors, but I never quite worked out how it handled conflict resolution.
    A chapter from the Windows 2000 Resource kit that describes how it works: http://www.microsoft.com/windows2000/techinfo/resk it/samplechapters/dsdh/dsdh_frs_tkae.asp

    2. Some people have done similar work for Unix systems, but they mostly involve kernel tweaks to capture filesystem events. Can't remember any URLS, but some Googling should find it.

    3. Some people are using Unison to support multi-write file replication. So long as you sync regularly, you shouldn't have too many problems.

    4. The multi-write problem is a hard one, so most people tend to say "don't do it, just make the bandwidth enough". This is the way to go if bandwidth isn't an issue.

    A guy by the name of Yasushi Saito has done quite a bit of research into data replication. Some papers (search for them on google in quotes). He also put together a project called "Pangaea" which tries to do as described above. It wasn't great last time I looked. Some paper titles:

    - Optimistic Replication for Internet Data Services
    - Consistency Management in Optimistic Replication Algorithms
    - Pangaea: a symbiotic wide-area file system
    - Taming aggressive replication in the Pangaea wide-area file system

    There is also a bunch of other research work:

    - Studying Dynamic Grid Optimisation Algorithms for File Replication
    - Challenges Involved in Multimaster Replication (note: this talks about Oracle database replication)
    - Chapter 18 of the Windows 2000 Server manual describes the File Replication Service in detail
    - How to avoid directory service headaches (talks about not having multi-master-write replication and why)

  27. OpenAFS all the way by fsmunoz · · Score: 5, Informative

    I had more or less the same basic requirements and I opted for AFS.

    My needs were a little more demanding (had to be implemented in GNU/Linux, Solaris, AIX, HP-UX and as an extra Windows 2000) and grocking AFS can be difficult at first but it was the best choice by far. Stable across all the Unices, very secure (this was another requirement) and integrates perfectly with our Kerberos Domain and LDAP accounting info. It provides a unique namespace that can span multiple servers transparently, does replication, automatic backups and read-only copies, client-side cache with callbacks, has a backup (to tape) system that can be used stand-alone or integrated with existing backup structures (Amanda, Legato, TSM) AND was the basis for the DCE filesystem, DFS (as a side note I find it interesting - and sad - that most things people try to emulate this days are present in DCE , and Windows 2000 got many of the "new features" from a technology initially made for Unix :DFS, DCOM, Directory Services, SSO, DCE-RPC, etc.)

    AFS is amazing and much more robust than any distributed filesystem I know of; it has shortcomings when servers time out, but apart from that it's really an excellent solution; an example I generally use to give an idea of some of the good features of AFS is a relocation of a home directory to another server. The user doesn't even notice that his home directory was moved to another server *even if he was using it and was writing stuff to disk*; at most all writing calls to his home dir have a small delay (a couple of seconds) even if his/her home dir was 5 Gb worth.

    Kerberos integration is an added bonus, if you can you can use this as an excuse to kerberize your systems and form a Kerberos Domain. If you don't want to just stick with the standard AFS KA server.

    In my setup I have Windows users accessing their home dirs in AFS using the Kerberos tickets they have from the Windows login and the fact that a cross-realm trust was made between the Unix DOmain and the AD; the can edit all the files they are entitled to with that ticket, and the system is so secure that Transarc used to put the source code in it's public AFS share and added the customers that bought the source to the ACL of the directory that contained it.

    With all this features it would be hard not to vivedly recommend OpenAFS as the best solution for a unified, distributed filesystem. Bandwidth utilization is, in my experience, at least half of what NFS uses, which is an added bonus.

    cheers,

    fsmunoz

    1. Re:OpenAFS all the way by MilliAtAcme · · Score: 4, Informative

      I second this "all the way" thought. I've been running OpenAFS for almost 2 years now on Debian GNU/Linux (many Thanks to Sam Hartman, the maintainer) and have never been disappointed. It's been pretty darn solid and, most importantly, has never lost any of my data through various upgrade cycles. It's a bit of a change in thinking, however, for those coming from an NFS background.

      There were three big wins for me...

      (1) Global file namespace managed server-side and accessible from anywhere... LAN, WAN, whatever. All clients see files in the same location.

      Unlike NFS, where you have to "mount" volumes within the file system on each client, the AFS file system is globally the same, living under "/afs", so every client accesses the same information via the same file system path. A notion of "cells" makes this possible... information under a single administrative authority lives in a "cell", e.g., "/afs/athena.mit.edu" is the top-most "mount point" for a well-known cell at MIT. Volumes, in AFS parlence, also aren't tied to any particular server or even location in the name space as far as the clients know. A client doesn't have to know explicitly in it's configuration which server a given bit of information lives on, and that data can be moved around behind the scenes as necessary (increase the volume space, increase the redundancy, taken offline, etc...) All volume mounts are handled server-side. The clients only have to know about the cell database server, and that can be determined via AFSDB records in DNS. (I.e., your AFS "cell" name matches up with your domain name, e.g., /afs/athena.mit.edu matches up with "athena.mit.edu" in DNS.) So almost all management aspects are handled server-side.

      (2) Client side implementations.

      All my Linux and Windows machines can access the same AFS file space. An OS X client is available too, but I've not needed that to date, but might someday. I thus have all home directory information, as well as a lot of binaries, living in the AFS file space, in one place. And behind the scenes, that info is on multiple AFS servers that have RAID-5 disk arrays and weekly tape backups going on.

      (3) The file system "snapshot" feature, for backups.

      You can take a snapshot of volume(s) at a particular point in time and roll them onto tape without needing to take them offline. You don't have to worry about inconsistencies in the files. Folks can continue to update files but the backup snapshot doesn't change. Very much the same as the snapshot feature on Netapps. These snapshots, called backup volumes, can even be mounted in the file space so folks can get access to the old view of the volume, e.g., accidentally deleted a critical file and need it back.

      And security via Kerberos is nice, especially if you already have an infrastructure. But it's not too hard to setup a single KDC to get started. In the Debian distribution docs for OpenAFS, there's a setup and configuration transcript that makes things relatively easy and clears up a lot of questions.

      In summary, OpenAFS is a very good solution here.

  28. Re:Mirroring file system - example w/ssh by draziw · · Score: 2, Informative

    rsync -e ssh -azu --bwlimit=500 --stats --exclude='/proc' --exclude='/dev' / targetsystem:/targetdir/

    -e is how to go - so -e ssh means use ssh
    -a (archive mode - see docs)
    -z compression - if you have more CPU vs pipe, use it. but if you are on a lan, you probably want to leave it off unless you don't mind the cpu hog (fat pipes will use more cpu time for compression)
    -u update only (dont overwrite newer files)
    --stats show you what it did when it is done
    --exclude leave off paths/files you want to skip
    --bwlimit in KBps - from my exp, put half of what you want your max to be.

    Ryan

  29. Freenet by Anonymous Coward · · Score: 1, Informative

    So if you were to recommend a distributed filesystem for Linux machines, would you choose one of the three filesystems listed here, or something else entirely?

    Most of people will probably talk about your ideas and also NFS, SMB etc. but you may also take a look at the The Freenet Project. You can make your own private network, with everything transparently distributed and redundant, with crypto, digital signatures, etc. on a buch of connected PCs.

  30. Nope, not NFS...yes AFS... by rmdyer · · Score: 3, Informative

    We use NFS every day, but just for very special circumstances. If you really understand how NFS works, then you will understand why NFS is just not a viable solution for anything large scale, or small scale for that matter.

    NFS is not secure. At most sites, NFS is exported read-only and limited to the domain, or to a given set of machine(s). If you export NFS as read/write then the client had better be secured, or you better use kerberos, and for damn sure better be behind a firewall. NFS has no client side cache, no volume location service, no ACL's, no authentication (unless kerberized), no replication, yata, yata, yata. We've used NFS sparingly for over 15 years because we -know how it works, and know its limitations.

    On the other hand, we set an AFS cell for enterprise scale application and data sharing. It currently uses Kerberos V authentication, has volume replication, global namespace, client cache, fault tolerance. User's can setup their own groups, set their own ACL permissions. Did I say quota? AFS has per-user/per-volume quota. Hey, guess what, symbolic links work from any volume to any volume on AFS. And, AFS is just a simple daemon. You crank it up, mount the top of your cell and poof, you are done.

    Another positive is the fact that once you setup an AFS cell you automatically become part of a larger community. Any AFS cell can mount the entire file system of another AFS cell within the same tree. I can for example mount many large university and government cells and share files. AFS allows Internet-wide file sharing with full security. On most versions of the client you can even enable encryption on the connection so your files won't be snooped easily.

    All of our Solaris, Windows, Linux, and Mac boxes can use the same AFS tree without blinking an eye. We use AFS for many things. Before LDAP was really worth anything, we used AFS for simply exchanging read-only data. It -is- a replcated and global file system! Just put your config files in the tree and you are done.

    If you are one of those people who are blinded by "always doing things one way", then I'd suggest you wake up and smell another technology, I did, and I liked what I got in return. Look into OpenAFS, you'll be glad you did.

    +10,000 karma points! :)

  31. Re:NIS == "Hack me please" by Benley · · Score: 2, Informative

    I've got other options, but I use NIS. The catch (there is always a catch) is that my NIS does not contain ANY password hashes, because I use Kerberos to contain those. It works well, and it's nice and simple. The future plan is to migrate to LDAP of course, and get rid of all my NFS mounts all over everywhere and implement AFS, but for now, NIS + Krb5 works great.

  32. A potted review of several distributed filesystems by elronxenu · · Score: 5, Informative

    Why not stick with NFS for the time being?

    I went through the "is coda right for me?" phase, and also "is intermezzo right for me?" and also spent tens of hours researching distributed filesystems and cluster filesystems online ... my conclusion is that the area is still immature, I will let the pot simmer for a few more years (hopefully not many), and use NFS in the meantime.

    My situation: desire for scalable and fault-tolerant distributed filesystem for home use with minimal maintenance or balancing effort. Emphasis on scalable, I want to be able to grow the filesystem essentially without limit. I also don't want to spend much time moving data between partitions. And last but not least, the bigger the filesystem grows, the less able I will be to back it up properly. I want redundancy so that if a disk dies the data is mirrored onto another disk, or if a server dies then the clients can continue to access the filesystem through another server.

    All that seems to be quite a tall order. I checked out coda, afs, PVCS, sgi's xfs, frangipani, petal, nfs, intermezzo, berkeley's xfs, jfs, Sistina's gfs and some project Microsoft is doing to build a serverless filesystem based on a no-trust paradigm (that's quite unusual for Microsoft!).

    Berkeley's xFS (now.cs.berkeley.edu) sounded the most promising but it appears to be a defunct project, as their website has been dead ever since I learned of it, and I expect the team never took it beyond the "research" stage into "let's GPL this and transform it into a robust production environment". Frangipani sounds interesting also, and maybe a little more alive than xFS.

    On the other hand coda, afs and intermezzo are all in active development. afs IMHO suffered from kerberitis, i.e. once you start using kerberos it invades everything and it has lots of problems (which I read about on the openAFS list every day). AFS doesn't support live replication (replication is done in a batch sense) either.

    CODA doesn't scale and doesn't have expected filesystem functionality: for 80 gigs of server space I would require 3.2 gigs of virtual memory, and there's a limit to the size of a CODA directory (256k) which isn't seen in ordinary filesystems. There's also the full-file-download "feature". CODA is good for serving small filesystems to frequently disconnected clients but it is not good for serving the gigabyte AVIs which I want to share with my family.

    Intermezzo is a lot more lightweight than CODA and will scale a lot better, but it's still a mirroring system rather than a network filesystem. I might use that to mirror my remote server where I just want to keep the data replicated and have write access on both the server and the client, but it's again not a solution for my situation.

    The best thing about intermezzo is that it sits on top of a regular filesystem, so if you lose intermezzo the data is still safe in the underlying filesystem. CODA creates its own filesystem within files on a regular filesystem, and if you lose CODA then the data is trapped.

    Frangipani is based on sharing data blocks, so like NFS it should be suitable for distributing files of arbitrary size. I need to look at it in a lot more detail; this is probably the right way to build a cluster filesystem for the long haul. For the short term, Intermezzo is probably the right way for a lot of people: it copies files from place to place on top of existing filesystems.

    What I did in the end:

    • new server (Celeron 1.3 GHz, 512 meg RAM)
    • 2 x 80 gig IDE disks
    • Each IDE drive has 2 partitions (one small, one huge)
    • Each partition is RAID-1 mirrored with its partner on the other disk
    • The huge RAID partition is defined to Linux LVM (logical volume manager)
    • Logical volumes are created within that for root, /home, etc...
    • All logical volumes are of type ext3 for recoverability.

    The way it works is tha

  33. Samba and permissions by Craig+Ringer · · Score: 2, Informative

    Samba DOES support UNIX permissions. Use the "cifs" client module not the "smbfs" one, and enable UNIX extensions on smbd. Its not hard, and works well.

    That said, I haven't tried it in real production yet. I do find it scary that a reverse-engineered MS protocol is now an option for UNIX<->UNIX network file access because NFS is so obsolete and crap that anything looks good in comparison.

  34. Re:NIS == "Hack me please" by HuguesT · · Score: 2, Informative

    It's not all rosy like `use LDAP'

    NIS is simple and easy to maintain. LDAP is harder. From memory (10 years ago) Kerberos was geared towards as single user on a single machine, is that still the case?

    Lots of big organizations still use NIS because its flaws, while real, are well understood.

  35. Re:NFS is not even close to secure by tzanger · · Score: 5, Informative

    I use a very simple script to help keep NFS secure:

    IPTABLES=/usr/sbin/iptables
    RPCINFO=/usr/sbin/rpc info
    GREP=/usr/bin/grep
    AWK=/usr/bin/awk

    $IPT ABLES -F nfs
    $IPTABLES -N nfs &> /dev/null
    $RPCINFO -p localhost | $AWK '/portmap|mount|nfs|lock|stat/ \
    { print "iptables -A nfs -p " $3 " --dport " $4 " -j DROP" }' | \
    /bin/bash

    $IPTABLES -L INPUT -vn | $GREP -q 'nfs all -- !ipsec0+'
    if [ $? -ne 0 ]; then
    $IPTABLES -I INPUT 1 -i eth0 -j nfs
    fi

    Basically it only allows incoming NFS-related connections over ipsec, dropping anything that is not. NFS port allocation is dynamic by default and I know you can force ports, but this seemed far easier to scale.

    One thing I have noticed (and perhaps it's common knowledge to NFS experts) is that in order to get locking to work at all, my NFS clients had to be running statd and lockd. Without 'em everything worked but locking would fail every time.

  36. Re:NFS is not even close to secure by HuguesT · · Score: 3, Informative

    > If anyone has root on ANY system or there are ANY > non-unix systems, forget it.

    By that you mean that it's easy to read stuff off people's directory if you can spoof their UID. Sure. I think you'll find the same is true on a SMB network.

    > The administrative functionality in NFS can't
    > compare to the features that have been available
    > to MacOS and Windows administrators for over a
    > decade,

    Given that 10 years ago Windows for Workgroup had hardly been released and didn't even have TCP/IP by default I think you are exagerating a little bit. At the same time MacOS version 7 was the norm, and we all know how secure that one was, right?

    Maybe NFS4 is your answer?

  37. Watch for NFSv4 in the future! by Sri+Ramkrishna · · Score: 4, Informative

    Watch for the new version of NFSv4. There are already a sample implementation in the linux 2.5 tree. NFSv4 will address most of the problems that NFSv3 and others have. Including plugin security models, namespace, and revamped ACL handling.

    It's also WAN friendly, letting several operations be done at the same time with a single directive. (COMPOUND directive) It also allows you to migrate one filesystem to another with no stale filehandles. Basically, it's trying to be an AFS killer.

    For more information, take a look at
    http://www.nfsv4.org/

    Lots of good info including the IETF spec. It's a interesting read.

    The spec is not quite complete. Currently, I believe there are discussions with how NFSv4 will work with IPsec.

    Cheers,
    sri

  38. More distributed filesystems by Anonymous Coward · · Score: 2, Informative

    I've been partial to GPFS (general parallel files system) on AIX, IBM's also got a version for Linux...Probably RH specific. There's also JFS, works alright...a bit more development there would be a plus. GPFS works great on Linux clusters!

  39. Re:NIS == "Hack me please" by cduffy · · Score: 2, Informative

    Kerberos is not at all geared towards one-user-one-box -- it was created for large multiuser computing environments, initially MIT. Certainly not one-box -- heck, you need at least one dedicated, secured, guarded system to run the thing.

    It *does* have flaws (I'd prefer it did something similar to AFS's PAG-based authentication, such that tokens are per process group rather than for all instances of a given UID on a box -- and a malicious root can trivially steal tickets for all users who have valid ones on that box), but Kerberos is used effectively in a great many large multiuser environments, and is a vast improvement on most of the other schemes out there.

  40. Re:Future obsolescence ? by cduffy · · Score: 2, Informative

    If backwards compatability exists, why can't I take a RedHat 9 package and run it on a RedHat 5.2 box?

    Err, that's forwards compatibility. Backwards compatibility would be running a Red Hat 5.2 package on a Red Hat 9 box (if that runs, then Red Hat 9 is backwards compatible with Red Hat 5.2).

    That said, though, you're discussing a different thing. The Linux kernel has a very good track record on backwards compatibility, as he stated. The Linux userland (as provided by Red Hat and such) has a really fsckin' crappy track record. (Not bad if binaries are compiled static, but who does that?)

    Distributed filesystems are quite certainly kernel territory, not userspace; hence, your argument regarding the userspace environment's track record is out of place.

  41. Re:NFS is not even close to secure by Anonymous Coward · · Score: 3, Informative

    There is plenty more that you can do to secure NFS than you suggest. Kerberos, secure rpc, DES / DH authentication, and IPSEC are all available tools. Unfortunately Linux NFS had tended to lag in security.

    http://docs.sun.com/db/doc/816-7125/6md5dsnvv?a= vi ew
    http://nscp.upenn.edu/aix4.3html/aixbman/comma dmn/ nfs_secure.htm
    http://docs.sun.com/db/doc/805-722 9/6j6q8sve1?q=de s+nfs&a=view

  42. Reasons why by photon317 · · Score: 2, Informative


    There's some reasoning behind the lack of big interest in distributed filesystems.

    1) Obviously, NFS continues to be a passable solution where you dont really need "distributed" so much as "universally network accessible in a simple way".

    2) For things where you truly want "distributed" access from multiple machines that are local to each other, there's a somewhat less complicated solution, which is to use shared storage. The idea is to attach all the machines to a SAN-style network (fiber channel, or hey even firewire these days) and use a sharing-aware filesystem that allows simultaneous access to the storage/filesystem from multiple hosts with sane locking and whatnot. One of the better places to look for this is otn.oracle.com - they've released a cluster-filesystem *and* drivers for firewire shared storage (which is cheaper than fiberchannel) for linux.

    Of course, that leaves out the case of a distributed filesystem for machines that can't be on a SAN together for distance or economical reasons. In that case you could of course hack something up using cluster-filesystem type of filesystem and SCSI-over-IP or something like that I guess, or use one of the experimental distributed filesystems you mention... but the market just isn't big enough for this yet to drive much.

    --
    11*43+456^2
  43. 7000+ users on CMU's OpenAFS installation by Anonymous Coward · · Score: 1, Informative

    I'm attending Carnegie Mellon University right now. The campus network stores all the user /home's, course webpages, homework submission folders, etc., on OpenAFS servers running some ancient, completely reworked version of Redhat. The servers are rock solid, 99.999% uptime as far as I've seen. I can only recall two one hour incidents the last two years when they went down for a bit. Tells you a bit about how stable OpenAFS is. That, and I've come to admire the usefulness of ACLs. The documentation could be better though.

    Maybe the knowledge at this page is transferrable, somewhat, to other people trying to set up OpenAFS. At the least, it gives you an idea of what you'll be needing.

  44. OpenAFS by Anonymous Coward · · Score: 1, Informative

    OpenAFS is a learning curve, but for a Distributed FS, it rules.

    At my office I have deployed OpenAFS + krb5 + LDAP. This allows me to have a network of kernel developers (read - need root access on thier workstations) and have no access to each others files.

    I have several sites, so they each sit under thier own name-space. /afs/..com. They all share a top level AFS namespace of course, so every site can see every other sites files.

    I have LDAP distibuting UID's and so on around the network, and providing multiple levels of access per person per machine (or groups of machines / people)

    I have windows 2000 and XP users authenticating to my KRB5 KDC. They have seamless access to thier AFS space also, using "wake" from rose-hulman.

    User's home directories follow them around.
    Users must authenticate every 12hrs, but that it. We have full single sign on over the network.

    NFS is no use. All users need root.
    NFSv4 is no use. Authentication is done at mount time iirc, so if 2 users log onto one server...
    Oh, and it in alpha stage, if that.
    Coda is experimental.
    I know little about intermezzo.
    OpenAFS is scalable.
    OpenAFS is secure.
    OpenAFS is easy to manage for large (huge) sites.
    OpenAFS looks after your backups for you.
    OpenAFS has user managable groups....!
    OpenAFS is just the most amazing FS I have ever used.

    OpenAFS needs a way to store its PTS in LDAP along with my other per-user data.
    This is the only fault I can find in OpenAFS.

  45. SAMBA is the way to go for a home network by internet-redstar · · Score: 2, Informative
    Being confronted with this in my home with my main Linux server, a MacOS X workstation, a single purpose Windows XP download machine (Kazaa Lite), a single purpose Linux mplayer and mp3 box (connected to stereo and TV) and a Linux development machine,... SAMBA is the only valid option.


    NFS really can't stand machines being switched on and off, NFS is great in a production environment if setup correctly, but not for home usage.


    MacOS X and M$ can handle SAMBA just fine (although MacOS X still can improve with handling filenames wich contain characters such as [ and ] and some other small caveats).


    I know it's the M$ protocol, but Linux has the best implementation of it. It works just great!


    The 'shares' of shut down machines disappear as soon as you try to access them on other machines. If you don't access them while they were off they stay in place. For ease, you can put a 'mount -a' in the crontab to automagically remount these unmounted filesystems.


    It just works. 3 times 'HURRAY' for SAMBA!