Slashdot Mirror


Distributed Filesystem for Disconnected Operation?

juraj asks: "I'm trying to achieve the following setup: I have two offices connected via a relatively slow ADSL line, and I want a shared fileserver between the offices. I have VPN using IPSec ready, so security is less of a concern, but simply mounting a filesystem (via Samba or NFS) from one office to another is not a solution because of the speed. Also, the ADSL line is sometimes not only slow, but also disconnected. I've tried the CODA distributed filesystem to achieve replication, so that both offices have local copies of their files. The problem is, that the CODA filesystem is just a research project: it is unstable, with the venus daemon constantly falling, and sometimes when recovering from the disconnected state, one side does not recognize the changes and they are simply not propagated. Have you had any good experiences with CODA? Which versions do you use? What kind of setup did you have? How is it configured? I've also heard about OpenAFS, but similar to CODA, I've learned it is unusable in a real environment. Is there any real solution to my problem? Are there any decent solid free distributed file systems for Linux or the BSDs?"

22 of 58 comments (clear)

  1. intermezzo! by Anonymous Coward · · Score: 4, Informative

    http://www.inter-mezzo.org/

    you are looking for intermezzo

    http://www.inter-mezzo.org/

    the same guy from coda is the leader. remember

    afs -> coda -> intermezzo

    1. Re:intermezzo! by Elwood+P+Dowd · · Score: 3, Informative

      And from their web page, there are still caveats. One of their components is advertised as "needing more work before it can be used in production."

      My company uses redundant leased lines to home (different breeds and providers) to ensure that every building can access network resources at all times. Manual fail over. We're not a huge company, but we manage most of this in-house. We'd *love* to know if there's a better answer, even if it cost a lot of money.

      Well. There's always a better answer on the other side of a long and expensive implementation process.

      --

      There are no trails. There are no trees out here.
    2. Re:intermezzo! by Bronster · · Score: 2, Informative

      http://www.inter-mezzo.org/

      you are looking for intermezzo


      Hmm.. let's just look at the mailing list again... maybe just a snip from a recent(ish) post (Mar 22 - there are 8 posts since this one, half of them spam):

      | Don't post to a list without reading it also.

      I read this list, what little of it I get.

      | And don't complain about the state of open source
      | software, if you are not ready to test it's betas.

      I am certainly ready to test the betas, just that the last time I tested
      Intermezzo or Lustre -- Lustre I couldn't even get patched, and
      Intermezzo I couldn't get a single share working without an immediate
      kernel oops -- and a fatal one, at least in that the filesystem never
      worked for me.

      This is not just a user-friendliness issue. I have played with
      experimental software -- even had an entire machine on reiser4 for a
      month or two. And that was _much_ easier.

      However, I promise to test Intermezzo again when I get more time.

      -----------------------

      Basically Intermezzo is not finished, and the developers don't actually have the resources to bring it to production level. I'd certainly not trust it with my data, and I've been watching it for years. I've never had the time to contribute unfortunately, and have always run out of time to test it in environments where I have enough boxen to actually do a realistic test (i.e. at work).

  2. Unison by JabberWokky · · Score: 4, Informative
    What you're looking for is something like unison. Since I don't know what you're serving off of those servers or how often you update files, I can't tell you if it will work for you. But it is robust, and with the -batch flag, it can be automated. It is quite CPU and disk intensive, that's why I say "something like". It's made more for daily or hourly syncs.

    --
    Evan

    --
    "$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
    1. Re:Unison by Anonymous Coward · · Score: 2, Informative

      Remember folks, unison has a 2GB file size limit.

  3. Perforce (or any other Version Control system) by shadowxtc · · Score: 5, Informative

    I find this to be the ideal solution for keeping filesystems synchronized across slow links.

    From my experience, Perforce has the best use of bandwidth and also the most intelligence when it comes to rearranging directory structures and resolving conflicts.

    Unfortunately it's only free for up to two users - so it may be useless for your needs.

    1. Re:Perforce (or any other Version Control system) by hashinclude · · Score: 4, Informative

      Mod Parent Up!

      I have used P4 (perforce) to keep a lot of files in sync between two locations. Fortunately, I had only two locations, so the 2-user 2-client limit never was exceeded.

      In case you want more clients/users, you can try for any of the following:

      1. CVS (http://www.cvshome.org/)
      2. GNU Arch (http://www.gnu.org/software/gnu-arch/)
      3. SubVersion (http://subversion.tigris.org/)

      All these are excellent source control tools, and operate over ordinary TCP/IP (don't need a special setup).

      Avoid tools like Visual SourceSafe because they require a network-mapped drive to work.

      http://better-scm.berlios.de/comparison/comparis on .html gives a comparitive list of version control systems out there.

      --
      US is now divided as the "Red" and "blue" states. Red States = communist countries. Coincidence? I think not
    2. Re:Perforce (or any other Version Control system) by sudog · · Score: 2, Informative

      Actually if you're doing Open Source development you can have unlimited users and clients. FreeBSD uses it internally for example.

  4. OpenAFS unusable in a "real" environment? by LoneRanger · · Score: 4, Informative

    Bullshit. You haven't looked at it hard enough then. I used to work at a university that had 26,000+ users using an AFS filestore for their homedirs and for distributed apps across several miles of campus.

    I'm sure this thing has more than surpassed terabyte size by now. It was always fast and always reliable, except when the one of server's SCSI cards would melt and start spewing errors.

    AFS is better than most people give it credit for. I'll admit, it isn't easy to set up, but all the features that you get for that initial work are well worth it.

    1. Re:OpenAFS unusable in a "real" environment? by Umrick · · Score: 2, Informative

      OpenAFS is a great solution to a problem, just not this one. It doesn't work in a detached state. On the other hand, the caching is quite aggressive, and if it's an option, you could set up two cells that trust each other and access files that way.

      I'll be happier once the stable versions have two things though... >2gb file support, and support for 2.6 series kernels. Disconnected operation would be nice as well.

      All of those are proposed projects, but not currently in the developement version (at least not in the changelogs).

    2. Re:OpenAFS unusable in a "real" environment? by wik · · Score: 2, Informative

      AFS dies horribly if your clients lose sight of the volume location or file servers. As long as the machines are well-connected, it works great.

      As far as Kerberos goes, I'd suggest the new ORA nutshell book "Kerberos: The Definitive Guide". While it doesn't go into AFS much, it explains how the thing really works and how to configure MIT and Heimdal Krb5.

      - Happy AFS/krb5 site administrator

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
  5. AFS doesn't work at my uni by Anonymous Coward · · Score: 1, Informative

    Monash University is using AFS on its Linux desktops. Whenever the connection to the file server goes down, everyone's sessions hang, which is clearly unacceptable.

    It's quite possible that it has been incorrectly set up, but in this situation AFS hasn't delivered what it promised.

    1. Re:AFS doesn't work at my uni by auzy · · Score: 2, Informative

      umm, I go to monash uni too and before they were using NFS, I haven't really tried out the AFS drives yet except over ra-clay and stuff, but from the short time I used them on the network, they are far better. I remember all the NFS probs they had (I must have lost at least 4 or 5 assignments on them and lost at least 20% in marks). AFS has disconnected operation so should be much better.. Have you tried it out this year much.. It might also be hanging because they are trying to make different fetches off the network though all the time that can't be cached.. To me its seemed to be alot better then all of those NFS stale file entries from last year, that I used to get all the time.. So far I've had no probs with AFS at monash. But then again, I'm 3rd year comp sci now so I haven't been using it as much as I used to. Then again, doesn't surprise me if its not set up right.. Monash administration is awful. they wont even support setting up a jabber server :( Network also seemed faster at monash (probably a side effect of the disconnected operation by AFS)..

  6. Re:This question by DDumitru · · Score: 4, Informative

    Please excuse the ad here (mod down if you like).

    I developed a replicated filesystem that we use with our commercial email service. The filesystem is layered under UML (User Mode Linux) and cross-replicates files between two servers, on in California, and one in Pennsylvania.

    I too looked at Coda and Inter-mezzo, but was not very satisfied with their stability and/or their ability to recover from outages.

    The replication that we use relies on the update nature of MailDir with Courier Imap.

    Our solution uses UML to post a transaction journel to the underlying host OS layer. Application level code then cross-posts filesystem updates using HTTP transactions with curl and Apache/cgi. Transactions are delayed about 2 seconds to coalesce multiple updates into a single network event. In general, we get about 5mbit of update thruput coast to coast and it is very rare that either system is more than a couple of seconds out of sync.

    I am sorry that I cannot give you the code. While the code is Linux bases, we don't actually sell (distribute) it, so we keep it in-house for our own use. Perhaps my description will give you some ideas.

    The email offering is described at:

    http://easyco.com/mail/index.htm

  7. Re:A similar question: edit/compile/run over ADSL by noselasd · · Score: 2, Informative

    Use version control ?
    Edit at your local site, have a (subversion/cvs)server at the office.

  8. Unison by hak1du · · Score: 4, Informative

    Don't bother with any of the kernel-mode disconnected file systems. For those kinds of situations, the Unison file synchronizer is a good choice: it performs bidirectional synchronization and uses an efficient protocol that only needs to send differences and some checksums across the wire. It also detects conflicts and (optionally) lets you resolve them automatically. It works on UNIX/Linux, Windows, and MacOS.

  9. Re:A similar question: edit/compile/run over ADSL by Artichoke · · Score: 2, Informative

    I mount via nfs over a VPN over 1Mb ADSL (rsize=8192,wsize=8192,intr,rw,async,noatime,noaut o,user) and after the Vim session is restored, don't have a problem.
    An rsync based script (FWIW in Python) to xfer disparate directories and files works around the cumbersomeness problem.
    As for the 'use version control' responses: I don't want to store intermediate versions of persistent files and don't want to store intermediate/temporary files at all (but don't want to recreate them from scratch every couple of day when I swap from home to/from office).

    BTW: Nick, how's the AFS investigation going? {8{)}

    --
    __
    Arse
  10. Re:cvs by doctormetal · · Score: 2, Informative

    Why not just use CVS or, even better, subversion?

    You should use CVSup for this.
    It has already proven its useability for syncing and updating FreeBSD systems

  11. Re:This question by Tomun · · Score: 2, Informative

    Try offlineimap, it'll sync imapimap or imapmaildir

  12. Re:Novell ifolder by Earlybird · · Score: 2, Informative

    While it looks interesting, the project is labeled as pre-alpha -- not ready for production use.

  13. Re:use AFS by Earlybird · · Score: 3, Informative
    • How can I buy their product? Who sells it?
    IBM AFS. Note that OpenAFS is a true fork of IBM's own code, and currently maintained by IBM and the community. Afaik, IBM AFS is no longer in active development. You don't need to buy anything except support.
    • What does it cost?
    IBM AFS client licenses have historically been "very expensive" -- that's about all I know. If you need to ask, you probably can't afford it. :)
  14. Foldershare by niai · · Score: 2, Informative

    Foldershare is a Win32 "Document Management & Real-time File Mirroring Solution".

    I read that "the development team hopes to start work on Mac OS X and Linux clients within the next six months" (Jan 27th 2004).