Distributed Filesystem for Disconnected Operation?
juraj asks: "I'm trying to achieve the following setup: I have two offices connected via a relatively slow ADSL line, and I want a shared fileserver between the offices. I have VPN using IPSec ready, so security is less of a concern, but simply mounting a filesystem (via Samba or NFS) from one office to another is not a solution because of the speed. Also, the ADSL line is sometimes not only slow, but also disconnected.
I've tried the CODA distributed filesystem to achieve replication, so that both offices have local copies of their files. The problem is, that the CODA filesystem is just a research project: it is unstable, with the venus daemon constantly falling, and sometimes when recovering from the disconnected state, one side does not recognize the changes and they are simply not propagated.
Have you had any good experiences with CODA? Which versions do you use? What kind of setup did you have? How is it configured? I've also heard about OpenAFS, but similar to CODA, I've learned it is unusable in a real environment. Is there any real solution to my problem? Are there any decent solid free distributed file systems for Linux or the BSDs?"
http://www.inter-mezzo.org/
you are looking for intermezzo
http://www.inter-mezzo.org/
the same guy from coda is the leader. remember
afs -> coda -> intermezzo
--
Evan
"$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
I find this to be the ideal solution for keeping filesystems synchronized across slow links.
From my experience, Perforce has the best use of bandwidth and also the most intelligence when it comes to rearranging directory structures and resolving conflicts.
Unfortunately it's only free for up to two users - so it may be useless for your needs.
Bullshit. You haven't looked at it hard enough then. I used to work at a university that had 26,000+ users using an AFS filestore for their homedirs and for distributed apps across several miles of campus.
I'm sure this thing has more than surpassed terabyte size by now. It was always fast and always reliable, except when the one of server's SCSI cards would melt and start spewing errors.
AFS is better than most people give it credit for. I'll admit, it isn't easy to set up, but all the features that you get for that initial work are well worth it.
Please excuse the ad here (mod down if you like).
I developed a replicated filesystem that we use with our commercial email service. The filesystem is layered under UML (User Mode Linux) and cross-replicates files between two servers, on in California, and one in Pennsylvania.
I too looked at Coda and Inter-mezzo, but was not very satisfied with their stability and/or their ability to recover from outages.
The replication that we use relies on the update nature of MailDir with Courier Imap.
Our solution uses UML to post a transaction journel to the underlying host OS layer. Application level code then cross-posts filesystem updates using HTTP transactions with curl and Apache/cgi. Transactions are delayed about 2 seconds to coalesce multiple updates into a single network event. In general, we get about 5mbit of update thruput coast to coast and it is very rare that either system is more than a couple of seconds out of sync.
I am sorry that I cannot give you the code. While the code is Linux bases, we don't actually sell (distribute) it, so we keep it in-house for our own use. Perhaps my description will give you some ideas.
The email offering is described at:
http://easyco.com/mail/index.htm
Don't bother with any of the kernel-mode disconnected file systems. For those kinds of situations, the Unison file synchronizer is a good choice: it performs bidirectional synchronization and uses an efficient protocol that only needs to send differences and some checksums across the wire. It also detects conflicts and (optionally) lets you resolve them automatically. It works on UNIX/Linux, Windows, and MacOS.