What is the Best Remote Filesystem?
GaelenBurns asks: "I've got a project that I'd like the Slashdot community's opinion of. We have two distant office buildings and a passel of windows users that need to be able to access the files on either office's Debian server from either location through Samba shares. We tend to think that AFS would be the best choice for mounting a remote file system and keeping data synchronized, but we're having trouble finding documentation that coherently explains installing AFS. Furthermore, NFS doesn't seem like a good option, since I've read that it doesn't fail gracefully should the net connection ever drop. Others, such as Coda and Intermezzo, seem to be stuck in development, and therefore aren't sufficiently stable. I know tools for this must exist, please enlighten me."
It looks to me that both AFS and NFS are kind'a outdated. SAMBA 3 combines NTLMv2 or kerberos encrypted passwords. I like that.
What about drbd? Its a mirroring thing, like raid 1, over a network. This way, the data is syncronised, and all you have to do is mount/share the data from the nearest server, by whichever way you want. Try http://drbd.cubit.at/ this.
I think it can manage to re-sync everything when the network line comes back up, but I'm not sure.
Dirk stood in the Stanley
I'm sorry I can't address your question for good remote filesystems in the face of an unreliable network. My network has been relatively reliable and that's been a decreasing concern. Perhaps network reliability will be less of a concern for you, too, in future.
Lately, what I've been looking for is a remote filesystem that provides performance, security, flexibility, the latter in reference to being able to log into someone else's desktop machine and easily get my home directory mounted, whether from a big server up 24x7, or from my desktop.
Some have dabbled with DCE/DFS, but I've heard that's slowly dieing, ponderous to set up, performance suffers.
SFS looks intriguing, but I haven't heard pro or con about its performance. It appears to be secure and flexible.
NFS is an old friend and, yes, if the network or the server dies, a lot of local sessions will hang interminably 'NFS server not responding'. But, this doesn't happen as much as it did 5 years ago.
Right now we're running NFS v3, but the new NFSv4 looks like it has a better security model.
Finally (and you shouldn't even think about this if network reliability is an issue), simple block service like iSCSI looks promising as a way of interchangeably moving around from desktop to desktop and getting your same home directory no matter where you are. More, you could conceivably even get your own flavor of OS booting, be it Red Hat 9, Win2K, XP, Gentoo, etc. Don't know about its security; it's heavily dependent on a reliable, high-performance network, but looks like a good way to get the most storage for your dollar (NAS instead of SAN).
"Provided by the management for your protection."
The way we do it is that we have some underlying file store running on unix machines. At the moment we've got a couple Sun machines with large RAID arrays.
Then, to provide access to clients, we use Samba as a bridge to the Windows desktops and NFS for trusted linux clients; untrusted hosts can use SFTP or, if they just need read access, HTTP.
Having multiple storage nodes on multiple sites synchronized is a SAN, not client access, problem. NFS just doesn't provide multiple-node functionality. NFSv4 (link, link) may have some interesting features that could help; AFS was designed with multiple sites in mind and does intelligent caching and has other useful features over NFS but does have some limitations; and then there's things like IBM's Storage Tank which I haven't had a chance to look at properly yet.
Bottom line: If you have a flexible SAN infrastructure, you can use bridging nodes to provide access to the SAN tailored to whatever your clients require. The infrastructure is the hard part; with commodity packages like Samba client support is a much simpler seperate issue.
Then you don't have to syncronize.
If you haven't already installed SSH on a machine in both locations, do so.
Follow the "Setting up Samba over SSH Tunnel mini-HOWTO" by Mark Williamson . Then you can use the server on each side to share out the files on the other side and not even change anything about how your users do anything. It's very simple to set up. It's 3 steps on each side plus adding it into a log in script or mapping on the individual machines. So you should be ready in 5 minutes.
If you still want to syncronize, there are tons of tools to do that including Unison.
Frankly AFS is what you want and what you need. I used to work at a site with over 26,000 AFS users and it was a magical system. It is hard to setup, I'll grant you that, but only the first time. After you've got it down once it's old hat after that.
My biggest issue when I was setting it up was Kerberos integration, can be tricky but the guys on the OpenAFS mailing-lists are incredibly nice and knowledgable. Some other issues are daemons that like to write to user home dirs won't work real well unless you find a way to have them get an AFS token or Kerb ticket.
If I were you I would SERIOUSLY consider AFS, don't listen to those who would say it's old and outdated, because it's not. OpenAFS is being actively developed and new features are being added all the time.
Feel free to email me if you want and I'll discuss the advantages/disadvantages further or help you get resources to set up your AFS system.
Lustre is something we're looking at rolling out for user home directories. Although a few labs have 100TB+ file systems using it. You get redundant servers at all levels (which deals with the synchronization problems), and best of all, you can stripe all your existing disks to create one logical disk. Think LVM for network connected machines. It's pretty fast too.
Setting up Samba over SSH Tunnel
For a quick-and-dirty solution for one or two users, over a reliable connection, this might be sufficient, but for the poster's problem, it would be a nightmare.
TCP over TCP is a bad idea because it amplifies the effect of lost packets.. two or three dropped packets in a short period of time will result in a cascade failure as each TCP stream attempts to compensate for the loss.
You can find all the gory details here.
SSH port forwarding isn't "TCP over TCP" - the SSH client isn't simply sending the TCP packets over the wire, it is sending the contents over.
Suppose we have 2 computers, A and B, connected via SSH, and forwarding some service. A sends a block of data to B.
The sequence is NOT:
A packaged data into TCP packet.
SSH encrypts packet and packages it into another TCP packet
B receives SSH packet and acks it
B decryptes packet
B acks that packet.
The sequence IS:
A packages data into TCP packet
SSH receives and acks packet.
SSH encrypts PAYLOAD of TCP packet
SSH sends packet
B receives SSH packet and acks it
B extracts data.
B packages data into local TCP packet, sends it, acks it locally.
So you don't get into the cascade failure mode for TCP over TCP.
Now, if you use your SSH connection to forward PPP data over the wire - THEN you are getting into TCP over TCP because the SSH session is actually forwarding the PPP packets.
www.eFax.com are spammers
I think FAT16 is the best remote filesystem -- I like it best when FAT16 is as remote from myself as possible.
<sig>Guvf vf abg n frperg zrffntr
You're assuming that a remote filesystem is the only way to share files. But its only the most common and simplest. When you start talking about replication and version control (which you are, even though you don't use the terms) you need to consider a technology that directly supports these features. There's version control systems, databases, content management systems. Which is right for you? Without knowing more about the data you're dealing with, it's impossible to say.
As far as AFS documentation goes, I found the following documents useful when installing a new AFS cell/kerberos realm earlier this month.
a uqbg000.htm) provided step-by-step installation instructions for the AFS server and client. Having been an AFS user for the past 7 years did help a bit.
e rberosAFSInstall explains the changes to the quick start required to actually integrate kerberos 5.
First, the AFS quick start guide on openafs.org (http://www.openafs.org/pages/doc/QuickStartUnix/
Second, the quick start guide assumes you are using the kaserver included with OpenAFS. Everyone and their pet dog now recommends installing a real kerberos 5 daemon instead. We chose Heimdal 0.6. The new O'reilly book "Kerberos: A definitive guide" was invaluable for this. In order to put the two together, this impossible to find wiki page http://grand.central.org/twiki/bin/view/AFSLore/K
Finally, to get a pam login that gets both kerberos 4 (for AFS) and 5 tickets and tokens, we used pam-krb5afs (http://sourceforge.net/projects/pam-krb5/) for the login module.
Unfortunately, none of this is tied together in a single cohesive document and I'm still trying to organize my notes. Overall, I was able to get the kerberos realm and AFS up in about a day, while getting the pam module and openssh to play nicely took three to four days.
/ \
\ / ASCII ribbon campaign for peace
x
/ \