What is the Best Remote Filesystem?
GaelenBurns asks: "I've got a project that I'd like the Slashdot community's opinion of. We have two distant office buildings and a passel of windows users that need to be able to access the files on either office's Debian server from either location through Samba shares. We tend to think that AFS would be the best choice for mounting a remote file system and keeping data synchronized, but we're having trouble finding documentation that coherently explains installing AFS. Furthermore, NFS doesn't seem like a good option, since I've read that it doesn't fail gracefully should the net connection ever drop. Others, such as Coda and Intermezzo, seem to be stuck in development, and therefore aren't sufficiently stable. I know tools for this must exist, please enlighten me."
It looks to me that both AFS and NFS are kind'a outdated. SAMBA 3 combines NTLMv2 or kerberos encrypted passwords. I like that.
I've got developers that need to have a consistent home directory over several unix and windows boxes - we're using samba *and* nfs - an ugly system at best. I'm currently in the situation where I can start over, more or less, so I'm looking at better options. any suggestions are appreciated.
You see, without that little doohicky, the universe stops.
http://propheteer.org
What about drbd? Its a mirroring thing, like raid 1, over a network. This way, the data is syncronised, and all you have to do is mount/share the data from the nearest server, by whichever way you want. Try http://drbd.cubit.at/ this.
I think it can manage to re-sync everything when the network line comes back up, but I'm not sure.
Dirk stood in the Stanley
I'm sorry I can't address your question for good remote filesystems in the face of an unreliable network. My network has been relatively reliable and that's been a decreasing concern. Perhaps network reliability will be less of a concern for you, too, in future.
Lately, what I've been looking for is a remote filesystem that provides performance, security, flexibility, the latter in reference to being able to log into someone else's desktop machine and easily get my home directory mounted, whether from a big server up 24x7, or from my desktop.
Some have dabbled with DCE/DFS, but I've heard that's slowly dieing, ponderous to set up, performance suffers.
SFS looks intriguing, but I haven't heard pro or con about its performance. It appears to be secure and flexible.
NFS is an old friend and, yes, if the network or the server dies, a lot of local sessions will hang interminably 'NFS server not responding'. But, this doesn't happen as much as it did 5 years ago.
Right now we're running NFS v3, but the new NFSv4 looks like it has a better security model.
Finally (and you shouldn't even think about this if network reliability is an issue), simple block service like iSCSI looks promising as a way of interchangeably moving around from desktop to desktop and getting your same home directory no matter where you are. More, you could conceivably even get your own flavor of OS booting, be it Red Hat 9, Win2K, XP, Gentoo, etc. Don't know about its security; it's heavily dependent on a reliable, high-performance network, but looks like a good way to get the most storage for your dollar (NAS instead of SAN).
"Provided by the management for your protection."
Since the office buildings are distant, chances are that there is untrusted connection between them. Don't forget to send data through secure tunnels (eg: ssh tunnel).
my cat's breath smells like cat food
CVS.
-- There are two kind of sysadmins: Paranoids and Losers. (adapted from D. Bach)
9p
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Then you don't have to syncronize.
If you haven't already installed SSH on a machine in both locations, do so.
Follow the "Setting up Samba over SSH Tunnel mini-HOWTO" by Mark Williamson . Then you can use the server on each side to share out the files on the other side and not even change anything about how your users do anything. It's very simple to set up. It's 3 steps on each side plus adding it into a log in script or mapping on the individual machines. So you should be ready in 5 minutes.
If you still want to syncronize, there are tons of tools to do that including Unison.
Frankly AFS is what you want and what you need. I used to work at a site with over 26,000 AFS users and it was a magical system. It is hard to setup, I'll grant you that, but only the first time. After you've got it down once it's old hat after that.
My biggest issue when I was setting it up was Kerberos integration, can be tricky but the guys on the OpenAFS mailing-lists are incredibly nice and knowledgable. Some other issues are daemons that like to write to user home dirs won't work real well unless you find a way to have them get an AFS token or Kerb ticket.
If I were you I would SERIOUSLY consider AFS, don't listen to those who would say it's old and outdated, because it's not. OpenAFS is being actively developed and new features are being added all the time.
Feel free to email me if you want and I'll discuss the advantages/disadvantages further or help you get resources to set up your AFS system.
Lustre is something we're looking at rolling out for user home directories. Although a few labs have 100TB+ file systems using it. You get redundant servers at all levels (which deals with the synchronization problems), and best of all, you can stripe all your existing disks to create one logical disk. Think LVM for network connected machines. It's pretty fast too.
I was suprised to hear were I work they are dropping NFS in favor of SMB.
The reason I was given was that SMB has better permissions/access rights across all platforms.
-greg
I advise against Linux Kernel-Samba, at least if you want your Clients (be it Workstations or Servers) to have some uptime. After some days, possibly weeks it randomly stops working, all programs having open filedescriptors on the samba-share hanging. If you kill (-9) them, or the smbmount-process, they go zombie. Any other program which tries to access the former mount-point immediatly goes zombie as well (Your shell checking whats wrong, updatedb,...) After several more days I have seen those zombie-processes disappearing again, however not always.
If you reboot daily anyway there shouldn't be any problem.
All in all not a satisfactory situation.
Tested with:
- Samba 2.2.3a (Debian Woody) as Server
- Kernel-Samba 2.4.* as Client
But perhaps I missed something...
Edgar
(note: by master/slave terminology I only mean that the master server is used more. Only AFS has a hierarchy where master/slave really matters)
AFS would be awesome... you see, sometimes these two offices need to work on the same files from both locations... not simultaneously, but sometimes consecutively. In those cases, it'd be great to have a setup that locally caches the file on the slave server, but will automatically serve the most recent version of the file, even if it had since been edited master server. With AFS, all of that is taken care of by the server, I believe.
Now, of course we could set up Samba networked drives, but then there would be no caching... a file would either be stored on the master or slave server, and if someone from another location wanted to work with the file, they'd have to redownload it every time. That would be an *alright* solution, but pretty inelegant, as far as I'm concerned. Linux is supposed to be good at this advanced server stuff, damnit.
So, finally, a question in response to your post: What happens to remote Samba connections when the net connection goes down? Is it a gracefull timeout, or does it start crashing things like NFS?
Setting up Samba over SSH Tunnel
For a quick-and-dirty solution for one or two users, over a reliable connection, this might be sufficient, but for the poster's problem, it would be a nightmare.
TCP over TCP is a bad idea because it amplifies the effect of lost packets.. two or three dropped packets in a short period of time will result in a cascade failure as each TCP stream attempts to compensate for the loss.
You can find all the gory details here.
You should be using a VPN if you have two offices and two firewalls. Unless your debian machines ARE your firewalls, then NFS or samba would be fine. However, machines will still lock or be slow of the internet gets slow or you drop a connection from one place to another.
On your win32 clients, setup putty (use latest dev version) with a tunnel to port 139 to your fileserver, map the network drive on windows as \\127.0.0.1\sharename
That's it! A free solution.
SSH port forwarding isn't "TCP over TCP" - the SSH client isn't simply sending the TCP packets over the wire, it is sending the contents over.
Suppose we have 2 computers, A and B, connected via SSH, and forwarding some service. A sends a block of data to B.
The sequence is NOT:
A packaged data into TCP packet.
SSH encrypts packet and packages it into another TCP packet
B receives SSH packet and acks it
B decryptes packet
B acks that packet.
The sequence IS:
A packages data into TCP packet
SSH receives and acks packet.
SSH encrypts PAYLOAD of TCP packet
SSH sends packet
B receives SSH packet and acks it
B extracts data.
B packages data into local TCP packet, sends it, acks it locally.
So you don't get into the cascade failure mode for TCP over TCP.
Now, if you use your SSH connection to forward PPP data over the wire - THEN you are getting into TCP over TCP because the SSH session is actually forwarding the PPP packets.
www.eFax.com are spammers
I think FAT16 is the best remote filesystem -- I like it best when FAT16 is as remote from myself as possible.
<sig>Guvf vf abg n frperg zrffntr
Samba3 is an amazing piece of software, don't get me wrong. Yet it exists to play patty-cake with Windows, and neither the Windows or the Linux side gets what it really wants. The NFS on the table doesn't look terrible, but what we have available now is pretty unusable. AFS, Coda, etc. probably aren't going to be a good solution either.
I am starting to get interested in whatever Novell has that can save us from this mess. Of course, something free would be best, some middle ground that any OS can implement without losing their own brand of authentication, roles, acls, file attributes, etc. Why this is still a problem for us creeping up on 2004 escapes me.
You're assuming that a remote filesystem is the only way to share files. But its only the most common and simplest. When you start talking about replication and version control (which you are, even though you don't use the terms) you need to consider a technology that directly supports these features. There's version control systems, databases, content management systems. Which is right for you? Without knowing more about the data you're dealing with, it's impossible to say.
Worth noting that ClusterFS is advertising Lustre as a pre-1.0 product. Probably not a current option for anybody who can't afford a big support contract.
Then they can't use it in L.A.!
It sounds to me like you're trying to connect two servers on different locations, which then serve out the files out to the clients through samba. And the connection between those offices might drop.
:) - it's in Debian.
Maybe it's worth considering Unison - it's built to run over SSH, and can is like a two way rsync. It keeps state on both sides, and you can set it up so it automatically/regularly updates both sides with the changes of the other side. There's a window of conflicting updates, that's true, but you'd also have that with intermezzo or coda when they're in disconnected mode. Additionaly, unison is completely userspace, it doesn't care about what filesystem it might be running on. And there's Windows/MacOSX port too iirc.
And hey, it's only an apt-get away
All other "encrypting remote filesystems" encrypt only the filetransfer, not the filestorage (AFS or - if i understood the FAQ correctly - SFS). So the fileserver admin (or an intruder or trojan) is able to read served files cleartext.
What's required is a remote filesystem where the clients do not need to trust the service nodes for data integrity and privacy. If i did not miss something (please tell me!), the only option nowadays is stacking a local crypting fs on top of a remote fs, e.g. NCryptfs on top of NFS or AFS.
...then I would consider building a SAN with replication. High end storage solutions using HDS and/or EMC gear fix this problem by enabling remote block for block copy of data between identical arrays. Veritas also makes a product called Volume Replicator that does effectively the same thing. By the sounds of it, this would be out of your price range, but it would do the job (we have a 15TB data centre mirrored using EMC's SRDF and another one using Volume Replicator).
In terms of free ways to do it, it will really depend on how sync'd the two offices need to be. If it's instantaneous, then you will need to have one master server and both sites pointing to it. Others have mentioned AFS, but that is also non trivial. If the synch doesn't have to be instantaneous, then perhaps a regular rsync tunneled through SSH would do the trick. CVS may also help, depending on the data you have.
Back in the day, we were forced to use sneaker.net (TM). It worked quite well, even on MS-DOS workstations with 512k RAM, and the 80286 processor and still works to this day. Reliability is so-so, and speed can be poor, but nowadays with technological progress transfer rates can be the orders of gigabytes per second, but latencies are large (tens of seconds upwards to several days). One downside was the propagation of viruses, but distribution of code across platforms by source and proper protected mode operating systems with selectable user privileges make viruses less dangerous.
Stick Men
As far as AFS documentation goes, I found the following documents useful when installing a new AFS cell/kerberos realm earlier this month.
a uqbg000.htm) provided step-by-step installation instructions for the AFS server and client. Having been an AFS user for the past 7 years did help a bit.
e rberosAFSInstall explains the changes to the quick start required to actually integrate kerberos 5.
First, the AFS quick start guide on openafs.org (http://www.openafs.org/pages/doc/QuickStartUnix/
Second, the quick start guide assumes you are using the kaserver included with OpenAFS. Everyone and their pet dog now recommends installing a real kerberos 5 daemon instead. We chose Heimdal 0.6. The new O'reilly book "Kerberos: A definitive guide" was invaluable for this. In order to put the two together, this impossible to find wiki page http://grand.central.org/twiki/bin/view/AFSLore/K
Finally, to get a pam login that gets both kerberos 4 (for AFS) and 5 tickets and tokens, we used pam-krb5afs (http://sourceforge.net/projects/pam-krb5/) for the login module.
Unfortunately, none of this is tied together in a single cohesive document and I'm still trying to organize my notes. Overall, I was able to get the kerberos realm and AFS up in about a day, while getting the pam module and openssh to play nicely took three to four days.
/ \
\ / ASCII ribbon campaign for peace
x
/ \