What is the Best Remote Filesystem?
GaelenBurns asks: "I've got a project that I'd like the Slashdot community's opinion of. We have two distant office buildings and a passel of windows users that need to be able to access the files on either office's Debian server from either location through Samba shares. We tend to think that AFS would be the best choice for mounting a remote file system and keeping data synchronized, but we're having trouble finding documentation that coherently explains installing AFS. Furthermore, NFS doesn't seem like a good option, since I've read that it doesn't fail gracefully should the net connection ever drop. Others, such as Coda and Intermezzo, seem to be stuck in development, and therefore aren't sufficiently stable. I know tools for this must exist, please enlighten me."
It looks to me that both AFS and NFS are kind'a outdated. SAMBA 3 combines NTLMv2 or kerberos encrypted passwords. I like that.
I've got developers that need to have a consistent home directory over several unix and windows boxes - we're using samba *and* nfs - an ugly system at best. I'm currently in the situation where I can start over, more or less, so I'm looking at better options. any suggestions are appreciated.
You see, without that little doohicky, the universe stops.
http://propheteer.org
What about drbd? Its a mirroring thing, like raid 1, over a network. This way, the data is syncronised, and all you have to do is mount/share the data from the nearest server, by whichever way you want. Try http://drbd.cubit.at/ this.
I think it can manage to re-sync everything when the network line comes back up, but I'm not sure.
Dirk stood in the Stanley
I'm sorry I can't address your question for good remote filesystems in the face of an unreliable network. My network has been relatively reliable and that's been a decreasing concern. Perhaps network reliability will be less of a concern for you, too, in future.
Lately, what I've been looking for is a remote filesystem that provides performance, security, flexibility, the latter in reference to being able to log into someone else's desktop machine and easily get my home directory mounted, whether from a big server up 24x7, or from my desktop.
Some have dabbled with DCE/DFS, but I've heard that's slowly dieing, ponderous to set up, performance suffers.
SFS looks intriguing, but I haven't heard pro or con about its performance. It appears to be secure and flexible.
NFS is an old friend and, yes, if the network or the server dies, a lot of local sessions will hang interminably 'NFS server not responding'. But, this doesn't happen as much as it did 5 years ago.
Right now we're running NFS v3, but the new NFSv4 looks like it has a better security model.
Finally (and you shouldn't even think about this if network reliability is an issue), simple block service like iSCSI looks promising as a way of interchangeably moving around from desktop to desktop and getting your same home directory no matter where you are. More, you could conceivably even get your own flavor of OS booting, be it Red Hat 9, Win2K, XP, Gentoo, etc. Don't know about its security; it's heavily dependent on a reliable, high-performance network, but looks like a good way to get the most storage for your dollar (NAS instead of SAN).
"Provided by the management for your protection."
Since the office buildings are distant, chances are that there is untrusted connection between them. Don't forget to send data through secure tunnels (eg: ssh tunnel).
my cat's breath smells like cat food
9p
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Then you don't have to syncronize.
If you haven't already installed SSH on a machine in both locations, do so.
Follow the "Setting up Samba over SSH Tunnel mini-HOWTO" by Mark Williamson . Then you can use the server on each side to share out the files on the other side and not even change anything about how your users do anything. It's very simple to set up. It's 3 steps on each side plus adding it into a log in script or mapping on the individual machines. So you should be ready in 5 minutes.
If you still want to syncronize, there are tons of tools to do that including Unison.
Frankly AFS is what you want and what you need. I used to work at a site with over 26,000 AFS users and it was a magical system. It is hard to setup, I'll grant you that, but only the first time. After you've got it down once it's old hat after that.
My biggest issue when I was setting it up was Kerberos integration, can be tricky but the guys on the OpenAFS mailing-lists are incredibly nice and knowledgable. Some other issues are daemons that like to write to user home dirs won't work real well unless you find a way to have them get an AFS token or Kerb ticket.
If I were you I would SERIOUSLY consider AFS, don't listen to those who would say it's old and outdated, because it's not. OpenAFS is being actively developed and new features are being added all the time.
Feel free to email me if you want and I'll discuss the advantages/disadvantages further or help you get resources to set up your AFS system.
Lustre is something we're looking at rolling out for user home directories. Although a few labs have 100TB+ file systems using it. You get redundant servers at all levels (which deals with the synchronization problems), and best of all, you can stripe all your existing disks to create one logical disk. Think LVM for network connected machines. It's pretty fast too.
I advise against Linux Kernel-Samba, at least if you want your Clients (be it Workstations or Servers) to have some uptime. After some days, possibly weeks it randomly stops working, all programs having open filedescriptors on the samba-share hanging. If you kill (-9) them, or the smbmount-process, they go zombie. Any other program which tries to access the former mount-point immediatly goes zombie as well (Your shell checking whats wrong, updatedb,...) After several more days I have seen those zombie-processes disappearing again, however not always.
If you reboot daily anyway there shouldn't be any problem.
All in all not a satisfactory situation.
Tested with:
- Samba 2.2.3a (Debian Woody) as Server
- Kernel-Samba 2.4.* as Client
But perhaps I missed something...
Edgar
Setting up Samba over SSH Tunnel
For a quick-and-dirty solution for one or two users, over a reliable connection, this might be sufficient, but for the poster's problem, it would be a nightmare.
TCP over TCP is a bad idea because it amplifies the effect of lost packets.. two or three dropped packets in a short period of time will result in a cascade failure as each TCP stream attempts to compensate for the loss.
You can find all the gory details here.
You should be using a VPN if you have two offices and two firewalls. Unless your debian machines ARE your firewalls, then NFS or samba would be fine. However, machines will still lock or be slow of the internet gets slow or you drop a connection from one place to another.
"CVS is not the answer, CVS is the question - the answer is no!"
:-)
Can't remember where I saw that quote first (LKML??) but I think it sums things up quite nicely...
Try NetBSD... safe,straightforward,useful.
SSH port forwarding isn't "TCP over TCP" - the SSH client isn't simply sending the TCP packets over the wire, it is sending the contents over.
Suppose we have 2 computers, A and B, connected via SSH, and forwarding some service. A sends a block of data to B.
The sequence is NOT:
A packaged data into TCP packet.
SSH encrypts packet and packages it into another TCP packet
B receives SSH packet and acks it
B decryptes packet
B acks that packet.
The sequence IS:
A packages data into TCP packet
SSH receives and acks packet.
SSH encrypts PAYLOAD of TCP packet
SSH sends packet
B receives SSH packet and acks it
B extracts data.
B packages data into local TCP packet, sends it, acks it locally.
So you don't get into the cascade failure mode for TCP over TCP.
Now, if you use your SSH connection to forward PPP data over the wire - THEN you are getting into TCP over TCP because the SSH session is actually forwarding the PPP packets.
www.eFax.com are spammers
I think FAT16 is the best remote filesystem -- I like it best when FAT16 is as remote from myself as possible.
<sig>Guvf vf abg n frperg zrffntr
You're assuming that a remote filesystem is the only way to share files. But its only the most common and simplest. When you start talking about replication and version control (which you are, even though you don't use the terms) you need to consider a technology that directly supports these features. There's version control systems, databases, content management systems. Which is right for you? Without knowing more about the data you're dealing with, it's impossible to say.
OBD's run on top of EXT3 (well sort of, its a hacked ext3, but basically it doesn't add any really new features on the journalling side).
:)
Lustre is a lot more stable than it used to be
The failover is an "in development" feature. I know people who claim to be using it, but I wouldn't count on it working when you need it. Its just using clumanager (or simular) and a service start on the "failover" machines. It really doesn't do all that much, and requires some hgeavy scripting and hand holding to get it to work at all.
Its a pretty good "in cluster" solution, I wouldn't recommend it (today at least) as a remote filesystem option.
It sounds to me like you're trying to connect two servers on different locations, which then serve out the files out to the clients through samba. And the connection between those offices might drop.
:) - it's in Debian.
Maybe it's worth considering Unison - it's built to run over SSH, and can is like a two way rsync. It keeps state on both sides, and you can set it up so it automatically/regularly updates both sides with the changes of the other side. There's a window of conflicting updates, that's true, but you'd also have that with intermezzo or coda when they're in disconnected mode. Additionaly, unison is completely userspace, it doesn't care about what filesystem it might be running on. And there's Windows/MacOSX port too iirc.
And hey, it's only an apt-get away
...then I would consider building a SAN with replication. High end storage solutions using HDS and/or EMC gear fix this problem by enabling remote block for block copy of data between identical arrays. Veritas also makes a product called Volume Replicator that does effectively the same thing. By the sounds of it, this would be out of your price range, but it would do the job (we have a 15TB data centre mirrored using EMC's SRDF and another one using Volume Replicator).
In terms of free ways to do it, it will really depend on how sync'd the two offices need to be. If it's instantaneous, then you will need to have one master server and both sites pointing to it. Others have mentioned AFS, but that is also non trivial. If the synch doesn't have to be instantaneous, then perhaps a regular rsync tunneled through SSH would do the trick. CVS may also help, depending on the data you have.
AFS was a nice filesystem to work with, but it took more to maintain it than our regular NFS mounts. The local (client-side) caching of files was nice though. So was the concept of having a master read/write volume and being able to then replicate that volume to read-only volumes, and replicating them only when we wanted to. So we could put new programs on the read/write volumes, test them out, and, when it all was tested, "release" the volumes, et al.
Access permissions are definatly different than your samba/CIF/NFS filesystems though. Its akin to Kerberos where you have to have a "token", and your "token" has to have rights to the file in order to read the file. And "tokens" used to not be a obtain-once-and-use-forever thing. They expired every 24 hours, so every 24 hours you'd have to re-authenticate.
One thing that we found we didn't like (this was with AFS 3.3/3.4) was that the cache of files on the client machine was not encrypted. So if someone knew how the cache was structured, they could retreive the files in the cache without having any AFS tokens (the cache exists on local disk, not in AFS space). This may have changed.
One other thing we had a problem with is when the AFS volume(s) would disappear from the client, and/or if the client lost contact with the cell AFS servers. The machine would become useless until it came back. This was all on UNIX (Sun, HP, SGI, BSD, Linux). Part of the problem was that /usr/local was in AFS space and contained most userland programs used.
afsd now refuses to start unless the cache directory is owned by root and chmod 600. As far as I know, the cache is still not encrypted, but if you can't trust root on the system, then you have bigger problems.
/usr/local is remote. I have yet to see a network file system that can gracefully handle this situation.
AFS is still nasty if you lose contact with the servers. That definitely will be a problem if
/ \
\ / ASCII ribbon campaign for peace
x
/ \
Back in the day, we were forced to use sneaker.net (TM). It worked quite well, even on MS-DOS workstations with 512k RAM, and the 80286 processor and still works to this day. Reliability is so-so, and speed can be poor, but nowadays with technological progress transfer rates can be the orders of gigabytes per second, but latencies are large (tens of seconds upwards to several days). One downside was the propagation of viruses, but distribution of code across platforms by source and proper protected mode operating systems with selectable user privileges make viruses less dangerous.
Stick Men
As far as AFS documentation goes, I found the following documents useful when installing a new AFS cell/kerberos realm earlier this month.
a uqbg000.htm) provided step-by-step installation instructions for the AFS server and client. Having been an AFS user for the past 7 years did help a bit.
e rberosAFSInstall explains the changes to the quick start required to actually integrate kerberos 5.
First, the AFS quick start guide on openafs.org (http://www.openafs.org/pages/doc/QuickStartUnix/
Second, the quick start guide assumes you are using the kaserver included with OpenAFS. Everyone and their pet dog now recommends installing a real kerberos 5 daemon instead. We chose Heimdal 0.6. The new O'reilly book "Kerberos: A definitive guide" was invaluable for this. In order to put the two together, this impossible to find wiki page http://grand.central.org/twiki/bin/view/AFSLore/K
Finally, to get a pam login that gets both kerberos 4 (for AFS) and 5 tickets and tokens, we used pam-krb5afs (http://sourceforge.net/projects/pam-krb5/) for the login module.
Unfortunately, none of this is tied together in a single cohesive document and I'm still trying to organize my notes. Overall, I was able to get the kerberos realm and AFS up in about a day, while getting the pam module and openssh to play nicely took three to four days.
/ \
\ / ASCII ribbon campaign for peace
x
/ \