Distributed Filesystems for Linux?
Zoneball looked at 3 distributed filesystems, here are his thoughts:
" Open AFS was the solution I chose because I have the experience with it from college. For performance, AFS was built with an intelligent client-side cache, but did not support network disconnects nicely. But there are other alternatives out there.
Coda appears to be a research fork from an earlier version of AFS. Coda supports disconnected operations. But, the consensus on the Usenet (when I looked into filesystems a while ago) was that Coda was still too 'experimental.'
Intermezzo looks like it was started with the lessons learned from Coda, but (again from Usenet) people have said that it is still too unstable and it crashes their servers. The last 'news' on their site is dated almost a year ago, so I don't even know if it's being developed or not"
So if you were to recommend a distributed filesystem for Linux machines, would you choose one of the three filesystems listed here, or something else entirely?
This guy must have installed too many versions of the same Microsoft products. ... You can still configure you networking using scripts for 2.0- or 2.2-based distros. You can often use 20 year old programs under Unix, albeit sometimes with some effort.
In the GNU/Linux world, BSD world, and to some extend in the entire Unix world, good designs do not become obsolete. Even not-so-good designs often stick around, for the sake of backward compatibility. In the newest greatest Linux kernel, you can still have a.out support, NFS, Minix, FAT16 filesystem support
Only in the M$ world is obsolescence such a big issue, because that obsolescence is planned. In short, don't worry that much about obsolescence : if Coda is as good as it looks, it'll be there for a long time. If SomeCrappyDistributedFS FileSystem is used by enough users, it'll stay around for compatibility's sake anyway, even if it sucks.
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
Naaaaaaaaaa.....
NFS is not distributed, it's only "networked" or "remote". I t doesn't support any: replication, disconnection, sharing, distribution. It is centralised, requires the same user names|numberpace and security.
In one word, it's far away of the requirements, at least if you compare them with the listed FS in the question.
sgis ddo ekil t'nod i
Samba is free and it will work with your M$ garbage if you need it to.
It's become such a part of my day to day life that I can't really describe the things I was missing before. The best things about it are probably the strong, flexible security and ease of administration. It also gives you everything you need from a small shop all the way up to a globally available decentralized data store.
There seems to be a good comparison here. I would strongly recommend AFS for all of your distributed filesystem needs. (The OpenAFS developers are cool too!)
LRC, the best-read libertarian site on the web
I think you should clarify what you mean by "distributed"... becuase that word is going to cause a lot of confusion.
IF you want a few linux boxes to all basically share a lot of files, so you can log into any one, do whatever, only install stuff once... nfs is fine. If it's just on a private network just for you.
NFS is not considered a "distributed" filesystem... but I'm not sure that's what yo want anyway.
"For every complex problem, there is an answer that is clear, simple, and wrong." -- HL Mencken
'jfb
To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
... and the gaping wide security hole that is NFS.
... thanks ... my accounting data now!".
... about 15 years ago.
"Hello, I'm user ID 500 and I'd like my home directoy
NFS doesn't actually have security anymore, never has since IP-capable machines became physically portable but more importantly since the assumption that every box would have a trusted admin became invalid
KILL NFS, we need something that doesn't suck.
I know that this is going to be the most common answer, but just go with NFS.
This is what immediately came to mind for me too. Except for one thing. NFS is not a distributed filesystem. It's merely a network filesystem. The data itself actually resides only in one central place, and is not distributed in any way. Storage is not shared across machines, and therefore NFS is limited, in performance and redundancy, to the levels that single storage point represents. If it's an infinitely scalable, fault-tolerant machine, then the difference approaches academic. Otherwise, the fact that NFS is not really a distributed filesystem is an important distinction.
Of course it is. It gives you a single, unified view of a file system tree that can span many machines.
It doesn't support any: replication, disconnection, sharing, distribution.
Sure it does. Some of that functionality requires more than a vanilla NFS server, but that can be transparent to clients.
It is centralised, requires the same user names|numberpace and security.
Older versions did, current versions don't.
Don't get me wrong, NFS has lots of problems in many environments. But for networking a handful of machines in a home environment, it is nearly perfect. In fact, NFS version 2 is probably the best choice (the old one without all the security stuff). Furthermore, the alternatives (AFS, SMB, CODA, etc.) are harder to administer, perform worse, and have lots of problems with UNIX file system semantics.
But for a lot of applications, you simply don't need that much, and you've got some way to contain the security risks, and NFS can be enough. It's easy enough to set up, and if all you're *really* trying to do is make sure that everybody sees their home directory as /home/~user, and sees the operating system in the usual places and the couple of important project directories as /projecta and /projectb, NFS with an automounter and a bunch of symlinks for your home directories is really just fine. They hide the fact that users ~aaron through ~azimuth are on boxa and ~beowulf through ~czucky are on boxbc etc. And yes, there are times you really want more than that, and letting your users go log onto the boxes where their disk drives really are to run their big Makes can be critical help. But for a lot of day-to-day applications, it really doesn't matter so much.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
If anyone has root on ANY system or there are ANY non-unix systems, forget it.
Actually, there are two sides to this problem:
Problems caused by case 1 can be handled using the root squash functionality: basically, making all root-UID files on a filesystem which I mount lose their root ownership. This stops Joe Sixpack (who is root on a machine which hosts a remote disk) from copying a SUID-root program onto the remote disk, and then logging into my machine as a normal user, and running the program as a means to get root priv.
Case 2 can be handled by a sensible security policy: only export disks via NFS to machines where you are root. Otherwise, Joe Sixpack can mount your disk, and with his root priv look at anything he wants. However, he still won't be able to compromise your machine (although he may get information which leads to a compromise).
Problems with NFS are at more of a nuts and bolts level - portmap is legendary for having all manner of holes in it, and it remains vulnerable to packet sniffing. Tunneling NFS through an SSH link helps with the sniffing problem, but it creates security problems of its own (see the NFS HOWTO for details).
Tubal-Cain smokes the white owl.
CVS.
It's got powerful replication services, although manually run.
Disconnected operation is no big deal with CVS.
As for distributed file systems, make one system the CVS server. Make sure all your systems "cvs update" by a cron job that runs often. If the main server explodes, your next task is to set up a new server. Set up your DNS so cvs.whatever.com points to your current CVS server, and keep a hot standby ready. Change the DNS to point to the new CVS, CVS commit from any of the slave servers that were doing "cvs update" every 5 minutes, and you're up and running again. Could automate it, if I had enough problems to make it worthwhile.
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
Right!! OpenMosix is the solution.
Using MFS, you can just have one pool of disks, memory, cpu's and the processes will migrate to the data; instead of copying the data around.
Great system, once you settle on one version of the kernel (have to be the same on all machines)
-Kz-
Everyone wants more OpenAFS documentation, and no one seems to be actively working on it. The volunteers don't seem to be tripping over each other to do it.
NFS still needs some cleaning up.
http://ebgp.net/ccc/