Distributed Filesystems for Linux?
Zoneball looked at 3 distributed filesystems, here are his thoughts:
" Open AFS was the solution I chose because I have the experience with it from college. For performance, AFS was built with an intelligent client-side cache, but did not support network disconnects nicely. But there are other alternatives out there.
Coda appears to be a research fork from an earlier version of AFS. Coda supports disconnected operations. But, the consensus on the Usenet (when I looked into filesystems a while ago) was that Coda was still too 'experimental.'
Intermezzo looks like it was started with the lessons learned from Coda, but (again from Usenet) people have said that it is still too unstable and it crashes their servers. The last 'news' on their site is dated almost a year ago, so I don't even know if it's being developed or not"
So if you were to recommend a distributed filesystem for Linux machines, would you choose one of the three filesystems listed here, or something else entirely?
NFS Linux FAQ
Howto #1
Howto #2
If you find yourself needing help, try asking people at Just Linux forums, or trying the NFS mailing list.
NFS + Automounter plus NIS and you get everything you ever wanted. NFS is fast, well known and documented and transparent.
I would use SFS, the Self Certifying File System. Assuming all the systems you are using are supported, it offers global, secure access to anything you care to export.
Since openafs forked from the old transarc/IBM codebase, it looks as if it has a real future. It's used by a load of educational and research institutions (notably CERN), as well as Wall Street firms.
Just make sure to secure it well if it'll be connected/accicable through the internet.
I have NFS and I'm very happy about it.
> I can't think of anything funny or intelligent to say...
Unsubscribe, so you'll see the "new article coming up" warning and have a little lead time to think about it.
Sheesh, evil *and* a jerk. -- Jade
I'm sure other ehere will suggest NFS but why not just go whole hog and setup you clients to boot off a server then mount the same NFS filesystem. That way total transparency without having to make sure that n FS is always mounted
Just my $00.2
Rus
Cheap UK and US VPS
explain dfs...
If some latency is acceptable, you could just setup cron to run rsync, or some other synchronization tool every 5 minutes. Just don't forget to run a NTP server on your network, and synchronize the time on every computer that runs rsync. Otherwise you might lose data due to clocks out of sync.
Check here for a good background on DFS. It also has a quick table comparison of the popular programs, and a walkthrough to set up Intermezzo.
We use PVFS at work to give us a high-performance network filesystem for use with our clusters.
http://parlweb.parl.clemson.edu/pvfs/
NT
I run an openmosix cluster with the openmosix filesystem here at work. Three computers.. no problems...
If you want to take a look..
http://lucifer.intercosmos.net/index.php
linkage and I am going to be placing some tutorials up. -joeldg
anime+manga together at last.. in real time.
DFS is just replication. DFS works in a number of ways in simplest form you could use rsync to achieve the same thing. Combination of NFS and RSYNC could be used to achieve its more complex form.
Samba works fine. I personally have approximately 5 samba mounts in my filesystem totally transparent for anybody who was to walk up and use my computer.
No need to unnecessarily complicate things here, samba is simple to set up and functions great.
does NFS have trouble with user permissions such as a user having different numbers on different system? Is it secure enough to share a common passwd file?
You can't judge a book by the way it wears its hair.
You might have luck googling for "clusted filesystems" as well. things like HP's CFS (no idea how good it is though)
Rus
Cheap UK and US VPS
Sounds like my ex-girlfriend.
I really have no idea what I am talking about.
...just to keep things simple. If you need redundancy, try changedfiles, it's a lot less of a hassle than intermezzo (IMO).
Acquiescence leads to obliteration
http://www.bebits.com/app/2304
it can be interfaced to a linux machine
This actually brings up a question I've been wondering about for a while. Does anyone have any solutions for a mirroring file system? Basically RAID 1 over a network.
What is a good stable solution for this? Currently I'm just using a tar over ssh once a night to do an incremental backup.
While there is no new news posted on the site, ther are current tarballs on the ftp server, as recent as 5.9.03. (but that file appears to be a redux, last update to code seems to be 3.13.03)
The sourceforge page for the project (http://sourceforge.net/projects/intermezzo) shows status as production/stable but the info there looks stale too.
... and about 15 Karma points once the moderators are done with you.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
This guy must have installed too many versions of the same Microsoft products. ... You can still configure you networking using scripts for 2.0- or 2.2-based distros. You can often use 20 year old programs under Unix, albeit sometimes with some effort.
In the GNU/Linux world, BSD world, and to some extend in the entire Unix world, good designs do not become obsolete. Even not-so-good designs often stick around, for the sake of backward compatibility. In the newest greatest Linux kernel, you can still have a.out support, NFS, Minix, FAT16 filesystem support
Only in the M$ world is obsolescence such a big issue, because that obsolescence is planned. In short, don't worry that much about obsolescence : if Coda is as good as it looks, it'll be there for a long time. If SomeCrappyDistributedFS FileSystem is used by enough users, it'll stay around for compatibility's sake anyway, even if it sucks.
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
I'll agree with the majority here, and say NIS+NFS is the way to go. But I'll deviate a bit and say use FreeBSD as your NIS/NFS server. It's dead simple to set up, and FreeBSD is rock solid. I have an NIS+NFS server here running FreeBSD, with Slackware boxes mounting from it. Works like a charm.
That should have read...
Format, Install Windows Server 2000 or 2003, Repeat
What you are looking for is 'autofs', which has been used extensively in solaris and linux for years (forever). You can set up an NFS share and then have autofs mount/unmount it on demand. The advantage is that if the share is not in use it's unmounted and the machine will be less vulnerable to hanging if the NFS server goes down. See the AutoFS Howto for more information on setting it up.
-- Greg
Slashdot, would a spell-checker for posting be too much to ask? It's not rocket science!
This seems to be a good choice
Not Today
Un-news
It seems like a distributed filesystem might be overkill for your needs. If what you really want is the appearance of a single common machine, why not just pick one as a server, and set up your other boxes as X clients. You can even pull out most of their memory and storage, and stick it in the server, thus turning them all into pretty powerful machines.
Money I owe, money-iy-ay
Just so you all know. NFS is a network accessible FS. A DFS can also be network accessible from clients, but it physically resides on multiple systems.
You limited the possible set of answers to "for Linux." I'll ask the larger question; are there any good distributed filesystems? Good meaning; mature, stable, works on at least one platform well, and is as transparent as possible to that platforms software, within reason.
:)
Truth is the only thing that resembles a distributed filesystem I have ever used is Domino. It does what I need quietly, efficiently and consistently. You can't open(...) the content you have stored from a C program (others APIs exist if you must) but maybe that expectation is what makes the existing DFSs on Linux suck...
Distributed Filesystems are attempts distribute things that most software assumes is only a couple microseconds away, exists atomically, is not accessed concurrently and is permanently available. Clearly a tough problem domain!
Maw! Fire up the karma burner!
Samba is free and it will work with your M$ garbage if you need it to.
Ummmm... no.
:-)
Both DFS and dfs (depending on whether you're unix or windows) are more of a distributed hierarchy.
You see the same file system wherever you are regardless of what systems happen to hold the data. You can choose to replicate data between said systems and if you're doing things right the dfs system will direct clients to the "copy" of the data that is closest.
Oh, and NFS can lick my swalls
Slashdot, would a spell-checker for posting be too much to ask? It's not rocket science!
Which offers the best stability and protection from future obsolescence?
The best protection from future obsolescence is to use something that is already obsolete.
If you have a lot of time to invest in setting everything up, training all your users in AFS, and administering a complex system, then AFS is a great choice. It provides more flexible ACL's and is generally more secure than NFS (if a user hacks root on your system, they can't comprimise the AFS volumes without obtaining a token).
:)
OTOH, having administered AFS installations in the past, I would steer clear of AFS unless you really understand it and are willing to make the investment in time and personnel to make it work for you.
NFS (optionally +NIS) is a tried-and-true solution; it's dirt-simple to set up if security is not paramount and there's a pleathora of documentation for it on the Web (i.e., for free). Every UNIX I've ever used had some sort of NFS client, if not a server, built in. Most Linux distro's come with NFS clients and servers prepackaged and ready to go -- if you go with AFS, you'll have to install stuff on every box and handle patches, upgrades, etc., through a separate process. And there are nice Windows clients that talk to NFS (or you can run Samba to go the other way) (or both, if you're a masochist).
Plus, most of your fellow slashdotters agree that NFS is the way to go.
It's become such a part of my day to day life that I can't really describe the things I was missing before. The best things about it are probably the strong, flexible security and ease of administration. It also gives you everything you need from a small shop all the way up to a globally available decentralized data store.
There seems to be a good comparison here. I would strongly recommend AFS for all of your distributed filesystem needs. (The OpenAFS developers are cool too!)
LRC, the best-read libertarian site on the web
I just went through this process a few weeks ago and I must say I'm really glad I went through the trouble of setting it up...it's very cool. I actually wrote a tutorial about how to accomplish this by using NIS and NFS. I hope you find it helpful.
The only trouble you might run into with the setup I used is some file-locking issues with programs wanting to share the same preference files.
--It's Pimptastic!--
So, you know who I am. You forgot to close the bold tag.
Informative? What the hell?
I run a large 70+ machine mixed network. We run NIS and automount everything everwhere.
It has to be one of the simplest configurations, and most transparent to the users.
Things you want to read up on is "NIS" and "ypmake" and stuff like that.
Some may say it is insecure, I don't think so. I don't know your environment. Mine is a business - we employ professionals - we fire jackasses. All access to our network is firewalled from the rest of the world.
-Duane.
I'm amazed to see no mention of the widely used and rock solid GFS, brought to you by the same people who did LVM. GFS was written for Linux and is Posix compliant. It was available under the GPL license once upon a time, unfortunately thats changed. Doesn't cost too much though.
I have exactly the same problem here at home, except I've thrown a couple of laptops into the mix. The solution that I've come up with us to use Unison to syncronize directories between machines. The big advangage is that Unison is as simple as it gets. It just plain works. It doesn't matter what the filesystem, network reliabilty, or even operating system is (it works on Win32 too).
Setup a cron job to unison the home directories over an SSH link at a regular interval. Not only do you have a distributed filesystem, but every client has a complete copy also.
DFS is just replication. DFS works in a number of ways in simplest form you could use rsync to achieve the same thing. Combination of NFS and RSYNC could be used to achieve its more complex form.
You should be flogged for suggesting N'SYNC!
The problem with these distributed files systems seems to be that they're either pretty old and lacking features like disconnected operation (AFS) or seem to be unstable or, even worse, unmaintained (Intermezzo, Coda).
For many simple purposes backups can be done quite nicely using rsync or something like bacular. For laptop/notebook support unison is definitely worth a look. It syncs directories like rsync does, but in both directions. Works nicely for me.
time is a funny concept
Anyone with a desktop and a laptop they want to maintain in sync definetely needs Unison. This is one of the coolest tools I found after I picked up a laptop.
I've spent quite some time researching this issue for here at work. We have two primary offices, separated by a 256k of network topology. Too slow for most users to find acceptable (large files, several 10s of seconds to copy). A bit of a culture problem but oh well.
k it/samplechapters/dsdh/dsdh_frs_tkae.asp
I looked into a whole pile of options for having a "live" filesystem, a-la NFS, but the bandwidth killed interactivity (this is for users who've never used 100mbit network filesystems before).
I found the following:
1. Windows 2000 Server includes a thing called "File Replication Service". Basically, it's a synchronisation service. You replicate the content to many servers, and the service watches transactions on the filesystem, and replicates them to the rest of the mirrors as soon as it can. You can write to all mirrors, but I never quite worked out how it handled conflict resolution.
A chapter from the Windows 2000 Resource kit that describes how it works: http://www.microsoft.com/windows2000/techinfo/res
2. Some people have done similar work for Unix systems, but they mostly involve kernel tweaks to capture filesystem events. Can't remember any URLS, but some Googling should find it.
3. Some people are using Unison to support multi-write file replication. So long as you sync regularly, you shouldn't have too many problems.
4. The multi-write problem is a hard one, so most people tend to say "don't do it, just make the bandwidth enough". This is the way to go if bandwidth isn't an issue.
A guy by the name of Yasushi Saito has done quite a bit of research into data replication. Some papers (search for them on google in quotes). He also put together a project called "Pangaea" which tries to do as described above. It wasn't great last time I looked. Some paper titles:
- Optimistic Replication for Internet Data Services
- Consistency Management in Optimistic Replication Algorithms
- Pangaea: a symbiotic wide-area file system
- Taming aggressive replication in the Pangaea wide-area file system
There is also a bunch of other research work:
- Studying Dynamic Grid Optimisation Algorithms for File Replication
- Challenges Involved in Multimaster Replication (note: this talks about Oracle database replication)
- Chapter 18 of the Windows 2000 Server manual describes the File Replication Service in detail
- How to avoid directory service headaches (talks about not having multi-master-write replication and why)
My university uses AFS as well. From a user standpoint, once everything is set up it works great. They've got it seemlessly integrated into all the Windows, Linux, and Solaris boxen on campus using OpenAFS and Kerberos.
/afs. Unfortunately, AFS uses kerberosIV authentication. This would be fine, accept my schools kerberos server only hands out v5 tickets. I have yet to find a kerberos implimentation that correctly finds my schools 524 server and actually get user permissions on AFS space. That's where I'm stuck right now any.
I had no complaints with it at all, until I tried to get a FreeBSD machine working with AFS. For starters, OpenAFS doesn't have a FreeBSD port. I've heard rumors of one in the works, but I haven't seen anything useful in the last year. I did stumble across a project called arla however, which allowed me to at least mount
Like I said, works great in Linux, Solaris and Windows, but beware if you try with a BSD.
security wise this would be a nightmare to watch over..
"Consider how lucky you are that life has been good to you so far. Alternatively, if life hasn't been good to you so far
Plan 9 gives you a different perspective and it is interesting.
That's like saying "jumping off a cliff is not the most intelligent thing to do." NFS is easily the LEAST secure option of ANY filesharing system.
NFS is only appropriate on a 100% secured(physical and network-level) network. If anyone/someone can plug in, forget it. If anyone has root on ANY system or there are ANY non-unix systems, forget it. If ANY system is physically accessible and can be booted off, say, a CDROM, forget it. The only major security tool at your disposal is access by IP, which is pathetic. Oh, and you can block root access.
Even though you can block root access for some/all clients, it's still massively insecure, and this remains NFS's greatest problem. You have zero way of authenticating a system. NFS is like a store where you could walk in, pick up any item you wanted, and say "I'm Joe Shmoe, bill me for this!" and they'd say "Right-o!" without even looking at you. All systems with the right IPs are explicitly trusted, and their user/permissions setups are also explicitly trusted.
NFS is a pretty good performer, especially when tuned right and on a non-broken client(which linux is VERY far from.) However, its entire security model is in dire need of a complete overhaul. There needs to be a way to authenticate hosts, for one, more similar to WinNT's domain setup, which is actually incredibly intelligent(aside from the weak LANMAN encryption.) The administrative functionality in NFS can't compare to the features that have been available to MacOS and Windows administrators for over a decade, and it's purely embarassing.
Either that, or AFS/Coda need to get a lot more documentation and (for Coda)implementation fixes. The unix world desperately needs a good filesharing system...
Please help metamoderate.
Other options like LDAPS and Kerberos offer at least some form of security.
ypcat, then brute force attack on the resulting passwd file is as old as dirt, and sadly still works. I was a bit dissappointed when I saw NIS as a required service on the Redhat cert syllabus.
This may sound harsh, but I don't think there is much excuse for run NIS in this day and age. Anyone who does this in an environment where security is a concerns deserves what they get.
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
its FUNNY cause it BASHES MICRO$OFT .......... hahahahahahahahahahahahahahaha!!!!!
I had more or less the same basic requirements and I opted for AFS.
:DFS, DCOM, Directory Services, SSO, DCE-RPC, etc.)
My needs were a little more demanding (had to be implemented in GNU/Linux, Solaris, AIX, HP-UX and as an extra Windows 2000) and grocking AFS can be difficult at first but it was the best choice by far. Stable across all the Unices, very secure (this was another requirement) and integrates perfectly with our Kerberos Domain and LDAP accounting info. It provides a unique namespace that can span multiple servers transparently, does replication, automatic backups and read-only copies, client-side cache with callbacks, has a backup (to tape) system that can be used stand-alone or integrated with existing backup structures (Amanda, Legato, TSM) AND was the basis for the DCE filesystem, DFS (as a side note I find it interesting - and sad - that most things people try to emulate this days are present in DCE , and Windows 2000 got many of the "new features" from a technology initially made for Unix
AFS is amazing and much more robust than any distributed filesystem I know of; it has shortcomings when servers time out, but apart from that it's really an excellent solution; an example I generally use to give an idea of some of the good features of AFS is a relocation of a home directory to another server. The user doesn't even notice that his home directory was moved to another server *even if he was using it and was writing stuff to disk*; at most all writing calls to his home dir have a small delay (a couple of seconds) even if his/her home dir was 5 Gb worth.
Kerberos integration is an added bonus, if you can you can use this as an excuse to kerberize your systems and form a Kerberos Domain. If you don't want to just stick with the standard AFS KA server.
In my setup I have Windows users accessing their home dirs in AFS using the Kerberos tickets they have from the Windows login and the fact that a cross-realm trust was made between the Unix DOmain and the AD; the can edit all the files they are entitled to with that ticket, and the system is so secure that Transarc used to put the source code in it's public AFS share and added the customers that bought the source to the ACL of the directory that contained it.
With all this features it would be hard not to vivedly recommend OpenAFS as the best solution for a unified, distributed filesystem. Bandwidth utilization is, in my experience, at least half of what NFS uses, which is an added bonus.
cheers,
fsmunoz
The guy wants to be able to do things like disconnected operation and file sharing over a WAN. NFS is totally unsuitable for either of those as it provides neither distributed file service (if the server you are getting a file from goes down you lose) or disconnected operation.
NFS is also not a distributed/global file system. It is a pretty primitive way to handle global namespace management compared to stuff like AFS. At best what an automounter lets you do is avoid a few of NFSes problems. Ideally, I'd say this guy should try to see if he can get the U. Michigan disconnected AFS stuff out of Honeyman and company and see if he can port it to OpenAFS.
It's been featured on slashdot before.
Non-Linux Penguins ?
But nothing the guy asked about requires "distributed". His post really sounds like he means "networked"
Oracle published this for free use on Linux a few months ago, I believe. Without a SAN it will only work for two servers though. Still, it is a true dfs unlike NFS, CIFS, or AFS (btw, if all you are looking for is a network file system, AFS rocks).
Check www.oracle.com for info and a good tutorial on OCFS.
making it really secure?
Or just take your disk around with you.
rsync -e ssh -azu --bwlimit=500 --stats --exclude='/proc' --exclude='/dev' / targetsystem:/targetdir/
-e is how to go - so -e ssh means use ssh
-a (archive mode - see docs)
-z compression - if you have more CPU vs pipe, use it. but if you are on a lan, you probably want to leave it off unless you don't mind the cpu hog (fat pipes will use more cpu time for compression)
-u update only (dont overwrite newer files)
--stats show you what it did when it is done
--exclude leave off paths/files you want to skip
--bwlimit in KBps - from my exp, put half of what you want your max to be.
Ryan
I think you should clarify what you mean by "distributed"... becuase that word is going to cause a lot of confusion.
IF you want a few linux boxes to all basically share a lot of files, so you can log into any one, do whatever, only install stuff once... nfs is fine. If it's just on a private network just for you.
NFS is not considered a "distributed" filesystem... but I'm not sure that's what yo want anyway.
So if you were to recommend a distributed filesystem for Linux machines, would you choose one of the three filesystems listed here, or something else entirely?
Most of people will probably talk about your ideas and also NFS, SMB etc. but you may also take a look at the The Freenet Project. You can make your own private network, with everything transparently distributed and redundant, with crypto, digital signatures, etc. on a buch of connected PCs.
We use NFS every day, but just for very special circumstances. If you really understand how NFS works, then you will understand why NFS is just not a viable solution for anything large scale, or small scale for that matter.
:)
NFS is not secure. At most sites, NFS is exported read-only and limited to the domain, or to a given set of machine(s). If you export NFS as read/write then the client had better be secured, or you better use kerberos, and for damn sure better be behind a firewall. NFS has no client side cache, no volume location service, no ACL's, no authentication (unless kerberized), no replication, yata, yata, yata. We've used NFS sparingly for over 15 years because we -know how it works, and know its limitations.
On the other hand, we set an AFS cell for enterprise scale application and data sharing. It currently uses Kerberos V authentication, has volume replication, global namespace, client cache, fault tolerance. User's can setup their own groups, set their own ACL permissions. Did I say quota? AFS has per-user/per-volume quota. Hey, guess what, symbolic links work from any volume to any volume on AFS. And, AFS is just a simple daemon. You crank it up, mount the top of your cell and poof, you are done.
Another positive is the fact that once you setup an AFS cell you automatically become part of a larger community. Any AFS cell can mount the entire file system of another AFS cell within the same tree. I can for example mount many large university and government cells and share files. AFS allows Internet-wide file sharing with full security. On most versions of the client you can even enable encryption on the connection so your files won't be snooped easily.
All of our Solaris, Windows, Linux, and Mac boxes can use the same AFS tree without blinking an eye. We use AFS for many things. Before LDAP was really worth anything, we used AFS for simply exchanging read-only data. It -is- a replcated and global file system! Just put your config files in the tree and you are done.
If you are one of those people who are blinded by "always doing things one way", then I'd suggest you wake up and smell another technology, I did, and I liked what I got in return. Look into OpenAFS, you'll be glad you did.
+10,000 karma points!
I've never met you. But I want to hurt you.
Oh my god! People still use that antique technology? Just say no! Even Sun long ago gave up on NIS. Poor scaling, poor security. No way man, no how.
I'm vaguely sure this is a brand new affront to RMS, but I just can't put my finger on it.
Linux is the best. I like it a lot. I think that everyone should use Linux.
I was told by a very reliable source that Veritas will have a DFS solution for linux before first quarter of 2004. This is very good news.
smbmount /home/user/shared_shit /remote_computer/home/shared_shit
Wow they now share the same mount point and it works with windows to.
1) Your post doesn't seem relevant to mine at all
2) I AM using Gentoo.
3) WTF?
This is only really true of things that were widely used before being superceded. Something like an _experimental_ distributed FS could be dropped from the kernel because nobody could be stuffed updating it to track changes in 2.5 that had broken the code. Unless you had the skills to fix it and add it yourself, or could afford to pay somebody to do it, you'd be stuffed.
You can use it as a distributed file system. I think you'd need a bit of glue logic to make it look like one, though.
... and the gaping wide security hole that is NFS.
... thanks ... my accounting data now!".
... about 15 years ago.
"Hello, I'm user ID 500 and I'd like my home directoy
NFS doesn't actually have security anymore, never has since IP-capable machines became physically portable but more importantly since the assumption that every box would have a trusted admin became invalid
KILL NFS, we need something that doesn't suck.
Why not stick with NFS for the time being?
I went through the "is coda right for me?" phase, and also "is intermezzo right for me?" and also spent tens of hours researching distributed filesystems and cluster filesystems online ... my conclusion
is that the area is still immature, I will let the pot simmer for a
few more years (hopefully not many), and use NFS in the meantime.
My situation: desire for scalable and fault-tolerant distributed filesystem for home use with minimal maintenance or balancing effort. Emphasis on scalable, I want to be able to grow the filesystem essentially without limit. I also don't want to spend much time moving data between partitions. And last but not least, the bigger the filesystem grows, the less able I will be to back it up properly. I want redundancy so that if a disk dies the data is mirrored onto another disk, or if a server dies then the clients can continue to access the filesystem through another server.
All that seems to be quite a tall order. I checked out coda, afs, PVCS, sgi's xfs, frangipani, petal, nfs, intermezzo, berkeley's xfs, jfs, Sistina's gfs and some project Microsoft is doing to build a serverless filesystem based on a no-trust paradigm (that's quite unusual for Microsoft!).
Berkeley's xFS (now.cs.berkeley.edu) sounded the most promising but it appears to be a defunct project, as their website has been dead ever since I learned of it, and I expect the team never took it beyond the "research" stage into "let's GPL this and transform it into a robust production environment". Frangipani sounds interesting also, and maybe a little more alive than xFS.
On the other hand coda, afs and intermezzo are all in active development. afs IMHO suffered from kerberitis, i.e. once you start using kerberos it invades everything and it has lots of problems (which I read about on the openAFS list every day). AFS doesn't support live replication (replication is done in a batch sense) either.
CODA doesn't scale and doesn't have expected filesystem functionality: for 80 gigs of server space I would require 3.2 gigs of virtual memory, and there's a limit to the size of a CODA directory (256k) which isn't seen in ordinary filesystems. There's also the full-file-download "feature". CODA is good for serving small filesystems to frequently disconnected clients but it is not good for serving the gigabyte AVIs which I want to share with my family.
Intermezzo is a lot more lightweight than CODA and will scale a lot better, but it's still a mirroring system rather than a network filesystem. I might use that to mirror my remote server where I just want to keep the data replicated and have write access on both the server and the client, but it's again not a solution for my situation.
The best thing about intermezzo is that it sits on top of a regular filesystem, so if you lose intermezzo the data is still safe in the underlying filesystem. CODA creates its own filesystem within files on a regular filesystem, and if you lose CODA then the data is trapped.
Frangipani is based on sharing data blocks, so like NFS it should be suitable for distributing files of arbitrary size. I need to look at it in a lot more detail; this is probably the right way to build a cluster filesystem for the long haul. For the short term, Intermezzo is probably the right way for a lot of people: it copies files from place to place on top of existing filesystems.
What I did in the end:
The way it works is tha
Samba DOES support UNIX permissions. Use the "cifs" client module not the "smbfs" one, and enable UNIX extensions on smbd. Its not hard, and works well.
That said, I haven't tried it in real production yet. I do find it scary that a reverse-engineered MS protocol is now an option for UNIX<->UNIX network file access because NFS is so obsolete and crap that anything looks good in comparison.
Um.. Linix? Learn the name of your fucking operating system, to start off with. It's spelled L-U-N-I-X.
Linux Journal (I think) had a story by a guy who was using cvs to sync his home directory between work and home. I think he said he did commits and updates every few days, or when he got tired of things being out of sync. For what he wanted, consistent config files and so on with little hassle, and the ability to intelligently merge differences if necessary, it worked well enough for him.
In the medium term, however, I think WebDAV will become a better option, because it can be served and accessed with standard web servers and clients, in addition to being mappable onto the file system.
The Linux kernel already has WebDAV support (CODA hooks plus some user-mode process), although I'm not sure how well it works.
I wish there was one that would work with any
or all Linux kernels. I.e., you would not have
to compile for your particular kernel. E.g.,
version x.y of the so-and-so distributed file system will work with Linux kernel 2.4.23 only is what I see. I'd like to see a distributed file system that would still work after upgrading the kernel on my system.
E.g., create a file/block that you mount as a EXT2 file system and that file/block would really be updated by the distributed file system software.
"ex girlfriend": because you are a fat oreo eating fucking slob that has no money and lives at home. you stupid fat fuck.
YHBT. YHL. HAND.
But for a lot of applications, you simply don't need that much, and you've got some way to contain the security risks, and NFS can be enough. It's easy enough to set up, and if all you're *really* trying to do is make sure that everybody sees their home directory as /home/~user, and sees the operating system in the usual places and the couple of important project directories as /projecta and /projectb, NFS with an automounter and a bunch of symlinks for your home directories is really just fine. They hide the fact that users ~aaron through ~azimuth are on boxa and ~beowulf through ~czucky are on boxbc etc. And yes, there are times you really want more than that, and letting your users go log onto the boxes where their disk drives really are to run their big Makes can be critical help. But for a lot of day-to-day applications, it really doesn't matter so much.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
it will take a while ... sadly so will NFS.
... at least linux has all these "research" FS'es
There seems to be no decent replacement for NFS on BSD
Watch for the new version of NFSv4. There are already a sample implementation in the linux 2.5 tree. NFSv4 will address most of the problems that NFSv3 and others have. Including plugin security models, namespace, and revamped ACL handling.
It's also WAN friendly, letting several operations be done at the same time with a single directive. (COMPOUND directive) It also allows you to migrate one filesystem to another with no stale filehandles. Basically, it's trying to be an AFS killer.
For more information, take a look at
http://www.nfsv4.org/
Lots of good info including the IETF spec. It's a interesting read.
The spec is not quite complete. Currently, I believe there are discussions with how NFSv4 will work with IPsec.
Cheers,
sri
I've been partial to GPFS (general parallel files system) on AIX, IBM's also got a version for Linux...Probably RH specific. There's also JFS, works alright...a bit more development there would be a plus. GPFS works great on Linux clusters!
To do a true backup, you must copy permissions. To copy permissions, the target system needs to have the same UIDs and GIDs as the source system. This is hard to do on Windows and OS X. Typical tools such as rsync, Unison and rdiff-backup make no effort to solve this problem. Suggestions?
PVFS, you fucktard ... there's no PVCS
Doesn't samba-tng support a true DFS, exporting a virtual file system (combination of shares from multiple systems)?
Anyone using it?
http://shfs.sourceforge.net/
This has some potential.
CVS.
It's got powerful replication services, although manually run.
Disconnected operation is no big deal with CVS.
As for distributed file systems, make one system the CVS server. Make sure all your systems "cvs update" by a cron job that runs often. If the main server explodes, your next task is to set up a new server. Set up your DNS so cvs.whatever.com points to your current CVS server, and keep a hot standby ready. Change the DNS to point to the new CVS, CVS commit from any of the slave servers that were doing "cvs update" every 5 minutes, and you're up and running again. Could automate it, if I had enough problems to make it worthwhile.
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
the enhanced network block device...
/dev/ndb0 or whatever it calls itself.
divvy up the SCSI disk on a fast machine (preferably not your NFS server), and export the partition with this. The server portion of nbd is a user-space daemon.
Then make sure nbd (client portion) is enabled in your netboot kernel, and right before you swapon (but after you use BOOTP and bring up the net) attach your swap device from the server. Then you can swapon
Magic.
Black holes are where the Matrix raised SIGFPE
After banging my head on the wall and doing a lot of research into various options for creating a single cross platform distributed file system, I stumbled onto OpenAFS. I'm getting ready to set up a test cell on a spare Linux box (the requirement in the documentation to dedicate an entire partition just to AFS initially gave me serious pause, especially for my production server).
In the meantime I'm left with Samba and NFS, and I'm not happy with either. Samba seems to lack a lot of the features that I want. NFS provides most of what I want (I don't want much), but has strange issues. I really want something that can work on Windows, Linux and OS X so that I can have one method across all platforms and eliminate a few services.
Has anybody else experienced connection delays when connecting a Linux box to an NFS share on an OS X machine? I get a mysterious 3 minute delay every single time on every Linux box, and no amount of Googleing seems to yield any answers.
Hmph, I guess because there is continual talk of re-implementing Coda, the codebase must not be too hot.
Every time I want to switch to a better network file system, I read about problems with corrupted files or mysterious crashes and get scared.
Then I come slinking back to NFS, which hasn't done something like that to me in at least 5 years or more. The only real problems I've had are when so-and-so's V3 implementation doesn't want to talk to this other V3 implementation, or read/write sizes. Been rock-solid when it is actually moving bits.
Of course, now that I think about it, NFS has been rock-solid because the fileservers have been rock-solid. If they crash, then everybody's sorry.
Wish I was using a DFS that supported disconnected operation...
Hey, stop that! You're only supposed to trollhound me! I'm jealous... my wife won't fsck me anymore.
Can I bum a sig? I left mine at the office.
I have a filesystem that I need to keep synced between two servers.
For security reasons, the servers cannot route to each other. Is there an off the shelf system that will do something like rsync in a disconnected mode.
e.g.:
1) On one side generate a filesystem listing with checksums
2) Transfer listing to the other machine (floppy, cd, etc)
3) Use listing and filesystem to generate an archive of files to upload, plus a sync script
4) Transfer these two files to the original machine
5) run the sync script. This will extra the tar, and delete files and folders not existing on the the other server.
Granted, this is not something for Your home network, but CXFS looks like a good product. AFAIR the server is SGI only, but there should be clients for almost every OS out there.
.haeger
Yes, I know, it's not a distributed FS, but since so many people suggested NFS, I thought I'd point at another solution.
You are not entitled to your opinion. You are entitled to your informed opinion. -- Harlan Ellison
There's some reasoning behind the lack of big interest in distributed filesystems.
1) Obviously, NFS continues to be a passable solution where you dont really need "distributed" so much as "universally network accessible in a simple way".
2) For things where you truly want "distributed" access from multiple machines that are local to each other, there's a somewhat less complicated solution, which is to use shared storage. The idea is to attach all the machines to a SAN-style network (fiber channel, or hey even firewire these days) and use a sharing-aware filesystem that allows simultaneous access to the storage/filesystem from multiple hosts with sane locking and whatnot. One of the better places to look for this is otn.oracle.com - they've released a cluster-filesystem *and* drivers for firewire shared storage (which is cheaper than fiberchannel) for linux.
Of course, that leaves out the case of a distributed filesystem for machines that can't be on a SAN together for distance or economical reasons. In that case you could of course hack something up using cluster-filesystem type of filesystem and SCSI-over-IP or something like that I guess, or use one of the experimental distributed filesystems you mention... but the market just isn't big enough for this yet to drive much.
11*43+456^2
The Freeshell.org Unix shell/email provider uses a distributed filesystem to provide transparency across their various machines. After a few minutes of searching, I was not able to determine what their method is, but it's worth asking them about it, as Freeshell has over 10,000 users and high traffic.
My other
I don't think they did.
Why don't you try Linux FailSafe, It's GPL and available on SGI's web site. It can cluster applications as well as filesystems.
Use rsync. Default is to map user and group names at both ends of the connection, unless you specify --numeric-ids. Of course you have to have at least the names right, otherwise there's nothing to work with. And you need rooteness on the receiving end, but that's also to be expected.
I've been using rsync for some time now to manage moving research data between home and school and I'm thoroughly impressed. Great piece of software.
Stefan Axelsson
I use ugidd which automatically translates between numbers and userid:s, thus solving the problem with users having different numbers on different machines. Again this is not the most secure option around, but in your own secured sandbox it's easy to understand, install and maintain.
Why does the kernel go through stable and then unstable forks? Can't it always be a stable build, like with Windows?
I'm attending Carnegie Mellon University right now. The campus network stores all the user /home's, course webpages, homework submission folders, etc., on OpenAFS servers running some ancient, completely reworked version of Redhat. The servers are rock solid, 99.999% uptime as far as I've seen. I can only recall two one hour incidents the last two years when they went down for a bit. Tells you a bit about how stable OpenAFS is. That, and I've come to admire the usefulness of ACLs. The documentation could be better though.
Maybe the knowledge at this page is transferrable, somewhat, to other people trying to set up OpenAFS. At the least, it gives you an idea of what you'll be needing.
I see why you remain anonymous now
It would be very good to remember that NFS comes from:
Not
For
Security
What about CFS , OceanStore or Ivy for a really distributed file system ? :)
but at least with windows, i could do this on my own without a)having to install anything new and only abuto 5 minutes of configuration and b)having to ask a few thousand people the best way to do it. point-and-click it may be, but sometimes i just want something done without spending hours (or days) on it.
OpenAFS is a learning curve, but for a Distributed FS, it rules.
/afs/..com. They all share a top level AFS namespace of course, so every site can see every other sites files.
At my office I have deployed OpenAFS + krb5 + LDAP. This allows me to have a network of kernel developers (read - need root access on thier workstations) and have no access to each others files.
I have several sites, so they each sit under thier own name-space.
I have LDAP distibuting UID's and so on around the network, and providing multiple levels of access per person per machine (or groups of machines / people)
I have windows 2000 and XP users authenticating to my KRB5 KDC. They have seamless access to thier AFS space also, using "wake" from rose-hulman.
User's home directories follow them around.
Users must authenticate every 12hrs, but that it. We have full single sign on over the network.
NFS is no use. All users need root.
NFSv4 is no use. Authentication is done at mount time iirc, so if 2 users log onto one server...
Oh, and it in alpha stage, if that.
Coda is experimental.
I know little about intermezzo.
OpenAFS is scalable.
OpenAFS is secure.
OpenAFS is easy to manage for large (huge) sites.
OpenAFS looks after your backups for you.
OpenAFS has user managable groups....!
OpenAFS is just the most amazing FS I have ever used.
OpenAFS needs a way to store its PTS in LDAP along with my other per-user data.
This is the only fault I can find in OpenAFS.
The CHENDO project is an attempt to get distributed file systems to work on any platform.
It is a good thing to look at.
http://www.openafs.org/
security
scalability
replication of data
reliability
location independence
tools to manage your AFS data
The ability to move live data between
AFS fileservers without breaking service
to users on clients still blows me away.
Granted the AFS administrator has to learn more
than the NFS administrator but it is worth it.
For a small (~5) group of machines,
NFS may suffice but you want a distributed
filesystem that will grow to meet your
future demands across your enterprise
consider OpenAFS.
Then, I bought a bunch of 10/100 Ethernet cards that had EEPROM sockets and used EtherBoot to create a boot image for it. You can also make a boot image on the web here, here, or here .
You'll need a way to program the EEPROM, but there are lots of places to get info about that.
The only directories that are not identical across the virtual machines are /etc, /var, along with the obvious /dev, /boot, /proc, and so on. /usr and /home are the same mount on each "machine."
IBM deployed a new distributed filesystem that goes beyond AFS, it's called GPFS and it's part of the xCAT package. You can find it here.
Unfortunately, documentation is really poor at this moment... but I think it could be a really good solution.
May the source be with you!
- "SFS is a secure, global network file system with completely decentralized control. SFS lets you access your files from anywhere and share them with anyone, anywhere. Anyone can set up an SFS server, and any user can access any server from any client. SFS lets you share files across administrative realms without involving administrators or certification authorities."
Typical distributed file-systems are notoriously insecure, but SFS is designed with security foremost. This is research-level code, but I'd prefer a buggy program with firm theoretical foundations than a "tested" program built on shaky logic.I have never tried this, I doubt it's possible but maybe I'm too pessimistic.. Have small pool of unix boxes, have each export a portion of the filesystem to one server. Then combine a hand full via softraid(5) to one logical disk (create a huge file on the filesystem and convice softraid it's your disk) and finally link all raid disks together using LVM. This way you have distribution, redundancy and extendibility ;) . You can then re-export the new volume via NFS/AFS or whatever. The drawback is of course that the data flows twice through the ether if you access it from the same clients that provide the storage and the other drawback is that it probably will not work because softraid is too picky what to accept as disk.;)
I think it would be great if more effort were put into "user-friendly" distributed file systems and processing so that a lot of wasted resources could be reclaimed at businesses. When I was an administrator at an unnamed company, I thought it was a shame that everyone had a fancy, 2Ghz. machine with a 60 gig drive that was only used for basically surfing the net and reading email. The result was an almost complete waste of machine - about 3 gigs of drive space used, and almost no CPU utilization. So.. why not make all of these machines into collective distributed servers? Obviously, a lot of redundancy would have to be built in - what happens if John accidentally turns off his machine at the end of the day? Anyway, just one possible use for distributed filesystems...
NFS really can't stand machines being switched on and off, NFS is great in a production environment if setup correctly, but not for home usage.
MacOS X and M$ can handle SAMBA just fine (although MacOS X still can improve with handling filenames wich contain characters such as [ and ] and some other small caveats).
I know it's the M$ protocol, but Linux has the best implementation of it. It works just great!
The 'shares' of shut down machines disappear as soon as you try to access them on other machines. If you don't access them while they were off they stay in place. For ease, you can put a 'mount -a' in the crontab to automagically remount these unmounted filesystems.
It just works. 3 times 'HURRAY' for SAMBA!
That wasn't me. It was some other guy, trying to horn in on my schtick.
It's definitely the easy option and it's fine for smallish self contained sites, but when you try to run your entire business off it, it gets very expensive.
You end up with a big server or two for redundancy at each site with the associated support costs, your information is not globally addressable or available which means you have to have separate information repositories at the sites further increasing the support requirements and reducing the coherence of your business.
Basically, though not ideal, AFS is the best that's available at the moment.
If you're just a single site or a couple of sites, stick to NFS, AFS is a pain to set up on a small scale, it definitely likes to be a big world spanning architecture.
Government of the people, by corporate executives, for corporate profits.
Something that is not being mentioned by the proponents of the various networked file systems is the set of file system facilities provided to the client.
I currently use a GNU/Linux system with home directories provided by NCPFS (Novell Netware) because I have no choice. We've done a fair amount of work but the loss of some semantics really bites. Hard links are used for locking by quite a few apps. Open Office uses shared memory mmap() calls for internal communication via a file in the home directory. These aren't supported either. We have had to do a fair amount of work working around these (and other) short-comings.
I know that NFS provides me with the file system semantics I need for most of what I do. Could the proponents of AFS, SMB, Coda, etc. let us know how close to local file system semantics we get with home directories mounted via their various file systems?
Those who do not learn from Dilbert are doomed to repeat it.
All my machines are connected together in my house with SAMBA. Gives all my machines, UNIX and Windows a common user space file system....
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
More specifically you will find a number of links to projects not discussed in the threads I have seen so far, at the Multi Disk HOTWO where you can see how for instance Yoke and RAID can give you a fast reliable networked sharable file system. Someof these are research projects while others have been used for a few years. For even more fun you can stack file systems, like RAID0 with RAID1 where some of the drives are Yoke-connected over the net. Put inhertied fs on top and you have enormous flexibility, speed and reliability. OK so it is more complex but I am sure you can handle that.
And here comes the part I would like to stress: when you come to a conclusion, please contact the relevant HOWTO authors and give them your input. Only by your inputs can the Linux Documenttion Project improve.
You may want to use rdiff, from librsync at sourceforge, for that.
Not sure what you mean by ``distributed''. you should be a little more specific about what you want. However, SFS (self-certifying file system) is a good replacement for NFS, featuring privacy and authentication. Works on Linux, *BSD, MacOS X....
see http://www.fs.net
MFS is pretty cool but you get a nasty error when you reboot one of the nodes, nasty meaning kernel panic:). At last time I checked it was pretty flakey.
Okay, I've been so wanting to us AFS since the good old days when I first started using it back at Stanford in the early 90's. But, I've got Solaris (shoot me now), Linux, and MacOSX boxes and limited brain power available. Does anyone have a tutorial for morons that goes through everything I'd need to do to setup AFS ( + kerb/ldap, right?) served off over either Linux or MacOSX? I tried to setup kerberos in 1999, but had a time of it... I know it rocks, but it can get confusing. Pointers anyone? -k (One of these days, I'm going to go to the end of the pier and throw our sun blades in the ocean)
Instead, I would use mine. But is not working yet. It's in a development stage. And as is my thesis, I can't release it yet. But I can tell what it'll do.
Essentially, it'll distribute the files in some machines. But any distributed filesystem does that. What's so good about mine? Well, suppose you have a lab with a NFS server and 28 clients. These 28 clients have, say, 8Gb disks, 2 of which are used to store the local instalation and the remaining 6 are unused. Doing some math, you reallize you have 168 of useless space, more than the space the server has. So, all you need to do is put a distrib fs and make use of it.
Now suppose you're one of the students using thhe lab. You sit front a 'client' and start using your files. But the files are in another machine, so the access is done thru the net. And it feels slow. Ok, all you neeed to do is to guess in what machine the files are located and use that one... or, have a distrib fs that will migrate your files to the local machine as you use them. That's what I'm after.
You can also read a draft (in spanish) here. Please be patient with the site, 'cause the uni has very saturated lines. If you want to contact me about it, use this e-mail address, as the other ones are down.
I mean, if I use AFS, does that mean from now on, every time I run an install script for some random package that chmods something, I have to realize that the script doesn't really work, and then I have to analyze its intent and then do some ACL thing that accomplishes the same intent? Ugh, I am not interested in things that create more work for humans.
Another annoying-looking thing is that it's really a filesystem, even from the servers' point of view. Unlike sharing/exporting services such as NFS and Samba, which you can run on top of your choice of filesystem (ext3, Reiserfs, xfs, etc), it appears that AFS/OpenAFS combines both the disk and the network topics. That means you don't get the advantages of all the great work the filesystem geeks have been doing in the last few years.
It almost strikes me as inelegant design or something, that a single project concerns itself with both the details of how things are laid out on a disk, and also how to do network-related things such as replication. Somebody made their black box too big for my tastes.
Am I wrong about all this?
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
Lustre which grew out of the Intermezzo/Coda projects. It was designed from the ground up as a scaleable ditributed filesystem. I would give a single mount point for a set of aggregated storage. In our clustered system it has been show to be able to write greater than 2 GB/s to disk. Lustre may be a little overkill for a home system, but for clusters, it works great.
Y'know; I'm really disappointed that nobody here has yet mentioned what is probably the granddaddy of distributed unix systems: TNC (The Newcastle Connection). It was publicly announced in SP&E back in '82.
...
It does something the writer requested that NFS doesn't: It provides a consistent directory tree on all systems. They were the folks who invented the rather elegant "/../" notation for getting out on the Net and accessing another computer's files, after all.
Also, unlike NFS, TNC hasn't ever required centralized security. Their scheme was to have a plugin module that mapped the id info (address, hostname, userid, etc.) into local ids. It was easy enough to just map all outsiders to "nobody/nogroup", thus blocking all access from outside while letting you see out. One of the useful features of this distributed system is that each machine's owner can control outside access independently, without being beholden to the whims of some network administrator
I worked with instances of TNC at several big corporations back in the 80's, and it was fairly slick. You literally had one network-wide tree that looked the same from everywhere. It was a real disappointment to see NFS get adopted as "standard", since one of its major problems all along has been the inability to pass file names around within distributed applications. You just get ENOENT on most of the machines, because the people setting up NFS never consider consistent directory trees to be anything important.
I also worked on one project where we decided to reverse engineer TNC's approach. "How difficult can it be?" It wasn't difficult. It took me about two weeks to get to the point that I could type "make" in a directory whose Makefile used source scattered across a collection of machines, and it worked. This included machines whose clocks weren't synchronized; solving this problem took me a morning, and didn't require any time servers or root permission anywhere.
As I said; I've been disappointed
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
Isn't AFS the group that sells pizzas in high schools?
AFS
Use the stock nbd... afaik it works fine (I think the issue is that it's not as fast as it could be; that's what enbd is trying to accomplish...)
It won't hurt. Compile it as a module.
Black holes are where the Matrix raised SIGFPE
I guess this doesn't really apply to "home usage" but I have to manage a lot of machines over a SAN and if you don't want people screwing up your SAN, you better use something like CXFS.
:)
CXFS uses a sort of token technique and allows multiple file accesses. That way, we get the same files on all the machines but w/o the NFS overhead and network congestion. File read/write are done over the fiber channel switch and the "metadata" is done over a private network. This is WAY much faster than NFS over Gigabit Ethernet. One good thing about CXFS is the redundency possibility. You can have failover servers and other neat things.
The only drawback, is that you need an SGI server but then, you can use Windows and Solaris clients. Very stable but probable not for home use
-- Leeeter than leet
There is no such thing as a Kerberos Domain. They're called realms.
Don't forget the work offline functionality. That's basically what he really wants.
FLAME ON Linix punks.
The last time I did a Redhat install, it took me 45 minutes.
I don't know what decade you live in, but damn...
-- This space for lease, low setup fee, inquire within!
The permission thing. Having to map the user and group names is inconvenient. For one, you have to maintain those users and groups.
Change sets. There is no sense of history: I can't go back to a certain date and get a snapshot of the data as it was then. This is absolutely required in certain scenarios. Of course, you could rotate among several different target directories, just like you would rotate tapes, but that wastes disk space that you might not possess.
Interestingly, rdiff-backup solves this problem; rdiff-backup uses librsync for the file transfer and has a simple backup increment system, with rudimentary support for file obsolescence by age (in other words, if you delete a file in the source location, it won't be deleted in the backup until a certain date).
The root issue. It would be much nicer if there was a server component running as root, which permitted specific users to connect and backup files.
Incidentally, rsync seems to not care about file name character encoding, which is a big minus. The port of rsync to Mac OS X will not transfer files containing 8-bit characters (eg., accented characters, Scandinavian letters), barfing with an "operation not permitted"-style I/O error. It seems that the Mac OS X APIs want UTF-8 file names. A simple internal translation should work.
How are you today? How does it feel to be sucking off of other people off of society? You are a fucking charity case, loser. You said in a post "I'm serious." HAHAHAHAHA. You better seriously DO SOMETHING, code some shit up, or become an implementation specialist because from what I see from you, PFY, is that you are anecdotal, full of shit and hot air and you are a poster child for the IT loser who has never done anything but put out fires and do other people's bidding. You are a fat gay poor sexless live at home mediocritomaton loser bitch. You: a scrawny, inept, impotent little man-bitch. A good name for you, MITCH. - Man bitch.
Tah tah sweetheart. Nice life you have living vicariously through a screen and keyboard. Make sure you put saran wrap on the keyboard BEFORE you jack off to k1d pr0n.
OpenAFS does seriously need documentation on their site. There is only one book on AFS admin that I've been able to find, and it's a little dated: Managing AFS by Richard Cambell (1998). Amazon has it.
I also recommend getting this short read on Kerberos.
Well, as I said you can use numeric id's if that's more convenient. Doing that they may (or may not) make sense no the target system, but there's really no other way to do it besided using a specific backup solution (that maintains its own internal mapping/whatever). A bit much to ask from a file syncronisation program IMHO.
That's supported by rsync, and quite nicely I might add. You can do differential backups, with your last backup being the full backup, and earlier ones being saved. Granted it won't do diffs of the contents of files, but rdiff-backup doesn't do that either, does it?
Oh, but there is. You can run an rsync daemon as root on your backup machine. It ever supports authentication. But you lose the 'ssh' capabilities so you'd better run it on an internal trusted network.
Never came accross that one as I only sync between linux and linux (and I'm Swedish so the odd LATIN-1 char has slipped in from time to time). IMHO, converting filenames as text to and from different binary formats having to consider differing locales and whatnot is fraught with peril, and a gargantuan task. I'm happy with the current "just copy the binary" strategy. And UTF-8 should die BTW... ;-)
Stefan Axelsson
...ceterum censeo Carthaginem esse delendam.
I am suprised that no one has suggested openGFS with iSCSI. This setup looks right and doesn't require any special hardware.
opengfs with iSCSI
iSCSI is a new IETF standard (RFC 3347)
that looks very prommising.