How Do You Sync & Manage Your Home Directories?
digitalderbs writes "A problem plaguing most people with multiple computers is the arduous task of synchronizing files between them: documents, pictures, code, or data. Everyone seems to have their own strategies, whether they involve USB drives, emailed attachments, rsync, or a distributed management system, all of which have varying degrees of success in implementing fast synchronization, interoperability, redundancy and versioning, and encryption. Myself, I've used unison for file synchronization and rsnapshot for backups between two Linux servers and a Mac OS X laptop. I've recently considered adding some sophistication by implementing a version control system like subversion, git, or bazaar, but have found some shortcomings in automating commits and pushing updates to all systems. What system do you use to manage your home directories, and how have they worked for you for managing small files (e.g. dot configs) and large (gigabyte binaries of data) together?"
I recently started playing around with Dropbox for some smaller folders than my entire home directory and haven't yet run into any major problems. And the versioning it provides is nice as well, and as a plus they don't consider the deleted files that they still retain versions of as part of the quota.
I use multiple OS X, Linux, and FreeBSD machines daily. One cannot sync all home directory files, as all the config stuff differs between Gentoo, Debian, FreeBSD, Tiger, and Leopard. So it's mostly down to documents, graphics, and a few audio and video files. For the larger ones, I use a usb stick, the smaller ones I email to myself so they're always available via IMAP servers. But most of all I have a bootable, customized version of systemrescuecd installed on a 16GB usb stick, which at any given moment has all the currently important stuff I need. It works well enough for me.
Caveat Utilitor
For small backups, every ten minutes, I use backintime (based on rsync). For larger, nightly or more rare backups, I use rdiff-backup. Both work over the LAN, or to locally-mounted hard drives.
I carry a 16 Gig USB flash drive with my working files on it. I've using this method since the days of 100 Meg Zip drives and just keep upgrading the media. My flash drive is automatically backed up to my backup server at home in the middle of the night so, if I forget it at the office, I'm only a few hours behind. Besides, I can use free Logmein to log into the office computer and transfer a file if it's got new and important information on it. It works the same way in reverse if I forget it at home. Since my working files are on the USB drive which is also compatible with my Linux machines, it really doesn't make much difference which machine I plug it into. Did I mention encryption? That's a good idea in case you lose the drive if you've got any sensitive information on it.
"Do the Right Thing. It will gratify some people and astound the rest." - Mark Twain
FAST 2009 has a paper on semantic data management using a file system built on top of an object store powered by MySQL. Performance isn't great, but it uses a distributed file system solution to solve the synchronization issue in a very nice way (e.g., synchronize all albums with my iPod, all photos with my laptop and computer, etc...). You can specify rules and I liked it when I heard about it. However performance is actually important, despite their claim :).
Perspective: Semantic Data Management for the Home
Brandon Salmon, Carnegie Mellon University; Steven W. Schlosser, Intel Research Pittsburgh; Lorrie Faith Cranor and Gregory R. Ganger, Carnegie Mellon University
HTML Paper
http://www.usenix.org/events/fast09/tech/full_papers/salmon/salmon_html/index.html
PDF Paper
http://www.usenix.org/events/fast09/tech/full_papers/salmon/salmon.pdf
Slides
http://www.usenix.org/events/fast09/tech/slides/salmon.pdf
--"You are your own God"--
I use git, with flashbake and cron to automate commits, and a simply cron job to automatically update a backup copy on an external hard drive.
I spent a long time tackling this, as I am situated at different locations on different days.
I have 2 desktops and a laptop which must remain sync'd and encrypted. I use TrueCrypt for the encryption.
On my Windows boxes - SyncBack handles it. It can be triggered on write or on insertion, or just periodically. Has version control support. Will sync over FTP (poorly) and can create zip files or burn Cds etc. It's a swiss army knife of sync tools.
The key for getting the most out of a sync program is granularity. Inevitably, you'll have exceptions, and you don't want a PASS/FAIL result for your entire backup set. It works much better to sort files into categories and sync the individual groups than to try to make one profile that does your entire disk array. My 2 cents.
Unison
I use SyncToy at work to sync my laptop up with the network for a quick and dirty solution that just requires a simple replication of data, but I've found it to be less than satisfactory for more complex tasks and interminably slow when there is a large quantity of files in a sizeable directory structure.
For home use (a mix of Linux and Windows boxes) where things are more involved I started using Unison for a cross platform solution but in the end settled on a simple RSync for the Linux data and SyncBack SE for the more complicated Windows stuff. SyncBack SE might not be free (it's $30), but it is lightning fast, extremely flexible and can handle very sophisticated synchronisation and backup tasks including versioning, support for more than one target, remote targets via FTP and email), bandwith controls... Worth a look!
UNIX? They're not even circumcised! Savages!
I don't share EVERYTHING, but I share some things:
Stating on Slashdot that I like cheese since 1997.
Currently? Just unison -quiet, running from cron. (I have it wrapped in a script that does locking, since Unison doesn't seem to lock against itself reliably, for reasons I don't understand.) I've had two problems worth watching out for:
1) Try to avoid running it against NFS. It walks the entire synced area every time you sync. Local disk will be two orders of magnitude faster.
2) Be careful syncing between case-sensitive and case-insensitive filesystems. Unison will start failing out if you ever create two files differing only in case.
Beyond that, I'm looking to start using git to version both my code and my textual data. I'm not intending to use git itself to sync the repositories; I'm going to use it for versioning only, and keep syncing using Unison. The reason is because I'm the only user, and for my own convenience I'd like the working copy to be synced. All I really need out of git is versioning anyway; I already have a workable solution for syncing.
Or how about, why on earth would I use something like CVS for files (movies, mp3 files, photos of my kids) that can be quite large and will never change?
I too am looking for things to help manage the huge piles of various files I have accumulated and am leaning towards something like beagle http://beagle-project.org/Main_Page and rsync/unison for backups.
Ultimately though I think dividing my files up into meaningful directories is a good start, especially if I start by putting everything that doesn't change into a subdirectory of a main directory named "Static".
I Am My Own Worst Enemy
Nighly (or more frequently, if you like) rsync to an OpenSolaris server running ZFS w/ Time Slider.
Quality versioned backups with little effort, plus data integrity (checksums built into the filesystem), compression, and (if desired) RAID-Z(2) goodness! In addition, the provided time slider interface allows easy browsing of versions.
Just my 2c...
I use shfs mounts by ( to make sure it stays mounted even if connection is interrupted) and ssh tunnels for everything else, with preshared keys to a central server / proxy, and rsnapshots for backup on the central server with hot swap drives.
This works on desktops, remote office, and for notebooks. I essentially don't trust my employees or myself to remember to encrypt everything or use "secure" protocals all the time, and so I remove the need to remember from the whole process. I can then focus on securing one system. Great if everything else is secure, but just in case. Very good for notebooks jumping from open wireless to open wireless systems, and also keeping track of employees activity in one location. I can log fairly easily everything they do or don't do (yea, the 2 hour coffee break sticks out like sore thumb in the logs).
Among other things this also has the nice side effect that should say a notebook or desktop be stolen, it will phone home as soon as it is connected to the internet and send detailed information about what it is doing.
Living in Chile
NFSv4 for home dirs has worked in our office, and when it works it does exactly what we wanted - it's beautiful, even. Lately I've been seeing more and more problems with new distros though. We have a Fedora 8 server (a decade-old desktop rocking a 500MHZ P3 and 128MiB of RAM, haha) and some clients which are running various Fedora releases. Fedora 8 and 9 were nearly perfect. the same settings, though, on Fedora 10 and now 11 have broken pulseaudio, skype, and will hang gnome-panel if any of its settings are changed. Fedora 11 seems to have some other stability issues on one client but that may be a wiring issue.
Am I the only one experiencing this, or do y'all think it's some kind of trend? It could honestly just be that I messed up some settings or don't know what I'm doing, but F9 worked so well that I'm tempted to just go back to it. Ubuntu is of course an option too, but one i haven't explored much yet. but with all of the options suggested here, i probably have a lot of options to look into. rsync works brilliantly for backups. still, i would prefer NFS working right again, because the peace of mind knowing that any one client on our network can go down without taking anyone's data with it and i can add a new client with so little work has been really nice.
I'm open to suggestions, but since this isn't the 'ask slashdot' section, I'll just summarize what I can contribute to the thread: NFS, as eln says, works very well when your network is well-wired and stable, but is useless for home dirs on notebooks that will be used away from the LAN. and Fedora 10 and 11 have given me problems with NFS home dirs.