Slashdot Mirror


How Do You Sync & Manage Your Home Directories?

digitalderbs writes "A problem plaguing most people with multiple computers is the arduous task of synchronizing files between them: documents, pictures, code, or data. Everyone seems to have their own strategies, whether they involve USB drives, emailed attachments, rsync, or a distributed management system, all of which have varying degrees of success in implementing fast synchronization, interoperability, redundancy and versioning, and encryption. Myself, I've used unison for file synchronization and rsnapshot for backups between two Linux servers and a Mac OS X laptop. I've recently considered adding some sophistication by implementing a version control system like subversion, git, or bazaar, but have found some shortcomings in automating commits and pushing updates to all systems. What system do you use to manage your home directories, and how have they worked for you for managing small files (e.g. dot configs) and large (gigabyte binaries of data) together?"

32 of 421 comments (clear)

  1. Dropbox by snl2587 · · Score: 4, Interesting

    I recently started playing around with Dropbox for some smaller folders than my entire home directory and haven't yet run into any major problems. And the versioning it provides is nice as well, and as a plus they don't consider the deleted files that they still retain versions of as part of the quota.

    1. Re:Dropbox by buchner.johannes · · Score: 4, Informative

      Have a look at Jake. Official website: Jake

      It is aimed for the average user (no server setup needed) and provides a syncing solution across the Internet with a nice UI. Free and open source, available for all operating systems.

      Check it out!

      --
      NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
    2. Re:Dropbox by darrylo · · Score: 3, Informative

      Yes, dropbox is very nice. I'll second the recommendation. Dropbox can also automatically keep previous versions of files around. Works on PC, Mac, and linux.

      If you need security, truecrypt with dropbox is nice. Dropbox supports incremental (delta) change file uploads/downloads, which makes large-ish truecrypt containers useful on dropbox. The only real limitations are that (1) you have to unmount the truecrypt container before synchronization can occur, and (2) you have to insure, manually, that only one PC/Mac/linux box is accessing the truecrypt container at any one time.

      An alternative to dropbox is syncplicity, but I haven't tried it. The feature set looks similar, though.

      Another alternative is jungledisk, which uses Amazon S3 to store your data. The advantages here are that everything is encrypted with a key (stored only at your end, unless you enable the web interface), that you pay only for what you use, and that there's no limit on storage capacity (as long as you have money). Disadvantages include:

      • Incremental/delta file downloads don't exist (makes truecrypt hard to use).
      • Incremental file uploads exist, for an extra $1/month fee.
      • You pay for bandwidth, and the bandwidth costs can add up.
    3. Re:Dropbox by fluffernutter · · Score: 3, Interesting

      And also, I assume, you are too old to care how dropbox scans your files, where they end up, or what they know about you by looking at your files or you wouldn't use it.

      --
      Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
    4. Re:Dropbox by nabsltd · · Score: 4, Insightful

      When I forget my USB thumb drive, I can log in to my Dropbox account via the web interface from any computer as long as it has net access.

      What ever happened to all the true geeks on ./ ?

      Whenever I need a file, I log in to my webserver and download it. With dynamic IPs, you can get business Internet access for around $70/month for 5Mbps symmetric (cable or FIOS). Hard drives cost around $75/TB, and you can host this sort of thing on just about any computer you have sitting around that can run Linux.

      So, for about $75/month, you can securely store insane amounts of data, and get to it securely from just about anywhere. You could also upload from anywhere with just a tiny bit of web programming.

      I still haven't set up the versioning sort of thing, but there are quite literally hundreds of them that work across a LAN if they can get to the filesystem for both the source and the store.

    5. Re:Dropbox by Heir+Of+The+Mess · · Score: 3, Informative

      I tried LiveMesh too, but it would crash from time to time so it is no longer on my system

      I had problems with earlier versions of LiveMesh but I haven't had any problems for a while. For a few months from November 2008 to February 2009 I'd been using both DropBox and LiveMesh. Now I'm using just LiveMesh. For me what swung it for LiveMesh was:

      - Being able to sync any folder on my PC, e.g I sync my favorites folder between 4 PCs
      - Being able to easily control which PCs get updated with what as I don't want everything synced between all my PCs
      - Being able to configure folders to be transferred just PC to PC, e.g. I have 30GB of family photos that I sync between my wife, myself, and my parents accounts on their respective PCs. I don't need these photos in online storage

      One tip though when using such sync tools - keep a backup as if one person trashes the folder it trashes everyones folder

      --
      Australian running a company that does C# / C++ / Java / SQL / Python / Mathematica
  2. The old-fashioned way? by clang_jangle · · Score: 4, Interesting

    I use multiple OS X, Linux, and FreeBSD machines daily. One cannot sync all home directory files, as all the config stuff differs between Gentoo, Debian, FreeBSD, Tiger, and Leopard. So it's mostly down to documents, graphics, and a few audio and video files. For the larger ones, I use a usb stick, the smaller ones I email to myself so they're always available via IMAP servers. But most of all I have a bootable, customized version of systemrescuecd installed on a 16GB usb stick, which at any given moment has all the currently important stuff I need. It works well enough for me.

    --
    Caveat Utilitor
  3. Different tools for different purposes by joe_cot · · Score: 5, Insightful
    • If you're keeping track of code, use a code repository. Subversion, GIT, Bazaar, etc.
    • If you're trying to keep config files, documents, pictures, etc synced, use DropBox.
    • For bookmarks, use one of the numerous Firefox bookmark syncing extensions, or the Del.icio.us extension (or use DropBox to sync your .mozilla/firefox folder).
    • For multi-GB files, use a portable hard drive, or rsync with a file server in your house/office

    I wouldn't recommend using one tool for every purpose. I wouldn't want to store multi-GB files in SVN, and I wouldn't want to store all my code on an external hard drive. Maybe using DropBox, or rsyncing with a server somewhere would work.

  4. backintime, and rdiff-backup by gardyloo · · Score: 4, Interesting

    For small backups, every ten minutes, I use backintime (based on rsync). For larger, nightly or more rare backups, I use rdiff-backup. Both work over the LAN, or to locally-mounted hard drives.

  5. USB drive by the_rajah · · Score: 4, Interesting

    I carry a 16 Gig USB flash drive with my working files on it. I've using this method since the days of 100 Meg Zip drives and just keep upgrading the media. My flash drive is automatically backed up to my backup server at home in the middle of the night so, if I forget it at the office, I'm only a few hours behind. Besides, I can use free Logmein to log into the office computer and transfer a file if it's got new and important information on it. It works the same way in reverse if I forget it at home. Since my working files are on the USB drive which is also compatible with my Linux machines, it really doesn't make much difference which machine I plug it into. Did I mention encryption? That's a good idea in case you lose the drive if you've got any sensitive information on it.

    --


    "Do the Right Thing. It will gratify some people and astound the rest." - Mark Twain
  6. What about a FUSE FS powered by a MySQL DB? by necro351 · · Score: 3, Interesting

    FAST 2009 has a paper on semantic data management using a file system built on top of an object store powered by MySQL. Performance isn't great, but it uses a distributed file system solution to solve the synchronization issue in a very nice way (e.g., synchronize all albums with my iPod, all photos with my laptop and computer, etc...). You can specify rules and I liked it when I heard about it. However performance is actually important, despite their claim :). Perspective: Semantic Data Management for the Home Brandon Salmon, Carnegie Mellon University; Steven W. Schlosser, Intel Research Pittsburgh; Lorrie Faith Cranor and Gregory R. Ganger, Carnegie Mellon University HTML Paper http://www.usenix.org/events/fast09/tech/full_papers/salmon/salmon_html/index.html PDF Paper http://www.usenix.org/events/fast09/tech/full_papers/salmon/salmon.pdf Slides http://www.usenix.org/events/fast09/tech/slides/salmon.pdf

    --
    --"You are your own God"--
  7. Re:Svn by MaskedSlacker · · Score: 4, Interesting

    I use git, with flashbake and cron to automate commits, and a simply cron job to automatically update a backup copy on an external hard drive.

  8. Beyond Compare by Anonymous Coward · · Score: 3, Informative

    On the windows side there is a great utility called Beyond Compare, around $30, that I have used to do this. I even had a small client once that could not afford a real backup software, so we faked the backup using portable USB hard drives and the Beyond Compare utility to sync her server and desktop to the drives. Worked quite great and the while thing was done for under $200.

  9. Windows - SyncBack by Anonymous Coward · · Score: 4, Interesting

    I spent a long time tackling this, as I am situated at different locations on different days.

    I have 2 desktops and a laptop which must remain sync'd and encrypted. I use TrueCrypt for the encryption.

    On my Windows boxes - SyncBack handles it. It can be triggered on write or on insertion, or just periodically. Has version control support. Will sync over FTP (poorly) and can create zip files or burn Cds etc. It's a swiss army knife of sync tools.

    The key for getting the most out of a sync program is granularity. Inevitably, you'll have exceptions, and you don't want a PASS/FAIL result for your entire backup set. It works much better to sort files into categories and sync the individual groups than to try to make one profile that does your entire disk array. My 2 cents.

  10. Re:always mount your home dir with NFS by Foldarn · · Score: 3, Insightful

    And when the server hosting your NFS share dies, so does your entire home directory on every PC. Check and mate.

  11. Re:Svn by WillKemp · · Score: 3, Informative

    If i'd elaborated, i wouldn't have made first post!

    However, i use subversion for two things - backup and syncing my development system with my remotely hosted web server. Neither of which is really how i "sync and manage home directories", but if i needed to do that subversion is what i would use.

    Some months back, i foolishly pointed to my web hosting service that there was a serious security hole in the way their system (cpanel) was configured for subversion - and they killed the subversion service and haven't reinstated it. So i have to do 'svn update' over an sshfs virtual file system, which is mildly irritating.

    Anyway, i've got a single repository set up on my system and i check in all new web sites i'm working on. Then i check them out onto the server - and update the files on the server with 'svn update'. It's easy, reliable, and reasonably fast. It also makes backup nice and easy, as i just sync the repository with a mirror on an external hdd.

  12. Re:Windows users? by iamhigh · · Score: 3, Insightful

    Out of curiosity, what do you think AD does that provides anything close to what the author is asking?

    --
    No comprende? Let me type that a little slower for you...
  13. The internet never forgets. by kylben · · Score: 5, Funny

    I embed all my documents in porn and post them on various web forums. The recovery procedure involves spidering my spam folder. I recently found my high school history term paper in a jpg of Marylin Chambers.

    --
    Insightful and funny are really the same thing, except one has a punch line.
  14. Unison by ashayh · · Score: 4, Interesting
  15. time machine is better by goombah99 · · Score: 4, Informative

    for backups I used to swear by rsync plus hardlinks. But since time machine came out it's oh so much much better. For one thing rsync is still a bit unstable on huge directory trees that contain lots of hard links. And it also boofs on some extended type attributes, forks and file types, though it keeps getting better (perhaps it's perfect now). Rsync + hardlinks also does not retain the ownership and privledges and ACL faithfully either.

    But even if Rsync + hardlinks didn't have those troubles, time machine is so flawless it's just the thing to use. What is especially nice about time machine is the recovery and inspection process. it's not too hard to figure out what files chaged (there's even a 3rd party gui application for this) and because this info is stored in meta data it's faster and more relaible to retreive than a massive FIND command looking at time stamps. The time machine interface for partial recoveries is intuitive and easy to drill down. In many cases it's even application aware so you can drill not on the file system itself but on say your mail folders in the mail application. this is actually a pretty stunning achievement that needs to be seen to be believed how paradigm shifting it is.

    And full recoveries could not be easier. you just boot off the CD and within ten clicks you have picked the souece and destination and it has done a series of idiot checks. While that might not seem too amazing, it sure is comforting. It's a mildly nerve wracking process of trying to recover from a back-up cause there's lots of ways to goof and maybe even wreck your original ( like oops, I didn't do a -delete, or I didn't tell it to reassign links, or worse I copied the wrong direction).

    Here's a super nice tip: you can have two disks operating with time machine that you rotate. Actually the best way i've found is to have one constantly attached then on fridays attach the other one, redirect time machine to it, let it back up all the changes since last friday, then detatch it and let time machine go back to the main disk.

    You can even use this as a way to sync your two computers though it's better as a backup than as a synch. have time machine back up just your home directory to a thumb drive, take this from home to work. plug it to the drive at work, back it up. then revert this to the backup from home. now home and work are synced plus, if there was one special file or two that was newer at work, well you have that in the backup you made! ( by the way to do this kind of thing requires fiddling with the backup cookie so two computers can share the same repository. google this if you want o know how)

    --
    Some drink at the fountain of knowledge. Others just gargle.
  16. Re:Windows users? by cdub1900 · · Score: 3, Informative

    Windows Live Mesh
    https://www.mesh.com/

    "With Live Mesh, you can synchronize files with all of your devices, so you always have the latest versions handy. Access your files from any device or from the web, easily share them with others, and get notified whenever someone changes a file.

    Working on one computer, but need a program from another? No problem. Use Live Mesh to connect to your other computer and access its desktop as if you were sitting right in front of it. "

  17. Re:Windows users? by Zocalo · · Score: 3, Interesting

    I use SyncToy at work to sync my laptop up with the network for a quick and dirty solution that just requires a simple replication of data, but I've found it to be less than satisfactory for more complex tasks and interminably slow when there is a large quantity of files in a sizeable directory structure.

    For home use (a mix of Linux and Windows boxes) where things are more involved I started using Unison for a cross platform solution but in the end settled on a simple RSync for the Linux data and SyncBack SE for the more complicated Windows stuff. SyncBack SE might not be free (it's $30), but it is lightning fast, extremely flexible and can handle very sophisticated synchronisation and backup tasks including versioning, support for more than one target, remote targets via FTP and email), bandwith controls... Worth a look!

    --
    UNIX? They're not even circumcised! Savages!
  18. Re:Svn by morgauxo · · Score: 3, Informative

    The biggest shortcoming of CVS that I know is the lack of ability to rename a file. Yes, you can copy it then delete the original but CVS sees this as a new file with no revision history. If I understand correctly subversion was created by former CVS users to overcome a few shortcomings of CVS with this being the biggest one. Thus SVN has a similar "feel" though not identical commands to CVS and a superior feature set.

  19. The right tools for the job by Enahs · · Score: 3, Interesting

    I don't share EVERYTHING, but I share some things:

    • If I just need to go one way, I use rsync.
    • If I need 2-way sync but no versioning info, I use unison.
    • If I need n-way sync but no versioning info, I use unison with a central "untouchable" folder.
    • If I need versioning info, I use git.
    --
    Stating on Slashdot that I like cheese since 1997.
  20. Unison; and maybe git in the future. by vyrus128 · · Score: 3, Interesting

    Currently? Just unison -quiet, running from cron. (I have it wrapped in a script that does locking, since Unison doesn't seem to lock against itself reliably, for reasons I don't understand.) I've had two problems worth watching out for:
    1) Try to avoid running it against NFS. It walks the entire synced area every time you sync. Local disk will be two orders of magnitude faster.
    2) Be careful syncing between case-sensitive and case-insensitive filesystems. Unison will start failing out if you ever create two files differing only in case.

    Beyond that, I'm looking to start using git to version both my code and my textual data. I'm not intending to use git itself to sync the repositories; I'm going to use it for versioning only, and keep syncing using Unison. The reason is because I'm the only user, and for my own convenience I'd like the working copy to be synced. All I really need out of git is versioning anyway; I already have a workable solution for syncing.

  21. Re:Svn by nizo · · Score: 3, Interesting

    Or how about, why on earth would I use something like CVS for files (movies, mp3 files, photos of my kids) that can be quite large and will never change?

    I too am looking for things to help manage the huge piles of various files I have accumulated and am leaning towards something like beagle http://beagle-project.org/Main_Page and rsync/unison for backups.

    Ultimately though I think dividing my files up into meaningful directories is a good start, especially if I start by putting everything that doesn't change into a subdirectory of a main directory named "Static".

  22. rsync + OpenSolaris (ZFS) w/time slider by EBorisch · · Score: 3, Interesting

    Nighly (or more frequently, if you like) rsync to an OpenSolaris server running ZFS w/ Time Slider.

    Quality versioned backups with little effort, plus data integrity (checksums built into the filesystem), compression, and (if desired) RAID-Z(2) goodness! In addition, the provided time slider interface allows easy browsing of versions.

    Just my 2c...

  23. shfs mounts by cron, rsnapshot by cenc · · Score: 3, Interesting

    I use shfs mounts by ( to make sure it stays mounted even if connection is interrupted) and ssh tunnels for everything else, with preshared keys to a central server / proxy, and rsnapshots for backup on the central server with hot swap drives.

    This works on desktops, remote office, and for notebooks. I essentially don't trust my employees or myself to remember to encrypt everything or use "secure" protocals all the time, and so I remove the need to remember from the whole process. I can then focus on securing one system. Great if everything else is secure, but just in case. Very good for notebooks jumping from open wireless to open wireless systems, and also keeping track of employees activity in one location. I can log fairly easily everything they do or don't do (yea, the 2 hour coffee break sticks out like sore thumb in the logs).

    Among other things this also has the nice side effect that should say a notebook or desktop be stolen, it will phone home as soon as it is connected to the internet and send detailed information about what it is doing.

  24. Re:Svn by xaxa · · Score: 3, Informative

    I have two main computers, desktop and server.

    File layout:
    desktop:Documents -- everything I want backed up regularly
    desktop:Server -- symlink to latest backup from server
    server:Documents -- a few server-specific files, and stuff I always want accessible (I turn my desktop off if I'm not using it).
    server:Desktop -- symlink to latest backup from desktop

    There is an @reboot cronjob on the desktop PC to backup the server, and tell the server to backup the desktop. I use the rsync --link-dest thing so I can have incremental backups (using hard links for files that haven't changed). There are a few other additions -- automatically deleting old backups (except keep a backup from every 10th day) and updating the symlink to the latest successful backup.

    The script is written in ZSH, to take advantage of the fantastic globbing that's available.
    The most important lines in the script are:

    older=($backups/$user/*(/om))
    ($older is now an array of directories, ordered newest-to-oldest).
    rsync --verbose -8 --archive --recursive --link-dest=${^older[1,20]} \
        --files-from=$scripts/${from}2$HOST-I-$user \
        --exclude-from=$scripts/${from}2$HOST-X-$user \
        $user@$from:/ $backups/$user/$date/

    (The variables like $from and $HOST are because I use the same script to copy some stuff to my laptop, but that has a small drive so I don't copy everything. I think the strange syntax after --link-dest expands the array like --link-dest=/dir/one --link-dest=/dir/two ... --link-dest=/dir/twenty)

    over2weeks=( $backups/*/???????[012346789]-????(/Omm+14) )
    end=$(( $#over2weeks - 5 ))
    rm -rf $over2weeks[1,$end]
    ($over2weeks is an array of directories, being backups not taken on the 5th, 15th or 25th day of the month, and at least 14 days old.
    $end is the length of the array minus 5)

  25. Subversion with a touch of bash by rpwoodbu · · Score: 4, Informative

    I have found that using Subversion (svn) with the aid of a bash script that is run manually actually works really well and provides a number of special advantages. Here's how I have it constructed:

    First, I don't actually make my whole home directory a svn checkout. I have a subdirectory in it that is the checkout, and my bash script ensures there are symlinks into it for the things I want sync'd. This makes it easy to have some differences between locations. In particular, I can have a different .bashrc for one machine than another, but keep them both in svn as separate files; it is just a matter of making the symlink point to the one I want to use in each location. My bash script will make the symlink if the file doesn't exist, and warn if the file does exist but isn't a symlink. It does this for a number of files.

    Another benefit of this method is that I don't put all my files in one checkout. The core files I'll want in all my home directories (e.g. .bashrc, .vimrc, ssh .config and public keys, etc.) go in a checkout called "homedir". But my documents go elsewhere. And my sensitive files (e.g. private keys) go somewhere else still. I choose what is appropriate to install at each location (usually just the "homedir" checkout on boxes I don't own). My bash script detects which checkouts I have and does the appropriate steps.

    The bash script not only sets up the symlinks but it also does an "svn status" on each checkout so I'll know if there are any files I've created that I haven't added, or any files I've modified that I haven't committed. I prefer not to automate adds and commits. I'll definitely see any pending things when I run my sync script, and can simply do an "svn add" or "svn commit" as necessary.

    I also prefer not to automate the running of the sync script. I like being in control of my bandwidth usage, especially when connected via slow links (e.g. Verizon EV-DO, AT&T GPRS). Plus dealing with conflicts is much easier when it is interactive (although I can usually avoid that scenario). It also simplifies authentication to run it from my shell, as it can just use my ssh agent (which I forward, which is setup in my sync'd ssh config).

    The sync bash script takes care of a few other edge-case issues, like dealing with files in ~/.ssh that have to have certain permissions and whatnot. And I've taken care to ensure that the script doesn't just blow away files; it will warn if things don't look right, and leaves it to me to fix it.

    Using Subversion has another big advantage: it is likely to be installed already in many places. So when I'm given an account on someone's computer, I can usually get my environment just the way I like it in a few short steps:

    svn co svn+ssh://my.server.tld/my/path/to/svn/trunk/homedir ~/homedir
    ~/homedir/bin/mysync # This is my bash script to do the syncing
    # Correct any complains about .bashrc not being a symlink and whatnot
    ~/homedir/bin/mysync
    # Log out and back in, or source .bashrc

    No fuss, no muss. No downloading some sync package and building it just to get your .bashrc or .vimrc on a random box, or asking the admin to install something. Subversion is usually there, and even if it isn't, most admins are happy to install it. Subversion deals well with binary files, and even large files. For bulk things (like a music library), I'm more likely to rsync it, partly because it is bulk, partly because it doesn't benefit from versioning, and partly because it only needs to be a unidirectional sync. I could easily add that to my sync script.

    I am simply in the habit of typing "mysync" from time to time (my .bashrc puts ~/bin/ in my $PATH). This works for me very nicely. Some people may prefer a little more automation, and of course my script could automatically do adds and commits, and even skip the log messages. But I prefer a bit more process; after all, this is my data we're talking about!

    If there is interest, I may post my sync script.

  26. Don't sync home. by dotwaffle · · Score: 3, Insightful

    I've done exactly the same as you, used every single tool under the sun, eventually settling on Unison until I realised I was being silly...

    Let's put it this way - just set up each computer how you want it, and sync the *data*, not the whole home directory.

    For instance, my Documents are synced with Dropbox (though tempted to move them to UbuntuOne), my development directories are generally stored in some kind of revision control (svn/bzr/git) and either not synced or at worst, unison-ed, and everything else just stays on the machine it was created on, and backed up with duplicity to a central fileserver hosted in France.

    When you realise that syncing home is *not* good, it suddenly becomes clear what you need, and what you want are completely different.

  27. Re:"Distributed homedirs" or "CVS'd configs"? by DEmmons · · Score: 3, Interesting

    NFSv4 for home dirs has worked in our office, and when it works it does exactly what we wanted - it's beautiful, even. Lately I've been seeing more and more problems with new distros though. We have a Fedora 8 server (a decade-old desktop rocking a 500MHZ P3 and 128MiB of RAM, haha) and some clients which are running various Fedora releases. Fedora 8 and 9 were nearly perfect. the same settings, though, on Fedora 10 and now 11 have broken pulseaudio, skype, and will hang gnome-panel if any of its settings are changed. Fedora 11 seems to have some other stability issues on one client but that may be a wiring issue.

    Am I the only one experiencing this, or do y'all think it's some kind of trend? It could honestly just be that I messed up some settings or don't know what I'm doing, but F9 worked so well that I'm tempted to just go back to it. Ubuntu is of course an option too, but one i haven't explored much yet. but with all of the options suggested here, i probably have a lot of options to look into. rsync works brilliantly for backups. still, i would prefer NFS working right again, because the peace of mind knowing that any one client on our network can go down without taking anyone's data with it and i can add a new client with so little work has been really nice.

    I'm open to suggestions, but since this isn't the 'ask slashdot' section, I'll just summarize what I can contribute to the thread: NFS, as eln says, works very well when your network is well-wired and stable, but is useless for home dirs on notebooks that will be used away from the LAN. and Fedora 10 and 11 have given me problems with NFS home dirs.