Slashdot Mirror


Laptop/Server Data Synchronization?

gbr writes "I've been trying to automatically synchronize data between a laptop and a server. When the laptop is connected to the network, I want all writes to automatically propagate across to the server. When the laptop is disconnected I want the laptop user to continue working with the local data. When the laptop is reconnected, I want the data to automatically re-sync. The issue is, the data on the server may have changed as well, which needs to propagate back to the laptop. The data doesn't contain anything too special, no database tables etc. It does contain binary data such as executables and word processing documents. I've looked at ChironFS, Unison file sync, and drbd. ChironFS needs a manual rebuild if a connection fails, and the user needs to know which machine contains the correct data. Unison requires the user to initiate the synchronization process manually every time, and drbd is just not meant for the job at hand. How do you automatically, and invisibly to the user (except in the case of conflicts), synchronize between a laptop and a server?"

34 of 305 comments (clear)

  1. rsync by jshriverWVU · · Score: 5, Informative

    I do this often and rsync is wonderful for such a task.

    1. Re:rsync by pixel.jonah · · Score: 4, Informative

      I'd second rsync.

      I'd also take a look at Microsoft's SyncToy if you're on win***s.

    2. Re:rsync by Roarkk · · Score: 5, Informative
      rsync is part of the answer. If you're looking for a way to have multiple, incremental backups of laptops with unpredicatable patterns of connecting to the network, BackupPC is the way to go.

      BackupPC is a high-performance, enterprise-grade system for backing up Linux and WinXX PCs and laptops to a server's disk. BackupPC is highly configurable and easy to install and maintain. Given the ever decreasing cost of disks and raid systems, it is now practical and cost effective to backup a large number of machines onto a server's local disk or network storage. This is what BackupPC does. For some sites, this might be the complete backup solution. For other sites, additional permanent archives could be created by periodically backing up the server to tape. A variety of Open Source systems are available for doing backup to tape. BackupPC is written in Perl and extracts backup data via SMB using Samba, tar over ssh/rsh/nfs, or rsync. It is robust, reliable, well documented and freely available as Open Source on SourceForge.
      By using pooling and compression, one client of mine is using BackupPC to backup over 1TB of data distributed among over 100 laptops to a 200GB filesystem on a central server. The network is polled every hour, and any system that hasn't been backed up in the last 24 hours is queued. Beautiful system.
    3. Re:rsync by Anonymous Coward · · Score: 3, Funny

      I keep trying to install the n'sync CD but it just asks me whether I want to play the music or not.

    4. Re:rsync by frenetic3 · · Score: 5, Informative

      Take a look at Dropbox (http://getdropbox.com/; screencast at http://getdropbox.com/u/2/screencast.html) if you want something that's rsync-like but integrated into Windows and OS X. It's in beta (and full disclosure: I co-founded the company) but was designed precisely because there's nothing out there that does this well and is easy to use.

      --
      "Where are we going, and why am I in this handbasket?"
    5. Re:rsync by syphax · · Score: 3, Informative


      Unison is 2-way rsync. But as the poster noted, unison/rsync doesn't easily support automatic synching (that I know of)- you have to kick it off and then deal with any conflicts, etc., manually. I think the poster is looking for ideas of at least automating Unison/rsync (BTW does rsync support 2-way updating, as the poster explicitly mentions?).

      As someone who relies on running unison manually (too lazy to figure out how to automate on my Windows box), I'd be interested in relevant solutions.

      --
      Simple Unexpected Concrete Credible Emotional Stories
    6. Re:rsync by cduffy · · Score: 4, Insightful

      subversion is intended for a case where you have a single data store. A modern distributed SCM -- designed for disconnected use -- would make more sense.

      Personally, I play with bzr most frequently; it has a nifty Python interface (and an extensive plugin architecture) which makes it quite conducive to local scriptage. (As an example -- I have a local, filesystem-backed set of CA scripts which use bzr for transactional semantics; if a method is called which throws an exception, all filesystem changes are automatically rolled back; if a method succeeds, a commit is done to record the operation and [effectively] set a rollback point). The separation between working tree and repository is optional (by default, all working trees are also repositories -- much like BitKeeper in that respect), which makes it very handy for situations like this where you don't necessarily *have* a separate, central location which all nodes can always communicate with, and where the different trees are allowed and expected to temporarily diverge.

    7. Re:rsync by frenetic3 · · Score: 5, Interesting

      Apologies if it's in bad taste to reply to my own post, especially because it's about the product I'm working on, but here are some of Dropbox's differences/improvements over what people typically hack together themselves:

      - syncs continuously/watches the FS for file changes (no cron jobs needed -- things usually sync as quickly as they can be sent)
      - does binary diffing and only sends deltas (compressed & over SSL)
      - transparently archives past versions of all files (i.e. undelete/infinite undo)
      - syncs across any number of machines
      - lets you get to your files from the web
      - some more info @ http://venturebeat.com/2007/08/16/the-y-combinator -list/

      We made it after hacking together our own rsync-based abominations and getting really annoyed that no one had solved this genre of problems in a way that normal people could use.

      Okay, I can stop shilling now. I was just excited that other people run into these problems.

      --
      "Where are we going, and why am I in this handbasket?"
    8. Re:rsync by dusty123 · · Score: 3, Informative

      I don't see how rsync solves this problem:

      AFAIK, rsync is only one-way, meaning that it overwrites and eventually deletes files. Have a try:

      mkdir d1 d2 # Create two directories (e.g. one on server, one on laptop)
      touch d1/foo.txt # Create an empty file
      rsync -r d1/ d2/ # Sync the directories
      echo "123" > d2/foo.txt # Now modify the file on d2 (e.g. laptop)
      rsync -r d1/ d2/ # Sync again
      cat d2/foo.txt # Ooops - foo.txt is empty!

      One possible way I experimented with is the following:

      - Integrate a rsync server -> laptop in the startup procedure of the laptop
      - Never modify a file on the server while working with the laptop
      - Integrate a rsync laptop -> server in the shutdown procedure of the laptop

      In theory this works, but practically there are cases where you miss the shutdown/startup sync, e.g. when you have no network at startup (e.g. you took your laptop away from home and forgot to sync it), in case you laptop crashes, the network fails during shutdown and numerous other problems. These lead to dangerous situations, e.g. if the rsync laptop->server fails during shutdown, a startup-rsync may overwrite modified files.

      After loosing some of my work, I decided to switch to unison, which is a 2-way sync and lets me decide how to resolve syncing problems.

      Nevertheless I'm not entirely happy with the situation - if I forget to sync, I have to resolve things manually, moreover the sync takes quite some time.

      In my special case, I have a WLAN connection to my server most of the time, so changes could be written immediately. So I'd favour some kind of network file system that has offline capabilities and can handle two-side modifications in some way. I thought about Coda but it seems to be far too complicated and unreliable and I don't know better alternatives.

      So I'm still stuck to my Unison solution, which is somehow cumbersome, but works...

    9. Re:rsync by deimtee · · Score: 3, Funny

      That's strange, n'sync CDs don't have any music on them.

      --
      I'm guessing that wasn't on their radar screen...
  2. iFolder? by belly69 · · Score: 4, Informative

    That sounds exactly like what Novell's iFolder is made for:

    http://www.ifolder.com/index.php/Home

    1. Re:iFolder? by killjoe · · Score: 4, Interesting

      >What do you mean if iFolder was mature?

      I don't know how mature the novell version is but the open source version is very far from being mature. In fact there hasn't been a stable release in more then two years and nobody knows if or when there will ever be a stable release.

      A while ago all the developers on ifolder either quit or were fired and the development was moved to india. Since then the pace of development has slowed down to a crawl and the new developers try to understand the code base and fix bug reports.

      Right now you can download something that is beta-ish but I certainly would not trust my mission critical data to it. If you want something that works you are going to have to pull off the trunk and compile it.

      --
      evil is as evil does
  3. my take by TheRealMindChild · · Score: 4, Insightful

    Man... You are late to the party. People have been struggling with this since the beginning of time (or so it seems). Especially database apps, where they need to work in "detached mode".

    I can't give you a flat out solution, because all situations are different. But I can pass on a bit of wisdom. The most important thing for you to do is create business rules for your synchronization. If the data on the server has changed and you made changes offline, who gets priority? You will have three categories of which a file can be... Client changes get priority, Server changes get priority, and Merge files. I would stay away from the last one. If you want to keep things simple, Id go for the "Server changes get priority" approach. In short, if you took an "online" file "offline" and came back, and the server copy has changed since, your offline edits are abandoned. This way, it makes it so heavily edited files have a shorter "check out window" (even if you don't use a checkout system), and forces the person taking the file offline to coordinate with everyone else that may edit this file.

    --

    "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
  4. Windows might be good for something by wpanderson · · Score: 3, Interesting

    My Briefcase in Windows 95. It even has a cute ickle briefcase icon.

    Somewhat seriously, Offline Files in Win2K/XP is something I've yet to see done well on any other OS.

    --
    neuro at well dot com (when I post, it's my opinions, no-one elses)
    1. Re:Windows might be good for something by arivanov · · Score: 4, Informative

      Offline most likely derives its origins from Coda which was designed to work for 100MB at most. It seems to inherit all of its problems when the data volumes become big. I have had to support an environment where people casually offlined 3-4GB documentation trees and it was falling over on regular basis.

      Further to this, offline files has a number of fairly fundamental bugs in the actual implementation. It records both the IP and the name of the server somewhere when doing the offlining. As a result if the name (but not the drive) or the IP changes your entire offline tree goes south and stays offline. You can neither delete it nor reconnect it and the only way of dealing with this is either surgery to the network (aliasing IP addresses) until you reconnect. The only alternative is to rebuild the affected laptops from scratch.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
    2. Re:Windows might be good for something by dos · · Score: 4, Informative

      It records both the IP and the name of the server somewhere when doing the offlining. As a result if the name (but not the drive) or the IP changes your entire offline tree goes south and stays offline. Go download csccmd 1.1 from Microsoft.

      csccmd /moveshare will take care of this.
  5. Coda by norkakn · · Score: 4, Insightful
  6. Re:Subversion by Iron+Condor · · Score: 4, Interesting

    At the risk of saying something stupid or blasphemous: why offer something that requires "writing some scripts"?

    If the OP wanted to "write some scripts" s/h/it could have done all the neccessary work with a couple foreach...cp...end. Or, hey, rsync.

    I am suspecting the OP is wondering whether there isn't something out there that "just kinda works" and only needs intervvention in case of a conflict.

    Knowing well that this will definitely be considered blasphemy: I've been using Window's "briefcase" system since Win98. It does "kinda work". Most of the time. And requires work when there's a conflict. Which appears to be what the OP is looking for. Given that the OP doesn't seem to want to just go that route, the question appears pertinent what s/h/it is looking for that Mr. Gates briefcases can't/won't do...

    --
    We're all born with nothing.
    If you die in debt, you're ahead.
  7. Windows & Make Available Offline by goofy183 · · Score: 4, Informative

    I'll likely get buried but here it goes:

    In Windows you can mark a folder on a network share as "Available Offline". Windows will copy all of the files to the local HD and if the server isn't available just work with the local copies. When the server is detected Windows will automatically sync the files and pop-up asking the user about conflicts (keep local / keep remote). When connected writes automatically go to both the local copy and the server.

    One of the few places that Windows has right and I haven't found a Linux or OS X solution for that is nearly as nice.

    1. Re:Windows & Make Available Offline by Techman83 · · Score: 5, Informative

      A great solution till it breaks... believe me it does break and when it does be prepared for heartache. There are ways to recover it, but I think it assumes to much and the potential to screw up is a big risk. There were several users at my place of employment that found out this the hard way and now we ban the usage of it. It's not so much finding the best tool, but managing the process overall and how to do that.. Well we are still in the process of developing that one!

      --
      # cat /dev/mem | strings | grep -i cat
      Damn, my RAM is full of cats. MEOW!!
  8. OS X Client & Server does a good job with this by phillymjs · · Score: 3, Insightful

    I'm not exactly sure what Apple uses under the hood to accomplish it. I don't think it's rsync, because I've fooled with the rsync built into OS X and I get errors frequently, but their home syncing works great.

    When you have a mobile user account (i.e. a network account with a local copy of the home folder on the workstation), it will sync every so often (frequency and exactly what is synced/skipped can be configured on the server end, and the user can kick it off manually from the client end). To the best of my knowledge, the sync is bidirectional, so if you log into another machine with a mobile account and modify the server copy, the changes will be reflected on the mobile copy at next sync. It makes my life easier because if a laptop user's machine gets lost, stolen, damaged, or destroyed, we've automatically got a backup copy of the data on it up to the last time it was synced.

    In the event of conflicts, the user is presented with a dialog asking which version to keep, including file size and modification date.

    Note that I'm not suggesting you throw out your existing hardware and buy Macs to get this feature, but maybe look into exactly how it's done on the Mac and see if you can duplicate it on your systems.

    ~Philly

  9. In OSX, portable home directories by shamborfosi · · Score: 5, Informative

    I have OSX laptops using portable home directories to do exactly what you are asking for.. a network home directory that is automagically sync'd to my laptop (thus making it portable). It works both ways, and I'm definitely happy with it. I'm not sure which OS you're using though. I wrote about how to do it in an article: Full Stack: Portable Home Directory over NFS on OSX authenticated via OpenLDAP on Debian Linux if you're interested. I also just got everything to work over AFP to an OSX server running open directory as well.. but haven't had time to write it up yet (btw, a lot fewer steps).

  10. commercial products. by deviator · · Score: 3, Interesting

    Offline Folders on a Windows client connected to a Windows server work reasonably well but sometimes get screwed up.

    Novell's iFolder is a very interesting alternative... runs on Linux/Apache/Java stack & only transmits changed blocks over an SSL connection.

    Other things worth looking into include Microsoft Groove--let's you synchronize an entire workspace with yourself on other computers or other people - and is relatively network & environment-independent (though Windows only)

  11. Synchronization Woes by JWSmythe · · Score: 5, Interesting

    A few people hit this one pretty well. rsync (and probably rsyncd).

        The more complex problem has been thrown at me a few times. What if it's not just one person?

        Say you have a repository of data that a dozen people may be working in. When they're all network connected, they're all dealing with the same file pool. When they take their off-line copies with them (unplugged laptops on vacation), they all make changes to the same files. Maybe mine is a one line change. Maybe one guy copy&pasted the first 3 chapters from War And Peace into a comment somewhere in the middle. Maybe another developer did some very intellectual looking changes but hosed some major functionality.

        When you start putting machines back on the network, who is right? The 6 guys who did real work are obviously right(ish), but they all made different changes. The very last change will end up being someone's 3 year old kid who was pounding on the keyboard right before daddy shut down the laptop, saving the new changes. Probably the last is the most recent, and right by most methods.

        It's not a pretty picture, and requires some intelligence to sort out the mess.

        The only "good" resolution I've found is to give logical authority to the changes. Bob is in charge of development. Any changes going into the development or production tree must clear him. He should be able to recognize that the 6 guys made changes, and diff them to come up with the common changes. The 3 chapters of war and peace go by the way side. And the guy with the 3 year old "developer" gets reprimanded.

        In the end, a good revision system and good backups are needed too. Something will slip through the cracks, and you'll need to roll back to something you hope is good.

        I take control over whatever I'm working on, so if I know I'll be working offline, I'll scp the data to my laptop, work on it on the road, and scp my changes up to the server when I'm done. Anyone else who may have worked in my project space in the duration should have known better. :)

    --
    Serious? Seriousness is well above my pay grade.
  12. Coda, AFS, InterMezzo by RAMMS+EIN · · Score: 3, Informative

    There have been some efforts in the area of networked filesystems with disconnected operations. I remember checking out AFS, Coda, and InterMezzo years ago. At the time, I found something wrong with each of them, but they may have improved since then. Of the three, I think Coda is your best bet.

    --
    Please correct me if I got my facts wrong.
  13. you already solved your problem by coaxial · · Score: 3, Informative

    Simply instally unison or rsync or whatever and have the job kick off with whereami for linux (you'll have to find the main page yourself) or marco polo for macs.

  14. Re:Subversion by aldheorte · · Score: 5, Funny

    "s/h/it"

    You may want to reevaluate your approach to political correctness.

  15. SyncBackSE by __aalmrb3802 · · Score: 3, Informative

    If you're running Windows, I would recommend SyncBackSE (http://www.2brightsparks.com/syncback/sbse.html), which I expect you should be able to setup to do exactly what you asked.

  16. OpenAFS by sid77 · · Score: 3, Informative
    As said before there're many choices, each ones with its own pros and cons, so I'll throw this one in: OpenAFS.

    As read from the main page:

    What is AFS?

    AFS is a distributed filesystem product, pioneered at Carnegie Mellon University and supported and developed as a product by Transarc Corporation (now IBM Pittsburgh Labs). It offers a client-server architecture for federated file sharing and replicated read-only content distribution, providing location independence, scalability, security, and transparent migration capabilities. AFS is available for a broad range of heterogeneous systems including UNIX, Linux, MacOS X, and Microsoft Windows
    Hope this helps, ciao
  17. Re:common refrain by oatworm · · Score: 3, Informative

    Not that it matters, but since you asked...

    Photoshop -> GIMP
    Avid -> LIVES - Note: I am not a video editor and have no idea if this program is any good.
    Quicken -> GNUCash, among others.

    I guess what I'm saying is that, based on your definition of "silly", there's quite a bit of silliness going on in the world today. *grin*

  18. Unison, Rsync & NTP by Colin+Smith · · Score: 3, Informative

    Unison can be scripted, added to a login script. As can rsync on windows. Alternatively you can add a polling batch file which wakes up every so often and checks to see if the server lives. (Yes, even on Windows)

    Rsync can sync in both directions, but you decide one of the sides is the master and sync that one first, in the case of conflicts the master rules. It isn't possible to choose on a file by file basis at sync time as you can with Unison.

    Oh, and NTP is absolutely vital when doing any synchronisation.

    Basically. Either you do it manually and manage conflicts at sync time, or you do it automatically and define one of the sides as a master in the case of conflict. There's really no way round this, software just isn't sophisticated enough to decide what you're thinking.

    The truth is that filesystem syncing isn't ideal for a very dynamically updated file system. It is best used on fairly static filesystems or one way syncing. Documentation, backups and the like.

    --
    Deleted
    1. Re:Unison, Rsync & NTP by Colin+Smith · · Score: 4, Insightful

      Sure there is another way: newest file wins. But this means that any number of intermediate edits by arbitrary numbers of people will simply and silently be removed if someone updates an obsolete version between syncs.

      nevertheless I couldn't find software that would support this, thus had to write it by myself... Eh? Rsync supports it. rsync -u ...

      --
      Deleted
  19. Re:Subversion is for stupid and ugly people by cduffy · · Score: 3, Insightful

    git's chunk-based rename handling is interesting, but bzr's directory-level handling is closer to what most users expect. (Does the git behavior make more sense for the kernel source tree? Sure! Does it make more sense for Joe Blow's home directory? I'd need some convincing there).

    Out here in the Real World, folks setting up revision control systems need to count "stupid people" (read: artists and web designers who are too busy making art or designing web pages to care about revision control except inasmuch as it's a way to back up and distribute their work) among their customers. For a great many cases, subversion is Good Enough, and it has excellent Windows support, integration with just about every IDE under the sun, TortoiseSVN and other nice pretty hand-holdy tools available which simply aren't ubiquitous among SCMs written with hardcore users in mind (seemingly to the exclusion of those that aren't). SVN isn't distributed, which sucks. SVN has an ugly hack of an excuse for a rename handling algorithm, which sucks. SVN is slow as molasses compared to some of its competition and lacks merge tracking (and thus history-sensitive merging) and has a gawdawful working tree library and sucks in any number of other ways -- but it is a compelling replacement to CVS, and sometimes that's what the customer needs, no matter what shiny happy features ${YOUR_FAVORITE_SCM} may have and no matter how many ways SVN manages to annoy the power user.

    And trust me, I learned this one the hard way.

    As for the 4NT copy suggestion, the whole bidirectionality and rename handling arguments come into play.

  20. Re:Subversion by Godman · · Score: 3, Funny

    I sat here for about 10 minutes before realizing that s/h/it wasn't a regex joke and why it was actually funny... :-/

    --
    I have this really funny quote that I like to put here. Unfortunately, there's this really annoying thing called a char