Laptop/Server Data Synchronization?
gbr writes "I've been trying to automatically synchronize data between a laptop and a server. When the laptop is connected to the network, I want all writes to automatically propagate across to the server. When the laptop is disconnected I want the laptop user to continue working with the local data. When the laptop is reconnected, I want the data to automatically re-sync.
The issue is, the data on the server may have changed as well, which needs to propagate back to the laptop. The data doesn't contain anything too special, no database tables etc. It does contain binary data such as executables and word processing documents. I've looked at ChironFS, Unison file sync, and drbd. ChironFS needs a manual rebuild if a connection fails, and the user needs to know which machine contains the correct data. Unison requires the user to initiate the synchronization process manually every time, and drbd is just not meant for the job at hand. How do you automatically, and invisibly to the user (except in the case of conflicts), synchronize between a laptop and a server?"
I do this often and rsync is wonderful for such a task.
That sounds exactly like what Novell's iFolder is made for:
http://www.ifolder.com/index.php/Home
I use unison. Why would you need to run it manually every time? It can be run in batch mode. I am mostly using it for live backups these days rather than true bidirectional synchronization, so I could really use rsync and some scripts, but I've gotten pretty comfortable with unison.
Man... You are late to the party. People have been struggling with this since the beginning of time (or so it seems). Especially database apps, where they need to work in "detached mode".
I can't give you a flat out solution, because all situations are different. But I can pass on a bit of wisdom. The most important thing for you to do is create business rules for your synchronization. If the data on the server has changed and you made changes offline, who gets priority? You will have three categories of which a file can be... Client changes get priority, Server changes get priority, and Merge files. I would stay away from the last one. If you want to keep things simple, Id go for the "Server changes get priority" approach. In short, if you took an "online" file "offline" and came back, and the server copy has changed since, your offline edits are abandoned. This way, it makes it so heavily edited files have a shorter "check out window" (even if you don't use a checkout system), and forces the person taking the file offline to coordinate with everyone else that may edit this file.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
Google's Super Secret Search Algorithm: SELECT @search_results FROM internet WHERE @search_results = 'good'
From the summary:
"The issue is, the data on the server may have changed as well, which needs to propagate back to the laptop."
So let me get this straight.. You have the old version of the file, somewhere. The new laptop version of file, somewhere. And the new server version of the file, somewhere. And you want the software to decide which to use and copy it to both the server and the laptop?
There are even more issues here, but it kinda sounds like you want some artificial intelligence that you can download.
Aero
Please stop hurting America -- Jon Stewart
My Briefcase in Windows 95. It even has a cute ickle briefcase icon.
Somewhat seriously, Offline Files in Win2K/XP is something I've yet to see done well on any other OS.
neuro at well dot com (when I post, it's my opinions, no-one elses)
http://www.coda.cs.cmu.edu/
At the risk of saying something stupid or blasphemous: why offer something that requires "writing some scripts"?
If the OP wanted to "write some scripts" s/h/it could have done all the neccessary work with a couple foreach...cp...end. Or, hey, rsync.
I am suspecting the OP is wondering whether there isn't something out there that "just kinda works" and only needs intervvention in case of a conflict.
Knowing well that this will definitely be considered blasphemy: I've been using Window's "briefcase" system since Win98. It does "kinda work". Most of the time. And requires work when there's a conflict. Which appears to be what the OP is looking for. Given that the OP doesn't seem to want to just go that route, the question appears pertinent what s/h/it is looking for that Mr. Gates briefcases can't/won't do...
We're all born with nothing.
If you die in debt, you're ahead.
I'll likely get buried but here it goes:
In Windows you can mark a folder on a network share as "Available Offline". Windows will copy all of the files to the local HD and if the server isn't available just work with the local copies. When the server is detected Windows will automatically sync the files and pop-up asking the user about conflicts (keep local / keep remote). When connected writes automatically go to both the local copy and the server.
One of the few places that Windows has right and I haven't found a Linux or OS X solution for that is nearly as nice.
I've been using Foldershare for several months now to synchronize several folders on three different machines. It has worked well so far and it is free. It's available at: https://www.foldershare.com/
I'm not exactly sure what Apple uses under the hood to accomplish it. I don't think it's rsync, because I've fooled with the rsync built into OS X and I get errors frequently, but their home syncing works great.
When you have a mobile user account (i.e. a network account with a local copy of the home folder on the workstation), it will sync every so often (frequency and exactly what is synced/skipped can be configured on the server end, and the user can kick it off manually from the client end). To the best of my knowledge, the sync is bidirectional, so if you log into another machine with a mobile account and modify the server copy, the changes will be reflected on the mobile copy at next sync. It makes my life easier because if a laptop user's machine gets lost, stolen, damaged, or destroyed, we've automatically got a backup copy of the data on it up to the last time it was synced.
In the event of conflicts, the user is presented with a dialog asking which version to keep, including file size and modification date.
Note that I'm not suggesting you throw out your existing hardware and buy Macs to get this feature, but maybe look into exactly how it's done on the Mac and see if you can duplicate it on your systems.
~Philly
I have OSX laptops using portable home directories to do exactly what you are asking for.. a network home directory that is automagically sync'd to my laptop (thus making it portable). It works both ways, and I'm definitely happy with it. I'm not sure which OS you're using though. I wrote about how to do it in an article: Full Stack: Portable Home Directory over NFS on OSX authenticated via OpenLDAP on Debian Linux if you're interested. I also just got everything to work over AFP to an OSX server running open directory as well.. but haven't had time to write it up yet (btw, a lot fewer steps).
Have a look at http://coda.cs.cmu.edu/ This is a disconnectable file system. It could be what you are looking for. Certainly, that is what I use for doing the same thing.
Maybe a two-way rsync tool made just for this purpose?
You might have to do A-B, A-C, A-B type syncs for more than 2 paths, unless you stick to a hub/spoke or cascading distribution model.
Not all conflicts are automatically resolved, by default.
http://www.cis.upenn.edu/~bcpierce/unison/
Good luck.
Offline Folders on a Windows client connected to a Windows server work reasonably well but sometimes get screwed up.
Novell's iFolder is a very interesting alternative... runs on Linux/Apache/Java stack & only transmits changed blocks over an SSL connection.
Other things worth looking into include Microsoft Groove--let's you synchronize an entire workspace with yourself on other computers or other people - and is relatively network & environment-independent (though Windows only)
A few people hit this one pretty well. rsync (and probably rsyncd).
:)
The more complex problem has been thrown at me a few times. What if it's not just one person?
Say you have a repository of data that a dozen people may be working in. When they're all network connected, they're all dealing with the same file pool. When they take their off-line copies with them (unplugged laptops on vacation), they all make changes to the same files. Maybe mine is a one line change. Maybe one guy copy&pasted the first 3 chapters from War And Peace into a comment somewhere in the middle. Maybe another developer did some very intellectual looking changes but hosed some major functionality.
When you start putting machines back on the network, who is right? The 6 guys who did real work are obviously right(ish), but they all made different changes. The very last change will end up being someone's 3 year old kid who was pounding on the keyboard right before daddy shut down the laptop, saving the new changes. Probably the last is the most recent, and right by most methods.
It's not a pretty picture, and requires some intelligence to sort out the mess.
The only "good" resolution I've found is to give logical authority to the changes. Bob is in charge of development. Any changes going into the development or production tree must clear him. He should be able to recognize that the 6 guys made changes, and diff them to come up with the common changes. The 3 chapters of war and peace go by the way side. And the guy with the 3 year old "developer" gets reprimanded.
In the end, a good revision system and good backups are needed too. Something will slip through the cracks, and you'll need to roll back to something you hope is good.
I take control over whatever I'm working on, so if I know I'll be working offline, I'll scp the data to my laptop, work on it on the road, and scp my changes up to the server when I'm done. Anyone else who may have worked in my project space in the duration should have known better.
Serious? Seriousness is well above my pay grade.
There have been some efforts in the area of networked filesystems with disconnected operations. I remember checking out AFS, Coda, and InterMezzo years ago. At the time, I found something wrong with each of them, but they may have improved since then. Of the three, I think Coda is your best bet.
Please correct me if I got my facts wrong.
Simply instally unison or rsync or whatever and have the job kick off with whereami for linux (you'll have to find the main page yourself) or marco polo for macs.
"s/h/it"
You may want to reevaluate your approach to political correctness.
If you're running Windows, I would recommend SyncBackSE (http://www.2brightsparks.com/syncback/sbse.html), which I expect you should be able to setup to do exactly what you asked.
As read from the main page: What is AFS?
AFS is a distributed filesystem product, pioneered at Carnegie Mellon University and supported and developed as a product by Transarc Corporation (now IBM Pittsburgh Labs). It offers a client-server architecture for federated file sharing and replicated read-only content distribution, providing location independence, scalability, security, and transparent migration capabilities. AFS is available for a broad range of heterogeneous systems including UNIX, Linux, MacOS X, and Microsoft Windows
Hope this helps, ciao
Not that it matters, but since you asked...
Photoshop -> GIMP
Avid -> LIVES - Note: I am not a video editor and have no idea if this program is any good.
Quicken -> GNUCash, among others.
I guess what I'm saying is that, based on your definition of "silly", there's quite a bit of silliness going on in the world today. *grin*
Unison can be scripted, added to a login script. As can rsync on windows. Alternatively you can add a polling batch file which wakes up every so often and checks to see if the server lives. (Yes, even on Windows)
Rsync can sync in both directions, but you decide one of the sides is the master and sync that one first, in the case of conflicts the master rules. It isn't possible to choose on a file by file basis at sync time as you can with Unison.
Oh, and NTP is absolutely vital when doing any synchronisation.
Basically. Either you do it manually and manage conflicts at sync time, or you do it automatically and define one of the sides as a master in the case of conflict. There's really no way round this, software just isn't sophisticated enough to decide what you're thinking.
The truth is that filesystem syncing isn't ideal for a very dynamically updated file system. It is best used on fairly static filesystems or one way syncing. Documentation, backups and the like.
Deleted
git's chunk-based rename handling is interesting, but bzr's directory-level handling is closer to what most users expect. (Does the git behavior make more sense for the kernel source tree? Sure! Does it make more sense for Joe Blow's home directory? I'd need some convincing there).
Out here in the Real World, folks setting up revision control systems need to count "stupid people" (read: artists and web designers who are too busy making art or designing web pages to care about revision control except inasmuch as it's a way to back up and distribute their work) among their customers. For a great many cases, subversion is Good Enough, and it has excellent Windows support, integration with just about every IDE under the sun, TortoiseSVN and other nice pretty hand-holdy tools available which simply aren't ubiquitous among SCMs written with hardcore users in mind (seemingly to the exclusion of those that aren't). SVN isn't distributed, which sucks. SVN has an ugly hack of an excuse for a rename handling algorithm, which sucks. SVN is slow as molasses compared to some of its competition and lacks merge tracking (and thus history-sensitive merging) and has a gawdawful working tree library and sucks in any number of other ways -- but it is a compelling replacement to CVS, and sometimes that's what the customer needs, no matter what shiny happy features ${YOUR_FAVORITE_SCM} may have and no matter how many ways SVN manages to annoy the power user.
And trust me, I learned this one the hard way.
As for the 4NT copy suggestion, the whole bidirectionality and rename handling arguments come into play.
I sat here for about 10 minutes before realizing that s/h/it wasn't a regex joke and why it was actually funny... :-/
I have this really funny quote that I like to put here. Unfortunately, there's this really annoying thing called a char
subject says it all.
You don't think enough... therefore you better not be!
There's a reason why SVN don't distribute well. It simply don't branch well.
And in branching, most of the GUI-users just don't get a clue and practically eliminates any chance there was for decent branching/merging.
SVN/Tortoise are good for one thing, snapshotting, and that should better be handled by the filesystem itself anyways. (Why are filesystem snapshotting STILL not mainstream, btw?)