Ask Slashdot: Temporary Backup Pouch?
An anonymous reader writes "It looks simple. I've got a laptop and a USB HDD for backups. With rsync, I only move changes to the USB HDD for subsequent backups. I'd like to move these changes to a more portable USB stick when I'm away, then sync again to the USB HDD when I get home. I figured with the normality of the pieces and the situation, there'd be an app for that, but no luck yet. I'm guessing one could make a hardlink parallel-backup on the laptop at the same time as the USB HDD backup. Then use find to detect changes between it and the actual filesystem when it's time to backup to the USB stick. But there would need to be a way to preserve paths, and a way communicate deletions. So how about it? I'm joe-user with Ubuntu. I even use grsync for rsync. After several evenings of trying to figure this out, all I've got is a much better understanding of what hardlinks are and are not. What do the smart kids do? Three common pieces of hardware, and a simple-looking task."
What do the smart kids do?
The smart kids don't use linux.
Oh dear.
Hardlinks don't span storage devices. They are files that share the same inodes on single storage device. Soft links do, but they are pointers to the inode, so "backup" using softlinks and you have a bunch of pointers to data that is on the original system. NOT on the thumb drive!
Use one of the backup packages out there, you are not at the point of rolling your own.
Not even close.
Never answer an anonymous letter. - Yogi Berra
Since you are an ubuntu user, and it looks like you just need a nice rsync front-end to handle backup of the same data to two different drives, I'll suggest unison-gtk.
Very nice, simple front-end, and will do what I think you need.
You can never know everything, and part of what you do know will always be wrong. Perhaps even the most important part.
Did you read the summary? Obviously Skydrive is of no use but there are several other alternatives that would be better suited to this purpose although if, as he says, it is for use while travelling an internet based system is useless. I would suggest reading up more on what you can do with dd and writing a couple of scripts to suit your needs. The problems are more likely to be around using USB but you can always write a script that puts a compressed file on your desktop that you manually copy to your USB stick. Old tech is normally most reliable.
I love stacking my barbecues in the shed at the end of summer - you can't beat a bit of grill on grill action.
I hesitate to offer this, because I've not experimented with it in the precise scenario you describe. However, being another Joe User with ubuntu, I took a look at rsync as a way to implement backups between my home PC and an Apple Time capsule that I was using as a secondary backup device.
After some tinkering I settled on Unison, which is available in the ubuntu repositories. It's essentially a sophisticated rsync front end, with a few bells and whistles. You get 2-way directory replication between your 'local' and 'remote' file systems [though they could both be local or both remote if you choose] and you can essentially script multiple different backups into the single interface. For example, I have "Office" for documents, spreadsheets and the like, "Photos", for camera images, "Music", and so on.
Like most tools, Unison is imperfect, but it's simple to use once set up. The key point with it, as with any product you put in this space, will be knowing and keeping track of your definitive data source. If you have a document that exists on both your local and backup systems, and you edit that file separately at each location, then run Unison, only the most chronologically recent copy will be preserved. To go beyond this level of functionality and get to something that can intelligently merge changes, I think you're going to need something more like a CVS tool... There are hugely expensive proprietary solutions (like Livelink), but I've not come across anyone using a good FOSS alternative. HTH...
I use DirSyncPro to automate my backup tasks. Not sure how to set it up for your particular task, or whether you can, but it might be worth looking into. A lot of options while still being easy to use.
I listen to both RIAA and non-RIAA stuff if I like the music, tangential business/politics nonwithstanding.
Duplicity uses librsync to generate the changeset that rsync would use, then stores the change set. If you stored the change set to the USB drive, this could then be "restored" to the destination drive, perhaps? I don't know if there's any way to do this out of the box, or with a bit of scripting, or if this would need to be a whole new toolchain.
there are better solutions than rsync.
rdiff-backup
dupicity
for example.
i probably don't understand what you are trying to accomplish.
I can't see how internet based system would be useless. SkyDrive and Dropbox both can sync files when you get internet connection. I am traveling too (have been for 4 months) and that's what I do, even while internet is really crap at times. But it will get synced eventually, and it gets synced automatically without me doing anything. On top of that de-duplication and only syncing parts that need to be uploaded saves bandwidth.
rsync and other low level solutions are much more work and on top of that you need to carry around extra devices that might get destroyed too. But with SkyDrive or Dropbox the files will always be there no matter what happens.
this is what dump(8) does
As you say, the internet is really crap at times when you are travelling so why make life difficult? It is also fair to say that you obviously think of travelling as a bit of wandering around in the US. Once you broaden your horizons you will find that the internet is often not even an option.
Skydrive is not going to integrate with Ubuntu (have you read the summary yet?) so it is a stupid option whereas there is a dropbox client. It is still flakey and not going to be easy to use as required so he is still better off doing something that will work well and therefore get done regularly. If he is using some client for a service that sometimes works and sometimes doesn't you can guarantee that the time when he needs that backup will be one of the times that it did not work.
I love stacking my barbecues in the shed at the end of summer - you can't beat a bit of grill on grill action.
If all you are doing is copying files from laptop to your USB HD.
Then prepare to be shattered.
A work college who did exactly the same as you, Went to restore some files that he removed previously (Space issue) learnt the hard way.
Came to work and asked - How can I recover files from my HD? I can still open them from the HD but the images are all corrupt....
My advice was ALWAYS have 3 (THREE) copies of any important data.
1) Live (Laptop)
2) First backup
3) Second backup for when you first backup FAILS while trying to restore the first.
Full backups are best. Move the backup off site (Even in the back shed with an airtight sealed container).
First, ignore the people who encourage you not to try, and who point you in other directions. Sure, there are much better ways of doing this, but who cares? The whole point is that you should be able to do whatever you want -- and actually doing this is going to leave you _so_ much smarter, trust me.
Some douche criticized you for not knowing beforehand why hard links wouldn't work. . . . because, you know, you should have been born knowing everything about filesystems. To hell with him, sally forth on your journey of discovery, this can be hella fun and you'll get an awesome feeling of accomplishment.
First off, you're going to have trouble using rsync with the flash drive, because I assume your constraint is that you can't fit everything on the flash drive, it's only big enough to hold the differences.
Next, come to terms with the fact that you'll need to do some shell scripting. Maybe more than just some, maybe a lot, but you can do it.
I'd recommend cutting your hard drive in two -- through partitions or whatever -- to make sure that "system" is fully segmented from "data." No sense wasting all your time and effort getting backups of /proc/ and /dev/, or, hell, even /bin/ and /usr/. Those things aren't supposed to change all that much, so get your backups of /home/ and /var/ and /etc/ working first. Running system updates on the road is rarely worth it, and will be the least of your concerns if you end up needing to recover.
Next, remind yourself how rsync was originally intended to work at a high level. It takes checksums of chunks of files to see which chunks have changed, and only transfers the changed chunks over the wire in order to minimize network use. Only over time did it evolve to take on more tasks -- but you're not using it for its intended purpose to begin with, since you're not using any network here. So rsync might not have to be your solution while travelling unless you start rsyncing to a personal cloud or something -- but its first principles are definitely a help as you come up with your own design.
The premise is that, while travelling, you need to know exactly what files have changed since your last full backup, and you need to store those changes on the flash drive so that you can apply the changes to a system restored from the full backup you left at home. You won't be able to do a full restore while in the field, and you won't be able to roll back mistakes made without going home, but I don't think either of those constraints would surprise you too much, you likely came to terms with them already.
So, when doing the full backup at home, also store a full path/file listing with file timestamps and MD5 or CRC or TLA checksums either on your laptop or on the flash disk, preferably both.
Then, when running a "backup" in the field, have your shell script generate that same report again, and compare it against the report you made with the last full backup. If the script detects a new file, it should copy that file to the flash disk. If the script detects a changed timestamp, or a changed checksum, it should also copy over the file. When storing files on the flash disk, the script should create directories as necessary to preserve paths of changed/new files.
For bonus points, if the script detects a deleted file, it should add it to a list of files to be deleted. For extra bonus points, it should store file permissions and ownerships in its logfiles as replayable commands.
The script would do a terrible job at being "efficient" for renamed files, but same is true for rsync, so whatevs.
I built a very similar set of scripts for managing VMWare master disk images and diff files about ten years ago, and it took me two 7hr days of scripting/testing/documenting -- this should be a similar effort for a 10-yr-younger me. I learned *so* much in doing that back then that I'm jealous of the fun that you'll have in doing this.
Of course, document the hell out of your work. Post it on sourceforge or something, GPL it, put it on your resume.
It seems like the poster confuses two tasks: Backup and version control.
/home, but that does not include huge files like RAW files from a DSLR.
/etc/hosts, or ~/ssh/config). You can then simply refer to it has "home" where ever you are, and tools like git and svn stay happy.
For the former, use archiving tools to perform full and incremental backup. How is it done? You could use find to list files with certain criteria, e.g. last modified timestamps. Pass that list to using the -T flag, where you also use -X to exclude files and directories like "*/.thumbnails" and "*/.[cC]ache*". Once the tar is done, use your favourite checksum tool; md5sum, shasum to store a checksum of the archive in a separate file. Once you get home, move the archives, verify the checksums, and you're done.
As pointed out in the summary, deleted directories and files will be an issue, thus, you perform full back from time to time. The time frame will depend on how big the changes are, and how much data you have. Personally, I've settled on every second week for
As for version control, set up your own git repository with git init, copy it to the laptop with git clone, and you're ready to go. Pro tip: Make sure you name your home computer with a real or "fake" DNS (e.g. in
If I understand your problem right, How about dar? It can make an empty archive of your main backup to act as a reference (just file info, no files). Then it makes archives relative to that, with just changed files. It can then apply the changes to the original dir, including deletions, if you need that.
Local backups via TimeMachine when your laptop is not connected to the backup disk.
Forgot to mention:
To accomplish this, you'll need to read up on:
- bash
- find
- grep
- awk
- sed
- md5sum
- chmod/chown
- mkdir -p
- diff/patch (for general reference, and also look up binary diffing tools)
Extra extra extra bonus points if you compress the changed files when storing them on the flash drive.
Did you read the summary?
Did you read his history.
Partofme is another of the Bonch/Sharklaser/Tech* etc etc sockpuppet team.
They're a group of Burson-Marsteller reputation managers working for Microsoft. They always get early postings so they can divert discussion to their pro-MS agenda.
If you want to stick with rsync for backups, what you want to do is get beyond having just a mirror on the external drive. After all, a backup should help you recover from mistakes, and mirroring will replicate your mistakes to the external drive too.
Instead, keep a series of backup snapshots on the external drive, representing your data at a certain point in time. Each rsync pass creates a new snapshot, like /mnt/backup/2012-05-20/ which represents your internal drive and shares common files by hardlink with older snapshots such as /mnt/backup/2012-05-19/ but doesn't have links to things that were deleted. Merging backups between two different external devices is then a matter of transferring around whichever dated snapshots you want to mirror or migrate between devices.
Here's a hint about how to implement one such snapshot:
mkdir /mnt/backup/2012-05-20 /. /mnt/backup/2012-05-20/.
rsync -Ravx --link-dest=/mnt/backup/2012-05-19/. --exclude=/var/{tmp,cache} --exclude=/tmp
see the man page for more details, and the general inductive step is left as an exercise to the reader. In practice, this sort of backup has very little overhead for snapshots of non-changing files. I allocate approximately 150% of my source data volume for my backup volume to maintain a long-term history of many daily, weekly, and monthly backups. My script decimates older backups (with rm -rf /mnt/backup/YYYY-MM-DD) to turn a series of dailies into weeklies, weeklies into monthlies, etc.
The pathological case for this kind of backup is a huge file that slowly grows, such as /var/log/btmp on older Linux systems where logrotate did not rotate that file. An optimization for rotated logs is to make sure they get a name like basename.YYYY-MM-DD instead of basename.1, so that the name of older files doesn't change each time it rotates. Also use appropriate exclude patterns so you don't waste time and space backing up junk you don't need. You can even go the other way and only selectively retain stuff like:
rsync -Ravx --link-dest=/mnt/backup/2012-05-19/. /./etc /./boot /./home /./var/log /mnt/backup/2012-05-20/.
This is an example of the power of the -R (--relative) naming scheme understood by rsync. The position of the extra "." inside the source file paths is not a typo, but actually essential to the purpose of this example. Learn it. Live it. Love it.
They'd have the skinny on pouches for sure.
Actually, I'm not even US citizen, and I travel in South East Asia. When talking about shitty internet, I know what shitty internet is. For example when I'm staying in Cambodia, internet can (and often does) go down for the whole day and night. It also happens often. The speed is also ridiculously slow. You can try to get around some of the downtimes by getting mobile internet for backup, but if there's a wider outage, there's nothing you can do.
Yet, I've found Dropbox to be the best backup solution. Files will get there eventually, and I don't need to do anything. There's also revision history of files, so if you upload corrupted files or something like that you can reverse it. You can access them from other computers in case your laptop goes poof (happened to me). And the most important thing - if you get robbed or lose your luggage, you will still have access to your files (and of course, I keep my laptop encrypted).
The good sides of online cloud backup far outweights the negative ones or worries about bandwidth. Especially since most of the time the files that need backup aren't large. No one in their right mind would try to sync their media files.
I have not used it myself, but I belive that git-annex does exactly what the OP asks for: http://git-annex.branchable.com/
You can get started on fire.
If you mod me down the terrorists will have won
To detect the changes, you can utilise snapshotting (rsnapshot package perhaps?) to do it and it would allow you to see the changes from a day-to-day view.
All you need to then do is transfer the changes listed until the daily.1 (or appropriate folder) to your USB stick. It maintains paths, which is one of the items you're after.
As for deletions, I think a daily email of the snapshot.log file could list this information for you. I wouldn't want it more complicated than that I guess...
Hope this helps,
@chayharley
The good sides of online cloud backup far outweights the negative ones
Until your cloud backup provider goes out of business or stops offering the service. You think rsync is a lot of work? Try keeping current on the status of Dropbox and SkyDrive services so you can pull your data before they disappear. I guarantee you that a properly stored external drive will outlive either of them.
Oh, and if you were trolling with that first post, kudos on playing it out so long.
timestamps.
could you:
store the timestamp of latest backup, using touch filename.
run find with -newer timestampfile as one of the params, can't remember others.
pipe output to tar.
http://dbaspot.com/shell/399852-creating-tar-file-only-files-have-changed-since-specified-date.html
Comment removed based on user account deletion
Install Windows, use a briefcase
Windows hard links do not span volumes, you are probably thinking of junctions. See:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365006(v=vs.85).aspx
Best post!
Obviously Skydrive is of no use but there are several other alternatives that would be better suited to this purpose although if, as he says, it is for use while travelling an internet based system is useless.
That's why I liked Crashplan when i first saw it. This may sound like a sales pitch but I'm just a happy customer.
With Crashplan you can have multiple destinations for your backup set. I usually have three:
- same HD in case I accidentally deleted some files.
- USB HD for faster recovery in case my primary HD breaks.
- Online "in the cloud", in case my house burns down etc.
Crashplan detects when I plug in the USB HD and automatically starts running updating the backup on it. If there's no internet the first two destinations will still keep me pretty safe. Once the internet is back it catches up on the cloud destination.
It works just fine on my Linux Mint laptop as well as my Windows desktop pc.
Or he could save himself a ton of grief and just use rdiff-backup, which happens to use librsync, produces incremental differential backups, stores said backups as files you can simply browse, works equally well on local and remote filesystems, and is dead simple to use. I've used it for years now on a ton of systems.
Write failed: Broken pipe
Try keeping current on the status of Dropbox and SkyDrive services so you can pull your data before they disappear.
Email? Twitter? Facebook? All kind of "push notification" technologies where you don't really need to do anything if you use them.
Besides, we are talking about Microsoft here. A company that has ridiculously long phase outs for their products as a standard practice so businesses feel safe using them (seriously, they announced that a version 4.0 of SilverLight will see end of support in two years from now). If there is any tech company in the world that you can trust not just going to end support suddenly, it's Microsoft.
With backup drives and file transfers, I also tend to run into the problem that I have different UIDs on different systems. Maybe not such a problem for the OP, but you mention backups of /var which is typically full of files owned by system users (e.g. cups, and mysqld/apache if you use the laptop for web development).
Avantslash: low-bandwidth mobile slashdot.
Wait a minute, I thought they were working for Apple. Oh I'm so confused.
Sweet post. Yes indeed you get full five points for reading comprehension and reading between the lines. And of course for great advice. Have been AC on /. since a couple of months after it started, so no problem ignoring the inevitable chaff.
I'm just rather baffled that what I'm looking for isn't already done; figure I must be blind. And at 50 I'm getting slow and less interested in skinning the cat myself. But if it comes to that (we'll see how the thread looks like by morning) then now I've got great advice for the next step. Thank you so much.
I don't know if I'm one of the "smart kids" or not, but I'm a standard non-technical user and have found LuckyBackup or BackInTime run along with an online sync/backup service like DropBox or SpiderOak the most handy options.
Both LuckyBackup & BackInTime are GUI tools that set up rsync rules (even complicated ones) with an easy point-and-click interface, then schedule them in cron. They can do anything rsync can: synchronize the drives so the backup matches the current, or make a backup of everything present plus never delete anything, and they won't waste time/energy by backing up files that haven't changed;
LuckyBackup can be set to keep up to 99 snapshots of anything that changes, and they're structured in the exact same way as the original. BackInTime can have unlimited snapshots, and each backup is in a different nested folder by date/time, with unchanged files within each folder being soft links back to the most recent backup copy. Both programs just create file copies, not compressed archives.
Right now, I'm using LuckyBackup for my regular files, and I have BackInTime handling my writing directory so I can go back to an unlimited extent in case -- as happened once -- I realize that I had made a major change several months ago (more backup dates than LuckyBackup tolerates in snapshots) that turned out to be a horrible mistake, so I don't have to try to reconstruct the original from memory.
I use the web/online backup solution partly to keep my computers in sync without a thumbdrive. It's also because it acts as a free minute-by-minute backup with a few months of snapshots, so if an .odt file becomes corrupt while I'm working on it, I don't lose everything since the previous system backup. I lost about 30 hours of intense revisions a couple of years ago because the thumbdrive I was saving & transporting my files on had a glitch that evidently had messed up everything I'd been saving for a few days -- and as it turns out, it's not possible to extract text from a bad .odt file even with a hex editor.
Apathy Sucks, Nobody for President!
Interesting, since I used rdiff-backup in the past and found it a pain. If files are stored as diffs of diffs of diffs of diffs of a full copy, it is rather easy to corrupt the backup. These days, I make backups using rsync, with
For the first backup, omit the --link-target argument. Only modified files are stored. As long as you don't have tons of big files that have only a few bytes changed, I don't see the advantage of rdiff-backup. However, it requires that the backup filesystem supports hard links (see my other comment on the use of a unix filesystem on a flash drive). When you come back home, you can do something similar (with --delete) to sync back to your regular backup drive.
The modify-window option is there to because I have to backup Windows filesystems as well.
Avantslash: low-bandwidth mobile slashdot.
Replying to myself: of course, I realize that the OP cannot use a hard-link backup if the usb drive cannot hold all his important data. It's too long ago that I used rdiff-backup; can you reliably split the master backup and the differential backups to different filesystems (say the drive at home and the usb stick)? Preferably without risking corrupted backups if it involves manually merging diff trees.
Avantslash: low-bandwidth mobile slashdot.
From what I have parsed the OP wants to have a full back-up on a USB-HDD and the diffs on the USB-Flash, because the Flash is limited in size.
Just write two rsync (or grsync) scenarios: one for HDD and the other for the Flash. On the HDD you will have a directory that is a mirror copy of your laptop. On the Flash you will keep the diffs for the time between syncs to the HDD.
When at home
1. rsync your laptop to the HDD (mirror).
2. copy the incremental stuff from the Flash to a separate directory (e.g. diff-2012-May-21) on the HDD, and wipe the Flash.
At the road:
just rsync diffs to that Flash.
I guess the recovery plan is quite obvious too. Should any _one_ of those three devices die, you are still good to go.
...a stunned silence fell upon the hall.
Look at the --delete option of rsync.
unison has already been suggested multiple times.
I used unison. It's perfect to sync from A to B (it only syncs the diffs) then modify B and later sync B to A
You also can modify A and B at the same time as long as it's not the same file, then sync and then A and B are identical.
You can even sync in cycles: A->B->C->A with modifications on all three directory trees and it still works
Unison also handles deletions on both sides fine.
Hint: use the -group -owner -times flags
Atari rules... ermm... ruled.
The first problem to consider is how you determine which files to backup. Filesystems like xfs, zfs, and btrfs have nice convenient ways to get a list of changed files (and for xfs and zfs, the contents of those files as well). For ext2/3/4 (and other older unixy filesystems) look at "dump". And of course, if you're working with a completely dumb filesystem, you can always use rsync (if your backup disk is remotely accessible) or some external/manual indexing to figure out what files to backup.
If your filesystem supports some form of dump (send for zfs), you can use that to create your incremental changes. If you only have a list of files, use tar, or rsync. If you have want to keep a full backup on the same drive, you can use rsync's batch mode (see the manpage) to efficiently generate incrememental backups, for filesystems that don't do a good job of that.
You don't want to hard link between your live tree and a backup tree. That will result in the changes showing up in both trees, obscuring the changes when you run a backup. It's a technique used with rsync for snapshotting, where two backups trees represent the state of the original filesystem at different times. To make that work, the links are broken for the files that differ between the two snapshots.
Or learn Perl. Perl can do easily the same things as bash, find, grep, awk, sed, ...
Avoiding to learn all the intricacy of all these tools was one of the main purposes of Perl.
Try MacBak works on Ubuntu as well :
https://github.com/daemonza/MacBak
No, he's made part of his username into a fake uid to make it look like he's been here longer. (hint: the second one is his uid).
For years I have used rsync scripts.
My problem was syncing a desktop and a laptop. So I made upload_to and download_from scripts to sync as needed.
I also try to keep a third master backup copy on a different server so all three are synced.
One problem comes when trying to work on both desktop and laptop simultaneously. Just map a drive and modify files on one side.
I think git has got what you seek. http://git-scm.com/
Wait. Stop scrolling for a sec. O.K. Thanks. - P
You could set up a robocopy script to do incremental backups. Then when you run it just set the 'edited after' property to only backup files since your last home backup. There are plenty of templates for this online as well as a GUI application if you prefer it.
Not to rain on your "old paranoid guy" parade, but dropbox (and I assume most others are too) IS local storage (sync'd to the cloud)...They go out of business, just uninstall, and install the next one. Probably works exactly the same.
In the grand internet tradition of answering a loosely related question which is no use at all to the asker, I will say that the "smart kids" might use something like ZFS, which almost handles this for you. (Take snapshots, save delta streams on your USB stick. Requires the backup to be a ZFS copy, not just the same files.)
Useless right now at least. But I've been pretty happy with switching my storage to ZFS, even if the Linux version sucks. (I mostly don't use the Linux version.) I'd recommend it to anyone who doesn't mind a bit of transitional pain.
Easiest solution is probably to use tar's incremental backups. The -g argument creates a relatively small file listing the files already backed up, so future incremental runs can skip them. If you keep the incremental files on the laptop then you can put each days actual tar backup on whatever devices you have handy.
Back in 1993 when I was new to *NIX, I asked a seasoned sysadmin which scripting language I should learn. He started listing all those same tools .... then said - "or you could just learn perl."
I did and you sir are 100% correct. I know grep, find, ksh/sh, sed very well - my awk is extremely weak. Perl hasn't let me down all these years. It is still my go-to scripting language whether I'm hacking some crap code together OR performing a system design and pushing out a beautiful website with XML or json services (check out Perl-Dancer) in a few hours.
Perl can be written cross platform - damn you Windows file systems - with a little effort. I'm constantly hacking Windows Perl scripts using strawberry perl.
Buy a bigger flash drive! 64GB isn't too expensive.
If you are running Linux, then 64G should easily handle your OS and documents with many versions.
It is only video, audio and photos that will eat up lots of storage and you need to get those to a server in a different country ASAP anyway. Use rsync+ssh to do that for the big files.
I saw a 32G USB flash drive for $17 yesterday. Unless you are storing HiDef video, this should cover a few weeks overseas. Be certain to encrypt it AND your laptop, but leave a tiny OS that boots so you can show it to customs and border control.
Do not carry your passphrases with you. Keep the KeePassX data file in the cloud somewhere ... so it isn't on the machine at boarder crossings. You will not physically be able to decrypt the HDD this way.
and I know this is not what he asked for, but wouldn't the simplest solution be to purchase a second external drive (maybe an SSD for durability) and actually have a complete backup on the road... Or even just take his current external with him - he has it backed up in the cloud any way...
I ask because he never stated why that external drive was stuck at home..
If that won't do, another possible solution.
1) I don't see a need to sync the USB stick when he returns - just perform your usual backup when you return and only care about the USB stick if it you have a failure before returning home.
If (1) is workable for him then: 2) how much data will he really need to backup when out-and-about? Can't you setup an 'away from home' backup profile that will only backup the things you'll be changing while away - document, current working folders, but skip the movies, porn and music for this backup (it's still at home and in the cloud anyway)..
Since he expects the deltas to fit on a USB stick, I'm assuming he's not wanting to backup a heap of video editing or some other hungry activity...
Never happened. True story.
Just do a clone with Git. You can track changes, deletions and it can resolve conflicts easily.
For 99.9% of all users a backup is simply that, a failsafe in case their main HD gets lost / damaged. So what if dropbox or skydrive suddenly were to go out of business (as unlikely as that is, youd know in advance)? You suddenly lose access to that safety copy of your data and will know right away because the client cannot connect anymore. But you still have your primary copy of everything, nothing was lost, you can just switch providers or change your backup strategy. The chances that something would happen right then in the time-frame that the cloud provider fails and you make another copy with another provider are incredibly low. If you can't take that risk then you'd have a third backup anyways.
Thete are multiple "watch" apps out there in various languages that will run a script every time a directory changes.
Google "watching files with ruby"
Substitute ruby for python or perl or...
A fool throws a stone into a well and a thousand sages can not remove it.
if ilusb | grep
then
backup script
fi
and so on
This is how my backup script works I do incremental backups to an external usb hdd if it is connected and just create local restore points if it is not.
Yes. All of my media from Microsoft still PlaysForSure.
Peter predicted that you would "deliberately forget" creation 2000 years ago...
I second the CrashPlan option. I've done the manual backing up to HDD's and USB Drives, but that way leads to madness.
If there is any tech company in the world that you can trust not just going to end support suddenly, it's Microsoft.
Kin?
Someone had suggested using Git and I was going to suggest the same. If you are only backing up documents then it should be easy enough to create repos on the USB HDD, Laptop and USB drive. You can then commit/merge changes between repos to keep in sync, perhaps use some shell scripts to ease administration. Also, I use a product called Super Flexible File Synchronizer to sync a subfolder on my laptop's filesystem with a WebDAV server. It's got lots of features and supports Linux, Windows and Mac. http://www.superflexible.com/
Except their hosting of Windows 7 Widgets. Already gone except for the top 10.
Agreed. If you wade into this you're going to want to use Perl.
I wrote pretty much exactly this about 15 years ago in OS/2 REXX of all things. Trying to use system tools will end up being way too slow; you're going to want to use an in-memory hash. REXX supported hashes before they were called hashes. (indirect array references, I think? can bash do that?)
If you do write this, please do share it somewhere.
Since you can't use dump(8) as others have pointed out, maybe you can do something with UnionFS. After you do your full backup to USB HDD and are about to leave on a trip, mount a unionfs over top of your critical filesystem(s)... Then every day, copy the union layer off to thumb-drive.
Yeah, Crashplan sounds nice, but...
Backup software written in Java, and you have to run it all the time? What a horrible resource hog.
Blowfish encryption with no option for AES? What a strange decision.
These are worryingly bad design choices which don't bode well for data recovery.
Try keeping current on the status of Dropbox and SkyDrive services so you can pull your data before they disappear.
You clearly have never used DropBox. It's just a shared folder that populates on every computer you install it on. If DropBox were to die right this instant, you would still have all of your data on every one of your computers - it would just stop syncing.
If you are worried about the revision history, you could pick one computer to run a rsync job between the DropBox folder and another folder of your choice, or if you are on a Mac just use TimeMachine, or if you are on Windows run something like Areca backup or any number of other free incremental backup solutions. You should be doing this anyway.
Then when DropBox goes out of business, you can switch to one of the several competitors out there and continue as before.
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
I'll take this one: Apple is just a puppet through which the will of Google is manifested. Even... though... Google is... much... smaller than... Apple... or something.
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Or you can learn all the utilities if you want to learn right and be a better sysadmin and work with other sysadmins.
Perl is for programmers that like to pretend they know how to be sysadmins and don't need to share their tools with other sysadmins.
If you want to "copy" junk with its metajunk from an ext3 filesystem on to a FAT32 filesystem, remember that you can always create an 8GB file with dd from /dev/zero, run mkfs.ext3 against that file, and then mount that file as an ext3 filesystem thanks to the loopback adapter. You won't be able to read that junk from a Windows machine, but you probably won't care, and if you create an 8GB file on a 16GB FAT32 flash disk, you'll still have 8GB of space available for use in Windows -- and Windows will be able to copy the 8GB filesystem file and stuff.
Someone else's explanation: http://nst.sourceforge.net/nst/docs/user/ch04s04.html
I would also like to second Crashplan. Mozy broke my cloud backup cherry, and Crashplan has been completely problem-free for me. I run it on every computers that I own and backup to the "cloud" as well as to my basement server. They have a great feature where you can send them a hard drive to seed your initial backup so that it doesn't take a month to do your initial backup.
Another thing that I do is install it on family and friend's computers whenever they ask me to fix them. I just point it to my basement server. That way when their hard drive crashes (and it always does - especially on laptops), it makes my life much easier. It uses some of my drive space, but even post-Thailand-flooding hard drives are pretty cheap relative to time.
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
I know people hate it, but setting up an auto-running batch script for backup upon plug-in. I've had no issues doing this from Windows to Linux.
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
There's shitty internet, and there's no internet. OP may be heading for the latter?
You might try bitpocket (see https://github.com/sickill/bitpocket or http://ku1ik.com/2011/07/18/bitpocket-as-a-dropbox-alternative.html ) or Unison.
- David A. Wheeler (see my Secure Programming HOWTO)
Isn't incremental tar just made for that ?
http://www.gnu.org/software/tar/manual/html_node/Incremental-Dumps.html
Thems fightin words. Perl is for everyone.
His account is obviously to troll the real kdawson, whose UID is 3715.
There is no -1 Disagree mod. Slashdot.org/faq defines mod options. USE IT.
Besides, we are talking about Microsoft here. A company that has ridiculously long phase outs for their products as a standard practice so businesses feel safe using them (snip) If there is any tech company in the world that you can trust not just going to end support suddenly, it's Microsoft.
"End support suddenly" is pretty much what they did with Office Live Small Business. A couple of emails of warning, then in about two weeks (maybe a month) the plug was pulled. No migration plan to move to Office 365. Oh, MSFT suggested moving to Office 365, but one would just have to start over. That's just one, personal example of a Microsoft business product that I have experience with (and I'm not too broken-hearted, other than the hassle, as their offering kind of sucked). Consumer-level products? Examples abound both from personal experience (PlaysForSure and MSN Music pop right to mind) and those of others on why you don't want to put too many eggs in the Microsoft basket.
MSFT probably won't be ending support for Windows or Office, but I'd be wary of trusting them with anything else.
Any data that is 'on the cloud' should not be the only residence for that data. If you store your only copy of data that you care about in the cloud, you're an idiot
Good-bye
find . -newer last_backup_timestamp | cpio -o snapshot$(date +%Y%m%d) && touch last_backup_timestamp
((lambda (x) (x x)) (lambda (x) (x x))) http://www.endpointcomputing.com a scientific approach to custom computing.
Extra extra extra bonus points if you compress the changed files when storing them on the flash drive.
Hint for the bonus question: gzip ;)
Not everything that can be measured matters; Not everything that matters can be measured.
This is the only readworthy comment in this whole thread. Thank you for posting this!
Even better, just use both SkyDrive and Dropbox on the same folder. Problem solved.
Have 2 different rsync tasks: copy laptop to stick (for when you're on the go) and copy laptop to backup HDD (for when you get home). If your laptop breaks while on the go, you copy the stick to the HDD.
If you're worried about losing all your data at customs or whatever, you need a networked system.
You can't have an 8GB file in a FAT32 filesystem. You'd need exFAT for that, but if you're going to use a filesystem that many existing Windows installations can't read, you might as well format the stick with a proper filesystem in the first place.
hotplugd + duid's and rsync (which you sort of already use) on OpenBSD makes this a cakewalk, the real kind, not the Iraq kind. Or find their equivalents in linux.
Perl is for anyone, sure, but it's certainly not for everyone.
If the author and the user are the same singular individual, and always will be, then sure, perl can be a fun toy and a timesaver.
But if you're mentoring junior sysadmins as part of your succession plan in a collaborative and evolving ecosystem, Perl is pretty much the worst choice available.
That's fine it you're using it to make your files more accessible, but he is suggesting that it be used for backups. If you need to access the data, it means you probably don't have a local copy.
The chances that something would happen right then in the time-frame that the cloud provider fails and you make another copy with another provider are incredibly low.
Noticing that your backups have failed isn't the time to go searching for another backup solution.
bzip is better than gzip if space is at a premium. There are even multi-core versions of bzip2 that are very efficient. You could also look at p7zip. If you want a really efficient compressor, try nanozip as well, although its page says it is still experimental, but it seems to be at the top of several compression benchmarks.
Make full "level 0" dumps at home to the big disk. Make delta "level 1" or incrementing levels dumps to the flash drive. Each level will back up everything that changed since the previous level.
Now, you want the backups at home to be a file copy - there's no reason you can't do that and then do level 1 dump backups on the road - ie, never actually make a level 0. Just do your rsync and then update /etc/dumpdates to reflect your rsync-level-0.
This is exactly what dump was designed to do, and it's going to be a lot easier than hacking rsync to fit.
If you have time to figure out what this guy needs, then you need a job or at least to be paid for doing the analysis. Geesh.
Why? One backup doesn't preclude a possibility of a second backup. Since hard disks fail, does it mean backing up to hard disk useless? Websites too fail, you are concluding backing up to websites is useless from it.
Bingo Dictionary - Pragmatist, n. A myopic idealist.
Yes, bash can do that, and I highly recommend learning not only bash but also plain Bourne Shell - still, if you want to create actual backup tool and release it, I'd go with Perl. If you want it to be cross platform and not limited to *nix systems in that, again perl.
If you haven't mastered the *nix tools mentioned, doing this with shell scripts, especially limiting yourself to plain Bourne Shell (Heirloom Bourne Shell is something to look into if your system, like linux distros, has /bin/sh as symlink to bash/dash), and those tools is a *great* learning experience, but another consideration for shell scripting vs. perl is speed and memory use. Consider these for making correct choice.
Also, there can be other reasons against perl and for shell scripting - want to be able to share it between *nix systems that you might not have perl, just for one. Shell scripting can be great for cross platform, but limits you to *nix likes, and often limits you to plain Bourne Shell - one of many things bash has but plain sh has not is hashes, and you can program a way around that, but it will be hackish, slow and resource eating - but it's also awesome experience :)
In capitalist USA corporations control the government.
...also, some people can be jerks about these choices, like this poster replying about perl:
Perl is for programmers that like to pretend they know how to be sysadmins and don't need to share their tools with other sysadmins.
...that is such a load... while with shell scripting combined with *nix tools you can get far, there are limits - scripts are great when used where they best fit, and sometimes the cost of hackish solutions around limitations, costs of taking tools beyond their optimal area, etc. are just too large to justify - except for doing something just for fun. Perl is a serious programming language, and even in sysadmin's scripts perl can sometimes be better than shell scripts (especially if it's not even meant to be shared with others), but there is also a point where a project starts to look more like program than script (not going into debating of the exact difference between those, I know anyone who can program probably knows what I mean), and that's one point where one should consider if shell scripting should give way to some other high level programming language. And as a high level programming language, perl has earned it's place, the above quote has no place in real world.
In capitalist USA corporations control the government.