Ask Slashdot: It's World Backup Day; How Do You Back Up?
MrSeb writes "Today is World Backup Day! The premise is that you back up your computers on March 31, so that you're not an April Fool if your hard drive crashes tomorrow. How do Slashdot users back up? RAID? Multiple RAIDs? If you're in LA, on a fault line, do you keep a redundant copy of your data in another geographic region?"
Simple. Redundancy backup.
It's a raid.
Apple hate aside, time machine is an amazingly excellent backup system.
It backs up to a Netgear Readynas configured in RAID 5. Hourly, daily, weekly backups. I've never lost anything thanks to this great system.
In linux I try to approximate this with BackupPC.
http://backuppc.sourceforge.net/
It is really an excellent piece of software, though no where near as refined of course. You pretty much only get daily backups though since the kernel in linux does not track filesystem changes so hourly backups would be very prohibitive.
It's easier to fight for one's principles than to live up to them.
With a loud beeping noise.
William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
Can we moderate this article flamebait?
Change is certain; progress is not obligatory.
I currently sync my files across three computers, each of which does a time machine backup. The files are also backed up via Jungledisk to Amazon S3. Occasionally I do full-disk images of things.
Files that would be inconvenient to lose, but which are not irreplaceable, are stored on a Drobo (redundant drive enclosure). This includes, for instance, my music library which could be reripped from CD.
Between 3 active computers I use, there's enough redundancy since they're rarely in the same place. SpiderOak manages absolutely, completely vital stuff (currently my thesis drafts).
But there's no real, constructive and useful pattern to it yet. The problem is less backups and more change management. Keeping copy-on-write sane on Windows is difficult, and migrating my servers XFS partition to ZFS is problematic since I need just tons of storage to do it which I presently can't afford.
The issue is far less "backups" and more "making them meaningful". Backing up is useless if I overwrite the media with the important changes, or it takes forever to dissect a working copy of the data.
My weekly backups: something like:
0 0 * * 0 /home/me/backup.sh
#### backup.sh ##### /dev/null
cp -r home/me/*
I haven't missed a backup yet :-)
Amateur. I take polaroids of my platters and store them in a safe deposit box.
I basically use this shell script once a week:
drive=/backup/drive
bpaths=/some/paths
for d in $bpaths ; do
dout=`echo $d|sed -e "s/^.*\///"`
echo Backing up $d as $dout
ionice -c3 rm -f "$drive/bkup/$dout.*z"
ionice -c3 tar -c "$d" | gzip -c | ionice -c3 openssl aes-256-cbc -salt -out "$drive/bkup/$dout.tgz.aes" -pass pass:"WouldntYouLikeToKnow"
done
I then copy the data to my USB drive on my keychain if it's plugged in. (Hence the encryption.) I also have a scheduled task on my laptop to copy the data from my desktop the next day.
(T>t && O(n)--) == sqrt(666)
My house is full of Macs, so I use Time Machine for on-site backup - each machine has its own Time Machine drive dedicated to it. Each machine also runs nightly image backups using SuperDuper onto yet other drives dedicated to that purpose.
All info is also backed up offsite. I use CrashPlan Pro, which backs up over the net to their servers somewhere in the American Midwest (Milwaukee?) - in the event of a fire or a giant sinkhole opening up under my house, I can get the full contents of all my computers shipped to me within a few days on external hard drives.
Mudge
In theory, theory and practice are the same.
In practice, they're not.
Live on the edge guys...
/does/ boot or you /don't/ get toasted by bolts of electricity then the sense of relief is wonderful !
;-)
When you boot up in the morning and it takes a little longer than usual, the heart beats a little faster and you think "OMG is the machine going to fail? My data will be gone". Or perhaps there's an electrical storm to liven your day up - "If that thunder gets any closer I might have to shut down the PC, but if lightning hits then everything's toast !".
These scenarios, and many others, all get the blood pumping in fear. If the computer
Try it - it's fun
while (true != false) process_more_stupid_code();
I use a secure distributed grid. The software is an open source tool, Tahoe LAFS (http://tahoe-lafs.org). The grid is composed of ~15 servers contributed by different people all over the world. There are a half dozen servers in various locations in the US, about the same number in Europe, and the remainder in Russia and the Ukraine.
My files are AES256-encrypted on my machine, split into 13 pieces using Solomon-Reed coding, any five of which are sufficient to reconstruct my files, and then those 13 pieces are distributed to the servers in the grid. I run daily backups, but since uploads to the grid are idempotent, only the changed or new files are stored. I also run a bi-weekly "repair" operation which checks all of my files (all versions, from all backup runs) to see if any of their pieces are lost. If so, it reconstructs the missing pieces and deploys them to servers in the grid. The individual servers in the grid are fairly reliable, but problems do happen, so repair is important.
I get about 100 KBps net upload rate, so this isn't a good solution for backing up terabytes, and the occasional "surge" in my data generation (usually caused by a day of heavy photo-taking) often causes my "daily" backup to take a few days to run, but all in all it works very well.
Should my server ever die, I only need two pieces of information to get all of my data back: The grid "introducer" URL, which will allow me to set up a new node connected to the grid, and my root "dircap", which is a ~100-byte string containing the identifier and decryption key for the root directory of my archive. That directory contains the decryption keys for the files and directories it references.
Since this grid is all volunteer-based, the only cost to me for this backup solution is the hardware and bandwidth I provide to my grid (I provide 1 TB of disk and grid usage consumes a fairly small fraction of my Comcast connection), plus the time I spend administering my server and checking to see that my backup and repair processes are running. Oh, and I also contribute (a little) to the Tahoe LAFS project, but that's due to interest, not a requirement.
I'm very, very happy with this solution.
BTW, the grid could use another 20 nodes or so, if anyone is interested. There's a fair amount of trust required of new members to the grid, though, so it might take us a while to vet new members. The trust is required not because other members of the grid might have access to files that are not their own, but we need to verify that new members will behave appropriately -- providing their fair share of storage and bandwidth, and not consuming too much.
Anyone interested should check out the grid's policies and philosophy at: http://bigpig.org/twiki/bin/view/Main/WebHome. If all of that looks good, join the mailing list, introduce yourself and we'll consider allowing you to join the grid.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Think through what you're backing up and why. For most people a thumbdrive should be sufficient for personal data; software can be reinstalled as needed. If you have more data than will fit on a thumbdrive you need to look at what's important.
Really large volumes of data almost always are static; usually music, eBooks, or video which can just be backed up once on a DVD and put away. No need to keep copying that stuff over and over.
Backing up software projects is another issue. A remote versioning site is best. Working in Java you'll need all the space you can afford; for a language like Python an old floppy drive is sufficient.
All of our important files (even the kids' files) are on the server. It backs itself up automatically 3 times per week to external USB drives. I rotate the USB backup drives every few weeks. So we need do nothing special today, as the backup works fine.
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
World backup day? How about world test your restore day? All the backups in the world don't mean anything unless you test your restores and know your data.
(1st sig) If this were a snappy sig, you'd be reading it right now. (2nd sig) I'm a karma whore. >Insert FUD here
Killthre... I mean The Tao of Backup
rsnapshot seems to work pretty well for incremental rsync'd backups for me. It uses symlinks to maintain the older snapshots, to save on total filesystem usage. It can do rsync over ssh for backing up remote servers/pushing local vital data to a safe remote location.
Local backup server uses Linux software RAID for good measure (5x1TB RAID 5 + 10x2TB RAID 6).
Backup is only half the problem. Restore is the other half. And indeed that's where I've usually had the most problems. The third problem is validating the restore. You always worry that you are either going to overwrite something on the restore target or miss something on the restore source and end up in an inconsistent state.
Time machine is revolutionary because it is so simple and seems to be almost flawless. I've had lots of backup systems over the years including dump 0 but everyone has been plagued with issues that arose when things were off normal. I've cobbled all sorts of things like rsync and cpio but the only thing that comes close to working as flawlessly as time machine is a NetApp.
At work where I can control the remote servers securley on a closed network I am able to use time machine for a remote backup. But at home I don't have a remote server I can target for the remote backup.
TO do a remote bakcup at home I use Crashplan. I looked a lot of competitors like Mosy but settled on crashplan for two killer reasons. The giant problem with all these commercial backups is that while the incremental backups are simple over the net, the restore of a whole hard disk cannot be done over the net. You have to pay them to burn DVDs and send them to you. ANd that assumes you know what time period you want to recover.
UNlike all the other methods crashplan lets you pick a buddy who runs crash plan and then you can back up your disks to each others computer. If you need to to a massive restore you just drive over to your buddy's house and pick up the drive, bring it home, and restore locally. This also solves the problem of the first dump being too large to send over the net as well. You do it locally then drop the drive off to your buddy.
Brilliant!! plus with crash plan you pay for the app once not monthly.
I've used it for years now and it works very well and it very easy to set up. All your files are encrypted so buddies can't read each other's drives.
The only flaw with crashplan is that it runs in java so you have this instance of java running 24/7 and not to put to fine a point on it: java sucks. I don't know if it is crashplan or other things that run in the JAVA VM but over the week it bloats up to 600MB to 800MB. THe workaround solution is to kill the java VM every few days. Empirically crashplan is robust enough to survive this and restart. But that's a really awful solution.
Some drink at the fountain of knowledge. Others just gargle.
Indeed it is not as efficient as it could be. However, using it is only slightly more complicated than "buy a usb hard drive and plug into computer"
An efficient, totally ideal process that no one actually bothers to use because it's either too complicated, or because it isn't actually licensed for your platform or whatever, is no backup system at all.
Also, ZFS is a filesystem that can be set up to preserve version information. It's not a backup while it's on the same disk....
Can you be Even More Awesome?!
Chinese espionage hackers do it all for us free. They copy our stuff over to their side. It's as off-site as you can get.
Table-ized A.I.
I have two systems I use.
For my servers, I use AMANDA with encrypted virtual tapes to do nightly backups. Shortly after the backups run, cron calls a shell script in order to copy the virtual tapes to an offsite location via rsync.
For my desktop PC, I don't need to back up as often, so I do a weekly backup via Windows Backup to a TrueCrypt volume on an external hard drive. When it's not being used to back up my PC, I keep the external hard drive at my office. I figure if something happens where both my office and home are destroyed, then at that moment I've got bigger problems to worry about than my data. :-)
Just my $.02...
I use both. Time Machine back up to a deduplicated RAID-Z volume. When Time Machine backs up a file (e.g. a VM disk image or an 8MB stripe from a sparse image) with only a few small changes, the decuplication kicks in and means it only takes up a couple of blocks.
I am TheRaven on Soylent News
My backup for a multi-boot laptop that other solutions (e.g.: running from one OS) don't seem to work for:
1) Buy a second copy of your main hard-drive + USB Interface (SATA enclosure)
2) Boot Linux on computer using CD
3) Use dd to mirror entire HD to external HD. Run before you go to bed, setup to shutdown when done. Save stdout/stderr somewhere like a USB flash drive.
4) Wakeup to a backup.
The advantage of this is when your hard drive fails, recovery is about 60 seconds away. Swap out one hard drive and you are done. Or you can recover specific files by just using the backup HD like a normal external HD, since everything is just under normal filesystems. If you'll be on business for a while take your second hard-drive with you (try to store somewhere it won't get stolen with laptop).
I actually keep two mirrors, partially because of travel and wanting to have one backup with me. This also makes sure that if your computer fails half-way through doing the mirror due to a power surge it doesn't fry your original + mirror. Keep one at a friends house or similar.
That's nothing. You should see my butterfly collection...
I'm surprised rdiff-backup hasn't been mentioned yet. It's a very nice piece of software, does incremental backups, and is easy to automate.
-- B.
This sig does in fact not have the property it claims not to have.
I have been using RAID for many years — RAID-1 at work as I only have two drives and don't need much storage space, and RAID-5 at home. A couple of years ago when I upgraded my computer at work, I downloaded at least three different backup systems to try out. The goals were simplicity of use, keeping historical versions of files, and relatively low storage space.
After setting up bacula, I never bothered with the other backup applications.
I found bacula to be highly flexible, adapted very well to the set of many virtual machines I use, and is the easiest to maintain. I just set it up once (or after any major re-partitioning) with a specific list of files and directories to back up or exclude, then practically forget about it. It's saved my files a number of times already from accidental deletion or overwriting, and I used it once for a full restore at home after upgrading my computer including a new RAID array.
At work my excess hard drive space is enough to store all my full and incremental backups locally, but I also have it back up critical files to a corporate NFS server. At home I use LTO-4 tapes, which provide plenty of backup storage for over 2 terabytes of data; and whenever it runs a full backup I take the used tapes off-site for extra security.
I'm very excited (about backup software?) about this new backup program from an old buddy of Linus Torvalds':
http://liw.fi/obnam/
It seems like it will be the most featureful, forward-thinking backup software, ever: deduplication across multiple clients, compression, and strong encryption for untrusted storage. He's very keen on unit tests, too, and it has good verify and restore functions. I'm already using it for some things. GPL, of course, so no proprietary lock-in, ever.
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
WHS works a lot like Time Machine (I suppose). All our machines are backed up automatically every day to a server (in another building). And most importantly, the restore works and is surprisingly fast! I've had two machines go belly up and didn't loose any data.
Wanted: witty unique signature. Must be willing to relocate.