Best Home Backup Strategy Now?
jollyreaper writes "Technology moves quickly and what was conventional wisdom last year can be folly this year. But the one thing that's remained constant is hard drives are far too large to backup via conventional means. Tape is expensive and can be unreliable, though it certainly has its proponents. DVDs are just too small. There are prosumer devices like the Drobo, but it's still just a giant box of hard drives, basically RAID. And as we've all had drilled into our heads, 'RAID is not backup.' When last this topic came up on Slashdot, the consensus was that hard drives were the best way to backup hard drives. Backup your internal HDD to an external one, and if your data is really important, have two externals and swap one off-site once a week. Is there any better advice these days?"
Switching off-site backups every week is an unnecessary hassle. Back up to an external hard drive and an online backup service. Anything more than that is overkill unless you have really important data.
A Magic the Gathering Article and Forum Aggregator
I decided that I have three main "categories of data":
- easily replaceable: This is stuff that is fairly easy to replace.. for instance I have ripped a huge portion of my DVD collection (for my own use). If I lost this data, it would not be a tragedy .. just a pain in the ass.
- hard to replace: This is stuff that does exist "out there".. but would not be easy to replace. This includes old TV shows that you can't buy or if you can are very hard to find.
- irreplaceable: Self explanatory.. this is my documents, code, photos, etc that could not be replaced if lost
I keep everything besides OS files on a file server. Raid 6 (two parity stripes).. this is the first layer..
to me this is adequate to protect "easily replaceable" stuff (which in my case constitutes a huge chunk of file space).
I backup everything in the "hard to replace" and "irreplaceable" categories to a seperate (removable but stays in the system) hard disk (so far 1TB has been enough to hold all this data). I make a
secondary backup to a second removable drive and store this "off site". This secondary backup does not get updated very often.. which is the trade off I guess... but it provides a "last hope" if something
crazy ever happened.. like my house burning down.
Oh.. and backups are encrypted!
rsync.
That's the protocol. Now what media do you recommend? Another hard drive?
Thank for reading to the sig. You may stop reading now. It is safe. There is no more content. Why are you still reading?
Windows Home Server actually has very good backup options. a)It allows for folder duplication on shared folders, protecting your shared files against a single hard drive failure. b)It allows you to add a hard drive as a backup drive, basically to dump all the shared folders, which can then be taken offsite. c)Jungle Disk has a WHS plugin, and there's an alternate Jungle Disk plugin which is allegedly better on whsplus.com, which provides your online protection. Automated daily backups mated with Volume Shadowing means that not only is your data safe, but previous versions are available too.
I've re-purposed a computer as a backup server, which lives at my parents house. It runs Ubuntu, with ZFS running over FUSE. Each night, a scripted CRON event will run zpool scrub on my storage pool, and if there is a problem, it will send me a text.
My MacBook Pro will use Time Machine over NFS over SSH to make the actual backups from my dorm/wherever I happen to be.
Commence CDDL/GPL/BSD Flamewar.
-jX
Don't you just love politics? It's like a comedy of errors.
Not really, keep doing it like that. for how to do that read this: http://jwz.livejournal.com/801607.html
I'm kinda a 'option 1' guy, but stuff that's really important, I just burn on to DVD every so often.
The other option, now that most folk now have halfdecent connections is to set up an rsync to a buddies machine, (and reciprocate) , using encryption, you now have an automatic off site back up.
http://backuppc.sourceforge.net/
Get an old P3 for free somewhere and load this up on it with a big disk or two for storage, put it on your network, and run it. That's what I do and it works like a charm. I went through all the options over the years, tape, DVDs, manual copying to a server.
Backuppc backs up all my windows and linux PCs. It backs up only what I tell it to, and it does both full and incremental. Sort of a pain in the ass to set up (I use cygwin rsyncd on the windows boxes, and regular rsyncd on the linux boxes), and it works well.
Only drawback is it is still on site.
Just because the backup solution _uses_ RAID doesn't mean the old adage applies to it. As long as you are using it as external backups all is well.
What that phrase IS telling you to do however is not use RAID on the machine you want to back up and expect it to do what you want.
My UID is prime... is yours?
I think the OP's post arose from a misunderstanding of what "RAID is not backup" means.
The adage isn't an admonition not to use hard drives as a means of backing up data. Rather, it is concerned with the fact that any change to your data is committed to each duplicate volume in a RAID, so if you delete an important file, for example, it's just as gone as if you weren't running a RAID.
That's completely different from mirroring your drive onto an external hard drive and putting it on a shelf somewhere. If you delete a file on your live system, you can restore from that backup.
Are you joking? S3 is perhaps the most overpriced way to backup data.
You're paying at least $0.15/GB/month for the space, and then paying $0.10/GB transferred in and $0.17/GB transferred out.
So if you were to use 1TB of storage over 5 years filling it perhaps 3 times over that period and reading it 10x, it would cost $1800 for the space alone, $300 bw in, $1700 bw out, for a total price of $3800.
Meanwhile, you can get 1TB hard drives for $80 everyday (you could almost buy 50 of them for the price of your online service). I'd love to hear how you can twist the math around so badly that it looks like you're actually saving money! Ever considered a career in politics?
Once upon a time, the computer you wanted always cost (at least) $5,000.
This trend ended in the late 80's. All of a sudden, package system prices started trending seriously downwards, because due to Moore's law, computer speed started outrunning almost everything you'd want to run on it. Not true for certain specific apps, including graphics and games, but for office use it was perfectly fine.
I remember buying a 200 MB hard drive for $500 and thinking about what a great price it was.
Up until recently hard drives were one of the more expensive components left in a computer package. Now? Most are under $100. That's lower than tape backups used to be at their lowest prices. It's true, right now the best way to back up your hard drive is a second hard drive.
IMO the big question now is where that second hard drive will be. You can stick it in your computer and mirror your main drive in real time easily enough, but that means a virus or software issue will ruin both drives simultaneously. Better to sync them once a week? Perhaps.
Of course, this won't help you if there's a house fire. The fireproof hard drives are still darned expensive. Internet-based remote backup is great, if your broadband can handle it.
Cuneiform tablets work well for me. Don't store them in a flood zone, though.
Actually I read the summary and decided it was stupid. Sort of like, "I want to make a ham sandwich. Conventionally these contain bread and ham, but I'm an idiot so I want to make it from dog hair and epoxy resin".
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Since when is tape unreliable? My DLT has a MTBF of 250,000 hours. I've used DLT, DDS, and Travan for years and I've seen far more HDD failures. I've seen plenty of tape drives fail, but not the tapes themselves. I trust my tapes far more than any spinning platter. Come to think of it, I trust my tapes more than any other backup I use (Optical disc, HDD, and Cloud). Once my station wagon full of tapes caught fire on the highway, but I blame that cheap-ass roach clip.
There are three kinds of data:
1. If you lose this data you will go to jail.
2. If you lose this data, your business will be impacted.
3. If you lose this data, you will have less options for entertainment.
#1 tends to be a megabyte or less.
#2 tends to be a few hundred megabytes of documents.
#3 tends to be terabytes.
My company has a PDF of every document that we've touched in the past decade (federal law requires this retention), and our entire business continuity backup fits easily on four LTO-4 tapes, plus a very less-than-full tape that we rotate for offsite storage weekly. We've explored every backup system out there and this is by far the most cost-effective for us.
I don't understand why the OP claims "tape is unreliable", as I have not heard of a single instance of in-service failure of an LTO-4. As for it being expensive, it is, but before we went to tape we were using Firewire800 external drives, much more expensive than tape cartridges, and not as reliable as some people have been led to believe.
USB and FW external drives almost never fail as long as they are powered on. They fail in storage, which seemed pretty weird to me, since they should be able to sit on a warehouse shelf indefinitely. My low-sample unscientific data from experience says otherwise.
Since everybody is going from LTO-2/3 to LTO-4, you should be able to get LTO-3 transports pretty cheaply.
But my first advice is to identify the data in categories #1 and #2, where you might realize that it's a good practice in any case, to store the important stuff with its own priority. This is the hard part. Identifying what's actually important. If you don't do this, no matter what backup system you end up using, you're going to be burying the important stuff in the noise, introducing risk.
The OP also mentioned Drobo. I have a Drobo and I love it, but I must warn you that it's pretty slow, even with really fast drives. Don't expect to be able to copy a terabyte to it in less than 40 or 50 hours, even with firewire 800. This is the problem that drove us to tape, which is much faster than any filesystem we can feed it from.
-fb Everything not expressly forbidden is now mandatory.
I just can't be bothered with slashdot any more. It's full of dummies with mod points. How do I get off the Internet?
Do I need a megabyte of backup capacity for every megabyte of storage? No, I decide what's important and how long it's important for.
Deleted
Ghost Virtual Machine gives 15gigs of Amazon.com data storage and right now if you use the promotion code of "launch" you get 10Gigs more as a bonus for 25Gigs. If you want to give me a referral my id is orion_blastar there, and each person you referred grants you 5Gigs more in a bonus.
Google Docs also has document storage but does not give as much as G.ho.st does. The Ghost Virtual Machine can access your Google Docs drive as well.
Here is a review of the top 5 online cloud storage sites so you can take your pick.
MyBloop offers unlimited free storage, but I am not 100% sure of that or their privacy policy.
Lifehacker talks about using your Yahoo Mail account for unlimited storage and also that Google's GMail almost offers the same service as well.
Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
This is a bad idea. Other than the ludicrous cost of the SSD, flash drives tend to fail all at once. bam! and all your data is gone. This is also why i do not use a USB key for backups.
On my system everything is dumped on a 2TB mirrored system (simple 2 x 2TB HDDs running debian software encryption + RAID lvm) and periodically backed up to blu ray DLs in duplicate. At $10/disk from japan (see ebay) two verbatims back up 50GB in duplicate for $20.
Typically it takes 2-3 months to generate that amount which means its cost effective. DVDRs (Taiyo Yudens) fill the gap if there is not enough data to justify a bunch of blu rays.
Really? I don't think you've looked at this very carefully...personally I use Mozy, it's a couple bucks a month, the initial upload took a week or so, but it was all backgrounded and I never even noticed (yes, you can turn your computer off, etc.). Daily incremental backups take just a few seconds. Retrieval is via downloading, if you just want a few files, or for some money ($50? I think?) they'll overnight you a couple of DVD's with your whole backup on it. So, it's cheap, requires absolutely no thinking on my part, is fire/meteor proof, and has unlimited storage. The choice was obvious, from my point of view.
We must lay out the kinds of failures and goals of a backup to determine how best to back up.
1. We would like to protect against mechanical drive failure. This can be done with a RAID.
1.5. We may also want to protect against the failure of other components of the computer. I recently had a computer die because its motherboard died, and it took about two weeks to get a new computer, and the new computer was a significant upgrade so it had SATA instead of IDE. In the mean time, I needed my data on other systems, and when the new computer came, I needed to borrow a USB-IDE bridge to recover some stuff that I wasn't backing up.
2. We would like to protect against accidental deletion of files, file corruption, or edits to a file that we have now reconsidered. This can be done with snapshotting. In source code, to reconsider and edit to a file is fairly common, and is the reason why most programming projects use revision control systems. Other options like nilfs or ZFS snapshots can also fill this goal. This goal is accomplished more easily if the backups area automatic and the backup device is live on the system.
Depending on your needs, this goal may be counterbalanced by a need to not retain the history of files for legal or other reasons, and this should inform your choice of backup strategy.
3. We would like to protect against filesystem corruption, whether by an OS bug, or by accidentally doing cat /dev/random > /dev/hda. This can be done by having an extra drive of some sort that isn't normally hooked up to the computer. Tape drives, CDs, and DVDs have traditionally fulfilled this purpose, and this is where the use of additional hard drives is being suggested. Remote backups, via rsync can also accomplish this. For this I use git.
4. We would like to protect against natural disasters. For someone living in New Orleans, it would be nice to have a backup somewhere outside the path of Hurricane Katrina. Remote backups may be pretty much the only way to accomplish this, unless you're a frequent traveler and can hand-deliver backup media to remote locations.
5. In addition to any of the above, the code you use create said backup may be buggy, or may become buggy or misconfigured over time. Checking the integrity and restorability of your backups after creating them, and keeping several (independent) previous versions of a backup may help here.
You may not be concerned with the various modes of failure described here occuring simultaneously. For example, it may be unlikely that you need to deal with file system corruption at the same time that you regret one of the edits you made on your file. In that case, your offline backup device doesn't need to hold all of your snapshots.
I hate to be that bitter old pessimist, but this has been debated to death and back here on Slashdot many times over. I swear, it should be in the FAQ by now.
All of the times this question has come up (feels like at least once a month), there have been many very good suggestions. Why should we rehash them for the nth time?
On the other hand, if you thought you could ask on /. you probably match this description...
If you had to ask on /., you already don't match the description.
"What about the other 95%?" Over the years I became an old and bitter sysadm... you know what ? They just need to do what the 5% did: Put their asses in a chair and Read The Fucking Manual... and read again, and again until they understand the subject.
That's not what they did.
First, they were born/nurtured in such a way to have above average technical aptitude.
Second, they were interested enough in how computers work to tinker and learn and gain a broad base of knowledge about their computer and OS.
Only then did they "put their asses in a chair and Read The Fucking Manual... and read again, and again until they understand the subject."
If you expect the 95% who did not go through the first two parts to skip right over into the third part, you're in dire need of taking your ass out of the chair and meeting some Real Fucking People.
No, I'm not user friendly, I do not need to be... people are asking me for help anyway.
Do what I do. Tell them, "yes, there's a way but it's rather complex. Do you want me to explain it?" The answer is almost always "no". Because they really don't want you to explain it, they want you to do it. If they say yes, you'll probably be asked to stop in less than 60 seconds.
And honestly I'm not all that worried about backing up with modern operating systems.
Modern operating systems don't protect you from:
Best thing at the moment for home backup is to mount an encrypted external hard drive and copy to it, then take it off-site. If you think that sounds over the top, then I predict one day you'll be sitting at your terminal saying "aw, shit".
Advice: on VPS providers
"I want to make a ham sandwich. Conventionally these contain bread and ham, but I'm an idiot so I want to make it from dog hair and epoxy resin".
Leela: And that sandwich you're eating is made of old discarded sandwiches. Nothing just gets thrown away.
Fry: The future is disgusting!