Affordable Home Backups for 10-100G Systems?
MichaelJames asks: "Ok, I have my MP3's streaming, all our digital pictures up, and a file server running on one machine in the basement. What would be the best way to do simple backups of the system and data? Get a tape drive Get a CDRW or DVDRW to backup the MP3 and pics, but use the old Zip drive for the file server data?" With drives in the 10-20 gig range only getting smaller and less expensive, what are we to do for backups, that have yet to scale well in the same range. For home systems with up to 100G of storage, what do you use to back up that much data, with a solution that's affordable to the average computer user? Have DVD writers become cheap enough for serious consideration as a backup media?
Given that a 100G hard drive is cheaper than any removable media solution, why not just buy another hard drive and install it in a removable (not hot-swappable, just removable) rack?
Racks are $20 at my local Fry's, and inserts for other hard drives are $10.
Just get a lot (A LOT) of 1.44MB floppy disks...
----
WWJD...For a Klondike Bar?
Hey, I've got all these 5 1/4" floppy disks sitting in boxes in a back closet. I bet if I added them all up, they would amount to close to 100 GB.
CD-R/CD-RW are too slow and too small, plan on spending a day or so swapping disks. You can always mirror to another hard drive, get a basic RAID card or just use a Ghost-like program to do manual backups. But tape is still cheaper per megabyte and more reliable. Sure, you can damage a tape, but it's harder to do than with a hard drive. SCSI tape drives are more expensive than another drive, but fast enough, and allow you to keep multiple versions or copies of your backup. Try that with hard drives and you need arrays. Tape starts looking REAL cheap then.
Ignorance is the root of all evil.
I have a 100BaseT network, and a server computer that resides in a different room from the rest of my systems. I rotate backups using those aluminum drive caddies. A pair of 60G drives turned out to be MUCH cheaper than the equivalent size tape backup. Every day, I rotate out the drive at the end of the day, and swap with the other. The spare I keep in a fireproof safe. Just tarball the appropriate directories. Done. Poof. Much faster than the average DDS3 tape drive too. Runs at night and I don't even notice it.
I have to say that this is coming from someone with a total of around 280gig at the house, but...
Out of 100gig, how much do you really NEED to back up?
The vast majority of my space is taken up by MP3s (where I converted my CD collection), but that could easily be replaced. To tell you the truth, of the things that I would need (documents, pictures, etc), I could easily fit it all onto a CDR. Well, maybe two. (I take lots of pictures)
Basically it boils down to, do you really need to shell out the money for that extra drive?
:^)
'Life is like a spoonful of Drain-O, it feels good on the way down but leaves you feeling hollow inside'
Onstream 30 or 50 GB ADR Tape backup.
Pros:
Can be found for under $100
Linux Support!
Cons:
Tapes are expensive
I once worked at a place where we had a lightning storm. Within a week, about half of the hard drives had failed, out of about a dozen. RAID won't save you then. And how fast can you get replacement hard drives installed, anyway?
All the affected machines were plugged into good UPSes, too.
Moral of the story: Always use offline backups.
Why are you bothering to back up your data?
/` or a virus)? If so, then a RAID solution is useless.
That may seem like a stupid question, but you need to consider the reasons you want to have a backup before you settle on a method.
Are you afraid of your drive failing? If so, then using a RAID solution should cover you.
Are you afraid of losing your whole system (perhaps due to lightning or theft)? If so, then your backup must be kept physically isolated from your system.
Are you afraid of accidentally deleting files (such as `rm -rf
Are you afraid of having your system down for an hour or two while you replace a drive? If so, then regardless of other issues, you need a RAID setup.
Do you want to use your MP3s with some other device? If so, you probably want CD-R copies.
Of course, there are other considerations that I haven't mentioned or thought of.
Clearly the answer, for easy backups of a 100G drive, is 21 iPods.
sulli
RTFJ.
Agreed that just backing up to another HD provides the best overall method for creating a complete backup of 100MB of disk storage.
However, I would suspect that most users don't change a huge percentage of their HD's content on a daily basis, unless you are routinely d/l'ing or ripping MP3s and MPGs on a daily basis (and I note that when I do generate that kind of traffic, it is usually because I am making a compilation CD, and while this does generate a few GB of "new" files on my HD that day, that data doesn't need to be backup up because I've got the original CDs anyway).
As a result, it seems to me that a reasonable solution is to create a "baseline" backup, say to a CD or DVD, at system install time, when there is (relatively) little on the disk, and then each day (or week, depending on needs), do an incremental backup of changed data only to another CD.
This approach is obviously quite inefficient if you have a complete HD failure, in that you have to recreate a new drive by starting with the first backup CD and then restore EACH ONE thereafter until the final CD restores the disk to it's last backed-up state, but for a more common problem of losing or corrupting an individual file, since that is more likely to happen with a recently modified than a remotely modified file, you are likely to be able to restore a last good version within only a few CD's of the most recent incremental backups.
A lot of people have mentioned that disk to disk backup seems to be the best way to go.
I agree.
What hasn't been mentioned is rsync, which makes disk to (local or remote) disk backups fast and easy.
It is trival to set up a second disk that is a "stale" mirror of your primary disk(s) that backs up nightly, and will boot off a floppy. This captures some of the advantage of RAID (quick recovery) while being an actual backup, not just fault tolerance.
Rsync can use ssh as a transport, so you can securely back up remote disks as well.
-Peter
RAID offers good protection for some things: hardware failure (ie: HD crash) and uptime. That aren't the only woes, however... You can loose data in a lot of ways:
Disaster (fire, quake, flood, nuff said)
Hardware failure (disk, controller, ...)
OS failure (FS corruption, ...)
Application failure (User space applications malbehaving, virii, ...)
User failure (accidental deletes, experimental children - trust me on this one ;-)
:)
...) are a plus, but more cumbersome.
RAID will protect you from the second, but will happily add nothing in case of any of the other failures. Backing up to another media is a necessity.
Adding an extra disk (or two, or three), and some tar/cpio cronjobs will add basic protection. (No disaster recovery for you, unless it's off-site
Removable harddrives (firewire, frames,
Tape is considered a more 'trustworthy' backup medium because the mechanism and data storage are separated (ie: tape drive / tape), while in a HD it's in one single package, and it's not as easy to replace the logic board/stepper motor if this flunks. With tape it's easier: just get a new tapedrive.
Anyhow: don't rely on RAID to save your data - it won't.
Okay... I'll do the stupid things first, then you shy people follow.
[Zappa]
Is there something you are trying to keep secure?
Why do you want to keep your data safe?
Is an encryption device utilized with a harddrive or an application?
Where did you obtain all of your software?
Are you looking to copy to a device that has the ability to encrypt files?
If you are looking for a portable back-up device, why do you need it to be portable?
Do you travel extensively?
When you do travel, do you primarily travel by air?
Do you have a digital camera?
Do you have a mobile phone?
Have you ever encrypted an email message?
Have you ever deleted an email message?
If so, have you had data rewrite over the sector(s) containing such message?
What was the title of the last book you purchased?
"There ought to be limits to freedom"
I'm sharing my cable modem via 802.11 with all the neighbors and since I am the local "neighborhood helpdesk technician", they often come to me for advice. Recently, one of them wanted to know how to go about backing things up properly. It dawned on me that hard drive space is abundant and most people are buying much more than they need (the person in question has an 80 gig at about 20% capacity). So I worked out a deal so that everyone is backing up to each other's PC at night on a weekly basis. The 802.11b connection keeps drive thrashing to a minimum yet provides enough speed for complete backup on an overnight basis.
I should start charging for these ideas... Can't wait for the proliferation of freenet!
Life is the leading cause of death in America.
There are two ways you can go relatively cheaply, and IMHO a far better solution than CD-R or CD-RW.
Pick up a DLT2000XT (15gb native) off ebay for about $200. Tapes are dirt cheap, about $5/ea and the media is extremely durable, nearly indestructible.
Pick up a DLT1 (40gb native) off ebay, about $500. Tapes are moderately expensive at around $20/ea, but again the media is extremely durable.
DLT is industrial strength backup, the drives are built like tanks and the tapes can take incredible abuse.
Its all standard SCSI and works great with linux, no problems whatsoever.
I considered buying hard drives for backups, but they are far too fragile for long term backup and off-site storage. Most drives arent designed to be spun up and down lots of times either.
Last thing you need is for your backup harddisk to go splat when youre trying to power it up to restore your main system from a data loss.
With DLT, this isnt likely to happen.
Bringing up the system is less of a problem with newer OSes, since you can usually, at minimum, get to your data. Configuring the database, webserver, and firewalling depends on how good you are with the OS. However, when I worked at a former company there was no real plan to get a working system back in place. We were using Novell with Arcserve -- unfortunately, you couldn't get to the data without a working system.
Next I usually try to segregate rapidly changing stuff versus things that are pretty much static. E.g, my mp3 collection is relatively static. I occasionally buy a fresh CD and rip it, but I'm pretty much satisfied with my collection as it is. I put these on CDROM. It takes a while to create them, but it's cheap and safe. If you want to keep everything up to date, you can run a script to save only files not included on the CDROM.
Finally, I back up my constantly changing stuff such as CVS, MySQL database, etc. to 4MM tape. It's cheap (hardware and tape) and most drives are pretty well supported.
I'm classifiable as an audio addict, having taken my entire personal
/boot
/home
/pchome
/pub
/pub/mp3
/scratch
/pub/mp3_2
/pub/software
/etc/cron.hourly/rsync_with_fumus script:
/pub
/pub
/pub
:-)
collection of CD's and ripped them to MP3's at 320 bit, and wanted to
have them stored in a central place, accessible from any machine in my
home. Currently this collection is at approximately 620 full CD's of
music, and I'm pushing right at, or just above the 80 gigabyte limit.
Now when you factor in personal files, financial records, games,
downloaded material, installation software you don't want to lose,
etc...etc... Well, see for yourself. Here's my space breakdown for the
partitions on my main file server Fumus (Smoke, in Latin):
fumus:/pub/mp3 # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/hda3 3.0G 2.1G 804M 72% /
/dev/hda1 129M 6.8M 115M 6%
/dev/hda5 9.8G 1.8M 9.3G 1%
/dev/hda6 20G 13G 6.3G 67%
/dev/hda8 40G 22G 17G 57%
/dev/hdb1 75G 38G 33G 53%
/dev/hda7 1.9G 20k 1.8G 1%
/dev/hdc1 74G 34G 40G 46%
/dev/hdd1 74G 36G 37G 49%
So, here's what I looked at:
Tape: For the size I'd need: Way WAY too expensive. When I brought
the media down into the range I'd afford, I'd be swapping tapes all week
to get a backup done. Not time effective.
CD-R: Faster, yes, but at 650 megabytes per media, same problem as
tape, only you've traded magne tic for optical.
Extra hard drives in the same machine: Originally, this is exactly what
I had done with a single file server running Reiser file systems in the
more experimental days. I got the scare (and lesson) of my life when
Reiser went a bit nuts, and started corrupting some of my data. I only
lost about one percent, but I vowed, never never NEVER again would I
backup data on a critical machine on live media in the same machine.
Okay, so here's what I finally DID select as my solution: A second
machine called Ignis (Fire in Latin) that uses the absolutely identical
configuration, right down to the types and number of drives, partition
sizes, everything. They both connect into my 100Mb network switch, and
Ignis rsync's from Fumus every hour on the hour thanks to scripts in
/etc/cron.hourly
In fact, here's Ignis'
rsync -arul --one-file-system --quiet fumus:/pub/mp3_2
rsync -arul --one-file-system --quiet fumus:/pub/mp3
rsync -azrul --one-file-system --quiet --delete --force fumus:/pub/software
rsync -azrul --one-file-system --quiet --delete --force fumus:/pub /
rsync -azrul --one-file-system --quiet --delete --force fumus:/pchome /
Is this a bit extreme? Yes. But... if, gods forbid, Fumus really does
let out its magic smoke, or Ignis does catch on fire, and the physical
media were actually damaged, hopefully the damage would be limited to
*one* case, and wouldn't end up taking both machines out. Then I really
would be crying the blues.
Oh yes, and each machine is on their own 900VA UPS. I'm not playing
THAT game.
Not to distro-bait, but Debian in particular shines here because apt makes it so damn easy to bring a system back to the state you wanted. For myself I have created a meta-package (.deb) which does nothing but depend on the applications I want installed on every desktop system: galeon, gnucash, xchat, gaim, xmms, vim-gtk, and a handful of others. Then I back up my meta-package, all of 10k including a few shell scripts I wrote for myself. Install my meta-package on a new system, and voilá, apt fetches and installs every app, that I need to continue working, dependencies included.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
FYI, if there's anyone out there in retail sales working for a place that sells HP computers, you can log on to HP Info Lab and you can get a $400 rebate for the HP 100i DVD writer. There's also a $50 rebate available, which you can use in conjunction with the $400 rebate. The $50 rebate is available to the general public.
This brings the price of a DVD burner down to $150, since the drive is 600 coon skins before rebate. At that price, and if the playstation 2 drops in price this christmas, i sense lots of burned games in my future...
sig?
RAID is to provide either additional speed and/or hotswappable capability. RAID really stinks as a backup, since RAID doesn't care when some program deletes most of the hard drive, when some user removes too many files, or when the OS barfs. Sure, RAID will save your DATA if one HDD fails, as long as whatever caused it to fail didn't affect the other drive, but for the reasons already listed, this doesn't mean RAID is a valid method of backup.
However, a HDD in an external enclosure could be considered a valid backup, however, for true redundancy, you better have two drives you swap, and you better be doing surface tests regularly. A drive, properly treated, should last many, many years. Also, you could combine a drive with monthly or quad-yearly backups to CD-R, just make sure you do your research on the inks used in CD-R disks, some don't last as long as others.
Just my $.02
That worked really well for backing up our 80MB drives onto stacks of 1.44MB floppies, since you would really only need to insert about 5-10 floppies during your weekly backup, just to get the files that changed.
So why not just do incremental backup onto CD-Rs? Even with 100GB of archives, most of those are static. You probably won't need to use more than one CDR per week (maybe two) to track the changes. It's cheap, relatively painless if you've got the right software (and it wouldn't be hard to throw together incremental backup/recovery scripts in Perl if you're into that sort of thing.) and you've probably already got a CD burner.
If less than 650MB of files change in a week, the rest of the CDR can be filled up with files that were on earlier CDRs (this way your backup set can remain finite and you can throw out the earlier CDRs as they become obsolete. Or if you keep them all, you can reconstruct that state of your hard drive at *any* time, not just at the last backup.) This seems ideal to me--why is everyone else talking about expensive solutions like tape drives, DVD-RWs, and second hard drives?
I have a positive modifier on Troll. When I mod someone Troll their karma should go UP!
I have to wonder whether (first of all) why in the heck anyone would need to have 100GB of disk space on a home system. But then I have five systems networked together and have more storage than I would have thought sane a few years a go though I have a bit of a ways to go before I will run into the poster's backup problem. It wasn't too long ago that, if you could afford 100GB, you could probably afford a SCSI array controller that would let you do a lot of RAID, hot swapping, automatic drive replacement, etc. With today's cheap disk prices you don't have to be wealthy to have an ocean of disk space. (I can remember the days when we thought having 900MB on a MicroVAX II was extravagant.)
You could always do it the traditional way and get some tape drives. Unfortunately, they're much more expensive than you might think when you have to backup that much disk space. You certainly wouldn't want to go cheap and be feeding 90m DAT cartridges into a drive all night (it'll start feeling like you're backing up to floppies before long). A good high capacity tape drive can get, what, 20GB onto a single cartridge? Not bad. And I think that at this point in time, tape is more cost effective than DVD-R. (Something tells me that the MPAA, and maybe the RIAA, will try to keep it that way too.)
Mirroring disks can be helpful. Hard disks are getting cheaper and cheaper. Heck it's almost scary mow much disk space you get in a typical PC sold at Best Buy nowadays (and without a backup device; it's almost criminal). If you're running mirrored disks you'll forestall the inevitable disk crash that takes all your data with it. Question for the Linux folks using the `md' driver: Does it allow adding a third member to a mirrorset? And, if so, can it be done while the system is `live'? (The third member gets removed and taken offsite in case there's a disaster.)
One final thought: The poster wasn't actually running a 100GB filesystem were they? I'm thinking that a power glitch could cause a world record to be set for the longest fsck-on-reboot run. Plus I'd think that backing up such a beast would be a challenge. I tend to keep my filesystem sizes no larger than what I can fit on a single tape cartridge... just to make life simple. (I'm used to having to pipe `df' commands through `more' at work so I don't mind lots of mount points. :-) )
CUR ALLOC 20195.....5804M
So, is this Fibonacci backup method?
Babies are cute because they have to be.
Since when is a harddrive not a semipermanent media that can be easily taken off site? I'm surprised this comment got modded up so high. And since when are tapes such a reliable media compared to a hard disk? So burn-in the drive for a few days before using it for backups. And use a S.M.A.R.T. utility to diagnose the drive before each backup to reduce the chance that something is getting ready to fail.
Your best option is to put all data on a 2-disk mirrored RAID and use another drive as a removable for an off-site or fire-safe backup. The probability of 3 hard disks failing simultaneously, one not in use, is so incredibly small it's laughable. And for that non-zero chance, if it happens, you can pay to have the spindle of one of the failed drives transferred to a new drive in a clean room.
True -- but given the article's "affordable", "home", "10-100GB" parameters, I'd be quite happy regarding hard drives as a real solution.
Don't expect one hard drive to last you 10 years, because 10 years from now, systems with 40-pin IDE won't exist. (And likewise, neither will readers for the tapes you purchase today. When was the last time you saw a DC600 cartridge tape drive available?)
If you're talking longterm storage, leave your "backup" drive somewhere secure, and expect to replace it every 3-5 years. (That'll probably be a 500G serial IDE drive 5 years from now, a terabyte-range solid-state device 10 years from now, and a petabyte-range holocube 20 years from now.)
This may not be the best way to do it, but it works for me...
I have a "backup" hard drive in my server. This drive is always unmounted so that there is no chance of filesystem corruption from the operating system.
I just use a crontab to run a simple script that mounts the drive and coppies whatever specified backup files to it, then unmounts it. The same method slightly modified could be used to back up this same backup disk to another location on the network on regular intervals.
Being one of the maintainers of Amanda (www.amanda.org), I'd always been of the opinion that tape backups were the only way to do backups seriously.
/boot on RAID 1 over the 4 disks and / on RAID 1 over 2 of the disks and an alternate root to test upgrades over the other 2, but you get the point). This got me blazingly fast disk access, that tapes would never help me get :-)
The recent explosion in disk capacities and decrease in prices got me to rethink this, just when it came the time for me to set up a home office. When I compared the cost of a reasonably-good tape drive and a number of tapes large enough for me to get at least a month of backups in rotation, and computed how many 60GB disks I could buy with that money, the solution was clear.
I ended up setting up 3 machines with 4x60GB each. They're all on RAID 5, such that if any single disk fails, the machine keeps running (actually, I have
I get all my backup-worthy data rsynced over to the other machines daily or so. I plan to start playing with Inter-Mezzo soon, so that I don't have to remember to run these backups, and so that I don't run these backups on the wrong direction.
But that's not all. With the mind-boggling amount of disk space I could afford, I could (actually, I will, but you get the idea) set up Amanda to backup interesting portions of my home directory to disk, and also replicate this to at least another of my local machines. Such backups can use software compression, such that they don't take as much space as live data. Also, I intend to use another form of compression: instead of backing up CVS trees (I've got loads of check outs), I'm going to back up only local changes to files, so that, in case of disaster, I can still download the original CVS tree and re-apply patches. But this is still a plan, not something I've got running.
Finally, I've got yet another disk on a remote site, to which I rsync not only the interesting portions of my data, but also my backups. I could convince someone else to run this remote backup site for me by offering this person the speed up of RAID 0 over two disks (one of those mine). As for keeping the secrecy of the data on this remote backup site, I'd just get the backup files encrypted, no big deal.
I can strongly recommend this solution: I got pretty much as much data safety as could be expected from a tape-based backup, without any of the hassle of having to switch tapes and moving them off-site and back on-site, and with the bonus of very fast access to local data, unlikely donw-time and fast recovery except in case of total disaster (i.e., having all of my local machines failing, in which case I'd have to either download my backups from the remote site over the net or, more likely, take a replacement machine over to the remote backup site and copy files over a fast local network connection, or from disk to disk.
As for getting 4 IDE disks into a single machine, don't even think of using only the 2 IDE controllers that come on most motherboards these days (for RAID set-ups, you really want one IDE disk per controller). There are a few good motherboards that come with 4 IDE controllers, so that you can even have a CD-ROM and/or a CD-RW in addition to the 4 disks. If you can't find such a motherboard that suits your needs, you can always get one of those PCI cards that adds 2 IDE controllers to your machine.
As for the problem of fitting so many disks in a standard ATX chassis, it can be done. Cooling may be a problem, but a good cooler has been good enough.
All in all, I'm very happy with this arrangement. It was not cheap, but it was not as expensive as a tape-based solution, and it's far more flexible, way faster and it doesn't require any baby-sitting after you get it going. And I can keep far more backup history than I thought it was going to be possible.