Happy World Backup Day
An anonymous reader writes "Easter isn't the only thing some people are celebrating today. Today is also World Backup Day. What steps have you taken to be able to resurrect your data, instead of having it go to eternal oblivion?"
World Restore Day?
Wait a sec. I should think it would be "Restore" day. At least for those of the various Christian persuasions.
Faster! Faster! Faster would be better!
I put all my data in a cave and sealed the entrance with a big rock -- but three days later it was all gone.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
The First Three Rules of Computing
1 - Backup
2 - Backup
3 - See Rules 1 & 2
Of course, I have no idea how to backup a world. What kind of media would you use, cosmic string recorders?
I've committed one and zero to memory.
I'll be able to regenerate all the data using just those two numbers.
Sleep your way to a whiter smile...date a dentist!
Me too, however, recalling the order of them is something I'm still working on.
Table-ized A.I.
I managed to go 16 years in the IT world, first as a sys admin and now up through an awesome mid-level management position, without any serious data management scares. (And by 'awesome', I mean I work for demoralizing leadership and I've hit a glass ceiling which will force me to go find another company to work for if I want any shot at career advancement.) I've always made sure there's many, many layers of redundancy and good processes in place.
That was until three weeks ago.
We use Microsoft DFS to sync data between two sites. Because of some other things going on, we had to turn DFS off for 3 weeks. We thought we had everyone transitioned to using the "master" file repository, the one that gets backed up every night, etc, etc. The day we turned on DFS back on, all hell broke loose.
Oh - and this is fairly important stuff: 10 years worth of CAD, design, and legal paperwork. It's a few terabytes worth. For our medium-size company, this is basically everything that we hold near and dear.
The first thing that happened is DFS completely puked and completely trashed BOTH filesystems. Fantastic, Microsoft - what a wonderful piece of shit DFS is. Fairly quickly we had to face some data integrity issues. First, we discovered apparently there was a fella at the remote site that was using the copy of files there. Great.. through a fairly manual process we were able to retrieve most of his changes to the dataset. Next, we fairly quickly gave up on trying to fix the DFS - on the advice of Microsoft it seemed to be fairly hopeless.
This is where shit gets real.
Our head sys admin had been troubleshooting an issue with a drive in a RAID'ed NAS backup device had failed. All the other backups had been shifted to other NAS devices, but that backup was so large that it apparently had just been failing. While looking for that, we also discovered the quarterly backup from December had failed (that's the point where I wanted to put on my manager hat and go rip someone a new one, but decided that probably wouldn't be the most productive thing at the moment and could save that little teachable moment asskicking until after we were out of the woods.) Now, the sys admin hadn't been completely foolish, before turning DFS back on he had run some full backups using a different NAS device.
In a f*cking brilliant stroke of disastrous luck, when we went to perform the recovery we discovered that RAID array on the backup NAS device also had corruption.
Now, how bad the corruption was and what exactly that meant remained to be seen. The backups had completed without error, it was the NAS filesystem itself that was throwing the errors. The NAS was still running and our backup software seemed to recognize the backup catalogs on it. Ok, other than what seemed to one potentially corrupt backup, it was seeming like the next best case scenario was a quarterly backup from September, and I was also staring a complete set of disks from 2010 dreading the thought of bringing them back online. Well, with nothing to do other than try a restore, we pressed the button.
That's when I went home mid-morning, chainsmoked four cigarettes on my porch and wondered what would happened if everything went south. In other words, I was contemplating my next job.
'Lo and behold, and restore worked. We had to merge all kinds of things back together to get a complete copy reassembled, then we still had to get DFS working (which took four days of syncing over the WAN.) When it was all said and done, it looked like there were just two files from one set of changes that we couldn't recover.
I think I'll go double check on the backup jobs now.
----- obSig
Amazon Glacier has really changed my backup strategy since this time last year - I now push all my own, generated content (ie: pictures, documents, things I could never get back if I lost everything) up to Glacier using the free Windows client, Fast Glacier. In February I was charged $0.13 by Amazon for storing ~8Gb of data. I tend to push new content up as and when I create it (for example, after I process holiday snaps, or get back from a day out).
Day to day file changes are now handled by Windows 8's File History feature where my changes are pushed to a small NAS (Dlink DNS-320) in my shed (technically off site?) over a Homeplug AV ethernet link. For added security I use the legacy Windows Backup application (still present in Windows 8) to create ~ monthly snapshots of the system which I store on a 320Gb external HDD. This drive is one of two which go back and forth between my parents house each time I got and visit. These disks are encrypted using Microsoft Bitlocker drive encryption.
I should get around to properly encrypting my NAS in the shed, I've been looking at encfs.
No need to do anything. When disaster strikes just wait three days and it simply restores itself. Shortly afterwards the data ascends into The Cloud and becomes available forever and ever. Halleluiah!
Or user stupidity erases the vital data? Or malware starts corrupting your files? Or a disaster destroys the whole computer?
RAID is a great solution to hard drive failure, but it doesn't cover all of the other things that might go wrong. For that, you need a proper off-line backup that can protect you against user or OS problems, ideally one that's located far enough away to recover your data in the event of a disaster. RAID is best in addition to, not as a replacement for, true backups.
There's no point in questioning authority if you aren't going to listen to the answers.
Automated incremental backup of the headless servers at home, every two days (and I check the backup logs regularly). The backup disks are cycled every 4 weeks: the existing set goes to an insulated box in the garage (a separate heated building), while the previous disks come in and start with a full backup. Our 4 workstations at home all get backed up to local USB disks, but these are merely for convenience - important files are always kept on the servers, where they belong.
You don't belong on this planet.
Seriously, I run RAID, cross-machine mirroring, then do daily backups, with the logs emailed to me each morning. Periodic external media copies to DVD and USB devices. In my case, I have incentive, though. I used to work for a big-name backup software company and knew of design flaws that meant that a certain percentage of backups would write out defective data. And got burned in later years when I was compelled to use the product for my later employer. Because the RAID arrays would blow a disk the minute I'd leave on vacation, then blow a second one before I got back to replace it. And the restore would fail.
For a long time I used TAR scripts, because unlike the fancy expensive commercial products, I could always count on being able to use a tarball as long as the media itself was undamaged.
Ironically, this is the weekend I started learning Bacula. Tar is reliable, but it doesn't manage media catalogs.
I embed the most important data in Bitcoin transactions, and let the geek world mirror the blockchain.
Escher was the first MC and Giger invented the HR department.
Thanks a lot for writing up this suggestion. I had no idea Amazon Glacier was only a penny per gigabyte, and thus a realistic way for me to backup virtual machines offsite, finally, (using only my available slow home upload bandwidth). Which got me to Searching on the net...
CloudGates.net does indeed look like a useful service.
A Search engine lead me to a free Windows client called FastGlacier http://fastglacier.com/faq.aspx
This technote from 'AWS Blog' explains how to use the more standard and better documented Amazon S3 Data buckets to automatically offload data after a specified time to Amazon Glacier storage. The trick is to create a lifecycle rule. I'm inclined to try this, once I get myself better organized, although CloudGates also looks very worthy. Kudos! http://aws.typepad.com/aws/2012/11/archive-s3-to-glacier.html
Happy World Backup Day!
You can't be ahead of the curve, if you're stuck in a loop.