Why Mirroring Is Not a Backup Solution
Craig writes "Journalspace.com has fallen and can't get up. The post on their site describes how their entire database was overwritten through either some inconceivable OS or application bug, or more likely a malicious act. Regardless of how the data was lost, their undoing appears to have been that they treated drive mirroring as a backup and have now paid the ultimate price for not having point-in-time backups of the data that was their business." The site had been in business since 2002 and had an Alexa page rank of 106,881. Quantcast said they had 14,000 monthly visitors recently. No word on how many thousands of bloggers' entire output has evaporated.
DUH!
While this mirrors previous comments, it's not really a backup solution.
Mirroring, RAID, grid, whatever. At some point, you want your data safe and secure on something not physically attached to any power source.
And that's why your IT department actually needs funding. Sleep tight.
That is one reason why mirroring isn't a backup, and why backups should ideally be off-line.
If I have nothing to hide, don't search me
Excellent! We can use their demise as yet another cautionary tale.
It's really unfortunate that this happened. If they had simply had a backup snapshot of the DB they could have restored it. RAID only saves you from disk failures. It doesn't work on OS/user failures.
Unfortunately this is the kind of thing you tend to learn from experience (either yours or someone else). It's very easy to think "RAID 1 = disks are safe".
Just like a database cluster wouldn't have saved them. A clustering database can save you from load, or you can swap servers if a disk goes bad. But when someone issues "DELETE * FROM..." the other cluster nodes start to happily run the same thing and now you have 2 (or 3 or 10 or...) empty database boxes.
I hope those bloggers had a backup of some sort of their own.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
The rules of backups:
1. Backup all your data
2. Backup frequently
3. Take some backups off-site
4. Keep some old backups
5. Test your backups
6. Secure your backups
7. Perform integrity checking
I am experiencing a strange phenomenon. The jaw-drop reflex has been popping my mouth open for several minutes and won't stop. If I focus I can close it, but then it pops open again. wow.
The cost of that cleanup, of course, will be borne by taxpayers, not industry.
Important note: don't hire the IT dude with Journalspace.com on his resume.
Considering how complete and unrecoverable the loss is, they have no idea who their users are. The accounts would have to be recreated from scratch, but who would try? Their users have no reason to ever trust them again. Journalspace would have a difficult time wooing back their original users, and no new user would seriously consider using them.
Bowing out is the only recourse, but I'm glad they're considering releasing their source code.
They also purposely blocked archive.org via a robots.txt exclusion, so the bloggers can't use that to try and recover some of their blogs.
In today's world where primary storage and protection storage are well-defined, and where entire industry grew around it (examples: NetApp, Data Domain), one is hard-pressed to understand the reason for such a debacle. The reading of the note referred to in the article leads me to believe, unfortunately, that Journalspace's IT department did not understand the difference.
It is sometimes considered a bad form to say something bad about fellow techies. We prefer to look for 'outside' causes. Still, to learn and avoid the same problems in the future, one has to admit his mistakes first. This paragraph from the Journalspace's page:
The value of such a setup is that if one drive fails, the server keeps running, using the remaining drive. Since the remaining drive has a copy of the data on the other drive, the data is intact. The administrator simply replaces the drive that's gone bad, and the server is back to operating with two redundant drives.
makes me believe there is a denial going on.
End anonymous moderation and posting on
See mirroring is like...well a mirror. If you stand before one and stick a fork in your eye your mirror-image does the same. In real time. Analogies are there for a reason.
Looks like at least some content is still in Google's cache, those looking to salvage their journals should act quickly.
You can limit google's search results to a particular site by using the "site:domainname.com" search term (example) and then click the "Cached" links of each result to see Google's copy.
There's also a Greasemonkey script for Firefox that can automatically add Google Cache links next to page links, so you can navigate from one cached page to another easier.
You don't just need backups. You need to TEST them. Having a backup run every night is nice and all; but if the tapes are unreadable and no error was reported, or if you're doing it wrong and the backup is corrupted and you only find out when you come to restore ....
DR is Disaster Recovery
HA is High Availability
I tried Googling, but the only results I got were a medical office in Chinatown.....
"City hall" in German is "Rathaus" Kinda explains a few things......