Recovering a Wrecked RAID
Dr. Eggman writes "Tom's Hardware recently posted an article specifying how the professionals at Kroll Ontrack recover data from a RAID array that has suffered a hard drive failure, allowing for recovery of even RAID 5 arrays suffering two failures. The article is quick to warn this is costly, however, and points out the different types of hard drive failures that occur, only some of which are repairable. Ultimately the article concludes that consistent backups and other good practices are the best solution. Still, it provides an interesting look into the world of data after death."
RAID 5 is great, though expensive when done right. RAID 6 is better, though has less performance, as well as additional cost. Many controllers will not do RAID 6, and you lose 2 drives to parity. If your data is truly critical, you should have backups done VERY often, as well as a RAID 50. This way you are far less likely to lose data, though you have to have a stripe of at least 3 drives, in a mirror. This requires at minimum, 6 drives. There are also VRAIDs, which allow for you to lose drives until you hit the watermark of your data. This technology is usually reserved for SAN systems.
That's true, but the most common cause of data loss on a RAID system that I've seen is when a disk fails, and people leave it there for days or even weeks without bothering to replace it.
When a disk fails in a RAID, it needs to be replaced IMMEDIATELY. A RAID system with a failed disk is a disaster waiting to happen. I've been in smaller shops that don't even have spare disks around. When a disk failed, they would order a disk at that point and have it shipped.
You should always have plenty of spare disks around, and you should replace disks as soon as they fail. A double disk failure is rare, but the longer you put off replacing a failed disk, the more likely it becomes.
OK, this is for the very extreme (and rare) cases where the disk is physically very damaged. Most of the time, you'll find that available tools are enough. See http://en.wikipedia.org/wiki/SpinRite, for example. Has worked for me, but 1. Copy the entire disk contents first. 'Low-level' disk-to-disk dup utilities (Seagate...) can work fine here. 2. Be prepared to wait. Of course, if your disk is on its way out, the intensive reading, (and writing, in the case of SpinRite) may accelerate its demise. Keep the disk at a constant, cool temperate, (stick it in a domestic freezer if you've no aircon).
It's not that expensive with the price of drives these days. The nice thing about a mirror is that if your controller (or something else if you have a software raid) dies you can mount one of the drives on its own. After dealing with a failed controller, I'm glad to fork out a little more money for the piece of mind.
Gamingmuseum.com: Give your 3D accelerator a rest.
I'm a big fan of the hard drive->freezer method. It has been alleged that putting a broken hard drive into a freezer can sometimes make the data readable again for a short period of time.
This is good reading:
http://storagemojo.com/?p=383
Short synopsis for those who don't want to read it: The rebuild process is intense enough to cause secondary failures in many more cases than you'd think. Because you haven't seen it yet is not indicative of the overall population, and sysadmins are payed to be prepared.
The rest of your post is arguable, but it's more a matter of opinion and practice than anything else.
With the two drives on separate channels, mirrored writes can be done in parallel.
Intron: the portion of DNA which expresses nothing useful.
*.intellitxt.com is blocked in my adblock list. Makes hundreds of sites more readable.
My blog. Good stuff (when I remember to update it). Read it.
As long as you know how the RAID config was setup(striping size), most disk recovery programs will do the job just fine. GetDataBack NTFS is functional and simple tool to use as long as you know how the disks were setup. Including RAID5...I've rebuilt 3 RAID5's and a shitload of 0's, 1's, and 01's. You should see the look on some of these people's faces after your done(with all 18+hrs of it...)The problem usually I find is that if you recovered the data then the customer is usually under the impression that you *fixed* the disk and they can keep on using it without replacing it...so yeah, it's not a big deal it's just a question of how much time you want to spend and how much time you have to finish the job.
If the motherboard fails and is replaced, won't the disks be overwritten when reconfiguring the array?
If you use a reputable controller (i.e. one that costs more than your entire motherboard), it will read the configuration off the disks instead of overwriting them.
As much as this stuff is cool, it's going to be insanely expensive to restore data from these guys.
Data integrity and uptime are served by RAID5. If it's not good enough, then it should be backed with mirroring (RAID5+0) or some form of dual-parity RAID (RAID-DP from NetApp, etc.).
But data gets lost or corrupted, even without disk failures. Backups are the place where data recovery is done. DO YOUR BACKUPS!
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban