Slashdot Mirror


Recovering a Wrecked RAID

Dr. Eggman writes "Tom's Hardware recently posted an article specifying how the professionals at Kroll Ontrack recover data from a RAID array that has suffered a hard drive failure, allowing for recovery of even RAID 5 arrays suffering two failures. The article is quick to warn this is costly, however, and points out the different types of hard drive failures that occur, only some of which are repairable. Ultimately the article concludes that consistent backups and other good practices are the best solution. Still, it provides an interesting look into the world of data after death."

34 of 175 comments (clear)

  1. Re:RAID5. by Anarke_Incarnate · · Score: 3, Informative

    RAID 5 is great, though expensive when done right. RAID 6 is better, though has less performance, as well as additional cost. Many controllers will not do RAID 6, and you lose 2 drives to parity. If your data is truly critical, you should have backups done VERY often, as well as a RAID 50. This way you are far less likely to lose data, though you have to have a stripe of at least 3 drives, in a mirror. This requires at minimum, 6 drives. There are also VRAIDs, which allow for you to lose drives until you hit the watermark of your data. This technology is usually reserved for SAN systems.

  2. Re:RAID5. by networkBoy · · Score: 4, Insightful

    For DB's and home use a mirror set is usually best. For homes because it is simple, for DB servers because it is fast.
    My home setup is a pair of 300 gig drives in a mirror, with another 1.6TB for other storage. Stuff that is important is on the mirror, and is differentially backed up to DVD regularly.
    Stuff on the mass array is available in original form (my DVD and CD library that's been ripped) or is backed up whenever it changes, which is not often (my code library, for example). Active code and my wife's thesis are on the raid. Supporting documents for the thesis are on DVD and mass storage, as is old code projects that I may borrow from for functionality in a new project. The old project (and likely several versions of it) are off on DVD in a safe deposit box, with the rest of my backups.

    Safe deposit boxes are awesome. I have one that can store 600 cds in cake boxes and it only costs $120/year. Dirt cheap for climate controlled fireproof storage.
    -nB

    --
    whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
  3. Re:RAID5. by drinkypoo · · Score: 4, Insightful

    RAID 50? Why not RAID 10? If you're already mirroring, the RAID 5 will probably not afford you much additional protection, and it has the effect of making writes slower.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  4. Long-winded advertisement^H^H^H^H^H^H^H^H article. by QuietLagoon · · Score: 4, Insightful

    It takes far too many pages to say what could actually fit in a page or two.

  5. FOR THE LAST FREAKIN' TIME... by canUbeleiveIT · · Score: 4, Insightful

    Never put all of your eggs in one little basket (RAID or otherwise)! For the love of God, if your data is critical, you need a backup *and* an offsite backup. At least one of each. There are no exceptions to this rule.

    1. Re:FOR THE LAST FREAKIN' TIME... by eln · · Score: 4, Informative

      That's true, but the most common cause of data loss on a RAID system that I've seen is when a disk fails, and people leave it there for days or even weeks without bothering to replace it.

      When a disk fails in a RAID, it needs to be replaced IMMEDIATELY. A RAID system with a failed disk is a disaster waiting to happen. I've been in smaller shops that don't even have spare disks around. When a disk failed, they would order a disk at that point and have it shipped.

      You should always have plenty of spare disks around, and you should replace disks as soon as they fail. A double disk failure is rare, but the longer you put off replacing a failed disk, the more likely it becomes.

    2. Re:FOR THE LAST FREAKIN' TIME... by BagOBones · · Score: 2, Insightful

      I agree they must be replaced ASAP. However I don't keep drives on hand when my vendor can get a new disk to my desk within 4 hours of my call at no charge.

      --
      EA David Gardner -"... but the consumers have proven that actually what they want is fun."
  6. Software RAID by Kludge · · Score: 3, Insightful

    People often poopoo software RAID (it is more of a pain to manage). But when it comes to recovery, it's what you want. You know the disk format and have the tools. Of course, you really shouldn't have to recover, you should keep good backups or another mirror if its that important.

  7. Gotta love Tom's articles by sjbe · · Score: 4, Insightful

    Could these articles be any more annoying to read?

    They painstakingly

    NEXT PAGE

    pull data

    NEXT PAGE

    off the

    NEXT PAGE

    damaged drive

  8. Could have mentioned other options by Bearhouse · · Score: 2, Informative

    OK, this is for the very extreme (and rare) cases where the disk is physically very damaged. Most of the time, you'll find that available tools are enough. See http://en.wikipedia.org/wiki/SpinRite, for example. Has worked for me, but 1. Copy the entire disk contents first. 'Low-level' disk-to-disk dup utilities (Seagate...) can work fine here. 2. Be prepared to wait. Of course, if your disk is on its way out, the intensive reading, (and writing, in the case of SpinRite) may accelerate its demise. Keep the disk at a constant, cool temperate, (stick it in a domestic freezer if you've no aircon).

    1. Re:Could have mentioned other options by goarilla · · Score: 2, Interesting

      what's wrong with popping in a livecd like sysreccd http://www.sysresccd.org/Main_Page/
      and to use dd to take an image of the disk or ghost (but iirc ghost uses dd) ?
      i have been able to successfully recover 99% of a crashed, broken, badly partitioned hard drive that way numerous of times
      offcorse i do not claim i have the expertise as ontrack but seeing as i've done this for quite
      a few friends and since well not everybody can pay what they ask for their service, i can understand
      why they get drives that have been subjected to a DIY recovery at first

      But why do we need all these expensive consumer disk recovery tools, that often do not work correctly
      i must agree on the issue that this article is mainly advertising but that is to be suspected
      i mean the dude works in that company, he's kinda obligated to praise the so called 'superiority' of their own proprietary tools.

      granted i don't have a clean room but the area in that so-called clean room doesn't seem so clean
      and well the platters hanging on the top right on the other picture doesn't strike me as a good idea neither
      that obviously wasn't a clean room and i would at least encourage the use of a static bag anyway.

  9. Printer Friendly by TubeSteak · · Score: 3, Insightful

    http://www.tomshardware.com/2007/02/14/raid_recove ry/print.html

    I don't know why TH has printer friendly pages that they don't ever link to.

    --
    [Fuck Beta]
    o0t!
  10. Questionable advice from Tom's by greg1104 · · Score: 5, Interesting

    I have a concern with the recommendations given in the introduction:

    We assume that all hard drives will be handled with care, so they should be installed in suitable drive bays. If you use multiple drives, we recommend removable drive frame solutions, which help reduce vibration transfer onto the computer chassis and even back to individual hard drives. Make sure that your system has sufficient ventilation, so high speed hard drives won't overheat.

    I've found that the removable drive frames available for cheap consumer hardware to be total crap. The metal enclosure keeps heat close to the drive, and the tiny fans used don't move nearly as much air past the drive as when it's inside the case, being cooled by the airflow of the case fans. The drive temperature is therefore higher even under the best conditions. In addition, the smaller fans fill with gunk quickly and as a result wear out faster than larger ones, leading regularly to a drive trapped in an uncooled box.

    I've used enclosures from Promise, Enermax, and several other companies whose products were so bad I tried to forget their names; all had fans that instantly became the least reliable part of the entire system once I installed the drive frame, and I wasn't happy with the drive's temperature from day one.

    I don't think the person making this comment at Tom's ever keeps systems running long enough to realize the long-term issues that come with anything cheaper than server-grade drive enclosures for hard drives. I'd welcome suggestions for a better quality product in this category. It's a hard subject to cover, because by the time you've had several units setup for a year or two to gather useful data on how rugged they are, the product is obsolete; not something any review site I'm aware of is setup to cover.

  11. IntelliTXT too by Skadet · · Score: 3, Insightful
    Yeah, between that and IntelliTXT, I pretty much gave up.

    What if your hard drive decides to enter the Elysian Fields in this very moment? Sure, you could simply get a new hard drive to substitute for the defective one with a quick run to your favorite hardware store. And with last night's backup you might even reconstruct your installation quickly. But what if you don't have a backup? We have experienced the truth to be more like this: many users don't even have a backup, or it simply is too old and thus useless for recovering any useful files at all. In case of real hard drive damage, only a professional data recovery specialist can help you - say bye-bye to your vacation savings!
    Anyone remember when Tom's Hardware was good?
    1. Re:IntelliTXT too by operagost · · Score: 2, Insightful

      Besides dedicating only about 10% of the page to actual content, the grammar is actually even worse than it used to be. Don't they have any native English-speaking editors?

      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
    2. Re:IntelliTXT too by PitaBred · · Score: 2, Informative

      *.intellitxt.com is blocked in my adblock list. Makes hundreds of sites more readable.

  12. Re:RAID5. by jandrese · · Score: 4, Insightful

    Guys, if you're doing regular backups and have a cold spare handy then RAID5 is typically more than enough. Two drive failures are exceedingly rare unless you have some sort of controller fault (and that will typically hit all of your drives anyway). Don't forget about the write penalty either, RAID 5 has a fairly stiff penalty, but RAID 6 is even worse. If you're talking about RAID5_0 or RAID6_0 you're almost certainly doing it wrong or planning for a day when you can't buy replacement hard drives (nuclear holocaust?).

    To put it another way: What do you think your chances are of having a second drive failure in the few hours it takes you to replace the drive and rebuild it? Even if that does happen you just lose the data up until your last backup (a day at most).

    Most professional installations I do are RAID1_0, because people are building the RAID array for the performance, not the cost. Since you're using crappy 80GB HDDs, I'm guessing you're going for cost, which makes it strange that you're thinking about a RAID6_0 solution at all (the controller alone won't be cheap for that). If you work the odds I think you'll find that it's just not worth it to build a RAID6_0, especially given the write penalty and complexity (complexity is your enemy with this, complexity means bugs, which can undermine your entire effort).

    --

    I read the internet for the articles.
  13. Gibson the Hack by spun · · Score: 3, Insightful

    SpinRite is a Steve Gibson product. Steve Gibson is a pompous blowhard with few real skills. There are plenty of other ways to do a low level copy of a disk.

    --
    - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  14. Re:RAID5. by endoftheroadmatt · · Score: 2, Informative

    It's not that expensive with the price of drives these days. The nice thing about a mirror is that if your controller (or something else if you have a software raid) dies you can mount one of the drives on its own. After dealing with a failed controller, I'm glad to fork out a little more money for the piece of mind.

  15. Backing up HDDs is very hard by AmiMoJo · · Score: 2, Interesting

    With recent articles on HDDs not being very good for redundancy (because they often fail at the same time if they are from the same batch, or fail because of things like electrical spikes which affect all drives in an array) it is clear that HDDs are not an ideal backup medium. I use an external 2.5" HDD which is totally disconnected from the PC and everything else when not in use (to avoid power surges etc), but only for critical data as my machine has 1.2TB of HDD storage.

    Optical discs are a joke - 4.3GB is just not enough. Larger formats exist but are relatively expensive. Tape is expensive per MB and slow, plus it isn't random access and not suited to anything but slow full backups. MO is too small and expensive.

    It seems like the best bet is something like a Century Tower - basically a USB enclosure that can take up to 4/8 drives. Keep it totally disconnected when not in use, and use RAID 0 mirroring with drives from different manufacturers.

    --
    const int one = 65536; (Silvermoon, Texture.cs)
    SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    1. Re:Backing up HDDs is very hard by operagost · · Score: 3, Informative

      Optical discs are a joke - 4.3GB is just not enough. Larger formats exist but are relatively expensive. Tape is expensive per MB and slow, plus it isn't random access and not suited to anything but slow full backups.
      Your knowledge is out of date. For example, a SuperDLT 640 backs up at 32 MB/s with compression. Slower than a disk, but not "slow". Sequential access: well that's a given. Only suited for full backups? That's news to my company. Even daily incrementals and differentials are usually hundreds of megabytes or a few GB, which negates the small spool-up time of the tape. Besides, most modern tapes now store metadata on an internal chip so that an on-tape index does not need to be searched.

      use RAID 0 mirroring
      RAID 0 is striping. You probably mean RAID 10 or RAID 0+1.
      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
  16. Cheap Solution by SwabTheDeck · · Score: 2, Informative

    I'm a big fan of the hard drive->freezer method. It has been alleged that putting a broken hard drive into a freezer can sometimes make the data readable again for a short period of time.

  17. sorry, that's wrong... by HalfOfOne · · Score: 2, Informative

    This is good reading:
    http://storagemojo.com/?p=383

    Short synopsis for those who don't want to read it: The rebuild process is intense enough to cause secondary failures in many more cases than you'd think. Because you haven't seen it yet is not indicative of the overall population, and sysadmins are payed to be prepared.

    The rest of your post is arguable, but it's more a matter of opinion and practice than anything else.

  18. Re:RAID5. by Intron · · Score: 3, Informative

    With the two drives on separate channels, mirrored writes can be done in parallel.

    --
    Intron: the portion of DNA which expresses nothing useful.
  19. backups by hurfy · · Score: 2

    Besides having a backup not connected to system, i found simply having a spare disk to steal the circuit board off of to be a life saver :)

    I miss the old bigfoot drives we had, everyone said they had problems with them but it was always (in our case) the board that died NOT the disk. I saved a couple of those by swapping in a board for a 1 hour recovery.

    If you buy several HD for RAID or whatever buy one more and stick it on shelf for a rainy day. Along with a few utilities you can do 3/4's of what they do for $100 instead of $1000+

  20. The easy way to fix a wrecked raid by StreetStealth · · Score: 2, Funny

    Really, a wrecked raid can be fixed pretty easily if you have enough warlocks to get everyone a soulstone.

    --
    Your mind is clear / The things that you fear / Will fade with how much you / Believe what you hear
  21. Terminolgy update by sasdrtx · · Score: 2, Funny

    When a disk fails in a RAID, it becomes an AID.

    Or in the case of RAID-1, it becomes just an ID.

    --
    Most people don't even think inside the box.
  22. Lunch by Seraphim_72 · · Score: 3, Interesting

    I attended a small conference where the Kroll VP of Data Recovery was speaking. He came in, his assistant set up his power point stuff, made sure the projector was right etc. He then gave a very interesting talk about what Kroll could pull off of a drive, despite what had been done to it. By way of example he showed a slide of a burnt and bent hard drive - that came out of the sky when the shuttle broke up. They recovered 99% of the data on that drive. He also mentioned that they do the data recovery for all of the spook organizations in D.C.

    When we broke for lunch I got to sit at his table and we got to ask him all sorts of questions about their processes. He mentioned they have things they use that they have never patented because it would be too much of a leg up for both the competition and those that seek to destroy data. We tried to get him to tell us what we would have to do to a drive to make it unreadable. Mostly his answers to our "Surely this would make the data unreadable" queries were "You would think that would work wouldn't you?" Someone referenced his assistant who was sitting next to him and the VP said:

    "Him? No, no, no. (laughs) He is not my assistant, in fact he doesn't work for me at all. He is a lawyer for the company and is here to make sure I don't say anything I am not supposed to." The assistant then gave us one of those 'I could eat you alive' lawyer smiles.

    I walked out secure in the knowledge that short of melting the platters down the data can *always* be recovered.

    Sera

    --
    Slashdot, where armchair scientists get shouted down and armchair theologians get modded up.
    1. Re:Lunch by DamnStupidElf · · Score: 2

      I walked out secure in the knowledge that short of melting the platters down the data can *always* be recovered.

      Encrypt. I guarantee that even if the NSA can break AES, they won't do it for anything short of top secret cases that will never see the light of day. Breaking random drives encrypted with AES or any other modern cipher would disclose their ability to break that cipher and no one would use it anymore, removing their advantage.

  23. Truly astonishing...but so simple by hrrY · · Score: 2, Informative

    As long as you know how the RAID config was setup(striping size), most disk recovery programs will do the job just fine. GetDataBack NTFS is functional and simple tool to use as long as you know how the disks were setup. Including RAID5...I've rebuilt 3 RAID5's and a shitload of 0's, 1's, and 01's. You should see the look on some of these people's faces after your done(with all 18+hrs of it...)The problem usually I find is that if you recovered the data then the customer is usually under the impression that you *fixed* the disk and they can keep on using it without replacing it...so yeah, it's not a big deal it's just a question of how much time you want to spend and how much time you have to finish the job.

  24. Data recovery is never an easy process by cyanics · · Score: 3, Interesting

    Last week, i did a data recovery on a client that had multiple disk head crash from a power outage, or a kick or something. The drives were resulting in a click-seek, which for the most parts is unrecoverable.

    Popped in a Helix disk, and checked what the MFT was doing. Low and behold, no MFT, no boot sector, and a huge list of bad sectors. Basically, the crash had resulted in a bad sector in the bad sector table, and all over the first portion of the disk.

    These were 200GB disks, but eventually I was able to get a sector repair program to read through and do a non-destructive repair. Data was safe, but was now corrupt. Next step was to repair the data, and I was finally able to just use chdisk to repair.

    Eventually, it was back to real data, and was able to push the data over to a new replacement hard drive.

    Told the client to invest in RAID 1, but seriously doubt they would be willing to spend that $100 for the RAID. Instead, they prefer to pay $1000 for a repair.

    BACKUPS. make lots of BACKUPS. RAID your stuff, and get those backups offsite. Do them regularly. Seriously, it would save your ass if something happens. For example, I have a LAN HD that is parked out in a shed in my backyard. Total cost $200, and has already saved my ass 2x.

  25. Re:Newbie Question... by Wesley+Felter · · Score: 2, Informative

    If the motherboard fails and is replaced, won't the disks be overwritten when reconfiguring the array?

    If you use a reputable controller (i.e. one that costs more than your entire motherboard), it will read the configuration off the disks instead of overwriting them.

  26. RAID5 is good, not flawless by swordgeek · · Score: 2, Informative

    As much as this stuff is cool, it's going to be insanely expensive to restore data from these guys.

    Data integrity and uptime are served by RAID5. If it's not good enough, then it should be backed with mirroring (RAID5+0) or some form of dual-parity RAID (RAID-DP from NetApp, etc.).

    But data gets lost or corrupted, even without disk failures. Backups are the place where data recovery is done. DO YOUR BACKUPS!

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  27. Re:RAID5. by v1 · · Score: 2, Interesting

    I have 3TB of storage here. 1TB of that is in a 5 x 250 hardware raid-5. In this case it's a stand-alone enclosure with firewire/usb ports on the back. I chose this because it's easily portable to another machine, so if the server buys the farm, I simply unplug it and walk it to the next server and jack it in. Also if the computer crashes, there is less risk of corruption of data since the raid box is handling parity calculation. It also does not hammer the server if I have to do a rebuild.

    The other 2TB is the large, more easily replaceable data, mostly video media. Those are arranged as single drives. I am not stupid enough to try to stripe them, I don't like the idea of any 1 of 8 drives failing leading to total data loss on all drives.

    I have a script that runs once a week and reads 100% of all blocks on all drives, and emails me if a bad block was found. I replace the drive immediately. To date I have yet to lose a byte.

    Closest call I had was a few years ago before the raid5, I had a pair of software mirrors. I had a server crash that wiped one drive's partition table and wiped the other drive's directory. Neither alone was fixable by any utility I tried. I ended up doing some really scarry things with DD and XXD in terminal to reconstruct the partition table from scratch on one drive and install firewire drivers on it so I could get at my data. I am very thankful for having a very high level of knowledge on partition table and basic directory layouts, most people would have had to cough up a stack of benjamins to get that fixed.

    My near future plans are to buy two 1tb lacie bigdick (they have a horrible failure rate, never use them for critical anything) and use that to back up the loose drives. Long term I plan to get a larger raid5 box to phase out the old raid as primary storage, and convert it to cover some of the less critical storage. Right now the cheapest and simplest backup plan for us seems to be to buy large cheap drives and keep 1 or 2 complete clones.

    --
    I work for the Department of Redundancy Department.