Slashdot Mirror


Everything You Know About Disks Is Wrong

modapi writes "Google's wasn't the best storage paper at FAST '07. Another, more provocative paper looking at real-world results from 100,000 disk drives got the 'Best Paper' award. Bianca Schroeder, of CMU's Parallel Data Lab, submitted Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? The paper crushes a number of (what we now know to be) myths about disks such as vendor MTBF validity, 'consumer' vs. 'enterprise' drive reliability (spoiler: no difference), and RAID 5 assumptions. StorageMojo has a good summary of the paper's key points."

1 of 330 comments (clear)

  1. Infant Mortality and stuff by jmorris42 · · Score: 0, Troll

    > Um, but doesn't the summary of the paper say that there is no infant mortality effect, and that
    > failure rates increase with time, and thus the bathtub curve doesn't actually apply?

    That may be the new 'theory' but we all know about theory vs reality. Here in reality if you put a couple of dozen new drives into service you have one or two spare hard drives to replace the ones that WILL fail in the first week. Especially with consumer grade drives typical in workstation deployment. If you only have one dud out of twenty it was a good rollout.

    And as for some of the other assertions in this paper (well the summary, haven't read this one yet, still wanting to reread the google paper again, need to hours in a day.... bah!).......

    > Costly FC and SCSI drives are more reliable than cheap SATA drives.

    Sorta. Again, real world vs theory. Try banging the hell out of an off the shelf consumer drive 24/7/365 and see how long it holds up. Yea, thought so. Hope you didn't have anything important on that paperweight.

    > RAID 5 is safe because the odds of two drives failing in the same RAID set are so low.

    This one should bother ya if you are overly relying on the 'infallibility' of RAID5. Remember kids, drives fail from two major groups of causes, internal and external. If a power event kills one drive in the array the odds are pretty low of only one being dead, you just might not KNOW about #2 yet. And filesystem corruption will be faithfully mirrored onto the array. Obey the 1st Commandment: "Thou Shalt Make Backups."

    --
    Democrat delenda est