Slashdot Mirror


Ask Slashdot: How Do You Test Storage Media?

First time accepted submitter g7a writes "I've been given the task of testing new hardware for the use in our servers. For memory, I can run it through things such as memtest for a few days to ascertain if there are any issues with the new memory. However, I've hit a bit of a brick wall when it comes to testing hard disks; there seems to be no definitive method for doing so. Aside from the obvious S.M.A.R.T tests ( i.e. long offline ) are there any systems out there for testing hard disks to a similar level to that of memtest? Or any tried and tested methods for testing storage media?"

2 of 297 comments (clear)

  1. badblocks by Janek+Kozicki · · Score: 4, Interesting

    badblocks -c 10240 -s -w -t random -v /dev/sda1

    that's my standard test for all HDDs

    --
    #
    #\ @ ? Colonize Mars
    #
  2. Re:Why? by v1 · · Score: 4, Interesting

    The point is to know whether it's faulty now at the time of arrival rather then 2 weeks down the line where it becomes a problem.

    I would disagree. I believe it's best to be able to identify the first moment a hard drive is starting to have problems, rather than the condition its in when you get it.

    One reason is that most of your hard drives will eventually develop a problem, and only a small fraction of the drives you buy will arrive defective.

    Another reason is that nothing of value is on the new drive, you are risking only purchase price. A year from now, you may have important, possibly irreplaceable or at least inconvenient things to replace.

    I run a piece of custom software I wrote that does a slow "disk crawl", reading ~100mb every 5 minutes. Over the course of a month it has read every block on the drive, and starts over. I get an email if an i/o error OR slow performance is encountered. I store a lot here, I have somewhere around 25TB of storage under the roof at home. Over the years I've been notified ~8 times of a failing drive. In all cases I was able to replace it before it became inaccessible. One of them failed to spin up ever again the day after I removed it from service. I consider this a very good system, and am surprised not to see a similar commercial offering. (it's a 5,600 line bash script!)

    SMART is only useful to possibly confirm that a drive has a problem. Only a fool relies on it to notify them when there's a problem. I've probably replaced somewhere around 750 hard drives here at work, and of those, under a dozen were still accessible and displaying a SMART failure. Many times I've had SMART toggle to failed while I was doing data recovery to a replacement drive, as I was fighting my way through I/O errors. Got some Cpt Obvious going on there I think.

    --
    I work for the Department of Redundancy Department.