Slashdot Mirror


Ask Slashdot: Do You Test Your New Hard Drives?

An anonymous reader writes "Any Slashdot thread about drive failure is loaded with good advice about EOL — but what about the beginning? Do you normally test your new purchases as thoroughly as you test old, suspect drives? Has your testing followed the proverbial 'bathtub' curve of a lot of early failures, but with those that survive the first month surviving for years? And have you had any return problems with new failed drives, because you re-partitioned it, or 'ran Linux,' or used stress-test apps?"

55 of 348 comments (clear)

  1. Heh by Deekin_Scalesinger · · Score: 4, Insightful

    Like, never. Out of the box and away she goes...good luck to thee!

    --
    "As the intrepid kobold companion continues his journey, he begins to wonder... if priests raises dead, why anybody die?
    1. Re:Heh by JMJimmy · · Score: 4, Insightful

      Add to the above:

      HDD tools are useless. I recently tried a bunch of them - they all reported my HDD in perfect condition... while it was doing the click of death. HDD failed within a week.

    2. Re:Heh by PlusFiveTroll · · Score: 3, Informative

      Sounds more like your hard drive s.m.a.r.t. was useless. The tools can only report what the drive tells it, if smart isn't telling about relocated sectors, resets, or whatever other terrible malfunction then they are left in the dark.

    3. Re:Heh by hairyfeet · · Score: 4, Interesting

      The problem is the best damned tool ever made for testing drives hasn't been updating in years and now won't work on drives bigger than 500Gb, I am of course talking about Spinrite. With Spinrite on lvl 2 you just bypass the firmware and write patterns of zeroes and ones and then read back what it reports, if its spitting errors right off the bat then you know to send it back. Problem is Gibson hasn't updated the thing since 06 so it can't handle drives bigger than 500Gb which makes it all but useless today.

      So if anybody has found something that works similar to spinrite but works on large drives I too would like to know, I get drives coming in from all over the place at the shop with ZERO history here at the shop so I don't know if they've been barely used or thoroughly abused and having a tool I can run on them would be a big help.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    4. Re:Heh by hairyfeet · · Score: 3, Informative

      That's nice, an OS used by less than 2% of the entire planet has some tool that reports what SMART is telling it, no different that a billion freeware programs for Windows. Just FYI but I can think of about a dozen freeware programs that will do the same damned thing in Windows, INCLUDING the email, so its not exactly like you got anything to brag about Ms AC.

      Now I'm gonna spell out what the REAL problem is, which any guy who has spent time in the trenches will tell you and that is SMART SUCKS ASS and for several years has more about covering bad batches for the HDD OEMs than it has been for actually telling you something is going bad. I have had drives in the shop that sounded like an angle grinder bouncing on pavement where SMART said "Nope, nothing wrong here la la la"" while the thing just ground and sputtered, its the most fucking pointless diagnostic tool there is.

      What we NEED is a replacement for Spinrite, something that bypasses the lying SMART and just runs a pass of zeroes and ones on the drive and reports a simple pass/fail on the read/writes. Spinrite was fucking brilliant for this, it would give you a layout of the entire drive with red for sectors that failed to report the correct data back and blue for clean so it took just a second to glance at the readout to spot a drive that was buggy out of the box, but nobody has updated the tool in years so its useless now since it can't do SATA 6 or drives above 500Gb.

      So how about it FOSS devs, here is the requirements: Bypass SMART, does a single R/W cycle, reports results. That's ALL it has to do anjd so far nobody has stepped up to the plate. damned near every shop I knew including mine had bought a copy of Spinrite so there is good money to be made there if you are willing to put in the work, its a niche but its a niche with money, builders, repair shops and gamers would all love to hand you money for this tool, so get on it and report back when its done, okay?

      --
      ACs don't waste your time replying, your posts are never seen by me.
    5. Re:Heh by Runaway1956 · · Score: 2

      I saw nothing about any burn in tests in the GP post. The guy has a couple of scripts running to ensure that A) he is made aware of impending hard disk problems, and B) his data is backed up in the event of a hard disk problem.

      Reading comprehension 101, available at a community college near you.

      Unless, of course, you're just trolling a Linux user. In which case, feel free to continue making a fool of yourself.

      --
      "Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br
    6. Re:Heh by JMJimmy · · Score: 3, Informative

      No, not SMART. I did a full range of tests with all suits on top of SMART (surface tests, etc)

      The only HDD tool I trust is the ancient one from GRC.

    7. Re:Heh by greg1104 · · Score: 5, Interesting

      Spinrite hasn't been useful for years. There's a good analysis why at Does SpinRite do what it claims to do?. Everything the program does can be done more efficiently with a simpler program run from a Linux boot CD. And the fact that it takes so long is a problem--you want to get data off a dying drive as quickly as possible. Here's what I wrote on that question years ago, and the rise of SSDs make this even more true now:

      SpinRite was a great program in the era it was written, a long time ago. Back then, it would do black magic to recover drives that were seemingly toast, by being more persistent than the drive firmware itself was.

      But here in 2009, it's worthless. Modern drives do complicated sector mapping and testing on their own, and SpinRite is way too old to know how to trigger those correctly on all the drives out there. What you should do instead is learn how to use smartmontools, probably via a Linux boot CD (since the main time you need them is when the drive is already toast).

      My usual routine when a drive starts to go back is to back its data up using dd, run smartmontools to see what errors its reporting, trigger a self-test and check the errors again, and then launch into the manufacturer's recovery software to see if the problem can be corrected by it. The idea that SpinRite knows more about the drive than the interface provided by SMART and the manufacturer tools is at least ten years obsolete. Also, getting the information into the SMART logs helps if you need to RMA the drive as defective, something SpinRite doesn't help you with.

      Note that the occasional reports you see that SpinRite "fixes" problems are coincidence. If you access a sector on a modern drive that is bad, the drive will often remap it for you from the spares kept around for that purpose. All SpinRite did was access the bad sector, it didn't actually repair anything. This is why you still get these anecdotal "it worked for me" reports related to it--the same thing would have been much better accomplished with a SMART scan.

    8. Re:Heh by SuperTechnoNerd · · Score: 4, Interesting

      You have to interpret the data correctly. Looking at seek error rate and raw read errors tells if the heads are positioning accurately. Run the drive hard (read/write patterns )and watch the temperature. And of course if you start seeing a non 0 pending, and realloc sector count you know the end is near. And watch as a drive gets older the spin up time will increase. (I rarely shut the raid server down so this is less important). I have smartd email and text me any time things start to get out of a happy place.. I do nightly quick test and weekly extended tests. Smart is useful - if your smart about it...

    9. Re:Heh by Burpmaster · · Score: 4, Informative

      What you want is just 'badblocks -w '.

    10. Re:Heh by greg1104 · · Score: 4, Interesting

      SMART is a part of the modern drive's firmware. You can't bypass it. Anyone who tells you otherwise--such as the makers of Spinrite--is lying to you in order to sell a product.

      The quality of SMART implementation varies significantly based on the manufacturer. Anecdotally, I have 3 failed Western Digital drives here that flat out lie about the drive's errors. Running the tool needed to generate an RMA does a full SMART scan of the drive, remaps some bad sectors, and then says everything is good. But it's not--each drive is still broken, in a way the firmware seems downright evasive about. Try to use it again, it doesn't take long until another failure. It does seem like the sole purpose of SMART and its associated utilities on WD drives is to keep people from returning a bad drive, by providing a gatekeeper in that process that never says there's a problem.

      Most of my serious installations avoid WD drives like the plague for this reason. I think that Seagate's drives are probably less reliable overall than WD nowadays. Regardless I prefer them, simply because the firmware is more honest about the errors that do happen. Drives fail and I plan for that. What I can't deal with is drives that fail but don't admit it.

      The reason there are "RAID edition" firmware available is to provide a drive that isn't supposed to be as evasive about errors. It may be that some WD RAID edition models might not have the problem I'm describing. I soured on them as a brand before those became mainstream.

    11. Re:Heh by Culture20 · · Score: 4, Informative

      My usual routine when a drive starts to go back is to back its data up using dd

      ddrescue is the tool for backing up a failing drive unless you really want to manually check every failed sector read then restart a new dd (skipping to the next sector).

    12. Re:Heh by koinu · · Score: 2

      Why does SMART suck?

      When I watch the SMART values and events I can tell about 3 weeks in advance before a hard drive fails. Also, the manufacturers watch the SMART values to check if a replacement can be offered or if you made some mistake.

      To me SMART does not lie, but reports too much. It reports every replaced sector which is totally unimportant, especially when you buy a new hard drive, you will find faulty sectors in 50% of cases (quite normal). The hard drive with few faulty sectors on day one will function for decades correctly.

    13. Re:Heh by BLKMGK · · Score: 3, Informative

      Not exactly useless... There's a preclear script that many unRAID users use to beat up their drives while monitoring SMART. It doesn't just look at SMART for a thumbs up or down but monitors the various parameters that SMART throws out. Users run this multiple times in a row and find bad drives fairly regularly. I will admit that I've not been running it but judging from the numbers of folks who have been finding it useful and from the fact that warranties seem to be getting ever shorter I may begin doing so. I use a decent number of the 3TB drives that are always going on sale and I'm starting to think I'm tempting fate by not testing them. I've gotten spoiled in that my unRAID box covers my ass in the even of a failure but I see too damn many reports of new drives going toes up to not be concerned. I have 3 drives sitting on the shelf waiting to be loaded and I may beat them up beforehand just to be sure they won't screw me when I least expect it...

      --
      Build it, Drive it, Improve it! Hybridz.org
    14. Re:Heh by PhunkySchtuff · · Score: 2

      Get enterprise series drives, not consumer drives. One difference is the firmware is a lot more up-front about errors, rather than trying to hide them and carry on as if everything is OK.
      In a RAID, you're going to want to fail a drive as soon as it starts to play up, whereas the average consumer wants a drive that doesn't turn around and die at the first small error, where it can remap sectors and pretend that nothing happened.
      Part of the reason enterprise drives cost more, when they're often the same, or very similar, physical hardware is that the price includes the better warranty...

    15. Re:Heh by thegarbz · · Score: 4, Informative

      No, not SMART. I did a full range of tests with all suits on top of SMART (surface tests, etc)

      The only HDD tool I trust is the ancient one from GRC.

      That is absolutely laughable. Spinrite is about as good at interfacing with a modern drive than an old 16bit dos program trying to sqeeze every ounce of performance out of a 64bit processor. It had it's purpose in its day. These days running it will more likely do more harm than good.

      Not to mention that if your drive is at the end of life running a program that is widely known to give it a most horrendous thrashing is probably not a good idea.

    16. Re:Heh by LordLimecat · · Score: 2

      Not useless, just not a good indicator of a drive NOT being near death. Its a great indicator to confirm that the drive IS dying-- if you see for instance 500 bad sectors, you may want to prepare to replace that drive.

    17. Re:Heh by washu_k · · Score: 2

      Running spinrite against an SSD is one of the clearest ways of showing that it is complete BS. It will report all sorts of things about the drive that are clearly impossible. It won't error or give no data, it clearly makes things up about the drive.

      Another good BS test for spinrite is to run it against a non-ATA drive that is still BIOS accessible. A booted USB flash drive is the best, but something like a modern SCSI/SAS controller works as well. It's clearly impossible for spinrite to access such a device directly, yet it still reports all sorts of things it simply could not see. No errors or blank data, it again makes shit up and displays it.

    18. Re:Heh by Anonymous Coward · · Score: 2, Interesting

      Agreed. I just recovered a very messed up 120GB drive with gnu ddrescue. It took over 7 days to read, but only lost 300MB of data. Very happy with the results.

    19. Re:Heh by Pentium100 · · Score: 3, Interesting

      MHDD works best for me for testing the drive. Spinrite (and ddrescue) is good for data recovery, but not that good for testing. I had one drive that have a lot of sectors that were good, except that the drive took 10-30 seconds to read them making the PC extremely slow (Windows would drop to PIO mode and be slow even when reading the good sectors).Chkdsk didn't detect anything, Spinrite didn't detect anything, only mhdd showed lots of slow sectors (I later made a list and manually marked them as bad, getting a 2.5" IDE drive is not that easy or fast, so it will have to do until then).

    20. Re:Heh by Nutria · · Score: 2

      Computer hardware is cheap

      Relative to 10 years ago, but $150 here, $100 there and $75 somewhere else add up for an impoverished college student, or a middle class family with other expenses out the wazoo to pay.

      --
      "I don't know, therefore Aliens" Wafflebox1
    21. Re:Heh by toddestan · · Score: 2

      I wouldn't ignore it. While SMART saying everything is okay doesn't mean much, SMART telling you that there is a problem is a definite reason for concern.

    22. Re:Heh by jon_doh2.0 · · Score: 2

      God, the grammar Nazis are breading on Slashdot. Fuck off!

  2. Re:SSDs by roc97007 · · Score: 5, Insightful

    > Who cares about HDDs anymore these days?

    Anyone with a need for a massive amount of storage space.

    --
    Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
  3. dban followed by smartctl by X0563511 · · Score: 3, Interesting

    If dban can write out every sector and not have smartctl show any pending sectors after the fact (and the average speed of the dban wipe was normal) then you've got good chances the drive will be fine.

    --
    For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    1. Re:dban followed by smartctl by bill_mcgonigle · · Score: 5, Interesting

      Yes, this. I do it online:

      dd if=/dev/zero of=/dev/sdX bs=8M

      and then check smartctl. If I'm making a really big zpool, I fill them up and let ZFS fail out the turkeys:

      dd if=/dev/zero of=/tank/zeros.dd bs=8M
      zpool scrub tank

      If I'm building a 30-drive storage server for a client I'll often see 1-2 fail out. Better to catch them now then when they're deployed (especially with the crap warranties on spinning rust these days). I need to order in staggered lots anyway, so having 10% overhead helps keep things moving along.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    2. Re:dban followed by smartctl by mathew7 · · Score: 2

      Now that I searched for it, it seems I used dd. But anyway, here it is:

      #!/bin/sh
      hdparm -i $1
      smartctl -a $1
      date
      time dd if=$1 of=/dev/null bs=1M
      date
      smartctl -a $1
      time dd if=/dev/zero of=$1 bs=1M
      date
      smartctl -a $1
      time dd if=$1 of=/dev/null bs=1M
      smartctl -a $1

      I run the script followed by "| tee result.txt". In case you want to change to dd_rescue, bear in mind that it outputs a lot of data (progress) which should not be redirected.

  4. Used to never test by AK+Marc · · Score: 2

    My first help desk job included every computer in the company. We had a server drive fail, so I had Compaq send a replacement. The new arrival didn't work. So then I spent more time looking at RAID configuration and such, but we got a second replacement. That one didn't work either. But I tested it on arrival. The third replacement worked fine, just when I was worried it was something stupid I was missing. Two DOA RMAs for the same part. And yes, that's happened to me again since that first time.

    I test every "used" part as if it's suspect. The question was about new, but they are still new to me.

    1. Re:Used to never test by PlusFiveTroll · · Score: 3, Interesting

      Two DOA of the same part isn't out of the question, a good amount of the time the same part number is from the same batch, which may suffer from the same manufacturing defects. I see things like that pretty often in batches of disks that fall out of RAIDs.

  5. smartmontools by WD · · Score: 5, Informative

    Set up the smartd.conf file to do the example short-test daily and long-test weekly, and email you when something is fishy. It's a trivial amount of effort, resulting in a significant amount of peace of mind. (In many cases, you'll have some amount of warning before your drive kicks the bucket and it's too late)

    1. Re:smartmontools by Deekin_Scalesinger · · Score: 5, Funny

      This should be modded up for your username alone lol

      --
      "As the intrepid kobold companion continues his journey, he begins to wonder... if priests raises dead, why anybody die?
  6. Lifetime of bathtubs by cvtan · · Score: 2

    Old bathtubs lasted longer than old hard drives. Now it's the other way around.

    --
    Sorry, but gray text on gray background is making my eyes bleed.
  7. Yes! Especially before adding them to an array. by Anonymous Coward · · Score: 5, Interesting

    I run some ZFS systems at work. With the current version of the filesystem, you can expand the zpools but you can't shrink them, so adding a bad drive causes immediate problems.

    I've found that some drives are completely functional but write at extremely slow rates: maybe 10% of normal. With typical consumer drives, maybe 1/20 is like this. To ensure I don't put a slow drive into a production zpool array of disks, I always make a small test zpool consisting of just the new batch of drives and stress-test them.

    This catches not only obviously bad drives, but also the slow or otherwise odd ones.

  8. Re:Yes, it's happened. by ArchieBunker · · Score: 2

    Sounds like a really old troll.

    --
    Only the State obtains its revenue by coercion. - Murray Rothbard
  9. Re:SSDs by White+Flame · · Score: 3, Insightful

    Not really. People usually don't modify gigantic footprints of data per day, so standard incremental backup strategies are still very applicable. Most of the large data tends to be read-only over time, typically media, archives, large installation files, etc.

  10. Re:Yes. I mean no. by Anonymous Coward · · Score: 2, Funny

    Let me guess,,, if it sank to the bottom it was a good drive, but if it floated it was a bad drive and needed to be burnt at the stake.

  11. Murphy's Law of Testing by White+Flame · · Score: 2

    Trying to coax an error will never reveal one. Only when you start using it "for real" will the problem manifest.

  12. Re:SSDs by cpghost · · Score: 3, Informative

    Who cares about HDDs anymore these days?

    We do here at work. We need some modest 120+ TB of storage right now, and 30% of that content is highly dynamic (PostgreSQL databases). Anything but data center quality HDD would be silly, not to mention unreliable as hell and heavily expensive. SSDs are just for laptops or so, not for real data storage requirements.

    --
    cpghost at Cordula's Web.
  13. SMART + badblocks by SuperBanana · · Score: 5, Interesting

    I run smartctl and capture the registers, then run badblocks, and compare smartctl's output to the pre-bad-blocks check.

    If there are any remapped blocks, the drive goes back, as the factory should have remapped the initial defects already, and that means new failed blocks in the first few hours of operation.

    1. Re:SMART + badblocks by rrohbeck · · Score: 2

      That's the right way to do it but manufacturers increasingly don't accept returns for a single or few bad blocks. They say that's acceptable.
      The reason is probably that it's too time consuming to test the entire surface with the high capacities but mostly unchanged transfer rates that we see.

  14. Re:SSDs by aaarrrgggh · · Score: 3, Insightful

    Rebuild time. It takes our hardware raids about 24 hours to rebuild, and software raids about 72 hours. If the disk failure isn't detected immediately, even with RAID-6 you are pushing your luck.

    RAID is not backup.

  15. Re:SSDs by PlusFiveTroll · · Score: 3, Insightful

    Depending on your definition of reliable and long term, people still use tapes.

  16. My testing methodology by dpidcoe · · Score: 2

    I thoroughly test any new hdd I get for my desktop PC:

    The first thing I do is format it and install windows. If that works, then we know the drive isn't DOA
    From there I torture test it by copying several hundred gigabytes of software and movies, as well as installing some more programs.
    After that, I let it run for a few months, using it normally. If it crashes during that time, then I know it was bad.

  17. Re:SSDs by cpghost · · Score: 4, Interesting

    Actually, the only use for SSDs currently are ZILs (ZFS intent logs) and we're evaluating whether we put PostgreSQL transaction logs on an SSD, but that's another story. Our main storage farm is still HDD-based.

    --
    cpghost at Cordula's Web.
  18. Plug it in by mbone · · Score: 2

    Testing is simple - plug it in, and run it till it fails. Might as well use it in the mean-time.

  19. Re:SSDs by roc97007 · · Score: 2

    At two companies I managed IP libraries (massive amounts of photographs and drawings used in catalogs and advertisements). The data changes only slowly, and (depending on usage) seasonally, so incremental backups are very much practical. But that's not really the issue.

    This is important. Raid protects you from certain kinds of failures, usually limited to the mechanical or electrical failure of a single hard drive. (More protection can be had by nesting raid levels, but for most installations this is the case.) Raid does not protect you from a wide variety of failures including data corruption from a bad controller or application bug, systemic failure of the raid appliance (example: a catastrophic power supply failure taking out multiple drives) operator-induced data loss, either accidental or malicious, or environmental catastrophe. If your data is important, there is still no substitute for backing up your data and sending it to a remote site. Even geosynch won't necessarily help if you're synching bad data to the only remote copy. And, I'm not yet convinced that syncing to "the cloud" is a good idea.

    Mind you, backups don't have to be to tape. I'm a photographer when I'm not a geek, and I typically keep tens of thousands of photographs online on my workstation. As backup to tape, DVD or even blu-ray isn't really practical, I back up to a series of hard drives using one of those plug-in hard drive toasters, then carefully store them elsewhere, disconnected from the computer. Disaster recovery is a set of drives in a safe at a friend's house.

    There are examples where backups aren't necessary. I worked with one array that was essentially a huge cache for 1-800 calls, and a complete wipe would only mean that customers would see a delay on the next call as their particular part of the cache was rebuilt. But for the most part, depending on raid instead of a properly implemented backup solution is a really bad idea.

    --
    Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
  20. Re:SSDs by hairyfeet · · Score: 4, Interesting

    Unless you are using SLC, which is getting harder to find and more expensive every day you are really pushing your luck. The problem is the hot/crazy scale when it comes to these drives, specifically the fact that nobody has figured out how to lick the controller issue. For those that haven't run into it yet (lucky bastards) the controller issue will cause a drive to suddenly fail without ANY warning and unlike how the SSDs are always bragged on to "fail safe" into a read only mode what actually happens is when the controller fails the whole drive is completely dead, it won't even show up in BIOS/UEFI.

    So until somebody figures out how to lick the controller problem, and when they do the money they make will truly be insane, or come up with the idea that i have been advocating for years of putting a second cheaper ARM controller on the board designed to take over as a read only backup while you get your data out? Well I'd be seriously leery of trusting any data I cared about to an SSD, not without spinning rust backups at the very least. The controller bug seems to bite every OEM on the ass, I have seen it from Intel to OCZ and its always the same. Push the button and poof! Data all gone with the drive. And of curse since you can't get your data off or even wipe it you have to hope they don't send it to some third world country for refurb where they help themselves to your data. Because of this I don't think my customers have even used 10% of their warranties for fear of the data falling into the wrong hands, great for the OEMs which rarely have to make good on warranties, not so good for the customer.

    --
    ACs don't waste your time replying, your posts are never seen by me.
  21. Wrong Approach by nuckfuts · · Score: 4, Insightful

    I've been dealing with hardware failures for 20+ years. What I've learned is that disasters WILL happen, regardless of what preventive measures are in place. So I shifted my focus toward recoverablity. To me, the important question is "When something catastrophic happens, how quickly and easily can I put things back in working order"?

    Since I use RAID where appropriate, and more importantly, I am positively fanatic about frequent, full, and tested backups, the only concern I have when a hard drive dies is whether I'm still entitled to a warranty replacement.

  22. Re:never had early failure by hairyfeet · · Score: 2

    Then you sir are either the luckiest bastard on the planet or haven't bought any Seagate drives above 500Gb, because I've seen so many dead OOTB or very soon after leaving the box Segate 1TB and above drives i won't even touch them anymore.

    There is a reason this guy is asking this question, its because we are now down to just 2 makers of drives and the Seagates are Russian roulette with your data. Most likely he has seen that the new Seagates are selling for as low as $50 a TB online and wants more space but can see all the horror stories in the feedback and wants some way to help mitigate the risk.

    But I'm sorry friend, the only way I've found to mitigate the risk is to avoid Seagate like an STD, even with WD drives often double the price of the Seagate, because while the WDs seem to have about a 1 in 15 failure rate the Seagates depending on the size (1TB-2TB the worst, 3TB better but not great) you are looking at as low as a 1 in 3 chance of failure. With failures THAT high, which frankly I hadn't seen since the big Maxtor mess of 2002, i just would avoid Seagate for anything i gave a shit about as its just not worth the risk.

    --
    ACs don't waste your time replying, your posts are never seen by me.
  23. Don't degauss it to start with by andy+the+engineer · · Score: 2

    On black Friday I bought a 1 TB drive at Office Depot, and of course they waved the box over their anti-theft degauser. I asked for a different drive and told them that they shouldn't do that with drives. The girl gave me the look we all have seen, but the boy behind her actually agreed with me and they gave me a drive out of the cage and let me leave the store with the alarm blaring. I've just about filled it up already and It's been working fine.

    --
    Jack of all trades, master of some.
  24. Re:SSDs by drsmithy · · Score: 4, Funny

    Holy crap. Twenty 3T spindles in a single array ? What do you do to de-stress ? Run between cars on a highway ?

  25. Try to break the disk before you lose your data by ncw · · Score: 2

    Stress testing hard disks is a particular bugbear of mine, after having some really bad luck with early hard disks. Over the 15 years that I've been doing it I've had to send back loads of hard disks and flash cards because they failed my tests, either breaking completely or returning single bit errors in your data. Mostly the manufacturers will take disks back if you can get their stupid Windows program to return an error code. Sometimes it takes a bit of arguing but ultimately the manufacturers want to keep you happy. Flash disks with single bit errors are the hardest to send back in my experience.

    Here is the latest generation of my stress testing code (re-written in Go recently): https://github.com/ncw/stressdisk

    (Interestingly the stressdisk program sometimes finds bad ram in your computer too!)

    I generally thrash every new hard disk or memory card for 24 hours to see if I can break it before trusting any data to it!

    I also run a long smart test too.

    Somewhat paranoid, yes, but I really, really hate losing data!

    --
    Every man for himself, all in favour say "I"
  26. Re:SSDs by dinfinity · · Score: 2

    Please. Quoting Jeff Atwood as an authoritative source on SSDs?
    Some anecdotal evidence and a subsequent admission of buying from the brand known for the highest failure rate in SSDs isn't going to convince anyone.
    I'd like to see some proper statistics before I believe anything you say.

    The most reliable statistics I've seen show SSDs performing as good or better than HDDs when it comes to failing. I haven't seen any statistics on what percentage of failing drives did so spontaneously, completely, without warning and without any possibility for repair.

    Mind you, I'm not claiming they don't. Just that I haven't seen any evidence beyond some anecdotes. And well, anybody that trusts a single drive with important data is an idiot or ignorant anyway.

  27. Re:SSDs by war4peace · · Score: 2

    Exactly this.
    I know a (very large) Data Center belonging to a (very large) company which started replacing their HDDs with SSDs. The price difference isn't even that large; price-per-GB for a server-grade 15K RPM SAS was negligibly close to SSD price. And the advantages are really there: (much) lower heat produced, less noise, less space taken, less energy consumed. Even with a similar failure rate, the advantages are there.

    --
    ...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
  28. Re:never had early failure by Wolfrider · · Score: 2

    --If I were you, I would look into the following:

    o Test all drives before putting them into production - either with SMART long test, or linux 'badblocks'

    o Cooling - is it adequate enough?

    o Powerful enough Power supply ++ UPS (essential these days)

    o Mount all drives with "noatime" option in Linux, or in XP and later:
    ' fsutil behavior set disablelastaccess 1 ' and reboot

    o Spin down all HDs when not in use.

    --I do all of the above, and my drives last for years and years. Just sayin'

    --
    .
    == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??