Slashdot Mirror


Why RAID 5 Stops Working In 2009

Lally Singh recommends a ZDNet piece predicting the imminent demise of RAID 5, noting that increasing storage and non-decreasing probability of disk failure will collide in a year or so. This reader adds, "Apparently, RAID 6 isn't far behind. I'll keep the ZFS plug short. Go ZFS. There, that was it." "Disk drive capacities double every 18-24 months. We have 1 TB drives now, and in 2009 we'll have 2 TB drives. With a 7-drive RAID 5 disk failure, you'll have 6 remaining 2 TB drives. As the RAID controller is busily reading through those 6 disks to reconstruct the data from the failed drive, it is almost certain it will see an [unrecoverable read error]. So the read fails ... The message 'we can't read this RAID volume' travels up the chain of command until an error message is presented on the screen. 12 TB of your carefully protected — you thought! — data is gone. Oh, you didn't back it up to tape? Bummer!"

803 comments

  1. Carefully protected? by Whiney+Mac+Fanboy · · Score: 5, Insightful

    12 TB of your carefully protected â" you thought! â" data is gone. Oh, you didn't back it up to tape? Bummer!

    If it wasn't backed up to an offsite location, then it wasn't carefully protected.

    --
    There are shills on slashdot. Apparently, I'm one of them.
    1. Re:Carefully protected? by rhathar · · Score: 3, Interesting

      "Safe" production data should be in a SAN environment anyways. RAID 5 on top of RAID 10 with nightly replays/screenshots and multi-tiered read/writes over an array of disks.

      --
      http://www.chaotickingdoms.com
    2. Re:Carefully protected? by SatanicPuppy · · Score: 5, Insightful

      Yea, because we all backup 12TB of home data to an offsite location. Mine is my private evil island, and I've bioengineered flying death monkeys to carry the tapes for me. They make 11 trips a day. I'm hoping for 12 trips with the next generation of monkeys, but they're starting to want coffee breaks.

      I'm sorry, but I'm getting seriously tired of people looking down from the pedestal of how it "ought" to be done, how you do it at work, how you would do it if you had 20k to blow on a backup solution, and trying to apply that to the home user. Even the tape comment in the summary is horseshit, because even exceptionally savvy home users are not going to pay for a tape drive and enough tapes to archive serious data, more less handle shipping the backups offsite professionally.

      This is serious news. As it stands, the home user that actually sets up a RAID 5 raid is in the top percentile for actually giving a crap about home data. Once that becomes a non-issue, then the point has come when a reasonable backup is out of reach of 99% of private individuals. This, at the same time as more and more people are actually needing a decent solution.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    3. Re:Carefully protected? by networkBoy · · Score: 2, Interesting

      True.
      Also FWIW I only run RAID 1 and JBOD.
      For things that must be on-line, or are destined for JBOD but not yet archived to backup media, they are located on one of the RAID volumes. For everything else it's off to JBOD, where things are better than RAID5

      Why?

      I have 6 TB of JBOD storage and 600(2x300 volumes) GB of RAID 1. If I striped the JBOD into 6TB (7 drives) and one drive failed all the near-line data would be virtually off-line (and certainly read-only) while the array re-built. With JBOD, should a disk fail, I pop in a replacement, grab the stack of DVDs from the local backup, and plug the data back in. Now all the other near-line is still available and honestly takes about the same amount of effort and time as re-building a stripe set w/ parity. Never mind that I've had a read error on rebuilds before and had to re-do the entire array from scratch anyway.

      While my system would not work in an environment where the files on the JBOD change often, they are basically .archive anyway, so handling them by way of staging on RAID1 pending copy to DVD and storing on JBOD works fine.

      Naturally this system also really gives an incentive to keep up on the backups, with no false sense of security of having files on a RAID5...
      -nB

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    4. Re:Carefully protected? by networkBoy · · Score: 3, Informative

      you know the other solution is to not use RAID5 with these big drives, or to go to RAID1, or to actually back up the data you want to save to DVD and accept a disk failure will cost you the rest.

      Now, while 1TB onto DVDs seems like quite a chore (and I'll admit it's not trivial), some level of data staging can help out immensely, as well as incrementally backing up files, not trying to actually get a full drive snapshot.

      Say you backup like this:
      my pictures as of 21oct2008
      my documents (except pictures and videos) as of 22 oct2008
      etc.
      while you will still lose data in a disk failure, your loss can be mitigated, especially if you only try to backup what is important. With digital cameras I would argue that home movies and pictures are the two biggest data consumers that people couldn't backup to a single dvd and that they would be genuinely distressed to lose.
      -nB

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    5. Re:Carefully protected? by Linux_ho · · Score: 1

      I think I saw a 1TB external USB drive for $180 the other day. Off-site doesn't need to be difficult or expensive, and it's worth the effort if you care about your data.

      --
      include $sig;
      1;
    6. Re:Carefully protected? by Anonymous Coward · · Score: 1, Interesting

      If I striped the JBOD into 6TB (7 drives) and one drive failed all the near-line data would be virtually off-line (and certainly read-only) while the array re-built.

      What kind of crappy raid array is that? Better raid arrays will model & predict performance under degraded conditions like failure & rebuilding. They certainly don't stop or go read-only during a rebuild.

      When tour groups of non-tech people come by the server room, I used to emphasize reliability by pulling a hard disk out of a running server, hand it to them, and then put it back in the server. The server doesn't skip a beat (and these were common off-the-shelf Dell rackmount servers costing $2,500 or so).

      Aside from automated alarms paging some IT people, no one would notice.

    7. Re:Carefully protected? by Whiney+Mac+Fanboy · · Score: 4, Insightful

      Oh come on. Do you have 12TB of home data? Seriously? And if you do, it's not that hard to have another another 12TB of external USB drives at some relatives place.

      I've got about 500GB of data that I care about at home & the whole lot's backed up onto a terrabyte external HDD at my Dad's. It's not that hard.

      If you think raid is protecting your data, you're crazy.

      --
      There are shills on slashdot. Apparently, I'm one of them.
    8. Re:Carefully protected? by cbreaker · · Score: 1

      I'm with you on this, but the problem with many RAID5 systems is that you usually purchase all of the drives at once, so it increases the likelyhood of a double-disk failure since all the drives are the same age.

      Mind you, I've worked in IT for over 15 years and I've never had the fortune to experience a double disk failure.

      On my own system at home, I have three RAID5 sets on two servers with hardware raid SATA controllers (Accusys controllers - they're really nice!) If I were to experience a disk failure, I'll turn off the server and go get a replacement disk before turning it back on.

      I also replicate all of the important data to a friend; we both have file servers, we both run Server 2003 R2 (Well, I run 2008 now) with DFS Replication, and we use an OpenVPN tunnel between us. This way, even if we had a bad disk failure we'll be okay.

      In reality, you should have backups of critical data because I'm much more afraid of a fire or flood than a double-disk failure.

      --
      - It's not the Macs I hate. It's Digg users. -
    9. Re:Carefully protected? by Urza9814 · · Score: 1

      Um, they said RAID5 is dead. What about, say, RAID 1? That's still good for backing up. Or do as I do and have your OS run a nightly backup on to a different hard drive. I have 2 500GB drives. One I use, and one is so that Mandriva can record the changes every night.

    10. Re:Carefully protected? by SatanicPuppy · · Score: 4, Insightful

      Yea, but DVD is transient crap. How long will those last? A few years? You cannot rely on home-burned optical media for long term storage, and while burning 12 terabytes of information on to one set of 1446 dvds (double layer) may not seem like a big deal, having to do it every three years for the rest of your life is bound to get old.

      For any serious storage you need magnetic media, and though we all hate tape, 5 year old tape is about a million times more reliable than a hard drive that hasn't been plugged in in 5 years.

      So either you need tape in the sort of quantity that the private user cannot justify, or you're going to have to spring for a hefty RAID and arrange for another one like it as a backup. Offsite if you're lucky, but it's probably just going to be out in your garage/basement/tool shed.

      Now, what do you do if you can't rely on RAID? No other storage is as reliable and cheap as the hard drive. ZFS and RAID-Z may solve the problem, but they may not...You can still have failures, and as hard disk sizes increase, the amount of data jeopardized by a single failure increases as well.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    11. Re:Carefully protected? by blahplusplus · · Score: 1

      "This is serious news. As it stands, the home user that actually sets up a RAID 5 raid is in the top percentile for actually giving a crap about home data. Once that becomes a non-issue, then the point has come when a reasonable backup is out of reach of 99% of private individuals"

      This is why they made DVD, Blu-ray, USB thumb drives, Flash memory (Flash memory is getting mighty big now 16-32GB), with USB thumb drives, 16-32GB SDHC flash memory cards, then their are portable hard drives that connect via USB/Firewire and personal SAN solutions starting to appear. But it's highly likely within the next 5 years flash technology will make a lot of talk about backup moot, since online can take care of storing pictures and videos (the biggest portion usually) and then that leaves the user to just backup his higher quality pics and videos (should he/she be savvy enough to begin with). I wouldn't be surprised most people use shared picture sites like Google, Flickr, photbucket, and youtube to store their videos/pictures and not worry about having to back them up. Much of what people have is disposable or highly redundant and easy to redownload.

      Most important data people want to backup is not a lot of data, unless you're talking about pictures and video and this could be done with a lot less if they were savvy about it. But... the truth of the matter is things will get better as the generations go by, as more and more people grow up with technology there will be less and less of a learning curve over time.

    12. Re:Carefully protected? by HTH+NE1 · · Score: 1

      Oh come on. Do you have 12TB of home data? Seriously? And if you do, it's not that hard to have another another 12TB of external USB drives at some relatives place.

      Only if you've established the habit of buying your drives in pairs while you were amassing your data.

      Sometimes having an expensive computer system doesn't mean you have money; it means you had money.

      --
      Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
    13. Re:Carefully protected? by Bandman · · Score: 1

      This weakness is really only an issue once you start getting towards an immense amount of data. As the blog entry states, one failure is statistically likely during 14TB of data recover (not storage volume, data volume). So 7 1TB drives in a full RAID5 array has a 50% chance of failure during recovery if one of the drives were to fail.

      Home users won't be affected, at least at this point. I imagine that by the time it does become an issue, manufacturing tech will have caught up and it'll be 14PB.

      I did cover this earlier today, too.

    14. Re:Carefully protected? by Ephemeriis · · Score: 1

      I'm sorry, but I'm getting seriously tired of people looking down from the pedestal of how it "ought" to be done, how you do it at work, how you would do it if you had 20k to blow on a backup solution, and trying to apply that to the home user. Even the tape comment in the summary is horseshit, because even exceptionally savvy home users are not going to pay for a tape drive and enough tapes to archive serious data, more less handle shipping the backups offsite professionally.

      This is serious news. As it stands, the home user that actually sets up a RAID 5 raid is in the top percentile for actually giving a crap about home data. Once that becomes a non-issue, then the point has come when a reasonable backup is out of reach of 99% of private individuals. This, at the same time as more and more people are actually needing a decent solution.

      First off, I have a hard time seeing a home user coming up with 12 TB of data anytime soon. Sure, it's possible... But I find it unlikely. I also find it unlikely that your average home user is going to be savvy enough to even know what a RAID-5 is. We've got plenty of home users who've been keeping all their precious family photos on a single SD card with absolutely no backup at all.

      And if your home user does actually have 12 TB of data, in a RAID-5, that's not going to be a cheap pile of hardware. If they've got that kind of money to spend on storage, why wouldn't they spend some money to actually protect that data?

      Next up, a RAID (1, 5, 10, various combinations there-of) does not protect you from the single biggest threat to your data - user error. A RAID will not make it any easier to recover data that you've accidentally deleted.

      Accidentally delete a pile of Timmy's graduation photos? Tough. Even if you've got a working RAID-5 you aren't going to have an easy time getting them back.

      How 'bout your house burns down with your RAID-5 in it? Your data is still toast, even if it was on a RAID.

      Anyone who cares about their data - home user or business - needs an offsite copy of it. Doesn't need to be a tape... You could dump it to a NAS and then unplug the thing and stick it in a safe deposit box. Or you can print everything out and mail it to your uncle. Or you can burn a pile of DVDs and hide them throughout the woods. But unless you've got your data offsite it is not protected.

      As far as getting a backup working... These days it's pretty damn trivial.

      If you don't have a ton of data you can just burn it to DVDs easily enough.

      If you've got more data you could get yourself an external HDD, or a few USB flash drives, or a cheap NAS and dump the data to it.

      You can also get an LTO1 drive for about $1,000 if you really wanted to go with a tape.

      --
      "Work is the curse of the drinking classes." -Oscar Wilde
    15. Re:Carefully protected? by sholsinger · · Score: 5, Funny

      Next they'll want to unionize. At that point you've lost everything.

    16. Re:Carefully protected? by DrVxD · · Score: 4, Funny

      Oh come on. Do you have 12TB of home data? Seriously? And if you do, it's not that hard to have another another 12TB of external USB drives at some relatives place.

      Not all of us have relatives, you insensitive...[URE]

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    17. Re:Carefully protected? by Bandman · · Score: 1

      Nice shock value, but I'm not sure I'd trust to fate to rebuild the array every time. The risk/reward ratio isn't in balance to me.

    18. Re:Carefully protected? by Ucklak · · Score: 1

      The CD format has been here for 27 years with no signs of leaving soon.
      I suspect that the DVD format will be the same.

      Your average tape drive for a home user lasts about 3 years before a newer, incompatible version comes along to replace it.
      I have never seen a commercial tape drive last longer than 7 years.
      YMMV

      --
      if you steal from one source, that is plagiarism, if you steal from many, well, that's just research.
    19. Re:Carefully protected? by SatanicPuppy · · Score: 1

      I actually have an RRRAID...A redundant redundant redundant array of inexpensive disks. I may lose 1 raid. I may even lose 2. But I probably won't lose 3. But that solution is WAY out of reach for the average consumer, and is only possible for me because the amount of data I have on hand doesn't change very quickly.

      Even 1TB is a problem, and that is within the reach of consumers these days. And if you think your external HDD is protecting your data, you're crazy. The failure on those is single point, and thats more likely on an external drive that gets moved around than on any internal drive. Beyond that, I'm sure your rotational policy is lax; everyone's is, so what you're really saying is you have some of your data backed up. Depending on how often you back up, you may only lose a month or two.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    20. Re:Carefully protected? by Free+the+Cowards · · Score: 1

      Buying a computer system you cannot afford to properly use is crazy. Yes, some people are crazy, and those crazy people are going to lose data, but there's no sense in defending it.

      --
      If you mod me Overrated, you are admitting that you have no penis.
    21. Re:Carefully protected? by Vu1turEMaN · · Score: 0

      What would you recommend then for a 30user non-profit business with a 136gb scsi drive and tape backups they have been running (but not realizing that they weren't completely backing up everything)? Thats the crap I just inherited...I was thinking moving to 2 sata drives in raid 1(the scsi drive sounds like it's gonna die soon) and fixing the bloated folders so that tape backups will work again. Differential appended tape backups M-R and a normal backup on friday after the workday seems like the best idea.

    22. Re:Carefully protected? by MBCook · · Score: 2, Insightful

      Good points. While magnetic media is problematic, SSDs are going to become a very viable option for the home backup (compared to stacks of DVDs or the possible reliability of old magnetic HDs).

      --
      Comment forecast: Bits of genius surrounded by a sea of mediocrity.
    23. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Why are the only options for the home user RAID or DVD?

      I back up critical documents at home to an external hard drive once a week, and the whole system (minus documents, which are stored on a separate partition) about once a month. The rest of the time, the external hard drive is off, saving wear and tear on the drive.

      If a document is really, really important, then I'll also put it on a DVD, on a secure server, or on the flash drive that I keep in my lockbox with my important documents.

      I'll admit that this probably works best for me because I only work on documents on my home computer a couple of days a week (and the rest of the time I mostly use it for web browsing), but I think it's worth considering, and takes a lot less time than trying to burn large documents to DVD.

    24. Re:Carefully protected? by Anonymous Coward · · Score: 0

      This article is stupid. RAID was never intended as permanent storage of digital assets since its inception in 1987. Also, drive capacities don't matter when read errors occur, which is a natural affect of a media error (that in most cases can be recovered). Sure individual drive capacities do increase over time however, spinning disk (mechanical) will always be prone to failure. A more intelligible article might point out that the storage industry is moving more to SSDs (Solid State Disks) by next year. There's no moving parts whatsoever thus MTBF will be decreased big time and will keep RAID volumes up longer as a direct result of non-mechanical failure which by the way is the number 1 issue affecting spinning disks today which SSDs mitigate.

    25. Re:Carefully protected? by Carrion+Creeper · · Score: 1

      My semi-cheap fire/water safe with a disconnected backup drive inside beats a RAID array any day.

      The fire safe is not that complicated, and I can put in and take out mismatched drives any time without rebuilding.

      Also the fire safe is much harder to carry away, not prone to power surges, and much less likely to catch on fire.

    26. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I had a 6TB RAID 5 array, 3 drives failed (not 2, 3) and I lost it all.

      I now have a RAID 6 array with a hot spare. Scripts back up the *important* 100GB or so. The rest can always be downloaded again (so in a way I do have an offsite backup)

      Tapes? Bluray is cheaper...

    27. Re:Carefully protected? by Fulcrum+of+Evil · · Score: 2, Informative

      Read the post again - he said that home burned DVDs are good for 3 years, tops. This is called media life.

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
    28. Re:Carefully protected? by MadMorf · · Score: 1

      Mind you, I've worked in IT for over 15 years and I've never had the fortune to experience a double disk failure.

      Lucky you.

      I work as a TSE for one of the major storage companies. I see double disk failures at least weekly, usually more often.

      This is where the difference between a home-brewed solution and a commercial solution comes in.

      Two and a half years of working this kind of case at least weekly and I've never lost any data, because we have the tools to deal with these issues.

      So, ideally, mirrored raid for availability. Snapshots to protect against deletion and off-site replication for DR.

      No tapes needed.

    29. Re:Carefully protected? by grahamd0 · · Score: 2, Insightful

      Yea, but DVD is transient crap. How long will those last?

      But DVD is *cheap* transient crap, and perfectly adequate for home backups.

      I've got something in the area of 200GB of data on the machine which I'm currently using to type this, but very little of that data has any intrinsic or sentimental value to me. Most of it is applications and games that could easily be reinstalled from the original media or re-downloaded. A DVD or two could easily hold all of the data I *need* and even cheap optical media will outlive this machine's usefulness.

    30. Re:Carefully protected? by poot_rootbeer · · Score: 1

      The CD format has been here for 27 years with no signs of leaving soon.

      And the media of some of those commercial CD releases from the early- to mid-1980s has degraded to the point where the music on them can't be listened to anymore.

      You don't even want to know the problems I've had trying to get data off of CD-Rs I've burned, and the oldest of those is from "only" 1999.

      CD/DVD drives will be around for another 20 years, but what good are they if your archival media succumbed to bit rot years ago?

    31. Re:Carefully protected? by aaarrrgggh · · Score: 1

      I think the real problem is that the scale of small business data storage will quickly outpace the capability of RAID5.

      But the real issue to the summary is that a 6+1 array is less reliable than a single drive. True for almost anything, including the UPS.

    32. Re:Carefully protected? by SatanicPuppy · · Score: 4, Informative

      I've got a mainframe circa 1984 that's been using the same type of drive since 1989. Last year we pulled all the year-end financial numbers off the yearly backups dating back to that point. Zero failed tapes.

      Consumer-grade CDs and DVDs use a photosensitive dye to record information. It can degrade in anywhere between 2 to 5 years...Longer if you keep it in a cool dark place, but not 20 years.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    33. Re:Carefully protected? by Hadlock · · Score: 4, Interesting

      I can't vouch for DVD-R but I have el-cheapo store brand CD-Rs that I backed up my MP3 collection to 11 years ago and they work just fine. My solution is this:
       
      Back everything up that's not media (mp3/video) every 6 months to CD-R, and once a year, copy all my old data onto a new hard drive that's 20+% larger than the one I bought last year and unplug the old one. I have 11 old hard drives sitting in the closet should I ever need that data, and the likelihood of a hard drive failing in the first year (after the first 30 days) is phenomenally low. Any document that I CAN'T lose between now and the next CD-R backup goes on a thumb drive or it's own CD-R and/or email it to myself.

      --
      moox. for a new generation.
    34. Re:Carefully protected? by WhatAmIDoingHere · · Score: 1

      Within reach of consumers, sure, but show me a consumer who filled his Dell's 120 gig hard drive with important (non-porn) documents. My Mother's computer has a 160 gig hard drive in it. She has less than 64 megs of important data she needs to keep backed up, so she uses a thumbdrive that she stores in a fire safe. I'm thinking that's closer to the average consumer than a 1TB drive full of "I CAN'T LOSE THIS" data.

      --
      Not a Twitter sockpuppet... but I wish I was.
    35. Re:Carefully protected? by WhatAmIDoingHere · · Score: 5, Insightful

      RAID is NOT a back-up solution. RAID is a "oh shit my hard drive failed" solution.

      --
      Not a Twitter sockpuppet... but I wish I was.
    36. Re:Carefully protected? by Hadlock · · Score: 1

      Was it a FW800 thumb drive? I'm not going to do the math but even over USB 2.0 sustained you're looking at 300m/b/s (which is probably above the drive's write speed unless you have a RAID array of thumb drives)
       
      looking at that link it looks like it's about 10 min/1.5GB, or 11 hours (roughly)

      --
      moox. for a new generation.
    37. Re:Carefully protected? by SatanicPuppy · · Score: 1

      If you've got an HD camcorder you can fill that up with three hours of video. I know people who's iPods have that much data on them.

      I'm not saying everyone has multiple TBs of info lying around but 1TB isn't ridiculous these days, and 1TB is pretty much impossible for joe user to back up without using another hard drive.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    38. Re:Carefully protected? by WhatAmIDoingHere · · Score: 3, Funny

      "Or you can burn a pile of DVDs and hide them throughout the woods."

      Patent that right NOW, I think we've got a winner to replace RAID-5.

      --
      Not a Twitter sockpuppet... but I wish I was.
    39. Re:Carefully protected? by Facegarden · · Score: 5, Funny

      Buying a computer system you cannot afford to properly use is crazy. Yes, some people are crazy, and those crazy people are going to lose data, but there's no sense in defending it.

      Well, i guess i'm crazy, i have 3TB of space on my home PC, and no way to back it all up offsite. I do have some important folders from one drive automatically copy to another drive periodically, so if one drive dies the other will be okay, but if i lose them both or the place burns down or i get a nasty virus, it's all going to hell.
      Most of my space is taken up by pirated... err... backed up... HD movies. And porn, lots of porn.
      Either way, i'm not too worried if i lose that, it's just the things i back up i really care about.
      The thing is, i was going to RAID 3 of the drives into a secure 1TB array, but now i hear all these issues with RAID and i worry that it may be WORSE than just copying over the files periodically. I want a DROBO but those are expensive as hell.

      This article has inspired me to look into Tape Backup but i worry that it's not cost effective (i haven't looked yet).

      I should fill up some tapes with a few hundred gigs of porn, write "confidential" on them, and stash them in a bag, under some bush, across the street from HP near my apartment. I'm sure some curious person would come looking, only to discover their contents and wonder why the hell someone went to all that trouble....

      God i'm strange.
      -Taylor

      --
      Worldwide Military budgets: $2100 billion. Worldwide Space Exploration budgets: $38 billion. Really, world? Really?
    40. Re:Carefully protected? by WhatAmIDoingHere · · Score: 0, Redundant

      A "Redundant Arrays of Inexpensive Disks" array?

      I'd like to buy one of those, do you mind if I stop at the automated teller machine machine first? Let me just type in my personal identification number number...

      --
      Not a Twitter sockpuppet... but I wish I was.
    41. Re:Carefully protected? by Wesley+Felter · · Score: 4, Insightful

      SSDs are going to become a very viable option for the home backup

      Yeah, I love paying much more for my backup than for my primary storage.

    42. Re:Carefully protected? by quarkscat · · Score: 1

      Anyone that relies on tape for backup of a RAID array of any modern size (TeraBytes) has got to have A HUGE backup window, as well as (likely) a Tape Jukebox. Considering what Tape Jukeboxes cost, as well as the tape Media costs, setting up a newer, faster, denser RAID array for backups is far cheaper AND faster. If you think that the MTBF for modern disk drives are problematic, consider the complex mechanical structure of a Tape Jukebox (or ANY jukebox!). There are two excellent filesystems that handle virtualization and hot on-line backup of RAID array data: SGI's XFS and SUN's ZFS. Between the use of hot standby spares (disks), redundant RAID array controllers, robust filesystems, and off-site backups, there is no reason for Tape Jukeboxes or Tape Backups anymore. Moore's Law applies better to disk densities (and MTBF) than to tape densities and Jukeboxes: Better / Faster / Cheaper WINS.

    43. Re:Carefully protected? by WhatAmIDoingHere · · Score: 1

      Yeah, because Joe The Plumber needs to record a 3 hour epic HD movie?

      He backs his quicken info, his turbo tax info, and maybe a couple megs of pictures.

      That's the thing that bugs me with slashdot, people always think that the average user is like them. The average user still has a VCR that blinks 12:00.

      --
      Not a Twitter sockpuppet... but I wish I was.
    44. Re:Carefully protected? by Nimey · · Score: 1

      You can just periodically burn the most critical data to a DVD set and store it in a safe-deposit box at your bank with your valuable papers. That's off-site enough if your area's not prone to earthquakes or nuclear warfare.

      --
      Hail Eris, full of mischief...

      E pluribus sanguinem
    45. Re:Carefully protected? by rtechie · · Score: 1

      First off, I have a hard time seeing a home user coming up with 12 TB of data anytime soon.

      How about 250 GB of data? That's the size of entry-level HDs nowadays. Burn it to 15 DVDs?

      Next up, a RAID (1, 5, 10, various combinations there-of) does not protect you from the single biggest threat to your data - user error.

      The extra size and speed granted by a RAID 5 or 10 array would make it easier to use undelete software on top of Windows. I know this from personal experience.

      But you're basically right, all RAID does is protect you from hard drive failure. Not your house catching on fire or your desktop getting stolen.

      You could dump it to a NAS and then unplug the thing and stick it in a safe deposit box. Or you can print everything out and mail it to your uncle. Or you can burn a pile of DVDs and hide them throughout the woods. But unless you've got your data offsite it is not protected.

      None of these stupid suggestions are online, so they're useless. For home users, the ONLY realistic offsite backup solution in online backup over the internet. And it's expensive and time-consuming for large amounts of data.

      If you've got more data you could get yourself an external HDD, or a few USB flash drives, or a cheap NAS and dump the data to it.

      Backing up to an external HDD or NAS is backuing up hard disks with other hard disks at the same location. I fail to see how this is significant different from RAID (except slower).

      RAID + as much internet backup as you can afford is THE solution for home users.

    46. Re:Carefully protected? by Anonymous Coward · · Score: 0

      raid is NOT backup

    47. Re:Carefully protected? by Martin+Blank · · Score: 1

      My experience with hard drives is that they generally either fail fast (like in the first 30 days) or after a few years. I've only had a small fraction fail in between. I tend to not trust a drive completely until it's been operational for at least a couple of months.

      --
      You can never go home again... but I guess you can shop there.
    48. Re:Carefully protected? by street+struttin' · · Score: 1

      Here's a decent solution. Back up your taxes, email, and budget files. Say fuck-all to your porn and if your 10 TB of necro-beastiality or whatever goes missing, you'll just have to download it all again from whatever depraved place you would get such a thing. Not that I would know....

    49. Re:Carefully protected? by nzgeek · · Score: 1

      Unfortunately tape is NOT that reliable.

      I can't find the link now, but I remember reading that NASA had so much information on old tape that they couldn't read it out of the tapes fast enough to completely copy all the tapes before the tapes reached the end of their usable lives, or fell fould of mold/rot.

    50. Re:Carefully protected? by ChrisA90278 · · Score: 1

      "Yea, because we all backup 12TB of home data to an offsite location."

      If you really had 132TB of data you cared about of ourse you'd back it up. When home users get ot the point of having 12TB of data and can afford the 12TB disk then they will be able to afford the 12Tb backup drives. After all a pair of backups only tripples the total cost.

      Today I have 1TB of data at home. Each back up only costs $150. I can afford to own several of these backup drives. I rotate them so as to never over write my last good backup.

      In ten years I might say the same thing but replace 1TB with 12TB. and I'm sure the backup devices will cost about the same or maybe less.

    51. Re:Carefully protected? by Carrion+Creeper · · Score: 1

      I thought the article was all about how we need RAID arrays because just one isn't good enough? Did I miss something?

    52. Re:Carefully protected? by Whiney+Mac+Fanboy · · Score: 1

      And if you think your external HDD is protecting your data, you're crazy. The failure on those is single point,

      No, it's not a single point. Because the data is always present in two places. On my home raid & on the external HDD.

      You're right that my rotational policy is lax - but my important data is photos. I tend to keep them in multiple points until rotation.

      --
      There are shills on slashdot. Apparently, I'm one of them.
    53. Re:Carefully protected? by Urza9814 · · Score: 1

      As is what I'm doing - I'm not backing up to external media, I'm not backing up to a different server, I'm backing up to a harddrive sitting right below the one I'm backing up from. Though I suppose it also covers 'oh shit I shouldn't have deleted those' scenarios, which RAID wouldn't. But anything more serious than that or hard drive failure and I'm screwed.

    54. Re:Carefully protected? by SatanicPuppy · · Score: 4, Insightful

      Sure, right now. The first hard drive I ever bought was 8 megabytes and cost 600 dollars. 4 years ago I bought a 1gb usb flash drive for 300 dollars, now they're running 10-20 bucks.

      In a few years solid state will be something I'm looking at VERY seriously. It has serious potential for long term storage. Yea, it's too expensive...right now...But in the long run it's the most promising thing out there.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    55. Re:Carefully protected? by jandrese · · Score: 1

      I want to know where you find offsite backup for 12TB of data daily at only $20k.

      --

      I read the internet for the articles.
    56. Re:Carefully protected? by jandrese · · Score: 1, Insightful

      Then I've got great news for you about tape drives.

      --

      I read the internet for the articles.
    57. Re:Carefully protected? by Wesley+Felter · · Score: 2, Insightful

      I agree that SSDs are inevitable... for primary storage. Once I've switched my laptop over to SSD I'll still use a hard disk for backup, though.

    58. Re:Carefully protected? by binarylarry · · Score: 5, Funny

      That's why serious IT people use Fedex.

      --
      Mod me down, my New Earth Global Warmingist friends!
    59. Re:Carefully protected? by mlts · · Score: 3, Interesting

      I just wish all the density improvements that hard disks get would propagate to tape. Tape used to be a decent backup mechanism, matching hard disk capacities, but in recent time, tape drives that have the ability to back up a modern hard disk are priced well out of reach for most home users. Pretty much, you are looking at several thousand as your ticket of entry for the mechanism, not to mention the card and a dedicated computer because tape drives have to run at full speed, or they get "shoe-shining" errors, similar to buffer underruns in a CD burn, where the drive has to stop, back up, write the data again and continue on, shortening tape life.

      I'd like to see some media company make a tape drive that has a decently sized RAM buffer (1-2GB), USB 2, USB 3, or perhaps eSATA for an interface port, and bundled with some decent backup software that offers AES encryption (Backup Exec, BRU, or Retrospect are good utilities that all have stood the test of time.)

      Of course, disk engineering and tape engineering are solving different problems. Tape heads always touch the actual tape while the disk heads do not touch the platter unless bumped. Tape also has more real estate than disk, but tape needs a *lot* more error correction because cartridges are expected to last decades and still have data easily retrievable from them.

    60. Re:Carefully protected? by Levvie · · Score: 1

      In raid world data volume = storage volume. A raid system is not aware of the data that is on it's array.
      Rebuilding after a disk failure means you want to read every single bit on every working drive in the array, even if the filesystem is empty.

    61. Re:Carefully protected? by DrVxD · · Score: 1

      What would you recommend then for a 30user non-profit business with a 136gb scsi drive and tape backups they have been running (but not realizing that they weren't completely backing up everything)? Thats the crap I just inherited...I was thinking moving to 2 sata drives in raid 1(the scsi drive sounds like it's gonna die soon) and fixing the bloated folders so that tape backups will work again. Differential appended tape backups M-R and a normal backup on friday after the workday seems like the best idea.

      I'd probably go along with your Raid 1 solution as the primary storage.

      For the backups, I'd probably forget tapes altogether - I'd go for a couple of cheap-ish, easily portable NAS enclosures. Run your proposed backup cycle (i.e. weekly full backup, dcaily incrementals) direct onto one of them overnight, then during the day mirror that onto the other one. Somebody takes the second unit home at night as an off-site. (Obviously, three or more enclosures would be preferable, since it means that at least one of your off-sites is always actually off-site!)

      Every so often (maybe monthly), back up the NAS to DVD as a belt-and-braces measure. It all kind of depends on how important the data is, and how much they can afford to spend.

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    62. Re:Carefully protected? by aaamr · · Score: 1

      I use a pair of 1 TB drives on a QNAP NAS in a RAID-1 mirrored config. For home use. I backup all kinds of important data to the network share, and yes, I feel pretty safe that my data is protected.

      I actually lost a drive, and sure enough, once the replacement was obtained, everything went back to it's mirrored normal operation.

      I used to backup to tape, but these days, why bother? With a decent NAS, you cover the bases, and it's a lot easier to deal with -- especially for a home user.

    63. Re:Carefully protected? by drunkennewfiemidget · · Score: 1

      I've run my servers on RAID 1 and my important shit is on RAID 1 now. Simplistic, but I've had exactly 4 harddrive failures with this set up, and every time, I was able to replace the failing drive, and rebuild the mirror, and no data was lost.

      I keep a backup encrypted off-site aswell, but I have depended on the RAID to protect my data, and it's never failed me yet.

      So I guess I'm crazy.

    64. Re:Carefully protected? by SatanicPuppy · · Score: 1

      Sure, if you only need 16gb of info, then almost any backup solution will meet your needs. Sign up for a couple of gmail accounts, and mail it to yourself. Pay Amazon 2 bucks a month to store it in S3...Hell, if you trust Amazon not to lose your data (debatable) they'd only charge 1,843.20 cents a month to store your 12TB (not counting the 1,228.80 they'd charge you when you uploaded it).

      It's a problem of scale. 1gb is trivial. 1,024GB is difficult. 12,288GB is obscenely difficult. Reliable, redundant, offsite storage is nearly impossible for that quantity of data for anyone except a decent sized corporation. If you put together the amount of storage I deal with at work, its between 10-20TB, but the amount we back up in the hardcore offsite manner is under 100gigs.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    65. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Yea, because we all backup 12TB of home data to an offsite location. Mine is my private evil island, and I've bioengineered flying death monkeys to carry the tapes for me. They make 11 trips a day. I'm hoping for 12 trips with the next generation of monkeys, but they're starting to want coffee breaks.

      Just wait until the flying death monkeys discover that they can unionize.
      You might even have to bring in those flying paramedic monkeys as scabs to break the strike.

    66. Re:Carefully protected? by st0rmshad0w · · Score: 1

      If you have that much data and it's that important then you should have either planned your backups as your storage needs increased, or found yourself an online backup provider and pay for adequate storage.

      EVERY home user I've ever dealt with is well aware of that by the time I finish dealing with them.

      You either do it right or quit whining.

      You pay to take care of all your other big investments (house, car, boat, kids, pets, etc) so why should it come as any sort of shock that (gasp!) having a PC and accumulating data might require some upkeep costs?

    67. Re:Carefully protected? by CrimsonAvenger · · Score: 1

      She has less than 64 megs of important data she needs to keep backed up, so she uses a thumbdrive that she stores in a fire safe. I'm thinking that's closer to the average consumer than a 1TB drive full of "I CAN'T LOSE THIS" data.

      Mostly true this year. Probably true next year. And the year after. Beyond that, I'm not willing to bet, really. Data grows to fill available space, and will continue to do so into the indefinite future....

      --

      "I do not agree with what you say, but I will defend to the death your right to say it"
    68. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I have essentially the same thing at home. I do keep /home on a 4-disk reiserfs RAID5 array for the past few years, and have never had any problems with it even through 1-2 single drive failures. mdadm is my friend.

      You might want to give drdb a try: http://www.drdb.org/ . But I like using RAID5 more for some extra performance benefits.

      Most of my personal files are periodically rsync'd to a trusted friend's server, overnight using bandwidth limiting to be polite. My large collection of digicam photos are organized by year backed up to DVD, (someday I might mail a copy to my relatives). Some of the better pr0n and rare downloaded content gets backed up to DVD as well, but mostly to make room for more content since I don't really access it all that much.

      I have a few big folders for CDs and DVDs filled with stuff to satisfy my pack-rat mentality.

    69. Re:Carefully protected? by SatanicPuppy · · Score: 1

      Most of that was probably reel tape; no doubt that crap went to hell in record time...It's exposed to the air in multiple places, people actually TOUCHED chunks of it at various times...And if they're having mold issues, it's sitting in a humid warehouse somewhere.

      Our stuff is in a nice climate controlled safe, and it's all DDS tape and newer, the sort that doesn't get crud in it in normal usage.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    70. Re:Carefully protected? by Dewser · · Score: 1
      um, I think this was geared towards the corporate infrastructure. I've gone on a lot of technical pre-sale visits and many small businesses that have heavy storage don't back it up. Then they ask about online backups. We still recommend tape or tape libraries.

      As what I do at home, for one I do not even have close to a TB of data. But I just backup to disk. having a copy of data store elsewhere is better than nothing. What home users should do is buy a couple inexpensive external drives and do regular backups of their computer HDDs, at least their personal DATA. Get a firebox safe and store the drives in that.

      What I recommend back to the corporate/small business owners is to get at least 2 weeks worth of media as well as monthly tapes. Store the monthlies off site. They can spend the money on Iron Mountain or they can keep it in someone's home safe. (I don't recommend the home safe, but whatever works!). The more data you have the larger the the backup solution will be. But then you have to ask yourself how much is this data worth to the company?? Suddenly $20K is not looking to bad.

      --
      Dewser - all around techy "In the immortal words of Socrates - 'I drank what?'"
    71. Re:Carefully protected? by Anonymous Coward · · Score: 0

      You would be better spending your money on plain disks and using external drives for backup. Your data would be safer.

    72. Re:Carefully protected? by j-cloth · · Score: 1

      Joe the plumber probably has a kid or 2 and a few home movies of them which may have been recorded in HD from the camera he got Mrs. Joe for Christmas. Those movies are probably more important to him than your porn is to you. You're right. Normal people aren't like slashdotters, their data is likely more valuable.

    73. Re:Carefully protected? by stonecypher · · Score: 1

      The governing observation of the appropriate length of backup survival is data utility, not original machine utility. I still want data that I have on 8" disk, which is probably still viable, but I can't find a drive for. The machine it's from is long since un-useful, but the data is not.

      Check your scale before using it.

      --
      StoneCypher is Full of BS
    74. Re:Carefully protected? by Vu1turEMaN · · Score: 0

      unfortunately, we have 20 unused tapes already bought, and my IT budget would be eaten up for the next 6 months if I did 2 NASs. The best part is that I'm just an intern...they haven't hired a steady IT person since 2002 when the dell server was bought lol cause none of their interns could solve any problems. So far I've done a VPN, fixed 10min logins, and made every printer accessible in ten days, so I'm a god to them haha I'll probably get 4x 160gb sata drives (2 for server, 2 for single NAS) and still run the tape backups after the NAS is done copying it.

    75. Re:Carefully protected? by BronsCon · · Score: 1

      Sometimes it seems as though Slashdot is becoming a new type of RAID... Redundant Array of Incoherent Drivel.

      --
      APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
    76. Re:Carefully protected? by ushering05401 · · Score: 1

      It is not just consumer grade CDs and DVDs. The ancient IBM/AIX system I used to manage was rock solid on commiting valid backups to media.

      My current home server is a multiprocessor Dell (the type that is marketed to small businesses) - internal tape etc.. After going round and round with Dell over the constant stream of inexplicably borked backup tapes I demanded a partial credit, returned both the original and replacement tape drives to them, and moved to an external raid array that I image on a fairly regular basis.

      I have a hard time believing I was being sold the same hardware that even a mid-size company would have been received from Dell.

    77. Re:Carefully protected? by shadoelord · · Score: 1

      Backup media and procedures are really based on what you are doing and protecting. I've had bitrot on tape drives many times so I switched to archival quality dvd's for a little more and store them in a dry / dark place. They are estimated at 100years or more. (btw, the main reason CDRs are bad is because their recording surface was the same you wrote on with markets, hence easer to scratch off).

      All files get parity files to aid in recovery if there is bit error. After a certain amount of time we just shred the DVD's though - which is why we don't need anything beyond 5 years.

      Btw, my mp3 CDRs have been in my car for 8 years and still work :)

      --
      this is my sig, there are many like it, but this one is mine.
    78. Re:Carefully protected? by Bandman · · Score: 1

      Really? I'd never thought of that, but I suppose it makes sense. Does the controller actually remap all of the individual blocks, or how does that work?

    79. Re:Carefully protected? by rbanffy · · Score: 2, Insightful

      "If you think raid is protecting your data, you're crazy."

      BTW, RAID will do nothing if you accidentally "sudo rm -rf /" it.

    80. Re:Carefully protected? by glitch23 · · Score: 1

      Yea, because we all backup 12TB of home data to an offsite location.

      Yea, because we all have 12TB of home data to backup. If you do then I'm sure you also are being affected by Comcast's new throughput limit. I only have 2 RAIDs setup: 0 for the C: and 1 for my data. I don't do backups. It may come back to bite me in the ass or it may not. The RAID at least helps somewhat though. The mirror set is only 320GB though and it isn't full. If you have terabytes of data to backup then you probably have a professional system to manage it and to perform the actual backup. I've never had a true backup system through the 11 years I've had data. Up until 2006 I didn't have a RAID array setup either. 99.99% of home users can ignore this article, even me. If I simply restated what you already said in the rest of your post (that I didn't quote) I apologize.

      --
      this nation, under God, shall have a new birth of freedom. -- Lincoln, Gettysburg Address
    81. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I have a Raid 5 NAS setup at home with a USB port on the front of it and a 'backup' button next to it. I carry a 16gb usb flash drive on my keychain that I sync with my /backup export on the NAS nightly. The only reason I can manage to actually do this is because its dead easy: plug in usb stick, press button.

      Simple solution or am I tricking myself into false security?

    82. Re:Carefully protected? by WhatAmIDoingHere · · Score: 1

      Oh, sure, I'm not saying she won't need more in the future. What I am saying is that she's a good representation of the current 'average user.' But thumbdrives are now 16-32 gigs. Even if she backed up every single picture she took EVER, she'd still use less than 1/4 of that.

      --
      Not a Twitter sockpuppet... but I wish I was.
    83. Re:Carefully protected? by Deathlizard · · Score: 1

      Exactly.

      Admins have to understand that RAID (especially RAID-5) is a "Nines" solution and not a backup solution.

      If your server has to be up 99.999% of the time and data is not mission critical, then a single RAID server is fine. If Data is of the utmost importance and downtime can be tolerated, 2 RAIDless servers mirroring each other with a storage solution to offsite backup in case The Wizard of OZ decides to make a visit, or the Towering Inferno is being filmed in your datacenter is an solid solution.

      Of course if you got the capital, It's best to combine the above solutions and get the best of both worlds.

    84. Re:Carefully protected? by myz24 · · Score: 2, Interesting

      While I generally agree, I have burned CD-R, CD-RW and DVD+/-R that are all older than 3 years. I haven't had one fail completely just yet. I've come across a couple here or there that have issues reading some parts, but not a complete failure right on day 1,096 as so many people like to claim. One thing that helps is to actually burn at a lower speed.

    85. Re:Carefully protected? by myz24 · · Score: 1

      I used to have that issue with the old DAT tape drives. After a few weeks of about a 40% success rate I replaced the tapes with a different brand. Changing tape brands fixed the issue.

      I'm just saying that sometimes the drive really does work better with a certain tape.

    86. Re:Carefully protected? by myz24 · · Score: 1

      How do you store data offsite?

    87. Re:Carefully protected? by myz24 · · Score: 1

      Not to mention Joe the Plumber is a lazy bastard who still hasn't finished the video chronicling his child's first year and said child is nearly 2 now....I wonder where I saved that

    88. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Its called online backup as low as .15/GB or DVD's if you want it offline. Obviously at the scale of 12 TB both are somewhat ludicrous, but you were talking about home users - who typically have under 100 GB's of data they truly need to back up enough to pay for a backup solution of some kind.

      If you have 12 TB of data at home you've got one hell of a lot of porn. Generally 12 TB of data is enough that some sort of dollar figure is attached to it in terms of collecting it, or that you make off of it, and hopefully that figure is high enough to afford also having a backup of it, or your business model sucks.

    89. Re:Carefully protected? by myz24 · · Score: 2, Interesting

      I don't mean to come off like another one of those "Mac people" but I don't agree that RAID + internet backup is the solution for home users. I think RAID + a realistic backup program is the solution for home users. Time Machine, despite its flamboyancy, marketing friendly name really is a slick way to do backup.

      I'm an all out IT guy, love Linux, can tolerate Windows but Time Machine is by far the best backup solution I have used at home yet. My backup sets are typically 30-40MB from hour to hour if I'm using the computer. Uploading that much data every hour would be a pain.

      The reason I like Time Machine is that it is automatic, provides a level of versioning and allows multiple methods for restoring data. I can do a full bare metal restore, install then restore or just take the drive to another Mac or Linux machine and copy off the files I want, from whatever point in time available.

    90. Re:Carefully protected? by Velorium · · Score: 1

      Did you miss the part where he said "going to become"? SSD will eventually fall in price, as is the trend with most products.

    91. Re:Carefully protected? by Free+the+Cowards · · Score: 1

      It's not crazy to not be able to properly back up every single piece of data you have, because it's not all that important. I divide my data into three categories:

      1. Whatever
      2. Really don't want to lose this
      3. If I lose this then I am totally fucked

      Everything gets backed up to a second drive I have in my computer. I use Apple's Time Machine for this, obviously there are a lot of good solutions for this out there since this is the easy way. This step is also entirely optional. I do it simply because it means that if my main drive dies, my downtime will be measured in hours instead of days. Since I use this computer for my job that's important, but from a pure data integrity point of view I could skip it.

      Categories 2 and 3 get backed up off site to a server I have an account on.

      Category 3 gets backed up to multiple locations, included encrypted backups e-mailed to a gmail account. The stuff in category 3 is small enough (mainly personal source code and some other documents) that the space available in a gmail account is more than sufficient.

      Now, if you actually have 3TB of must-never-lose stuff, then you have a big problem. But from your description it sounds like most of your stuff fits into category 1, and you ought to be able to set up offsite backups for categories 2 and 3 without spending a bunch of cash you may not have.

      --
      If you mod me Overrated, you are admitting that you have no penis.
    92. Re:Carefully protected? by Gr8Apes · · Score: 3, Informative

      "Safe" production data ...with nightly replays/screenshots ...

      Exactly. You make backups, no matter what. Anyone that relies on RAID for backups will get what they deserve, sooner than later.

      RAID and SANs are for uptime (reliability) and/or performance. SANs with snapshots and RAID with backups are for data recovery.

      --
      The cesspool just got a check and balance.
    93. Re:Carefully protected? by boner · · Score: 2, Interesting

      SSDs should not be considered a viable option for long term storage just yet. Keep in mind that Flash cells are memory arrays and as such are susceptible to ionizing radiation that can and will flip bits. Store a Flash drive long enough and there will be bit errors beyond the capacity of the on-board CRC/ECC to correct.

      If you insist on using SSDs at least use them with ZFS.

    94. Re:Carefully protected? by Anonymous Coward · · Score: 0

      First off, I have a hard time seeing a home user coming up with 12 TB of data anytime soon.

      Well, somebody has to help Netflix back up their data.

    95. Re:Carefully protected? by NuclearError · · Score: 1

      O yeah? Whose basement are YOU living in?

      --
      Nuclear engineers build weapons. Civil engineers build targets.
    96. Re:Carefully protected? by ObjetDart · · Score: 0

      Hell, if you trust Amazon not to lose your data (debatable) they'd only charge 1,843.20 cents a month to store your 12TB

      I think you slipped a decimal place or two. 12TB is 12,288 GB, and S3 charges $0.15 per GB per month, so that makes it 1,843.20 dollars per month. Unless I'm missing something.

      --
      I read Usenet for the articles.
    97. Re:Carefully protected? by jaxtherat · · Score: 5, Insightful

      I love how you use the language "get what they deserve".

      What about my situation, where I have to store ~ 1TB of unique data per office in 3 offices that are roughly 1000 km apart and I have to keep everything backed up with a budget of less than ~AU$ 4000 IN TOTAL?

      I have to run a 4 x 1TB RAID arrays on the file servers and use rsync to synchronise all the data between the offices nightly "effectively" doing offsites, and have a 3 TB linux NAS (also using RAID 5) for incrementals at the main site.

      That is all I can afford, and I feel that I'm doing my best for my employer given my budget and still maintaining my professional integrity as a sysad.

      Why do I "get what they deserve" when I can't afford the necessary LTO4 drives, servers and tapes (I worked it out I'd need ~ AU$ 30,000) to do it any other way?

      --
      http://www.zombieapocalypse.tv/
    98. Re:Carefully protected? by bluefoxlucid · · Score: 1

      And if you work for the Department of Government or the Super Mega Corporation of the World, that data might be worth a $2 billion opportunity cost over the next 5 years -- and paying some small town firm $150,000 to manufacture a custom hardware device to read the damn thing is justified, since you probably have more than 2KiB of data on one disk somewhere and will be using it every once in a while, and it'll ensure smooth operation without hickuping 10 or 15 $80k employees for 3% of their time on top of several hours of stalled productivity in the entire 200,000 person work force each year (ouch ouch ouch!!!!!!).

    99. Re:Carefully protected? by camperdave · · Score: 5, Funny

      Keep in mind that Flash cells are memory arrays and as such are susceptible to ionizing radiation that can and will flip bits.

      That's okay. We'll just gang them together in a RAID 5 configuration.

      --
      When our name is on the back of your car, we're behind you all the way!
    100. Re:Carefully protected? by Anonymous Coward · · Score: 0

      ((300MBits / 8) / 1 024) * 60 = 2.2 Gbytes / min

      A 1TB drive should take around (1 024) / (2.2/min) = ~7.8 hours.

      How do you get 10 min/1.5GB, or 11 hours?

      PS: The iPod Shuffle's SSD is much slower right speed than a normal HDD.

    101. Re:Carefully protected? by boner · · Score: 1

      LOL,

      good luck....

    102. Re:Carefully protected? by ajkst1 · · Score: 5, Informative

      I have to echo this comment. RAID is not a backup. It is a form of redundancy. Nothing is stopping that system from losing two drives and completely losing your data. RAID simply allows you to keep working after a SINGLE disk failure. If you're not making backups of your critical data and relying on RAID to save your behind, you're insane.

    103. Re:Carefully protected? by Lukey+Boy · · Score: 2, Informative

      Tape can still be pretty decent for off-siting and DR. I managed to get recently at work an LTO4 drive in a 24-slot library; each tape is 800 gigabytes uncompressed (and most are about 1.2 with native compression), plus the drive does native AES encryption so every tape that goes offsite is protected in that way. It wasn't cheap, but it didn't break the bank by any means. Oh, and I can write at about 170mb/s to the drive.

    104. Re:Carefully protected? by HairyCanary · · Score: 1

      Archival grade DVDs are not terribly expensive and should last a lot longer than regular DVDs, and they claim a lot longer than 20 years as well.

    105. Re:Carefully protected? by failedlogic · · Score: 1

      Why would HDD not plugged in for a few years be less reliable than a tape?

    106. Re:Carefully protected? by afabbro · · Score: 1

      But DVD is *cheap* transient crap, and perfectly adequate for home backups.

      LOL! Sure, if you have less than 2GB of data. Because more than that, and I'm not willing to sit and feed discs.

      There is, unfortunately, no good tape backup system for home if you have any real volume. I'm sure not able to buy LTO or something like that for home.

      The best you can do is write to a set of encrypted external hard drives and rotate them offsite.

      --
      Advice: on VPS providers
    107. Re:Carefully protected? by RedBear · · Score: 1

      I want a DROBO but those are expensive as hell.

      Seriously? The Drobo is the most economical form of protected storage I've ever found. Especially the previous generation USB-only model, it's being sold for just $349 most places. One terabyte drives are down to about $135 each now. Four drives gives you 2.7TB of actual protected storage. Configure the Drobo as a 16TB volume and you can just keep upgrading the drives over the next few years as the prices come down on 1.5TB and 2TB drives. Pull the old drive, put the new one in. Wait for the light to go green and repeat. Simple.

      The next closest thing to the Drobo in price and function is the ReadyNAS NV+ which is still almost $1,000 (with no drives!) and still just uses standard RAID levels, whereas I think the Drobo has something more like a ZFS filesystem and should be less likely to puke and lose the entire array. I can't say that for sure, but they do say that you can pull out your drives and place them in a different Drobo, or even rearrange the drives in a different order and it will still work. That kind of adaptability is a good sign, I think.

      So, for the price and what it does, I would never complain about the Drobo being expensive. And I would seriously recommend it to anyone who needs to store up to 2.7TB of data semi-safely. It's really the cheapest way at the moment. I'm sure it's not totally fail-proof but it sure beats a bunch of individual drives holding non-backed-up data.

    108. Re:Carefully protected? by Anonymous Coward · · Score: 0

      The DROBO is just a custom controller/rebuilder built on RAID 5, so it will have the same problem.

    109. Re:Carefully protected? by Firehed · · Score: 2, Interesting

      A very quick check puts an LTO4 tape drive at an entry point of $3700, plus media and actually interfacing it with a system. Most people (companies) with a budget that allow for that kind of hardware not only have such a system in place, but have someone on staff who knows how to avoid the problems that RAID5 can/will bring down the road. And that's fine for businesses. However, RAID5 is reasonably cost-effective for home users as well (at least until offsite via Amazon S3 and the like becomes practical, which is entirely dependent on how fast internet connection uplink speeds are), and much more likely to be employed by someone who isn't aware of these kinds of risks.

      So, as someone who is clearly pretty well-versed in backup-related tech, do you have any ideas that would work for a home user who doesn't live on a yacht?

      --
      How are sites slashdotted when nobody reads TFAs?
    110. Re:Carefully protected? by totally+bogus+dude · · Score: 4, Insightful

      If you're replicating data between all three offices (and a fourth backup system?) then you are making backups. The vitriol is aimed at people who set up a RAID-5 array and then say "hooray my data is protected forevermore!".

      Tape systems, especially high capacity tapes, are very expensive, and even those are prone to failures. Online backups to other hard drives are the only affordable means of backing up today's high capacity, low cost hard drives. To do it properly though, you need to make sure you do have separate physical locations for protection from natural disasters, fires, etc. Which you have.

      The only concern your system may have is: how do you handle corrupted data, or user error? If you've got a TB of data at each site it's unlikely that mistakes will be noticed quickly, so after the nightly synchronisation all your backups will now have the corrupt data and when someone realises in a month's time that someone deleted a file they shouldn't have or saved crap data over a file, how do you restore it? Hopefully your incremental backups can be used to recover the most recent good copy of the data, but how long do you keep those for?

    111. Re:Carefully protected? by toddestan · · Score: 1

      Why even bother with DVDs? With a 1.5TB drive, you're paying about 8 cents per GB. That's dirt cheap. Why not pay 16 cents per GB, and put the data on two drives? It's so cheap that I'm not even bothering with the RAID1 or external drive question - I just pay 24 cents per GB and do both! I can't see why not.

    112. Re:Carefully protected? by EastCoastSurfer · · Score: 1

      That is all I can afford, and I feel that I'm doing my best for my employer given my budget and still maintaining my professional integrity as a sysad.

      As long as you have fully explained the risk to your employer they do end up getting what they deserve. They have made a decision about how much their data is worth to them. In this case roughly $4000.

      Business operations ask me for things all the time at work. "Can you do X?" My response always is "Of course given infinite time and resources." That way we immediately bring the conversation to what they really need and how much they are willing to spend. Backups never seem to be an issue as they will pretty much write blank checks to make sure the data is safe.

    113. Re:Carefully protected? by mabhatter654 · · Score: 1

      media life is the same whether the media dies or you can't get a working drive for a reasonable amount of time or money.

    114. Re:Carefully protected? by kimvette · · Score: 4, Informative

      I have CD-Rs dating back to 1994 or 1995 that are just fine -- and they're off-brand media too. "Good" media was $12 to $20 per CD then, and "cheap" media was $7.00 per CD.

      I have DVD-Rs dating back to 2002 or 2003 -- again, just fine.

      While it's good to be cautious, some in here are crying wolf regarding optical media.

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    115. Re:Carefully protected? by PitaBred · · Score: 1

      Amen. That's why I back up to the RAID array, and absolutely critical stuff is compressed, encrypted and copied to a remote server so that it's not all in the same place at home.

    116. Re:Carefully protected? by jaxtherat · · Score: 4, Interesting

      Judging by the budget you quoted, it's a combination of all of the above: you are a crappy sysadmin for a crappy company with limited growth potential.

      Sigh. *ignores flamebait*

      Anyway, here's the actual reality of the situation:

      I'm a not brilliant (but certainly not crappy either) sysad who is working for a company that has rapidly expanded to the point where they need a full time sysad, and then felt the kaboom of the subprime mortgage debacle, since they consult to the property market. Hence why my original upgrade budget got shrunk big time.

      The company BOTH cares about their data AND can't afford a proper backup system.

      --
      http://www.zombieapocalypse.tv/
    117. Re:Carefully protected? by Firehed · · Score: 1

      Even if most of your data fits into category 1, that doesn't mean it's not a pain in the ass to deal with losing it. I've been ripping all of my DVDs to full-quality h264 files (something that works out to iTunes compatible, really couldn't tell you what anymore) and while it wouldn't be the end of the world if it was lost, it would still be a tremendous pain in the ass to re-rip all of that stuff. Porn, whatever, it only gets used once or twice anyways... (I don't think I have ANY saved right now, if you can believe it) but I've got a ton of other media where it would be a considerable inconvenience to re-rip. I don't know about your internet connection, but any offsite system for me would have to be a sneakernet rather than an online service of some sort, and the cost of that is thoroughly offputting.

      Never mind my photo collection, which could easily grow faster than my ripped DVDs on a busy day, especially once you figure in copies for editing. I'd have to use a four-point scale (don't care, inconvenient, better not lose this, can't lose it) and while the media would get a 2, the photos would rank between a 3 and a 4. And even 100GB can be quite daunting to back up for a home users at the "this should only be unavailable for more than a few minutes in the event of a very extended power outage or the house catching fire" level.

      --
      How are sites slashdotted when nobody reads TFAs?
    118. Re:Carefully protected? by Anonymous Coward · · Score: 0

      bingo

    119. Re:Carefully protected? by rgmoore · · Score: 1

      None of these stupid suggestions are online, so they're useless. For home users, the ONLY realistic offsite backup solution in online backup over the internet.

      Give me a break. Copying your data to an external hard drive and keeping it elsewhere is a perfectly realistic solution. You just need to find a workable off-site location, like your desk at work or a friend's house. That's exactly what I do. It's not automated- I have to move the drive myself- so it's only good for snapshots, but it's still a workable solution. If my house burns down or a thief steals my PC, I'll lose at most a week or two of data. I may miss that week's worth of data, but it will be a much smaller loss than the decade plus of data that I'll save.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

    120. Re:Carefully protected? by jaxtherat · · Score: 1

      If you're replicating data between all three offices (and a fourth backup system?) then you are making backups. The vitriol is aimed at people who set up a RAID-5 array and then say "hooray my data is protected forevermore!".

      Yes, I'll be replicating the data between 3 sites and using bacula to do incremenmtal backups to the 1 NAS. This seems to be my only option for the price point.

      The only concern your system may have is: how do you handle corrupted data, or user error? If you've got a TB of data at each site it's unlikely that mistakes will be noticed quickly, so after the nightly synchronisation all your backups will now have the corrupt data and when someone realises in a month's time that someone deleted a file they shouldn't have or saved crap data over a file, how do you restore it? Hopefully your incremental backups can be used to recover the most recent good copy of the data, but how long do you keep those for?

      I'm hoping that I can do a one monthly level 0 to the NAS, and then keep 1 months worth of incremental backups after that.

      Since, as you say there's nothing to protect us from the "dumb user armed with a Delete key" beyond that 1 month safe window, I've amending the QA/QC and staff manuals to basically explain to the management team that if a user does this, we're screwed, but this is all we can afford so you have to live with that.

      I'm pretty sure I have all angles covered, so I'll just have to see how I go implementing this :/

      --
      http://www.zombieapocalypse.tv/
    121. Re:Carefully protected? by darkpixel2k · · Score: 5, Funny

      The company BOTH cares about their data AND can't afford a proper backup system.

      In this case, linux has one last resort for you:
      sudo apt-get install bible

      darkpixel@hoth:~$ bible
      bible: Debian/BRS Release 4.18, $Date: 2005/01/23 11:29:22 $
      Hit '?' for help.

      -snip-

      bible(KJV) [Gen1:1]> ec3:6

      Ecclesiastes 3

      6 A time to get, and a time to lose; a time to keep, and a time to cast away;
      bible(KJV) [Ec3:6]>


      Mainly pay attention to that whole '...and a time to lose' part.

      --
      There's no place like ::1 (I've completed my transition to IPv6)
    122. Re:Carefully protected? by nabsltd · · Score: 1

      Insanely high bitrate (48Mbps) HD video is 20GB/hour.

      How can you fill up a 120GG hard drive with 3 hours of that?

    123. Re:Carefully protected? by Lukey+Boy · · Score: 3, Informative

      Sadly no. I have a ton of things to back up at home and just use Bacula with a ton of DVD-RWs. It's not really ideal. I keep scouring eBay and Craig's List for an LTO1 or 2 drive but I haven't had any luck getting something under a thousand dollars. I've looked at S3, rsync.net, and a few others, but they're all way too expensive for me.

    124. Re:Carefully protected? by mabhatter654 · · Score: 1

      iTunes... once you start with TV shows disk fills up very fast and you have to back your own stuff up Apple frowns upon letting you download all of it over again.

    125. Re:Carefully protected? by tengu1sd · · Score: 5, Insightful
      >>>The company BOTH cares about their data AND can't afford a proper backup system.

      It can be that the company cares, but doesn't care enough to budget for potential data recovery. All you can do is to make sure the risks are explained, with budget option and well documented paper trail is cover your nether regions. Been there, done that. The typical response is that backups are not important, until a failure and a few days of uncertainty is forced upon the company.

      Having the same, potentially corrupted, data at multiple sites mitigates against the loss of a disk, or even the loss of a single site. User error or database corruption can wind up copied over your good data. Needing to go back for more than a day or two can may not be practical in a disk to disk backup environment.

      It's a part of system manager's role to spell out potential problems in easy to understand power point sound bytes and show what options are available. The better you can do this, the more toys you'll have to play with.

    126. Re:Carefully protected? by mabhatter654 · · Score: 1

      Something like Time Machine is a second copy of one computer. If the computer dies the chances of the drive dying at the same time are slim and visa versa. With something like Drobo, now you can have several machines in one location and if one of the array fails it can be replaced without losing data. Good enough for most people as long as the data is synced well.

    127. Re:Carefully protected? by AvitarX · · Score: 1

      What I do at home id have a 320 GB HD on my computer.

      I use pdumpfs to snapshot my disk daily to a similar external USB (it can fit plenty of snapshots since it is only /home and excludes downloads I have about 100GB to play with). When I get something important (like unload vacation photos from a camera), I put the current backup drive in the fire safe, and use the one from the safe for backups (clearing it first).

      This cost me about $300.00 (100 each for the drives and the fire safe). I am protected from deletion fairly well (until 2 backup swaps), from any drive failing (1 day lost), and my long term backup get's used enough that I can hopefully catch it failing before there is a fire and it is my only copy.

      When I upgrade my HD, or computer, I will get new drives, so I should be protected from 10 year old disk being my only copy too. That upgrade will cost me a few hundred extra as I buy additional drives though.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    128. Re:Carefully protected? by S-100 · · Score: 1

      ...5 year old tape is about a million times more reliable than a hard drive that hasn't been plugged in in 5 years...

      Got anything to back up that statement?

    129. Re:Carefully protected? by that+this+is+not+und · · Score: 1

      You can't find an 8" drive? They aren't that rare...

    130. Re:Carefully protected? by repvik · · Score: 1

      There is no such thing as "can't afford a proper backup system". If you can't afford a proper backup system (as a company), you don't value your data.

    131. Re:Carefully protected? by repvik · · Score: 1

      The problem is, you don't know the shelf life of the media until it fails. Within three years you are reasonably secure. After that, who knows.

    132. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Want a home one?
      I'll give you one:
      2X LINKSYS NAS200 and rsync them

    133. Re:Carefully protected? by discogravy · · Score: 1

      dude, NAFTA or H1B your way into foreign monkeys, they're way cheaper. Give 'em all a USB cheapo drive with your data and be all "fly my pretties!". Sure they have different accents, but if you're listening to what your monkeys say, it's not like your data is all that important anyway.

    134. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Or 2X LINKSYS NSLU2 NET STORAGE LINK USB 2.0 + RJ45

    135. Re:Carefully protected? by LVSlushdat · · Score: 1

      Don't know about that.. I'd agree the old style tapes were flaky as heck... the old MammothII "miniVHS" 8mm tape cartridges were the pits.. But these newer LTO1/2/3/4 tapes, are something else.... At work, we bought an Overland NEO2000 tape library and two HP LTO3 drives back in Nov 2006, and one as yet, unused cleaning tape. We were advised by Overland/HP to set up the library to automatically run the cleaning tape when the drives requested it.. So far, the drives have not requested any cleaning yet, and we're rapidly coming up on two years of backing up over 5TB/wk on the system. Feeling nervous as heck to go so long without cleaning/maintainance, we've pestered HP/Overland about every six months as to why the system hasnt cleaned itself yet.. They keep saying... don't manually clean, let the drive request cleaning... and so far, neither drive has.. And having come from a DLT/MammothII system before the LTOs, where tapes came out of the wrapper with large quantities of soft r/w errors, and hard r/w errors were common, and would kill your job dead often, these LTO's are worth their weight in gold in my opinion....

      --
      THANK YOU, Edward Snowden!! Americans owe you a debt of gratitude (whether they know it or not..)
    136. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Someone on Slashdot reads the bible?!?

    137. Re:Carefully protected? by Gr8Apes · · Score: 1

      I've had RAID 1, and 5 systems up and running in one form or another since 96. I also keep a backup done at least monthly.

      I'm looking at moving to RAID 0 for my performance disk, and just a backup drive for storage. Disks these days are fine for file serving in the home. I'm not running a multi-thousand user file system after all.

      --
      The cesspool just got a check and balance.
    138. Re:Carefully protected? by Gr8Apes · · Score: 1

      To be fair, depending upon which RAID you're using, you can lose 1, 2, or even up to n/2 disks in your arrays and still maintain your data. But that still won't protect you from del * or rm -rf.

      --
      The cesspool just got a check and balance.
    139. Re:Carefully protected? by TooMuchToDo · · Score: 1

      We use Amazon S3 in production storing close to 100TB of data, and it works like a god damn champ. YMMV.

    140. Re:Carefully protected? by localtoast · · Score: 1

      Agree totally. There needs to be something to bridge the gap when people have to save costs. Part of that is the right user experience on these appliance boxes. If you have a low-end NAS (running RAID-5), the real problem is that the appliance doesn't have the right user experience to promote best practices and guide the naive home user to back up the NAS system itself. The user is left to think that once data is on the NAS, it is "safe". In addition, This problem is compounded is that on a few low-end NAS boxes, the user experience promotes backing up desktop machines to the NAS. Example: USR8700

    141. Re:Carefully protected? by TooMuchToDo · · Score: 1

      On pricewatch.com, 1TB sata drives dropped under $100 the other day. Woohoo!

    142. Re:Carefully protected? by Gr8Apes · · Score: 3, Informative

      External TB drives are around $150 bucks. Buy several. Make rotating copies. It's doable on your budget. (We're in the same boat, btw, and that was our solution for the dev machines)

      However, the real issue is your employer has decided on the budget, and what you do with it is how well you're protected. Sometimes we don't get a Fibre NAS with remote backup, no matter how much we want it. Sometimes we have to get by with the old rsync, dd, or pure copy or even tar/zip with rotating media. (Anything less is suicide)

      --
      The cesspool just got a check and balance.
    143. Re:Carefully protected? by TooMuchToDo · · Score: 1

      You could always pay Flickr $25/year to upload unlimited photos them them.

    144. Re:Carefully protected? by Gr8Apes · · Score: 1

      with the cost of disks what they are, as discussed above, there is no such thing as not being able to afford a "proper" backup process in his case. It sucks to do it as the sysad, but it is doable on his budget. (actually, after one or two nights, if he really is a sysad, he'd have the entire process scripted and debugged, and would be sitting at home drinking a beer while the entire automated process ran at night, so the suckage would be in setting up the scripts)

      --
      The cesspool just got a check and balance.
    145. Re:Carefully protected? by TooMuchToDo · · Score: 1

      I actually just saw a 128GB SSD on pricewatch for $400, which isn't a horrible price IMHO. Yea for price drops on advancing technologies!

    146. Re:Carefully protected? by Anonymous Coward · · Score: 0

      That is the old way of protecting data. The data is becoming too large for that non-sense. Encrypt, randomly split data at the bit so no discernible data traverses the network or lands on a single disk. Geo-disburse within synchronous distances and apply a keyed HMAC so the integrity of the data is always protected. Secure, safe and highly available.

    147. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Congratu-fucking-lations. We all have 13 year old CD's. Point is, many of us who have 13 year old CD's also have 13 year old CD's that are garbage. Using CD's as a backup media, especially when you plan to take data "offline", is stupid. On the other hand I've never had tape go bad (after having the data properly verified). So you go on and use your unreliable backup method. Good Luck.

    148. Re:Carefully protected? by 0100010001010011 · · Score: 2, Funny

      It's ok, he just works for Verizon.

    149. Re:Carefully protected? by A+Friendly+Troll · · Score: 1

      Of course, things would be a bit better if the format war didn't happen and if we could get to choose to backup to Blu-Ray or HD-DVD, the prices of both being affordable by now... Thanks, Sony!

    150. Re:Carefully protected? by PCMeister · · Score: 1

      While an example of good maintenance practices, it doesn't mention the fact that many things of yesteryear were Built-to-last(tm)!! This serves as a perfect example of that.

      Unfortunately, that moniker has no place in this disposable society we live in. Users only start realizing the delicate nature of higher capacity drives after it crashes and burns.

      Another point that I haven't seen mentioned thus far is the increased difficulty of recovering data from drives approaching and surpassing the terabyte threshold.

      One piece of advice: Try not to be a digital pack-rat!!

    151. Re:Carefully protected? by Trogre · · Score: 1

      You should. The fact that the fastest, cheapest storage solution is also the least reliable should give you some clue to that.

      --
      "Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
    152. Re:Carefully protected? by MrNaz · · Score: 3, Funny

      Must be the admin for a Windows server.

      --
      I hate printers.
    153. Re:Carefully protected? by mjwx · · Score: 1

      If you think raid is protecting your data, you're crazy.

      RAID is protecting your live data, if a server that requires 100% uptime durin business hours and it throws a disk your only protection is RAID.

      Please allow me to correct your mistake, RAID != backup. They are two different concepts that have similar characteristics, RAID provide redundancy for data availability and reliability. Backups provide protection against long term data loss. RAID is protecting me from immediate data loss, meaning I can get back to business in minutes as opposed to the hours it would take to roll 200GB of data off a backup tape.

      The article is BS, consumer level HDD's have no bearing on enterprise level hardware where RAID is being used. if you have a 2TB array in the SME/Enterprise world that means you paid A$10K plus for high speed disks and redundant raid controllers, this probably is comprised of about 6 to 8 disks and 2 controllers, this setup is important because if it throws a disk you dont have time to do a full recovery, what you need is for the system to run until you get a replacement disk and have time to power down the system and put the disks in (Outside of business hours) which means you need redundant systems to kick in as soon as any problem occurs.

      A 73GB 10K RPM disk for an IBM server is US$250 so the threat of cheap 1 TB disks is non existent, I wish people would stop confusing consumer hardware with enterprise systems they are completely different. I have 2 x 320 GB disks in my home PC, this mostly contains video's and programs I've downloaded, its mirrored so if I lose a disk I'm OK. if I lose both disks I'm not going to cry over it because it doesn't cost me a lot to replace it (most of the video's I have on their original optical media anyway) but in business I'd have to be certifiably insane not to use RAID on any server because it does cost a lot of money to rebuild a server and get it back to operational status if the server throws a disk (happens more often than most non-techs think).

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    154. Re:Carefully protected? by Fweeky · · Score: 1

      Yeah, and plenty of what I burned from 1999 onwards works fine. Significant chunks of it doesn't, though; some of it's degraded so far there are sizable holes in the metalic layer. Some work fine in one drive but are unreadable in another, others are readable but only very slowly. Sure, some of it's cheap-and-cheerful noname media, but quite a lot of it is high quality stuff that touts its long lifespan.

      And these days they're pathetically tiny; 4.7GB? Great, I can get a pack of 100 for about 17p/each; 3.5p/GB, but only usable once, and using it all involves playing with my CD tray 200 times.

      Or, I could get another HD; say, £40 for 500GB, or £80 for 1TB; 8p/GB. It's all completely accessable after plugging it in (and that's trivial with hot-swap bays), it's much faster for sequential and random access, it can all be reused and rewritten, and the drive can be repurposed for on-the-shelf backup or day to day use as I see fit.

      These days, the only use I have for optical media is OS installation; I have a pack of CD-RW and DVD-RW's I can reuse, and that's it. My CD-R and DVD-R piles haven't been touched in years.

    155. Re:Carefully protected? by TheLink · · Score: 1

      That's fine if you only have a very few GB's to backup.

      If you have TBs to backup, it's a bad idea to use conventional optical disks. They're slow, flaky and low density.

      If you're a small business, buy a bunch of 500GB (or even 1TB) drives (they're cheaper per GB nowadays) and use those as your backup media. Just make sure you don't drop them ;).

      USD60+ for 500GB, USD1K buys you a fair bit of "media".

      --
    156. Re:Carefully protected? by mindstrm · · Score: 1

      I'm conflicted about this... I do understand the low-budget need for lots of space, however, if there is one area IT in general generally fails at, especially in small business, is assessing risk.

      More spindles on smaller drives is more expensive, but it's more manageable, and therefore more reliable. If they need to store terabytes of data, they need to store it correctly - the trick is in defining and selling what "correct" is.

    157. Re:Carefully protected? by Eivind · · Score: 1

      Making offsite copies using rsync is a perfectly adequate way of doing backups assuming it's competently done.

      *not* making backups typically mean the data exists in a single copy on a raid-5 array and *only* there. If the array is toast, the data is gone.

      People who do that does indeed deserve what is coming to them. It's not anywhere close to what you're doing though, if one of your arrays where toast, you'd have a complete copy at each of the other offices, and another at the linux-nas. That's completely different.

      There's no rule that backups must be on tape. There's just a rule that they must EXIST, and that it's better if they're not physically in the same building as the primary data.

    158. Re:Carefully protected? by Anonymous Coward · · Score: 0

      SRS? Who actually has 12TB of irreplaceable data on their home computer. With the speed that you can download all your pirated music and videos that you haven't watched or heard in months, because there is 12TB of them, there isn't a need to back up all that crap. Buy a couple of external hard drives if you are really paranoid and back up your character sheets and family photos on there.

    159. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Highly doubtful it's in HD. It's probably DVD. In that case, just burn extra copies with blank TY discs.

      If you got money for a blu-ray drive and HD camcorder, well money isn't really a problem is it? Sounds like a problem for 1% of the population.

    160. Re:Carefully protected? by TheLink · · Score: 1

      How many hours worth of home movies? I doubt Joe the Plumber records home videos in HD.

      A 2 hour DVD is considered decent quality and is about 5GB.

      A 120GB drive can hold about 48 hours worth of movies.

      I bet most of those home movies aren't longer than 30 minutes.

      For perspective, a 500GB drive is about USD60.

      Normal people don't think about backups much. I had to remind a friend to backup his vacation photos etc.

      --
    161. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I'd like to know what's the "high quality" stuff you bought. Almost all retail brands don't make their own media.

    162. Re:Carefully protected? by Firrenzi · · Score: 1

      Mine is my private evil island

      Do you zip it? ZIP!

      oh sorry, you're not scott.

      --
      The Tao that can be named is not the Tao
    163. Re:Carefully protected? by drsmithy · · Score: 1

      When tour groups of non-tech people come by the server room, I used to emphasize reliability by pulling a hard disk out of a running server, hand it to them, and then put it back in the server. The server doesn't skip a beat (and these were common off-the-shelf Dell rackmount servers costing $2,500 or so).

      Presumably you only did this until someone competent saw it happen and had you fired ?

    164. Re:Carefully protected? by JayAEU · · Score: 1

      I'm hoping that I can do a one monthly level 0 to the NAS, and then keep 1 months worth of incremental backups after that.

      Since, as you say there's nothing to protect us from the "dumb user armed with a Delete key" beyond that 1 month safe window, I've amending the QA/QC and staff manuals to basically explain to the management team that if a user does this, we're screwed, but this is all we can afford so you have to live with that.

      Well, you might want to have a closer look at software like BackupPC (http://backuppc.sourceforge.net/), which basically does backups using rsync or SMB and keeps versions of all files back as long as you like. All you need is a moderately powered PC running Linux and a few disks.

      It's a very space efficient system, which even allows you to dump the stored backups to DVD or another portable harddisk for offsite keeping.

    165. Re:Carefully protected? by dbIII · · Score: 4, Funny

      I'll tell you that I was pretty serious when Fedex put a forklift tine through the front of a server they were shipping.

    166. Re:Carefully protected? by mlts · · Score: 1

      Its sad to say, I'm in the same boat. As a student graduating this semester, I don't have the cash for a modern (read $2000+) tape drive, so instead, I use a good backup program that can support moving data between backup volumes. I then back up to a Samba server which has a software RAID array. Then, every quarter or so, I buy 200-300 DVD+Rs (the single density as the double layer are still a tad expensive for the capacity increase), then burn all the data on the backup volumes on that server over a couple weeks. Those go offsite immediately, and stay offsite.

      For me, a backup doesn't just consist of critical documents (which also are copied onto dedicated media and saved), but the ability to completely bare metal restore a machine. This has saved me probably hours, if not days, of time when I have had hard disk failure, and restoring the machine (after replacing the drive) consists of booting a restore CD, partitioning, mounting the samba server, clicking "restore all", restarting, and calling it done. Also, there are always files that are outside the document directories that are needed, and sometimes lost when just backing up only home directories. CD key files and license keys are good examples of this.

      I'm sure some company with the ability to do high volume sales would rake in the bucks if they made a backup device for the SOHO market, especially if the media had a long shelf life and a decent capacity (250GB native, preferably 500-750GB, so a modern computer can be completely backed up to one cartridge.) Unfortunately, hard disks are not it. Drop a tape, dust it off, put it back on the shelf, nobody will be the wiser. Drop a hard disk, you have a good chance of kissing all the data on it goodbye. Especially if the platters are ceramic.

      Another issue with hard disks is that (for the most part) they can't be set read-only via a physical switch. Even the most redundant RAID system is going to lose its data if some malware decides to zap it. A backup system needs a way to make media read-only, so if one is restoring a file on a compromised machine, the backup tape will not be damaged.

    167. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Of course, disk engineering and tape engineering are solving different problems. Tape heads always touch the actual tape while the disk heads do not touch the platter unless bumped. Tape also has more real estate than disk, but tape needs a *lot* more error correction because cartridges are expected to last decades and still have data easily retrievable from them.

      IMHO, considering physical constraints you mentioned above, magnetic tape is possibly a technological dead end. It basically boils down to: "they should design special purpose hard discs meant for backup only".

    168. Re:Carefully protected? by dbIII · · Score: 1
      You don't actually have to get everything.

      It really depends upon the data. I have over 5TB of stuff that is never backed up as a whole, just on a file basis when necessary (usually every week for an active project). All of this data was generated from original information on tape, CD, DVD or portable drive. The transformations that were applied to the data are on other drives and add up to around 200GB so is easy to backup on tape and mirror onto other systems. When a disaster happens it would take a couple of days to regenerate some of the data (assuming that project was never backed up) and a week or two to get absolutely everything back - however quite a lot of it would not be needed for months and most useful stuff could be back in hours. The nature of plenty of disk storage is you get it filled with a lot of unnecessary stuff.

      The important thing is not to necessarily backup everything but to have a plan. However be sure to cover contingencies like people getting incredibly angry when their mp3 collection they tried to hide in /tmp doesn't come back and other stupidities. If you don't know for sure something is unimportant make sure you've got it.

    169. Re:Carefully protected? by somersault · · Score: 1

      Yeah, I think when he talks about budget he means they don't have the budget for a multiloading tape system that would be necessary to run that kind of script.. scripting is the easy part, convincing management to give him funds sounds like the awkward one. If the company can't afford to backup their data, it's like someone buying a Ferrari but not leaving enough budget for the insurance.. so in reality they can't really afford a Ferrari.

      We've had 2 different Dell fileservers and both ended up failing when trying to recover after a disk failure. Both times we had stuff that wasn't on the tape backup list for some reason or another, and had to pay to get the disks recovered -_- It's far cheaper and simpler to recover a single disk than an entire raid array. So to my mind unless you're looking for the performance, using a large single disk and tape backups is the safest, most cost effective backup option for a small business.

      --
      which is totally what she said
    170. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I was implying that if you're adminning a Windows server, you need to pray more. "Funny" was the mod I was aiming for but I'll take an "Insightful", if that's all that's on offer :P

    171. Re:Carefully protected? by o'reor · · Score: 0

      RAID is NOT a back-up solution. RAID is a "oh shit my hard drive failed" solution.

      That is true, but consider this: if a RAID array was built initially with 5 brand-new disks, then one of the disks fails after 5 years, what happens ? If you try to rebuild your data from the other (still working) disks, you will put these disks under unusual stress, therefore highly increasing the risk of one more failing (since all of them are now close to, or past the MTTF). Then your array turns into a huge brick.

      So what are you going to do ? Are you going to replace them at a rate of one every year to try and spread the risks ? Are you going to buy twice the number of disks you need from the start, since when they fail after 4 or 5 years, you are not likely to find the very same model on the market any longer ?

      When you take everyting into account, RAID-10 or RAID-1 sound much more reasonable in terms of both risks and cost.

      --
      In Soviet Russia, our new overlords are belong to all your base.
    172. Re:Carefully protected? by OzRoy · · Score: 1

      How are you doing the replication?

      Beginning of last year we had a similar sort of problem, but they wanted the data to be kept as in sync between the sites as possible. With the amount of data we had Rsync and Unison was too slow.

      The only tools we could find that would do almost everything we wanted were for windows. We ended up going for a Windows file server and using DFS-R.

      Were you able to find anything decent for Linux?

    173. Re:Carefully protected? by Kjella · · Score: 1

      Half the problem is technology, the other is the chicken-and-egg that consumers don't use them. Small businesses in general don't tend to create much data unless they're in the media business, you can store a lot of office documents in a few gigabytes. That means that if you buy a tape drive you're considered to be Big Business with lots and lots of Important Data and so you should also be able to pay the Big Bucks.

      Truth be told, most consumers don't care about single-bit failures in most of their data. You know those 2342 pictures you got of aunt Hilda? Well gee, one of those broke for some reason and it's a bit of a shame but you'll live. The loss isn't anywhere near linear, even if you lost 90% that is much, much better than losing all of them. Applications can be reinstalled, it's only important documents that can really bite you but by then the risk/reward ratio just isn't there.

      What most people want is backup, be it multiple copies on different disks internally, multiple copies using an external disk (on or offline, on or offsite) or multiple copies using a remote site (over internet or external disk). Most of it doesn't even need any fancy incremental backup or versioning, for example digicam pics are just to copy and it's done. Checksumming and resolving bit conflicts between these copies just isn't an issue as far as I've experienced. Pretty much everyone I've met have been like "OMG my disk died I'm so screwed".

      RAID1 + 2 external disks that swap on being offsite (bring fresh copy to offsite and swap, not the other way around so one disk is always offsite) is usually enough. If you got really important documents on it just make a few copies in the file system so if anything bad happens, you'll find a good version even if the bit corrupted copy has spread to all your backups. This is FAR from an enterprise class setup. But it's better than most people have and easy to understand with no advanced tricks home users would have to understand. Plus it keeps a little incentive that "hmm, I'm going to where the other disk is stored, maybe I should make a fresh backup?" so the external copy doesn't collect too much dust.

      --
      Live today, because you never know what tomorrow brings
    174. Re:Carefully protected? by FromellaSlob · · Score: 1

      a company that has rapidly expanded to the point where they need a full time sysad, and then felt the kaboom of the subprime mortgage debacle, since they consult to the property market.

      That pretty much covers the "crappy company with limited growth potential" angle.

    175. Re:Carefully protected? by SlashDev · · Score: 1

      Backups are not the same as RAID. RAID simply protects against disk failures, a backup tape protects against failures AND restores data that has been deleted. RAID cannot restore deleted data.

      --

      TOP DSLR Cameras Reviews of the top DSLRs
    176. Re:Carefully protected? by merauder · · Score: 1

      That's why they have archival quality cd's/dvd's which have a shelf life of 100 or more years.

      --

      ..and knowing is half the battle.

    177. Re:Carefully protected? by FireFury03 · · Score: 1

      As it stands, the home user that actually sets up a RAID 5 raid is in the top percentile for actually giving a crap about home data.

      RAID is not, nor has it ever been a replacement for backups. RAID provides continuity of service in the event of certain types of hardware failure, backups provide a way of recovering your data in the event of a much wider range of hardware, software or user failures. If you don't need continuity of service then you are far better off investing the money you would've spent on RAID into a backup solution instead.

      RAID will not protect you from many causes of data loss, such as some rogue program/user trashing your data, or your PSU going bang and taking out all your drives (yes, I've seen this happen - breaking it to a customer that they have lost all their data because they ignored advice about taking backups since they had a RAID is great fun...).

      I question your assertion that home users need to back up 12TB of data - very few home users even *have* 12TB of data, let alone 12TB of important data. My backups fit on about 3 DVDs - I don't bother to back up stuff I don't need to back up. Things like my music collection can be re-ripped from CD, yes it'd be a pain but I have decided that I would prefer risking a re-rip rather than having to back it up. Similarly, my operating systems and applications do not need to be backed up since in the event of a failure I would reinstall the software from scratch rather than recover from a backup. The *vast* majority of data on my disks is stuff I haven't created myself and can just be downloaded again.

    178. Re:Carefully protected? by amRadioHed · · Score: 1

      It seems like a great idea, but I tried it several years back and never was able to find a way to overcome the mischievous gnome issue. If you manage to keep them off of your data I think you'd have a winner.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    179. Re:Carefully protected? by rikkards · · Score: 1

      That's ok. I had a loadmaster put 4 LAV tires on top of a small server rack.
      Fortunately it was in a pelican case but it could have been bad. Servers had been built and configured. Where they were going doesn't exactly have a local vendor rep. If it did, he would definitely be carrying an AK

    180. Re:Carefully protected? by Sobrique · · Score: 1
      Not as easy as you think actually.

      Even large storage arrays are using 'really big' disks, which mean long rebuild times and less 'duplication'. I've had a controller fault in one of my arrays which has meant 16 of my 300Gb drives have been offline for 24hours or so. Nothing died because of that, but with a failure rate of 1-2 drives per month, the odds of a double disk raid 5 failure was significantly higher than I liked.

      Our replica array was fine, so we weren't faced with data loss, but mean time between failures becomes much more significant when you're talking about multi-hour drive rebuilds.

    181. Re:Carefully protected? by Welsh+Dwarf · · Score: 1

      Where I work, we keep them forever. We don't have the volumes that the GP has (we handle 100G full backups monthly).

      All Full backups are stored on an external disk, every 3 months the disk is changed, so nothing is ever erased.

      Incrementals/Differentiels cope for changes during the month.

      If you only noticed after 6 weeks that you screwed up, either your work was on the internal SVN Document repo, or you have to sort it out yourself by reverting to the last full.

      For the GP, this kind of setup would run at 4*1TB Disks monthly, or at 2000EUR monthly. That's the cost of peace of mind.

      --
      Ask 8 slackers a question, get 10 awnsers (a citation, but I can't remember from who)
    182. Re:Carefully protected? by Capt+James+McCarthy · · Score: 1

      What about my situation, where I have to store ~ 1TB of unique data per office in 3 offices that are roughly 1000 km apart and I have to keep everything backed up with a budget of less than ~AU$ 4000 IN TOTAL?

      Well, it appears as though your employer has put a price on his/her data. So as long as your employer understands that as the cost goes down, chance of data loss goes up. So the real issue is the ones who write the checks should understand how much value they place on their data.

      --
      There are no loopholes. It's either legal or it's not.
    183. Re:Carefully protected? by Capt+James+McCarthy · · Score: 1

      Yea, because we all backup 12TB of home data to an offsite location.

      Duh. It's called Pr0n.

      --
      There are no loopholes. It's either legal or it's not.
    184. Re:Carefully protected? by theaveng · · Score: 1, Interesting

      >>>As the RAID controller is busily reading through those 6 disks to reconstruct the data from the failed drive, it is almost certain it will see an [unrecoverable read error].
      >>>

      This is a load of crap. The computer wouldn't just give up. It would make a second attempt to read that bit, and do so successfully. One bad read does not necessarily mean that spot on the disc is permanently damaged.

      Furthermore even if that bit is lost, it depends-upon what kind of data was damaged. If it's an MP3 or MPEG or JPEG, one lost bit is not going to visible to the viewer. The human ear and eye are not sensitive enough to detect that small an error, especially with lossy-compressed sounds and images. ----- If it's a word doc, then you might get the word "progrbm" instead of program. The document is still usable even with that mis-spelling. I would hope RAID controllers are intelligent enough to not throw away 99.999999999999% of the data and declare it "unrecoverable" just because of one lost bit.

      --
      FOX NEWS.com should be BANNED from television and internet. Have the Congress take it over and give us Truespeak.
    185. Re:Carefully protected? by dropadrop · · Score: 1

      Exactly. The important thing is that all risks are clearly presented. The people making the decisions should be able to calculate a ruff estimate of the expenses to the business in case of a problem (service available for "X" time means services "Y" and "Z" are also unavailabe and workers "a, b, c, d" can't do their job. This will cost the company the workers salaries, lost sales, lost clients due to not being able to fulfill contracts etc). If they make the calculations based on this and decide not to proceed they at least know what is coming to them.

      Being a small company it's probably your job to identify the implications of each devices failure / downtime and present them clearly to the management. Then they can base their decisions on that.

    186. Re:Carefully protected? by mikael_j · · Score: 1

      1920*1080*3 = 6,220,800 bytes per frame, at 30fps that's just under 180 MiB per second when doing uncompressed. Even uncompressed PAL video at 25 fps is 720*576*25*3 ~= 29,66 MiB/s and I have personally had projects where I've been rendering several minutes of uncompressed PAL frames from Maya. Obviously this is a bit of a special case but my point is that when you're working with video the default tends to be to use uncompressed video (even if it's just the scratch file in your video editing app's saved projects it can still be a lot).

      /Mikael

      --
      Greylisting is to SMTP as NAT is to IPv4
    187. Re:Carefully protected? by Anonymous Coward · · Score: 0

      ok, try this EVAULT, IRON Mtn Data, etc, swap your flying death monkeys for encrypted data packets, the real bitch is the initial backup, which can be months long for small internet pipes!

    188. Re:Carefully protected? by numbski · · Score: 1

      Okay, I'll bite.

      I have a (recently failed) company that was network engineering and OSS development, so feel free to razz me on the "failed" part.

      That out of the way - ideally you want 2 offsite backups at locations geographically separated enough that should a catastrophe occur, the data is "safe". We had a Coraid ATA over Ethernet setup (which is for sale now, btw), and something like 15TB of disk space accessible over network. Great.

      Now what. I had another in Montreal (I'm in St. Louis), and I did what I could to keep both a real-time data sync, and incremental off-site, but dang it's hard. Unless you have the budget for a monstronsity of a tape jukebox (which we didn't...no amount of wiggle room would have fixed that) - it felt like such a futile cause. The only thing I could manage to do was use an external firewire drive, and *try* to push the daily incremental off onto that drive to take with me. Sometimes it would work, others it wouldn't.

      So what do you do in the world of open source that fixes this issue?

      --

      Karma: Chameleon (mostly due to the fact that you come and go).

    189. Re:Carefully protected? by farnsaw · · Score: 1

      Backup location : Protection Against
      Drive Mirroring : Single Drive Failure
      External HD : Server Meltdown
      Safety Deposit : Building burns down
      Different City : Natural Disaster (Flood / Earthquake)
      Different State : Natural Disaster (Hurricane)
      Different Country : War
      Orbit : Global Thermonuclear War
      Voyager Spacecraft: Alien Invasion

      --
      "Computer Scientists can count to 1024 on their fingers" (non-mutant, non-mutilatated, human computer scientists)
    190. Re:Carefully protected? by Sobrique · · Score: 1
      Ah that's the thing I love about IT. Starting with someone coming to you with a spec for something fantastic, and watching their dreams crumble into dust as you point out that everything is possible, but THIS thing will cost them.

      But you're correct that backups are one of the places where we can pretty reliably get funding. People don't like losing data, especially when there's regulations that require them to have it, or get fined a lot.

      But that's actually kind of odd, as there's such a huge mountain of 'stale' data accumulating on our filesystems. I wouldn't be surprised to find I could implement a rolling delete of anything that hasn't been accessed in a year, and no one would ever notice.

    191. Re:Carefully protected? by Gr8Apes · · Score: 1

      The post alluded to discussed using a set of single large disks to back up RAID arrays in rotation. That would come in under his budget, and provide all the backups requirements he needs.

      --
      The cesspool just got a check and balance.
    192. Re:Carefully protected? by Anonymous Coward · · Score: 0

      With the amount of data we had Rsync and Unison was too slow.

      Don't you mean your link between sites is too slow? Using rsync comes pretty close to just moving the minimum necessary bits.

    193. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Yea, but DVD is transient crap. How long will those last? A few years?

      DVD's cost about.. nothing.. It's insanely trivial to burn a new copy of your data even every 6 months, if you are worried about it..

      Tapes are incredibly inconvenient for "home users". If nothing else, you can't walk into your local computer shop and buy a tape drive (or backup tapes).. A quick search shows tape drives cost on between $500 and $4000... You can buy recordable DVD's in most supermarkets now, and you can buy DVD-burning drives for less than $50..

      It'd probably be cheaper (although utterly inconvenient) to encrypt your important data well, burn it to a few hundred DVDs and give them to a few hundred people..

    194. Re:Carefully protected? by Whiney+Mac+Fanboy · · Score: 1

      You could always pay Flickr $25/year to upload unlimited photos them them.

      Sure, I could - but the flickr interface is a PITA compared to my way....

      --
      There are shills on slashdot. Apparently, I'm one of them.
    195. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I actually run intercontinental backups from my home server to a server I have back with my parents on the other side of the world.

      In fact, I then set up a readonly share to allow my parent's Windows machine to see the family photos directories. It avoids clogging up email and shrinking photos to avoid mail limits on their end, and all they have to do is browse to the share. Works well.

    196. Re:Carefully protected? by YourExperiment · · Score: 1

      Work out what files are really important to you. I did this a while back, and I discovered there was only a few Gb worth that I'd be really annoyed to lose (out of 1.5Tb of total storage). Now I back up these files to an 8Gb USB key every day, and carry them with me everywhere I go. With encryption courtesy of Truecrypt, I don't even have to worry about losing it.

    197. Re:Carefully protected? by somersault · · Score: 1

      Yeah I guess I read it wrong. I do something like that myself in addition to tape backups, but I don't take the disk offsite, it's more for convenience so that we can just setup the disk as a temporary network share if the RAID array needs rebuilt or whatever..

      --
      which is totally what she said
    198. Re:Carefully protected? by boomer_rehfield · · Score: 1

      >I'm sorry, but I'm getting seriously tired of people looking down from the pedestal of how it "ought" to be done, how you do it at work, how you would do it if you had 20k to blow on a backup solution, and trying to apply that to the home user.

      No one is looking down from their pedestal, because this _is_ how it should be done. This isn't geared to the home user, unless you have a second house that you consider an 'offsite location.' Nowhere in the article does it discuss home users, and if you have a 12TB RAID 5 sitting under your desk at home, well, I don't know what to say about that other than it's not the norm. Or even close.

      --
      Carpe Canem - Seize the Dog
    199. Re:Carefully protected? by marcosdumay · · Score: 1

      Disk-to-disk backup is quite cheap nowadays, and will also protect you against an accidental 'rm * .o'. RAID is still for increasing uptime, access speed and partition sizes.

    200. Re:Carefully protected? by dreamchaser · · Score: 1

      My really important data goes onto a disk monthly and is stored in my safe deposit box at the bank, along with other important items (orignal will, a few heirlooms, stock option paperwork [toilet paper these days], etc.)

    201. Re:Carefully protected? by domatic · · Score: 1

      Rsync IS efficient with the bare transfer the data. I suspect his problem is with the time rsync can take to find all of the changed files in a large set of directories and files. I mitigate this somewhat by rsyncing sets of subdirectories.

    202. Re:Carefully protected? by Lumpy · · Score: 1

      Your data is not worth more than $4000AU. so why bother backing it up?

      Honestly if your management thinks your data is worthless then why are you busting your chops to back it up?

      Or do you FAIL to educate and inform management that the backup systems are inadequate and will fail and they will have data loss? your professional integrity is not rubber banding things together, your professionalism is informing management that the system in pace is inadequate and will fail.

      --
      Do not look at laser with remaining good eye.
    203. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Long term? Flash memory is not permanent. In fact it degrades faster than DVD.

    204. Re:Carefully protected? by Overzeetop · · Score: 1

      I have about 2.5TB of home data. Of course, at least 2.2TB of that is backed up on commercially pressed CDs and DVDs, and another 0.2TB is in media without originals (by is available on the net, carefully retained by others with similar tastes in music and video). If my drive fails, I'm looking at several days to recover from the originals, but at least unrecoverable "core" things I have to really worry about backing up can be put on about 4-5 DVD-DLs. I don't of course. I have two external drives I swap between home and the office, each carrying a recent backup of both. I have also uploaded to ADrive, though I have my doubts that they are really a long term, reliable solution.

      As for the main array, it's something like RAID4, if I read the unRaid docs properly. Single parity drive; once drive failure gets repaired, two drives failed loses one drive worth of data. Actually, now that I some to think of it, I'm not sure which drive I would lose if I lost two data drives, or if I get a choice. BRB...

      --
      Is it just my observation, or are there way too many stupid people in the world?
    205. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I use carbonite to back up my data offsite, i only use it for things i have created (documents and pictures etc) not for all media, its easy to use and gives me peace of mind.

      http://www.carbonite.com/

      Try it, you might like it!

    206. Re:Carefully protected? by Ephemeriis · · Score: 1

      First off, I have a hard time seeing a home user coming up with 12 TB of data anytime soon.

      How about 250 GB of data? That's the size of entry-level HDs nowadays. Burn it to 15 DVDs?

      Actually, that's not a bad idea. And it isn't all that hard to do. Decent burner software like Nero will split the backup and ask for additional discs as you go. A relatively fast burner isn't that expensive, and you can keep working while your discs burn. I'll routinely kick it off on a Saturday morning while I'm doing housework, checking email, reading the news, and playing games. I just have to swap out discs periodically. Not a big deal. Takes a little while, but I don't have to give it my undivided attention.

      You could dump it to a NAS and then unplug the thing and stick it in a safe deposit box. Or you can print everything out and mail it to your uncle. Or you can burn a pile of DVDs and hide them throughout the woods. But unless you've got your data offsite it is not protected.

      None of these stupid suggestions are online, so they're useless. For home users, the ONLY realistic offsite backup solution in online backup over the internet. And it's expensive and time-consuming for large amounts of data.

      I was being sarcastic... But I don't see why it is unreasonable to expect home users to make use of an offline backup. It isn't much different than keeping copies of birth certificates, titles, deeds, treasured photos, and whatever else in a safe deposit box. I mean...everyone has a safe deposit box, right? I make a trip to the bank at least once a week to drop something in my safe deposit box. How else do you ensure that in the event of a fire/burglary/whatever you can go on with your life? Important data doesn't have to be digital.

      If you've got more data you could get yourself an external HDD, or a few USB flash drives, or a cheap NAS and dump the data to it.

      Backing up to an external HDD or NAS is backuing up hard disks with other hard disks at the same location. I fail to see how this is significant different from RAID (except slower).

      Sure, you're still vulnerable to a HDD failure, which is why I suggested a NAS for a few USB flash drives. But it gets an offline/offsite copy of your data, which makes it harder to accidentally delete something. And it will also protect you from fire/burglary/whatever. Yes, it's a pain...but this is important data, right?

      RAID + as much internet backup as you can afford is THE solution for home users.

      I wouldn't know. I've seen some on-line backup solutions... I've supported a client or two with them... But I've never made use of them myself.

      I've got about 100 GB of data that is actually important, which gets backed up about once a week to DVD and stuffed in a safe deposit box. I've got another 200 GB or so of data that isn't really important, but I'd rather not lose it (save games, movies, music, etc.) - which gets backed up to DVD every couple months.

      I've been playing with the idea of buying myself a tape drive. They aren't cheap... About $1,000 for an LTO1... But it would make my backups a lot easier, and therefor more frequent, and therefor I'd feel safer about the whole thing. An LTO1 only holds about 200 GB compressed, which I've already exceeded, but you can always throw in a second (or third, or fourth...) tape.

      --
      "Work is the curse of the drinking classes." -Oscar Wilde
    207. Re:Carefully protected? by Limecron · · Score: 3, Informative

      "Unrecoverable" implies that it is not possible to read the data anymore.

      Also, data on the disk is addressed by sectors, so if one fails, this means you typically have at least 512 bytes lost.

      It's true that even that might not completely break some kind of large media file, but you have to remember that RAID5 is a layer below your file system data, so if an error occurs when its trying to rebuild itself, it will not be able to give you your data back.

      You might be able to recover a lot of your data from an error of this kind, but don't count on the RAID implementation to do it for you.

    208. Re:Carefully protected? by Sobrique · · Score: 1
      LTO4 will do 800Gb of raw capacity (vendors quote between 2:1 and 3:1 with compression).

      It'll also do anything between 30MB/s to 120Mb/sec on a decent drive.

      They're at a price that's more per gig than you'd pay for a cheap SATA drive for your home system, but less than you'd pay for the similar amount of high end drives.

      I think they're keeping up quite nicely actually, and we still use it extensively. But for home use? No, a cheap SATA drive is 'good enough' as my backup. But then, I don't cry too much if my porn and mp3 archive is partially unrecoverable, but my employer has stuff that it considers significantly more valuable.

    209. Re:Carefully protected? by plumby · · Score: 1

      And even 100GB can be quite daunting to back up for a home users at the "this should only be unavailable for more than a few minutes in the event of a very extended power outage or the house catching fire" level.

      Few individuals, I suspect, have 100GB of data that they can't survive without for a few minutes when their house is on fire. Plenty will have 100GB of "I really need this data back at some point in the next few weeks" type stuff (photos, music etc).

      I've got several 10s of GB of photos (it was around 40 last time I checked, but I've taken a fair few since then) in that category, and for that I use Carbonite (pretty cheap, very easy to run). There's a few other services available like that, and if your data really is precious it's certainly worth the money.

    210. Re:Carefully protected? by igb · · Score: 1

      It's been moderated `funny', but I recently plugged eight 2GB USB memory sticks into a pair of hubs and ran ZFS over it (RAID 0+1, roughly). That's 8GB of storage that would resist most failure modes, I think. Detecting bit-flips requires checksums, which of course ZFS offers.

    211. Re:Carefully protected? by Mattsson · · Score: 1

      A one to one, off site disk-duplication is only a disaster-backup.
      You can still make an archive-backup to disk, but you'll use as much data-space as with a tape-backup.
      If you need fifty 800GB tapes to keep a good tape-backup, you'll need 40TB disk-space to be sure that you can keep the same backup-scheme to disk.

      --
      /.Mattsson - My native language is not English, so please don't whine over linguistic errors. (That's lame anyway...)
    212. Re:Carefully protected? by cdrudge · · Score: 1

      Yeah, because Joe The Plumber needs to record a 3 hour epic HD movie?

      No, Joe the Plumber doesn't need to record a 3 hour epic HD movie. But he might want to keep 3 hours worth of memories of his kids growing up. Or his wedding. Or a family vacation. Or some combination of thousands of other life events.

    213. Re:Carefully protected? by techess · · Score: 5, Interesting

      I always love it when Fed-Ex destroys something and then tries to hide it. One day I walked past the shipping office and I smelled the very strong odor of hydraulic oil coming from the room. I take a look inside since we shouldn't be receiving anything that has hydraulic oil in it. I found a bunch of boxes with the local Detroit Airport logo all over them and sealed with DET labeled tape. The cardboard was completely soaked through with the oil.

      I carefully opened one of the boxes and found it contained servers! It appears that the original boxes got in some sort of accident at the airport and were completely soaked. At the airport Fed-Ex or the baggage handlers did us a "favor" and re-boxed everything. The servers were so coated (and filled) that even the new boxes were completely soaked through and the bottoms of the boxes were starting to pull apart. The Fe-Ex guy (so we wouldn't refuse them) dropped them off at lunch and then got some random person in the hall to sign off on it.

      We had to pay for new servers to be built ASAP and shipped overnight (UPS this time) at huge cost for us. Since someone had signed off on the package we then had a very long fight to get Fed-Ex to pay for the equipment they destroyed. We never got the extra cost for the overnight shipping and the rush build reimbursed.

      --
      Don't anthropomorphize computers. They *hate* that.
    214. Re:Carefully protected? by Jellybob · · Score: 1

      So either you need tape in the sort of quantity that the private user cannot justify, or you're going to have to spring for a hefty RAID and arrange for another one like it as a backup. Offsite if you're lucky, but it's probably just going to be out in your garage/basement/tool shed.

      Or you could use a service like S3. If you're really paranoid about Amazon going out of business, or losing your data, pick four different services and back up to all of them.

    215. Re:Carefully protected? by QuantumRiff · · Score: 1

      A point of seriousness to your humor... Sun has a video demonstrating ZFS using a bunch of USB Flash drives plugged into a system. Pretty cool!

      --

      What are we going to do tonight Brain?
    216. Re:Carefully protected? by rwyoder · · Score: 1

      Well, i guess i'm crazy, i have 3TB of space on my home PC, and no way to back it all up offsite. I do have some important folders from one drive automatically copy to another drive periodically, so if one drive dies the other will be okay, but if i lose them both or the place burns down or i get a nasty virus, it's all going to hell. Most of my space is taken up by pirated... err... backed up... HD movies. And porn, lots of porn...

      Ummm...if you're looking for offsite storage, you just lemme know, m'kay?

    217. Re:Carefully protected? by Sangui5 · · Score: 1

      You could try upline; they have plans starting at $60/yr: https://www.upline.com/plans/index.shtml

    218. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I would suggest holographic storage disks as a viable answer. Sadly, they are quite expensive at this time, but then again so were CDs when first introduced. I paid $800 for my first Teac 1x CD Burner and about two bucks each for the blank media.

      Holostorage hasn't even made it to that point yet, but it is likely to do so before the RAID-6 general failure times predicted by TFA. Eventually we will be looking at buying $10 storage media with faster-than-disk read/write speed in the 1-4TB per disc range.

      Furthermore, most of these holostorage companies use something like RAID5 internally on the disc itself, so all data is written to more than one location or can be regenerated in some way. Without the redundancy we'd be looking at 10TB per disc.

      With flash drives taking over for hard drives in the fast access department, and hard drives moving into the large storage realm previously occupied by massive tape libraries, we definitely need either significant advances in tape or in a new format like HVD to fill the role of long term archival storage.

      I also remain very skeptical that some simple mathematical improvements to redundancy algorithms wouldn't be able to solve this RAID failure problem permanently.

    219. Re:Carefully protected? by Lukey+Boy · · Score: 1

      I replied to another post, but I still prefer local methods for backups because there's an upfront cost for tapes or DVDs but no monthly cost, and when backing up and restoring large amounts of data I don't have to do it over the Internet.

    220. Re:Carefully protected? by theaveng · · Score: 1

      "Unrecoverable" is a misused word by the original article. "Unreadable" is the correct terminology, and all that means is that the read sensor failed to read that bit. Pass the read sensor over that same area a second time, and it should work just fine.

      --
      FOX NEWS.com should be BANNED from television and internet. Have the Congress take it over and give us Truespeak.
    221. Re:Carefully protected? by RobBebop · · Score: 1

      exceptionally savvy home users are not going to pay for a tape drive and enough tapes to archive serious data, more less handle shipping the backups offsite professionally.

      The costs of buying the equipment aside, I think home users could cart a backup tape to a nearby friend or family member once a week to secure that they don't lose all their porn if their building went up in flames.

      --
      Support the 30 Hour Work Week!!!
    222. Re:Carefully protected? by roman_mir · · Score: 1

      Not 12, but I have 4TBx4TB in RAID 1 configuration at home. So hopefully RAID 1 will be a bit better protected than RAID 5. I am not going after speed, only after reliance and size.

    223. Re:Carefully protected? by jeffmeden · · Score: 1

      If they are in fact only willing to spend $4000 (annually?) to protect their data, how in god's name can they have 4 TB of data that's worth backing up? Either they are severely over-leveraged on risk (hint: they are related to the subprime debacle) OR the data just isn't that important, like keeping around email archives from '90-'99 that should have been offlined a LONG time ago to make room on the routine backup system for more important things.

      Here is why people are questioning your career choices: If you have $4000 you don't say "this is the best I can do to meet your needs of x amount of data", you say "$4000 will get you protection for y amount of data, if you want more you pay more."

    224. Re:Carefully protected? by jebrew · · Score: 1

      Ugh, where's the +1 'that sucks'? How do you mod this? Insightful? Interesting? I guess interesting.

    225. Re:Carefully protected? by Brett+Diamond · · Score: 1
      CDs and DVDs are photosensitive when creating the media. The media looses its ability to change over time, thus a blank CD or DVD will no longer be reliable for recording new data after a few years. However, once data has been burned on the disc, the previously photosensitive layer becomes stable and should last longer than the user who did the burning. In other words, blank discs go bad whereas burned discs remain fine and dandy.

      Discs are susceptible to physical damage (scratches, etc.) whereas tape is susceptible to magnetic damage and wear-and-tear. Unlike SatanicPuppy, I have experienced loss of data on tapes, resulting in a non-insignificant cost to replace.

      Also, examining the problem from the cost side, DVD media is currently roughly half the cost per byte of tapes and the DVD burners are orders of magnitude less expensive than tape drives. Granted, each disk holds orders of magnitude less data; still, finding the disc containing the required data and retrieving it will take less time than performing the same with a tape (although most of the tape access time requires minimal user interaction).

      And if you don't mind spending the money, you can get a DVD jukebox for the roughly the same cost as a decent tape drive, providing the same storage capacity for less cost and using media which is universally readable by any computer made in the last five years (ignoring some silly mini-laptops).

      But the truth is that, for the majority of users, a simple DVD backup solution is perfectly acceptable, leveraging their existing hardware to provide a long-term, reliable backup of their important data. Combine this with a RAID-Z (or RAID-Z2 for the paranoid) on drives that attempt to warn when sensing that their days are numbered (S.M.A.R.T., etc.) and the chance of loosing data is close to zero.

    226. Re:Carefully protected? by mapsjanhere · · Score: 1

      I'd say the question is not "when will the last DVD burned in 2000" become unreadable, but when will the first one die. And the odds are, if you run a complete backup of all your kids pictures on CDR or similar 10 years ago, one of the 10 CDs is probably irretrievable. And that's where the 3 year rule comes from.
      I personally use the "snapshot every month" to a portable drive stored off-site to get at least some true back-ups in our office. After I found out people were using the back-up server to archive and delete, instead of just backing up data kept on their desktops, this seemed advisable. One day I'm sure I get a real sysadmin with a budget and back-up tools (right after the flying pig ski races in hell).

      --
      I'm aging rapidly, I bought a new game and had no idea if my machine was good for it.
    227. Re:Carefully protected? by Darth_brooks · · Score: 1

      Sure, right now. The first hard drive I ever bought was 8 megabytes and cost 600 dollars.

      So...regarding your lawn. Does it bother you that I'm on it? Does the very thought of my intrusion seethe you? Are you consumed in the middle of the night by the thought that something...someone could be within its confines and you would be unaware?

      --
      There are some people that if they don't know, you can't tell 'em.
    228. Re:Carefully protected? by Medievalist · · Score: 1

      A one to one, off site disk-duplication is only a disaster-backup.
      You can still make an archive-backup to disk, but you'll use as much data-space as with a tape-backup.
      If you need fifty 800GB tapes to keep a good tape-backup, you'll need 40TB disk-space to be sure that you can keep the same backup-scheme to disk.

      rsync -F --link-dest

      Passwordless scp the batch to another site where your DR copy is kept on a big RAID-10 coraid. If you have more than two sites, ring 'em & sync 'em so all sites have backup archives of all other sites.

      Tapes are for companies without multiple remote sites... otherwise they are not worth the expense.

    229. Re:Carefully protected? by FourthLaw · · Score: 1

      RAID 5 on top of RAID 10 with nightly replays/screenshots and multi-tiered read/writes over an array of disks.

      Is that RAID 50 or RAID 15? Real data protection uses 6 RAID 6 arrays that is all placed into a RAID 6 configuration...uh...wait...

      --
      Skilled in differentiating ravens from a writing desks.
    230. Re:Carefully protected? by clodney · · Score: 1

      The arrogance and narrow mindedness of this attitude is astonishing:

      Your data is not worth more than $4000AU. so why bother backing it up?

      I have never yet met a small business owner that didn't have more needs than they have cash to cover them. Usually that means accepting some risks (and then lying awake at night worrying about them). Perhaps they could budget 6K for backup instead of 4K, but if they do that a key employee leaves because they didn't get a raise, and revenue tanks and the company goes under. Perhaps the money not spent on backup is going to health insurance, because the owner believes that if he has to make a choice between healthy people and healthy data he sleeps better at night when he chooses healthy people.

      By all means make management aware of the risks they are running and outline alternatives. But don't pretend that the owner can wave a magic wand and make the budget suddenly appear.

    231. Re:Carefully protected? by nabsltd · · Score: 1

      HD camcorders don't use uncompressed video.

      Your example is pretty much the only time that uncompressed HD is regularly used...when rendering CGI. Even when editing HD, all you do is mark cut points and when you splice scenes together, most of the compressed frames are copied unchanged, with only a few frames around the join subject to de-compression and re-compression.

    232. Re:Carefully protected? by jsight · · Score: 1

      Rsync IS efficient with the bare transfer the data. I suspect his problem is with the time rsync can take to find all of the changed files in a large set of directories and files. I mitigate this somewhat by rsyncing sets of subdirectories.

      And don't forget the massive memory usage of rsync either. I've seen it at over 1GB when syncing large trees of files.

    233. Re:Carefully protected? by SatanicPuppy · · Score: 1

      My bad. I was thinking fifteen cents a gig, and just typed cents instead of dollars.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    234. Re:Carefully protected? by Medievalist · · Score: 1

      rsync -F --link-dest every night, push the batches out to the remote servers, apply the batches, use RAID10 or better on the storage end, publish the backup archive to the end-users with samba (if you get the switches right on the rsync job file protection and permissions management is done for you, since files will retain their original attributes) for self-service restore capabilities, use passwordless (with large keys) ssh for the scp and rsync base transports, put a little bit of filtering on the client end to quiesce or ascii-dump any live databases and to prevent the keys being abused for other purposes.

      http://www.mikerubel.org/computers/rsync_snapshots/

      If you're using MS-windows on the server end, it gets exponentially harder, so you want to avoid that if you can.

    235. Re:Carefully protected? by Anonymous Coward · · Score: 0

      God i'm strange.
      -Taylor

      god you're about the norm :p

    236. Re:Carefully protected? by TheoMurpse · · Score: 1

      Except that I have home-burned CDs from 10 years ago that still get read properly. In fact, the only home-burned CD from 10 years ago that doesn't still work for me is one that I scratched something terrible.

    237. Re:Carefully protected? by mikael_j · · Score: 1

      Heh, all I know is that when I've recorded regular PAL video it's been uncompressed, never had access (or the need to use) a HD camera and I assumed that "real" HD cameras would also record uncompressed video.

      That said, my point was that it's pretty easy to grow video projects into absurd sizes even when doing "regular" editing.

      /Mikael

      --
      Greylisting is to SMTP as NAT is to IPv4
    238. Re:Carefully protected? by gravis777 · · Score: 1

      Yes, but seriously, how many people back up 12 terrabytes?

      I have about 50 gigs of Data that I absolutely do not want to use (growing at about 10 gig a year now that I have full TV resolution on my point and shoot camera). Basically, this is pictures. My word documents and databases could all fit on a single CD. I think I am extreme for most users.

      All the rest of the storage space I have (2 terrabytes) goto storing movies off of my DVR, or installing applications (CS4 itself was over a 10 gig install), or for scratch space for projects. No, all the rest of that data is non-crucial, and does not need to be backed up.

      As Blu-Ray media comes more popular, I can easily back up 50 gig to a single disc. So, that goes from burning your 1446 DVDs every few years to burning a couple of Blu-Ray discs every year. Much more managable.

      Throw in a cheap 100 gig drive, and put it in an offsite storage facility with your BluRay media, and you are fine. Shoot, most of my photos I have 4 copies laying around. I have lost pictures before, and have learned from my mistakes.

      Most of my friends have as well, and back up pictures to online picture sites, or burn discs as well.

      In the office, wow, lets see, you have your servers, the backup servers, the daily snapshots, tape backups, and then the tapes that we ship to offsite facilities. And we have a data-recovery service we sometimes use (expensive, but effective). A bad sector on a HD should not take down an entire Raid array - you are talking worse case scenarios, and if you DO loose data, you have your backup servers, and then your tape backups. If you don't, I doubt you are SOX compliant.

      Truthfully, this sounds like a bunch of FUD to me. You have ALWAYS had this issue with RAID-5, which is WHY you have backup plans in place. And IF you are geeky enough to be running RAID at home, you should understand the importance of backup anyways.

    239. Re:Carefully protected? by penguinbrat · · Score: 1

      "If you're replicating data between all three offices (and a fourth backup system?) then you are making backups."

      Replicating data is NOT backing anything up - I learned this the hard way almost a decade ago. The system I was working was a huge LDAP system, replicating across 20+ servers - everyone assumed that there was more than enough redundancy going on, until that fateful day when the front end triggered some obscure bug somewhere that started deleting the data from the bottom up. And guess what? The replication worked just fine - now all 20 servers were deleting the same data :-(

      You NEED to have snapshots of your data, no two ways around it.

    240. Re:Carefully protected? by mmullings · · Score: 1

      The controllers I've used have the option of 'force rebuilding in event of a read error'. Sure after a disk check there were some bad files that got placed in the 'found.001' folder, but guess what, I was lucky enough that the files were old and useless and the good stuff rebuilt 100%.

      --
      I remember when MOD was an audio format, and DOS wasn't a network attack....
    241. Re:Carefully protected? by Bob-taro · · Score: 1

      "Unrecoverable" is a misused word by the original article. "Unreadable" is the correct terminology, and all that means is that the read sensor failed to read that bit. Pass the read sensor over that same area a second time, and it should work just fine.

      Can you cite a source for this? There seems to be some disagreement.

      --
      Prov 9:8 Do not rebuke mockers or they will hate you; rebuke the wise and they will love you.
    242. Re:Carefully protected? by Philosinfinity · · Score: 1

      Obviously, this is not an indictment against you, but the fact of the matter is that every company needs to evaluate the value of their data and their risk of data loss and budget accordingly. HOWEVER, part of the job of being a sysadmin is making sure that the company realizes the value of the data at hand. Consider these points:

      • A company that doesn't understand the value of their data cannot possibly put value in the person who ensures access and protection of that data.
      • A company that doesn't understand their level of exposure/risk as it relates to data loss will assuredly blame the person they entrusted with the protection of their data is a loss occurs.
      • A good sysadmin accepts the fact that some of their battles could cause unemployment, but that RGE (Resume Generating Event) is better than the catastrophic failure kind.

      Your first efforts as a sysadmin is to get the company to define things like RTO, RPO, acceptable data loss, system criticality, data protection budget, and other things that fall into the "business perspective" so you can craft your processes and plans. Then, you should be requiring management to sign off on those documents. The thing to keep in mind is that, regardless of the circumstances of business, if a company is not willing to spend sufficient funds to protect their data, even after understanding the risks involved... well they just don't value their data, then.

    243. Re:Carefully protected? by TooMuchToDo · · Score: 1

      True, but their storage system is many times more reliable than your way....

    244. Re:Carefully protected? by lazyforker · · Score: 1

      Backups are not important to anyone.

      Restores/recovery are. I really feel sorry for 1165473 - he/she is stuck in a tough spot. The only advice I can offer is that he/she needs to do some calculations to demonstrate the cost to the business of rebuilding/re-acquiring data.

      Another poster mentioned creating a paper trail: that's nice but blatant ass-covering won't make any friends or influence decision-makers. Highlighting the risk, with a valid assessment of the potential loss to the business, and presenting various solutions (at different price points) would help the business owners/managers make a reasoned decision.

      Not only are you covering your ass but offering solutions - which you as a sysadmin are best placed to recommend.

      It sounds like a lot of meaningless paperwork but the business owners need to understand the objective risk to the business and the cost associated with recovery vs the cost of various solutions. Here's a chance for you to exercise your ingenuity, creativity and sysadmin skills - and demonstrate your value and commitment to the business.

    245. Re:Carefully protected? by penguinbrat · · Score: 2, Insightful

      RAID is a backup - just backing up the hardware and NOT the data...

    246. Re:Carefully protected? by Applekid · · Score: 1

      The company BOTH cares about their data AND can't afford a proper backup system.

      If they don't pony up, they can't care that much about their data. I can understand not wanting to paint a gloomy picture for the bosses, but gotta face reality: a data catastrophie can sink you.

      A University of Texas study found that 43 percent of companies experiencing a catastrophic data loss never recover, and half of them go out of business within two years. According to DTI/Price Waterhouse Coopers, 70 percent of small firms that experience a major data loss go out of business within a year.

      I'd recommend to GP that they stop sugar-coating it for the brass. Times may be tough but if they can't find the budget for adequate protection and the roll of the dice wills it, they'll be out of a job and NOBODY there will have budget anymore.

      --
      More Twoson than Cupertino
    247. Re:Carefully protected? by torkus · · Score: 1

      Except you're talking about a multi-hundered-disc backup set where you need 100% reliability. I'd put good money that if you had 500 CD-Rs from 1994 there would be at least a few bad ones by now. Heck, the burn failure rate is > 1:500 and not every error is detectable unless you do a read-verify on the disc immediately post-burn.

      --
      You can get rich if you own a politician, but you have to be rich to buy one in the first place.
    248. Re:Carefully protected? by grahamsz · · Score: 1

      Backups never seem to be an issue as they will pretty much write blank checks to make sure the data is safe.

      Maybe almost true.

      Still the thing that bit us in the ass once was the recovery time. A software problem brought down a customers production database.

      The backups were running fine, we lost about 2 hours of transactions (which was considered acceptable risk) but it still took about 12 hours to get the whole system working and back online again.

      A good solution may not be in the budget, but don't forget to quote the recovery times for different types of backup. Particularly if you are doing some kind of incremental offsite copy - getting that all back onsite in a hurry isn't easy.

    249. Re:Carefully protected? by torkus · · Score: 1

      Yes, 10 years ago you could get a 1GB drive. Now you can get a 1TB drive for ~1/4 the price of that 1GB drive.

      So SSD is droping in price while growing in capacity except magnetic media is doing the same. It's going to be quite some time before it's cheaper to use digital media than magnetic media for raw storage (we will leave the price/performance ratio out of this since we're talking about backup).

      Oh and as density increases you realise the charge stored in "non-volatile" flash ram gets smaller and smaller to the point where you're going to be looking at bit errors in that arena after a year or 5 of unused storage...right?

      --
      You can get rich if you own a politician, but you have to be rich to buy one in the first place.
    250. Re:Carefully protected? by sjames · · Score: 1

      That's the really ugly part of it. It's bad enough that they disclaim responsibility for their own negligence unless you buy insurance from them, but then even after that, even though they had full control over the level of risk, they bend over backwards to disclaim responsibility.

      My favorite was when they rammed a forklift through a 1U server (rendering the back V shaped) and they claimed 'inadequate packaging'.

      Unfortunately, it's not just FedEx. In another case, I shipped a router in a wooden crate and the shipper dropped it down several flights of stairs (as the recipient watched. First they tried to claim it was fine when they delivered it!!! Then they claimed that old favorite 'inadequate packaging' (even though it actually made it OK for the first flight or two). Since the backplane cracked, I'm guessing it was the sudden stop at the end that did it in.

      the recipient was actually watching as they dropped a router down several flights of stairs.

    251. Re:Carefully protected? by guruevi · · Score: 1

      I have an even smaller budget since I work for an educational institute in an only-research department that only gets money when research grants are being spent in that department (local pilot grants that max out at $10,000, NIH for the rest).

      There is plenty of stuff out there for the 'cheap' sysadmin. I run a few Apple XRAIDs over FibreChannel. They are (were) the cheapest in the industry with very good specs and if you can afford a few Apple XServe's you get XSAN on there and a FibreChannel switch (on my wishlist for winter-een-mas) and you have a very easy to maintain expandable SAN.

      I was looking into $12000 solution for a backup (40 TB total storage to go over with about 500 GB changed per week). Backups would take about a week to complete and require me to switch tapes once and a while. But then I found the Norco DS cases which have 12 slots (10 if you want to maximize usage on a single controller card) and for less than USD 1500 you can get a 10TB eSATA array. The performance is pretty crappy (1 eSATA link per 5 drives). I use it for backups and using rdiff-backup over rsh I have a fully functional (weekly) mirror with weekly diffs in a weekend (starts Friday evening at 5:00pm and ends early Monday morning) but I should be able to get better performance if we get a new fileserver (currently dual PowerPC).

      --
      Custom electronics and digital signage for your business: www.evcircuits.com
    252. Re:Carefully protected? by sjames · · Score: 2, Interesting

      A remarkable number of RAID units throw a tantrum and refuse to even keep trying at the first sign of real trouble. That's why I prefer to use the Linux soft RAID over various hardware RAIDs. At least the layout is well documented so I have a chance of putting most of it back together later.

    253. Re:Carefully protected? by The_reformant · · Score: 1

      I use the internet to back up all my data that way other people will pay for it to be stored in perpetuity. Sprinkly in enough redundancy and your good to go. By the way:

      Shopping List -- 14/3/2005
      Bread
      Milk
      Peppers
      Chicken
      Mince
      Pasta
      Rice
      Pasta Sauce
      Noodles


      To-do list
      Finish writing xmas cards
      Phone drummer ---done this no car = no good
      back up data from last 2 months (in progress)

      --
      I have discovered a truly remarkable sig which this post is too small to contain.
    254. Re:Carefully protected? by danomac · · Score: 1

      Sadly no. I have a ton of things to back up at home and just use Bacula with a ton of DVD-RWs. It's not really ideal.

      I'd say it'd be cheaper to get a really cheap PC and run linux raid + rsync over a gigabit LAN for home users...

      Tape drives (good high capacity ones) aren't going to be affordable for home users for who knows how long. You could probably build three or four cheap systems with enough disk space over the LAN for one of those tape systems. Only problem then is offsite backups... better hope the house doesn't burn down.

    255. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Those movies ARE my porn you insensitive clod.

    256. Re:Carefully protected? by Doctor+Faustus · · Score: 1

      Or you can burn a pile of DVDs and hide them throughout the woods.
      Isn't that what the Reiser file system does now?

    257. Re:Carefully protected? by JWSmythe · · Score: 1

          What's sad is, it doesn't matter what shipper you use, stuff can get mangled.

          We've shipped rackmount servers that turn out more round than square when they've arrived. We've received some that were skewered by the forklift. Oil soaked boxes. Water soaked boxes. Unidentified liquid soaked boxes (I don't question, I washed my hands carefully).

          The more often you ship to and from more places, the more damage you'll see.

          And yes, it's hard to get anyone to settle a claim. They want the extra money for the insurance, they don't want to pay that money out. You'd be better off if the package disappeared entirely.

      --
      Serious? Seriousness is well above my pay grade.
    258. Re:Carefully protected? by kimvette · · Score: 1

      . . . so says the anonymous coward troll. :)

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    259. Re:Carefully protected? by JWSmythe · · Score: 1

          I was going to recommend that. :)

          Actually, you can just back up your config and data directories to tape, and have the full shebang on tape. :)

          I had set up two pretty cool servers (4 500Gb SATA drives as RAID5 = 1.5Tb/ea), and a third machine with a tape jukebox on it for off-off-site storage. The site where the backup servers are, is different than where their data is.

          Then the jukebox died. Hrm.

          But, the Linux machines are running strong still, so the data is still backed up.

          I back up my own personal servers the same way. It's very effective.

      --
      Serious? Seriousness is well above my pay grade.
    260. Re:Carefully protected? by Anonymous Coward · · Score: 0

      The people making the decisions should be able to calculate a ruff estimate...

      Dammit, Scooby! If I've told you once I've told you a hundred times - wipe your paws before using the keyboard, and stop slobbering on the monitor! Bad dog! No!

    261. Re:Carefully protected? by GSloop · · Score: 1

      rdiff-backup is your friend.

      Reverse diffs and all their goodness.

      Not as good as full individual copies on tape, but given the situation, not bad either.

      -Greg

    262. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Presumably you only did this until someone competent saw it happen and had you fired ?

      No, I was authorized to do that (and I got it in writing). But management didn't want to take the risk of another disk failing during the rebuild.

    263. Re:Carefully protected? by Anonymous Coward · · Score: 0

      >This is serious news.

      It's not.  It's an article written one and a half year ago by a dumbshit who knows some basics of RAID5 but lacks knowledge of modern RAID controllers, and has no hands on experience rebuilding dirty RAIDs.

      One and a half year later Slashdot has a slow news day, picks up on the bearded doom story and some dumb sheeps without any actual knowledge or experience start chanting the doom of RAID5 with him.

      Beeeeh!  RAID5 is going to die!  Beeeeh!

      Data scrubbing will prevend the UER problem during rebuild, and even *if* the rebuild would fail your could simply copy 99.999% to 100% of the data from the dirty array to some place else.

      --
      C u in 2010
      Raid5

    264. Re:Carefully protected? by HTH+NE1 · · Score: 1

      Buying a computer system you cannot afford to properly use is crazy. Yes, some people are crazy, and those crazy people are going to lose data, but there's no sense in defending it.

      Yeah, well, who gets it right the first time every time? The first time you start working with large DV files, you're just trying to keep up with the storage demand, and next thing you know it's going to be a huge expense just to double your capacity once for a single backup. Instead, you just figure out what data you can move off-line awhile to some external volumes. Especially if it started just as a hobby and only becomes a business for the occasional wedding video.

      I want a DROBO but those are expensive as hell.

      I picked up my drobo from a seller on Amazon and have been very happy with it. Now the same USB model can be obtained for even less than I paid directly from the manufacturer. For me it was a natural choice as I was replacing three internal 500 GB drives (RAID 0) with three internal 1 TB drives (separate volumes) and wanted an enclosure with a migration path to larger drives with mixed capacities. (I'd love to have a PATA drobo as well for my even older drives, but they don't make them.)

      And with the SDK available, you can install programs and services that run on the drobo itself or on a droboshare managing it as a NAS.

      --
      Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
    265. Re:Carefully protected? by EastCoastSurfer · · Score: 1

      Having experienced all sorts of failures by this point, recovery time is one of the big questions I pose to management. If a system can be down for 24 - 48 hours b/c of a failure, then we can do things a lot cheaper/differently than if you need to be back up and running in 1 hour. In the first case we can wait for new parts (obviously we keep spare HDs onsite, but not stuff like back planes and RAID card batteries ugh...lol) to be shipped and arrive and in the other you need to have a hot spare system sitting idly by. Even then, if you have a huge backup to restore you may need to have a replicative system up and running all the time that you can fail to if needed. Luckily most systems that I've had to manage allowed for 1-2 days downtime in the case of a major failure.

    266. Re:Carefully protected? by noidentity · · Score: 1

      I have to mock this comment. Backup is not a preventative. It is a form of redundancy. Nothing is stopping that system from losing the main and backup volumes and completely losing your data. Backups simply allow you to recover after a SINGLE volume failure.

    267. Re:Carefully protected? by techess · · Score: 1

      Good point. I think all shipping companies have their problems. What ticked me off in this case was that the items were insured and Fed-Ex tried to get out of paying because someone signed off on the items. (Someone who isn't our receiving person and who never saw the boxes). They claimed that since someone signed off on the delivery the damage must have come after it was delivered.

      It took a lot of photographs on our part and a letter from the vendor stating they did not ship our stuff in DET labeled boxes before we got our money.

      --
      Don't anthropomorphize computers. They *hate* that.
    268. Re:Carefully protected? by ThePromenader · · Score: 1

      I've opted for twin server-managed raid5 arrays clustered using drbd - should one disk in an array go down - or the server managing the array go down - the 'twin' server/raid5 array will take over until the 'down' array's repair cycle is complete. Once this is done, the newer data will be written to the 'repaired' twin when it comes back up. This is the cheapest/most reliable solution I could find - with HA thrown in in the bargain.

      --

      No, no sig. Really.

      ThePromenader
    269. Re:Carefully protected? by Cramer · · Score: 1

      Like it doesn't freak the hell out on the first error? Put down the crack pipe and step away from the server room. Linux is just as bad, if not worse, than hardware systems in the face of errors.

      Linux software raid is easier to deal with, but that's simply because you're on the inside. It's pretty easy to make it shutup and reassemble the array anyway. You can do the same on a number of hardware systems, but it's always an undocumented "internal" process that has to be entered via morse code with a paper clip -- login to a serial port you didn't know it had with a password you don't have to run commands even the source code doesn't document.

    270. Re:Carefully protected? by WhatAmIDoingHere · · Score: 1

      Well, if you're keeping your data backed up on a tight enough schedule, this is something you don't have to worry about.

      "Oh, one drive failed, I'll replace it." becoming "Oh god, they're ALL failing, what do I do?!" isn't an issue when your data is also stored on tape (or something).

      That being said, I'm considering replacing 2 smaller older hard drives with a pair of 1TB drives in a RAID-1, since that much data on one drive scares the hell out of me. (And it isn't anything worth backing up, really, it's a lot of stuff I could probably download again.)

      --
      Not a Twitter sockpuppet... but I wish I was.
    271. Re:Carefully protected? by dbrutus · · Score: 1

      The media's so cheap now that you just need to copy to a new set of media every few years and you've avoided the problem, no?

    272. Re:Carefully protected? by Cramer · · Score: 1

      I submit you just don't know where to shop. eBay may get bad looks, but when you don't have the budget for new gear from CDW, it's a great source of stuff. I don't recommend buying used tapes (or hard drives in general) for important backups, but a used tape drive or library still has a great deal of life left in it. You don't need top of the line LTO16 technology to backup TB's of data -- and a TB or storage doesn't mean a TB of data to backup. I do it every week with SDLT320 (160GB) (onsite) and LTO2 (200G) (offsite) tapes. I used to do it with AIT-2 (50G) tapes. And IMO, the key factor in backups is speed, not size of each volume; backups need to complete within 4-6 hours. When it take 8-10 hrs or more, you need to start thinking about more drives and/or faster technology.

    273. Re:Carefully protected? by Cramer · · Score: 1

      And I have commercially manufactured DVDs less than 2 years old that are no longer playable -- the layers have fused. (one of them is my DVD of Army of Darkness!)

    274. Re:Carefully protected? by raap · · Score: 1

      The (relative) new option --recusive should reduce the memory usage.

    275. Re:Carefully protected? by Cramer · · Score: 1

      Actually, most of NASA's issues were not the tapes themselves, but finding hardware capabale of reading them. A properly cared for tape will last decades (manufacturers quote 30yrs MINIMUM) -- even a moderately cared for tape will last multiple decades (the ~20yr old tape that's been sitting in my kitchen for the past 11 years was perfectly readable last year when I found a compat drive.)

      However, they do have a finite lifetime. The plastic ribbon that is the backbone of the tape will eventually degrade, crack, and fall apart. Seeing how the tape has to be under tention to be read, it will become unusable long before the magnetic information is lost. Yet, the most common "failure" is having tapes without a functional drive to read them.

    276. Re:Carefully protected? by Cramer · · Score: 1

      I have a $200 LTO2 drive (HH, external, with SCSI card) from eBay. Unless things have changed a lot in the last year, you just aren't looking in the right places often enough.

    277. Re:Carefully protected? by Anonymous Coward · · Score: 0

      I don't even have to worry about losing it.

      Except when your USB key falls out of your pocket while you're running for the bus or removing snow from your car.

    278. Re:Carefully protected? by Lukey+Boy · · Score: 1

      Wow, nice! I haven't looked in a couple of months, but I'll definitely keep my eye out. Fortunately I already have a SCSI card.

    279. Re:Carefully protected? by sjames · · Score: 1

      That was kinda my point! Why would I prefer the dane brammage of having to beat the magic procedure out of someone's tech support w/ a rubber hose when I could be following the documented Linux procedures?

      Further, should those fail, The on-disk layout in Linux is documented so I can write a small program to put as much of it as possible back together using raw disk commands if necessary.

      If I can't have robust, I at least want the option to glue the pieces back together!

    280. Re:Carefully protected? by Mjec · · Score: 1

      If you've got investors coming around, I'd do it. Just think of all the extra hardware you can buy with all that VC...

      --
      "But everyone should know everything." -markab
    281. Re:Carefully protected? by growse · · Score: 1

      Why on earth can't they stick a gigabit ethernet port on it? That alone stops me buying it.

      And no, the silly "spend an extra wadge of cash on this base thing to sit it on" is not a good solution.

      --
      There is nothing interesting going on at my blog
    282. Re:Carefully protected? by Cramer · · Score: 1

      So do I, boxes of them actually. But it was part of the collection, so one more in the box.

    283. Re:Carefully protected? by tigersha · · Score: 1

      Been there, done that. Once reconstructed a raid array by dd ing the disks to another bigger one and writing a c program to reconstruct a single image. Worked like a charm too, was a Mylex controller.

      --
      The dangers of excessive individualism are nothing compared to the oppressiveness of excessive collectivism
    284. Re:Carefully protected? by Anonymous Coward · · Score: 0

      Father, I ask that you bring this person into the knowledge of Your Son, and save him.
      In Jesus' Name

    285. Re:Carefully protected? by rtechie · · Score: 1

      How about 250 GB of data? That's the size of entry-level HDs nowadays. Burn it to 15 DVDs?

      Actually, that's not a bad idea.

      It's a terrible idea. For one, I miscounted. It would take about 30 DVD-9s or 60 DVD-Rs. That's a backup process that takes a bare minimum of 240 minutes of constant disc swapping. And the DVD-9s cost at least $1 each, so that's a minimum of $30. NOBODY is going to do this every week.

      I mean...everyone has a safe deposit box, right? I make a trip to the bank at least once a week to drop something in my safe deposit box.

      Certainly not. I don't have one. I don't think that most people do, and even if they did I seriously doubt they'd be willing to shuffle backup tapes out of their safe deposit boxes every week (another 4 hour ordeal), or an external hard drive for that matter.

      But an external hard drive placed in a safe deposit box is the only reasonable suggestion you've made thus far. I still think online backup is vastly superior because it's much more likely to see actual USE and be up to date.

      I've got about 100 GB of data that is actually important, which gets backed up about once a week to DVD and stuffed in a safe deposit box. I've got another 200 GB or so of data that isn't really important, but I'd rather not lose it (save games, movies, music, etc.) - which gets backed up to DVD every couple months.

      You're obsessive, at least relative to 99% of home users. I would never do crap like this, I don't have the free time.

      I've been playing with the idea of buying myself a tape drive.

      In my opinion, it's a waste of money for pretty much everyone. $1000 buys you a lot of hard drives for a lot of redundancy. Make 'em external if you want. Still vastly faster and MORE RELIABLE (tape drives fail constantly, the MTBF is less than a year IME) to use more hard drives. Robotic tape arrays are typically the LEAST robust equipment in a datacenter.

    286. Re:Carefully protected? by YourExperiment · · Score: 1

      Er, no, that's my point, I don't care if it falls out of my pocket. It's encrypted, so it's no use to anyone else. With the price of USB keys, it's hardly going to break the bank to buy another. And it's a backup, not an original, so I haven't lost any data.

      Besides, it's in my wallet. How's it going to fall out of there?

    287. Re:Carefully protected? by ITEric · · Score: 1

      For any home user on a budget or for short-term data backups, optical media is more than adequate (though it would be a pain to backup large amounts of data, I'm sure). Combine that with periodic backups to (an) external hard drive(s) and you have a simple solution that would work for many users without the expense of magnetic media. If you cycle the newer hard drives from previous external hard drive backups into the RAID array, wouldn't it decrease the chances of failures and/or read errors?

      --
      The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' but 'That's funny...
    288. Re:Carefully protected? by Limecron · · Score: 1

      The abbreviation used is URE, and so you're saying that URE stands for "Unreadable Read Error"? Err...

    289. Re:Carefully protected? by iowannaski · · Score: 1

      I run a few Apple XRAIDs over FibreChannel. They are (were) the cheapest in the industry with very good specs and if you can afford a few Apple XServe's you get XSAN on there and a FibreChannel switch (on my wishlist for winter-een-mas) and you have a very easy to maintain expandable SAN.

      I am an Xsan administrator, and I will grant you that Xsan is cheap and easy to maintain - for a cluster filesystem.

      If you don't need a cluster filesystem (and you probably don't), Xsan is a ridiculously complicated, difficult to maintain, and expensive substitute for a nice, cheap, NAS solution.

      --
      i forget
    290. Re:Carefully protected? by RedBear · · Score: 1

      Why on earth can't they stick a gigabit ethernet port on it? That alone stops me buying it.

      And no, the silly "spend an extra wadge of cash on this base thing to sit it on" is not a good solution.

      Again, you can buy the next least expensive device that comes with a gigabit Ethernet port, the ReadyNAS NV+. It's only $1,000 compared to the USB Drobo + DroboShare which together costs $550, or $700 if you get the new FireWire Drobo. The DroboShare also runs Linux and will supposedly have the ability to run applications like BitTorrent soon. I think the "silly" DroboShare "base thing" is priced just about right for what it can do. It's basically a small file server that needs no configuration and costs only $199. I don't think you can find or create anything cheaper with comparable features. It's not just some dumb Ethernet card module that should cost $49.

      Looks like the Drobo Apps just became available a few days ago. Thanks for reminding me to check! Christmas comes early this year!

      Drobo Apps: http://www.drobo.com/droboapps/

    291. Re:Carefully protected? by Anonymous Coward · · Score: 0

      You can always get mugged.

      But as long as you have more backups in other places, you'll be alright. I like your style.

  2. Dont worry too much by Gat0r30y · · Score: 1

    When HDD's move to bigger sectors - there should be better error recovery reducing the probability of unrecoverable read errors. Right? Ok, I'm moving to ZFS.

    --
    Prediction: The real iPhone killer is going to be sex robots from Japan. Think about it.
    1. Re:Dont worry too much by tepples · · Score: 1, Insightful

      When HDD's move to bigger sectors - there should be better error recovery reducing the probability of unrecoverable read errors. Right?

      Not if what fails is the drive motor.

    2. Re:Dont worry too much by SatanicPuppy · · Score: 5, Informative

      The real issue is one that anyone who has ever had to recover a multi-drive array can tell you instantly: if one drive fails, and the other drive was bought at the same time, and has had a nearly identical usage pattern, the odds of the other drive failing are well above average.

      I once had a single drive fail in a 24 disk array. The disks were arranged, RAID 5, in groups of 3, glued together by Veritas (from back before it got bought by crappy symantec). By the time the smoke cleared we had replaced 19 out of 24 drives. They had all been bought at the same time, and as they thrashed rebuilding their failed buddies, they started dying themselves. The remaining 5 drives we replaced anyway, just because.

      That's a worst case, but multiple failures are far from uncommon, and very few people correctly cycle in new drives periodically to reduce the chance of a mass failure.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    3. Re:Dont worry too much by Angus+McNitt · · Score: 5, Insightful

      ... very few people correctly cycle in new drives periodically to reduce the chance of a mass failure.

      That is also because very few people buy a Raid setup piecemeal. Most end up buying a solution, fully populated. The idea of swapping out some drives as you go, or growing your RAID over time doesn't always look good, either to the PHBs who usually run the budget, or to the vendor. We had a vendor trying to sell us a iSCSI SAN device tell us that varying the drive lots and dates increased the chances of failure. Needless to say we went elsewhere.

      When we bought the RAID array for our Exchange box, this is going back a few years, everybody looked at my like an idiot because I asked for drives with different lot numbers. It was the best I could do as buying over time was not an option. HP was actually pretty cool about this request and out of 8 disks, no 3 have the same lot number or manufacture date.

      Of course we are also running RAID on that machine for non-backup and do a nightly replication, so your mileage may vary.

      --
      "To Do Is To Be" - Socrates, "To Be Is To Do" - Sartre, "Do Be Do Be Do" - Sinatra
    4. Re:Dont worry too much by afidel · · Score: 1

      Actually, yes. Or at least that's one of the way my SAN vendor is dealing with increasing drive size. They are the first vendor to enable ANSI T10-DIF end to end checksumming which includes additional bits per block (AFAIR it's the same as mainframe drives have been using all along). They have also made the recoverable element the surface rather than the drive so recovery times are several times faster. Check them out.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    5. Re:Dont worry too much by Anonymous Coward · · Score: 0

      Indeed, but did you lose any data?

      RAID 5 is invaluable to protecting from disk failures. You're absolutely right: There's a REAL risk of double-disk failures with RAID sets because they are usually purchased together, were manufactured in the same group, and follow the same usage patterns.

      You can't really be secure in ANY RAID configuration. Even if you had two mirrored mirror sets, you could lose your data. The only way to be sure you don't is to have good backups.

      Good backups is a problem, but for Joe Home User you can get 1TB external disks for so cheap it's scary, and many times these disks come with "push to backup" buttons you can use to instantly back up all your shit.

      For me, I have too much to back up so I replicate all my important stuff (about 2TB) to a friend. He does the same. We use DFS Replication. It works like a frigging champ (it's my favorite Microsoft thing ever!) and it's the only way for the both of us to ensure that if the building burns, we're protected.

      Multiple disk failures aren't uncommon, however I've been responsible for... an uncountable number of RAID sets (including many housed on Symms, Clariions, DAS, and other types of disk systems) and I've personally never experienced a double disk failure on a RAID 5. Maybe I'm just lucky..

    6. Re:Dont worry too much by LarsG · · Score: 1

      and the other drive was bought at the same time

      Sequential serial numbers too, to be *really* safe, right? ;-p

      --
      If J.K.R wrote Windows: Puteulanus fenestra mortalis!
    7. Re:Dont worry too much by Bandman · · Score: 1

      wow, I don't envy that period of time in your life, to be sure.

    8. Re:Dont worry too much by Wesley+Felter · · Score: 1

      The chance of a double motor failure is much lower than a motor failure + uncorrectable read error, which is what the article is about.

    9. Re:Dont worry too much by SatanicPuppy · · Score: 1

      Yep. The drives we had were all sequential serial numbers...They were good drives, IBM Ultrastar's, which were a benchmark for reliability before Hitachi came along, and the little bastards held up. We didn't lose any data (and we had a nightly backup, so no biggie), though the whole experience probably stripped a year off my life.

      But I agree completely; I can't imagine trying to convince my boss to cycle out a few thousand dollars worth of working drives a year, even though its the way it ought to be done.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    10. Re:Dont worry too much by DigitalCrackPipe · · Score: 1

      replaced 19 out of 24 drives

      As they were glued in sets of 3, that conjured an image of someone separating the drives with a crowbar and cursing like a sailor.

    11. Re:Dont worry too much by ACMENEWSLLC · · Score: 1

      Been there. Your good vendors will set you up with RAID drives from different batches / run dates. In fact, if it's a drive that is expected to be RAID, a good drive manufacture will mix up a batch of drives to a vendor. I've seen WD do that. You still might loose two drives at the same time, but you're odds are a little better that there will be enough time between failures to rebuild the failed drive.

      The last problem I had was a bad RAID controller. What happens when the controller itself goes bad? So I need two controllers? A RAID 5 on each, RAID 1 between them?

      But what if it doesn't fail gracefully? What if the controller just dumps bad data to one set of drives without catching it? My HP DL360 did that. The IO bus on the card got full and Windows ran out of queuing room, loosing data, killing the PC.

      And then what happens if you have a fire, and your backup server is in the same computer room? You run your full backup on Friday. Tapes stay in until Monday. Fire on Sat. So now you have to go back 1 week and apply every incremental?

      So many contingencies so little time.

      ZFS was mentioned. In Windows I have some critical data backed up to a remote location via DFS. The replication is one way, to a location about a thousand miles away. The remote replica is a copy only, it is not enabled for DFS redirection /reading.

      That gets me automatic offsite syncing of Windows server files. Packeteer makes it run low priority. DFS doesn't work well enough to make that a live partition for fail over. DFS will redirect clients that are 1Gb/s to the LAN DFS across the 1.5Mb/s link too often even with sites and networks setup correctly. But I could manually fail over to it or recover from it in a major DR situation.

    12. Re:Dont worry too much by BitZtream · · Score: 1

      One code argue that you should have kept those 5 drives going till they failed so that next time around your failures would be spread out a little more rather than at one time.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    13. Re:Dont worry too much by nine-times · · Score: 1

      The last problem I had was a bad RAID controller. What happens when the controller itself goes bad? So I need two controllers? A RAID 5 on each, RAID 1 between them?

      That's why it's a good idea to get as redundant as you can afford (completely separate systems at different locations, if possible). And of course, you should be doing a backup anyway. Backups protect from more than just hardware failures. It's a good thing to remember that there's only so safe you can get on a set budget.

      What lots of people forget about this sort of security (as well as security in general), is that you can't ever be completely safe. You're just managing your risks, categorizing some as acceptable and some as unacceptable, and then determining what you can do within your budget to eliminate the unacceptable risks. But when you're all done, there's still going to be a possibility that you'll lose all of your data.

    14. Re:Dont worry too much by Anonymous Coward · · Score: 0

      I have a software raid 6 with a few Western Digital drives and a couple Samsung drives. Of course I did that because a couple of the Western Digital drives died an early death. (Yes, long exhaustive formats still have a use, unfortunately.)

      I more or less intend this to be my safe storage, with critical things getting mirrored off site via the occasional use of a portable hard drive and a normally unpowered Linux box with old hard drives in it. I plan to just format each with an ordinary ext3 file system, and just put whatever fits on each, without using LVM or something that might be awkward to put back together in the event of a failure. I'm guessing in future I might expand the raid array with larger capacity drives, then at some point I'd have changed the size of all the drives and can then expand out the array. The previously used drives that still work could then be used for the backup solution to replace smaller capacity drives. Of course, I know none of this is as good as tape and all that, but I only have so much time.

      Probably the scariest bit is when your changing the size of the raid array or expanding the file system, but then I'm fairly new to actually using raid, so I suppose I'll get used to it.

      The key perhaps non obvious point here is, if you want a bit more reliability you could mix drives from different manufacturers. It probably will hurt your speed, but if that doesn't matter, you should, in theory, reduce the odds of most of your drives failing at the same exact time. This is also why I am using software raid, since, in theory, I should be able to replace the motherboard and still have it all work...

    15. Re:Dont worry too much by Anonymous Coward · · Score: 0

      this is guaranteed systemic problem

      you either:

      a-got a bad batch

      b-got bad power

      c-got bad cooling

      d-you're a dumbass

      e-you're coworkers and you are dumbasses

      f-profit

    16. Re:Dont worry too much by Anonymous Coward · · Score: 0

      I have a raid 6 with 16 drives, I bought all the drives at once and made sure they all had the same firmware. So in the extreme case of multiple failures of 3 drives or all I need is one functional pcb to ghost the failed drives insert the new drives into the raid and bring then raid back online. Done.

    17. Re:Dont worry too much by oblivionboy · · Score: 1

      Seagate Barracude 7200s? I saw this happen before once in the mid 90s with a friend of mine. Something like four out of six went pretty much all at once...

    18. Re:Dont worry too much by leonroy · · Score: 1

      An Apple engineer explained that the reason Apple charge a premium over other suppliers for extra hard disks and memory is that:
      1. They validate the part.
      2. For hard disks at least, no two disks shipped with your Apple machine will have the same lot number. The idea is that this reduces the chance of multiple disks from a faulty batch going bad at the same time.

    19. Re:Dont worry too much by SuseLover · · Score: 1

      The real issue is one that anyone who has ever had to recover a multi-drive array can tell you instantly: if one drive fails, and the other drive was bought at the same time, and has had a nearly identical usage pattern, the odds of the other drive failing are well above average.

      Which is why when buying drives for an array you should specify that they all be from different manufacturing lot numbers. I have seen this before, if there was a mfg. glitch a whole batch of drives can be affected by the same problem. Sun Micro did this by default for us when we got their arrays, I don't know if they still do though. Plus a good vendor "burns in" the drives to weed out infant mortality failures.

    20. Re:Dont worry too much by SatanicPuppy · · Score: 1

      At that point the MTBF was longer than the projected lifespan of the system, so we didn't bother. One of the drives we put in actually does have some kind of intermittent fault; it still get a blip on my logs every now and then.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    21. Re:Dont worry too much by Phantom+Gremlin · · Score: 1

      By the time the smoke cleared we had replaced 19 out of 24 drives.

      My "gut feel" is that the failure rate you encountered was much too high to attribute to simple wear-out related drive failures.

      I suspect something common, like a bad power supply, somehow damaged the drives. You were wise to replace all. But you probably should have scrapped the entire enclosure.

  3. Backup by Anonymous Coward · · Score: 2, Informative

    RAID is not, and has never been, a substitute for backups.

    1. Re:Backup by PitaBred · · Score: 1

      Nope. It's a great place for backups to go to, though. Run RAID on one machine, back up your others to it. The most important thing is to never keep any data that you will mind being gone in only one place. And critical stuff (which typically doesn't change), compressed, encrypted and shipped off to another physical location, either a relative's house or web server space or something.

  4. At least..... by stun · · Score: 0

    I am not running Windows.......oh wait nvm

  5. RAID != Backup by vlad_petric · · Score: 3, Insightful

    I mean, WTF? Many people regard RAID as something magical that will keep their data no matter what happens. Well ... it's not.

    Furthermore, for many enterprise applications disk size is not the main concern, but rather I/O throughput and reliability. Few need 7 disks of 2 TB in RAID5.

    --

    The Raven

    1. Re:RAID != Backup by Anonymous Coward · · Score: 4, Insightful

      Furthermore, for many enterprise applications disk size is not the main concern, but rather I/O throughput and reliability. Few need 7 disks of 2 TB in RAID5.

      Some of us do need a large amount of reasonably priced storage with fast read speed & slower write speed. This pattern of data access is extremely common for all sorts of applications.

      And this raid 5 "problem" is simply the fact that modern sata disks have a certain error rate. But as the amount of data becomes huge, it becomes very likely that errors will occur when rebuilding a failed disk. But errors can also occur during normal operation!

      The problem is that sata disks have gotten a lot bigger without the error rate dropping.

      So you have a few choices:

      - use more reliable disks (like scsi/sas) which reduce the error rate even further
      - use a raid geometry that is more tolerant of errors (like raid 6)
      - use a file system that is more tolerant of errors
      - replicate & backup your data

    2. Re:RAID != Backup by MBCook · · Score: 3, Insightful

      I've always understood it as RAID exists to keep you running either during the 'outage' (i.e. until a new disk is built) or at least long enough to shut things down safely and coherently (as opposed to computer just locking up or some such).

      It's designed to give you redundancy until you fix the problem. It's designed to let you limp along. It's not designed to be a backup solution.

      As others have mentioned: if you want a backup set of hard drives, you run RAID 10 or 15 or something where you have two(+) full copies of your data. And even that won't work in many situations (i.e. computer suddenly finds it's self in a flood).

      All that said, the guy has a possible point. How long would it take to build a new 1TB drive into an array? That could be problematic.

      There is a reason SANs and other such things have 2+ hot spares in them.

      --
      Comment forecast: Bits of genius surrounded by a sea of mediocrity.
    3. Re:RAID != Backup by Rayeth · · Score: 1

      I never understand why this idea persists. RAID is more useful to me to increase write speeds when moving large files across multiple drives, than it is for having fault tolerance.

    4. Re:RAID != Backup by cbreaker · · Score: 1

      Wait - WHO SAID THAT?

      I don't think I've ever met anyone that thought RAID was a replacement for backups. Have you? Wait don't answer that, I don't need a made-up story.

      And I beg to differ on the "many enterprises concern most with disk speed" - no. Even small companies now have large data needs, and the very first thing to consider on any storage solution is usable disk space - because if there's not enough space then it doesn't work does it?

      Performance is a close second, and reliability is simply taken for granted. You're always going to use a RAID set. It just depends on how much performance you need and how much you can spend.

      Storage capacity is always #1 on the list.

      --
      - It's not the Macs I hate. It's Digg users. -
    5. Re:RAID != Backup by Walpurgiss · · Score: 4, Informative

      I run a raid5 with 1TB disks. Growing the array from 3 to 4 took around 4 hours, 4 to 5 took maybe 8 or 10, 5 to 7 took something like 30 hours I guess.

      But that's growing from a previous capacity to a larger capacity.
      Using mdadm to fake a failure by removing and adding a single drive, the recover time generally was 4-5 hours.

    6. Re:RAID != Backup by Anonymous Coward · · Score: 0

      BINGO!!!

      Someone on slashdot actually gets it! Kudos, man!

    7. Re:RAID != Backup by AngelofDeath-02 · · Score: 1

      I use a raid 5 as a lazy man's backup.

      I'm well aware that if one drive dies, the others probably aren't far behind, and may well die while they restore a replacement.

      I'm willing to accept this risk because I use the raid5 as more of a convenient large storage medium that is less likely to dissapear.

      Of course, it's 6 disks for 1.3 terabytes... But I can no longer afford to upgrade it. I could buy 1 drive and back it up.

      --
      No, I am not an English major. My posts are subject to typos and incorrect grammar. Do not expect perfection.
    8. Re:RAID != Backup by Pentium100 · · Score: 1

      I do not know much about RAID, but if the read error occurred during rebuild, wouldn't just that sector/cluster be lost and not the entire array?

    9. Re:RAID != Backup by WuphonsReach · · Score: 1

      If you use SoftwareRAID in Linux, don't do a single RAID-6 array across your disks. Instead, divide the disk up into parts (say 1/4 of the disk for each) and do four separate RAID-6 arrays.

      Now, obviously, if a disk goes completely kaput - you're in the standard situation. But a single read-error on a single-disk, will only knock out one of the four RAID-6 arrays. And since the array only spans 1/4 of the disk, it rebuilds faster. Hopefully before a 2nd error occurs.

      It doesn't help much for situations where you need one BIG disk. (And you could always LVM across multiple RAID arrays.) But there could be some pretty good advantages in other situations (such as making the stripe size on one of the arrays smaller then stripe size on the rest of the arrays).

      --
      Wolde you bothe eate your cake, and have your cake?
    10. Re:RAID != Backup by blai · · Score: 1

      I agree. If you're running RAID 5 at home for backup, you should seriously consider some backup software which backs up to a hard drive that does not run 24/7.

      --
      In soviet Russia, God creates you!
    11. Re:RAID != Backup by MassacrE · · Score: 1

      the problem is that the odds of a nonrecoverable read error are changing while the size goes up. So say my new raid5 4x1.5 TB has a drive fail. Odds are about 15% based on a 1:1x10^14 chance of unrecoverable read error that there will be data corruption while trying to build the new disk. the issue is that all data must be valid and readable for raid5 to fully recover from failure. A mirror would only need (one of the) mirrored disk to be readable to recover.

      Also, recovery reads are no different from regular reads - making an offsite backup for your data also has that chance of encountering a read failure, although the controller should be able to recover based on the redundant disk.

      Of course, if capacities continue to improve without better error recovery, it'll take four years for a mirrored setup to catch up to the same poor odds.

    12. Re:RAID != Backup by Anonymous Coward · · Score: 1, Informative

      Actually, that flood just happened in our server room. The 15 ton AC/dehumidifier crapped out and started dumping water like nobody's business. That happened either late Friday night or sometime over the weekend. By the time we got in on Monday morning, the water was about half an inch deep. About 400 gallons later, we shop-vac'd enough water that we could leave the server room to air dry.

      Luckily nothing fried, but just letting you know freak problems do occur, and you could end up with a flooded/incinerated server room.

    13. Re:RAID != Backup by cbreaker · · Score: 1

      You don't count - you understand the risk =)

      You're running 300GB disks right? It's amazing, I have a 7 disk RAID 5 array on my file server at home and it's got 4.1TB =) I put it together about two months ago. I'd wanted to do it for a LONG time but until those 750's came down to the right price (about $100 each) I just couldn't justify it. The server also has four 500's in it (I mean, you can buy a 500GB for nearly $50 now!)

      The cost comes with the RAID controllers. I wanted good performance so I got a hardware RAID card for it and the SATA RAID cards ain't cheap. I settled on a 12-port Accusys card. I love the thing. Couldn't be happier with it. I purchased a 4-port version for my VM host.

      No way I can really back that up so I replicate a good portion of it to a friends' file server at his house. There was a lot of pre-staging involved, obviously, but DFS Replication works wonderfully for keeping things up to date in both directions.

      --
      - It's not the Macs I hate. It's Digg users. -
    14. Re:RAID != Backup by AngelofDeath-02 · · Score: 1

      I think they are 250 gigs

      let me log in and check ... 7 disks, 250 gig.

      I personally went with software raid, and I'm pretty happy with it. I've had a few system failures in the past and was able to easily recover them because any linux kernel 2.4 or later will auto detect and initialize the raid, in any order. I've also had a tendency of rebuilding the system every year or so - but I'm breaking out of that mold. It runs, I have quiet fans, and the drives spin down after 10 minutes of inactivity. I'm happy ^_^

      Also, there's the unexpected benefit of being able to grow the raid by swapping them out individually for larger disks! Although they are IDE, I'd love to get SATA. I just can't afford to buy new drives - but the comp to run it is really only about 150$

      --
      No, I am not an English major. My posts are subject to typos and incorrect grammar. Do not expect perfection.
    15. Re:RAID != Backup by truthful+cynic · · Score: 1

      You are growing horizontally (by adding drives), therefore, you are increasing the probability for failure as you are growing the RAID set.

      One super neat thing that isn't mentioned much about ZFS is that you can expand a set by replacing the current drives with larger drives - when you replace all the drives in your set, you will see the additional space. So, next year, when I upgrade my 3x1tb with the 2+tb drives that will be out, I'll be able to get a 3x2+tb set with no downtime. Anyone who thinks that one can afford the time to copy data to a bigger filesystem for more space is nuts in the days of TB drives.

    16. Re:RAID != Backup by Anonymous Coward · · Score: 0

      I use two disks and run rsync between them once a day. The result is somewhere between a backup and RAID1, since a fsckup on one disk won't be immediately replicated to the other. Good enough for home use.

    17. Re:RAID != Backup by Walpurgiss · · Score: 1

      That is a really excellent feature for ZFS. I'll have to look into that when I'm ready to ditch these drives. Just read today about the WD GP drives retarded load unload cycling every 8s and saw with smartmon that one of my set has over 550,000 load cycles over 6342 power on hours, when the drives specs show the lifetime as 300,000.
      I expect that drive in the set to fail soon, and I don't plan to buy this green power garbage again.

    18. Re:RAID != Backup by jimicus · · Score: 1

      I do not know much about RAID, but if the read error occurred during rebuild, wouldn't just that sector/cluster be lost and not the entire array?

      Well, there's two problems there.

      First, there's no such thing as one error. Once you start to see errors, they generally multiply like rabbits on viagra.

      Second, where has that error occurred? If it's in the middle of a 2GB data file which forms part of the backing store for your accounts database, there's a strong chance you'll have to recover the whole database. Even if your DBMS can repair the file corruption, how exactly are you going to explain to the finance director that one or more transactions in the accounts system may no longer be correct and you're not sure which transaction it is or how many are affected?

    19. Re:RAID != Backup by Pentium100 · · Score: 1

      From the article:

      SATA drives are commonly specified with an unrecoverable read error rate (URE) of 10^14. Which means that once every 100,000,000,000,000 bits, the disk will very politely tell you that, so sorry, but I really, truly canâ(TM)t read that sector back to you.

      So, the array will know where is this error, and the error will most likely be contained in a single sector. Yes, if the bad sector was in the file system structure (for example $MFT, or the FAT (but it has a second copy)) or in the middle of an important file (and software could not correct it) then you are out of luck. If it happened in unused part of the disk, then you would just have a few kilobytes of unusable space (because most likely the system would mark the sector as bad).

      But the chances of it happening and being a really uncorrectable error are lower than 1 in 12TB

    20. Re:RAID != Backup by cbreaker · · Score: 1

      NewEgg has some 8 port SATA controllers (non RAID) for pretty cheap. I recently purchased one and it was a PCI-X card that also worked fine in a normal PCI slot. The card is nice and fast even over PCI (because PCI isn't THAT slow.)

      I originally went software RAID too. I did this because what happens if the RAID controller dies? That's a big problem if it's 5 years from now and I can't get the same controller anymore.

      The big reason I went with HW was not only the performance (which wasn't excellent under Windows) but the fact that if the machine doesn't shut down cleanly the machine will have to do an entire rescan of the array. With a 7 disk 750GB SATA array, this was taking about 20 hours. This happens on Linux too. It's normal for software RAID but can be avoided on HW RAID.

      I don't have a complete battery and generator power solution, so I got a HW raid card. The Accusys cards actually have a little bit of flash on them that can keep track of things so if you power off it won't have to rebuild, even without a battery unit.

      Software RAID can be a lot more flexible than HW RAID but HW RAID has a lot of benefits so I went that way.

      --
      - It's not the Macs I hate. It's Digg users. -
    21. Re:RAID != Backup by castanaveras · · Score: 1

      Then you have 4 RAIDs competing for the spindles. That is going to _suck_ performance wise.

    22. Re:RAID != Backup by dannys42 · · Score: 1

      In fact hardware RAID is _only_ really useful for throughput. If you care about data integrity during a failure, RAID is really bad (even when mirroring). The reason is because RAID doesn't specify how the data is actually stored on the disk. If you're using a hardware RAID controller and your controller dies, you've lost all your data. You don't always even have the option of buying a new controller, as firmware can change even with the same model.

      It's for this reason that I only trust software RAID (linux in particular, of course), whenever I do use RAID. I know I can always take the disk and load it up on another Linux machine without difficulty. The same can't be said for any hardware RAID format. (please correct me if I'm wrong).

    23. Re:RAID != Backup by Anonymous Coward · · Score: 0

      if you plan on consistently growing your array, picking raid 4 might be a better solution.

  6. RAID is fine, stupid admins are not! by Anonymous Coward · · Score: 0

    RAID 5 and "carefully protected"?

    RAID 5, as well as RAID 6 is nothing more at an attempt to add some amount of redundancy without sacrificing too much space. Go RAID 1 instead with the same number of disks. Also do off-site mirroring of all your data.

    And if you get get "unrecoverable read error" after a drive failure, it means the administrator should get fired, as he was too stupid to type "echo check > sync_action" followed later by "cat mismatch_cnt".

    1. Re:RAID is fine, stupid admins are not! by DrVxD · · Score: 3, Insightful

      RAID 5, as well as RAID 6 is nothing more at an attempt to add some amount of redundancy without sacrificing too much space. Go RAID 1 instead with the same number of disks.

      As far as I'm concerned, RAID 5 really has no redeeming features (it's slow, not particularly safe, but lulls people into a false sense of security).

      From a data integrity perspective, though, RAID6 is a better solution than RAID1.

      Given arrays of equal sizes, with RAID6 your data can survive the loss of *any* two disks; with RAID1, if you lose two disks which happen to be a mirrored pair, then you're hosed.

      But, as you point out, RAIDn doesn't really qualify as "carefully protected"

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    2. Re:RAID is fine, stupid admins are not! by Bandman · · Score: 1

      The problem with RAID6 is that it's not supported by most controllers, especially if it's a year or two old. My new EMC AX4-5 doesn't even support it. I'm doing RAID10 on it, though 6 would have given me some more usable space. I just set it up with two hotspares so I feel alright about the possibility of an error.

    3. Re:RAID is fine, stupid admins are not! by Anonymous Coward · · Score: 0

      Given that RAID1 is pretty much for two disks only, I fail to see how any form of RAID will prevent data loss when you lose two disks.

    4. Re:RAID is fine, stupid admins are not! by 4D6963 · · Score: 1

      Wait wait, your RAID 6 vs RAID 1 claim, that's bullshit. Of course RAID 6 can survive the loss of any two disks, it requires you to have at least 4 of them! If you have a RAID 1 array of 4 disks then you can lose any 3 disks. RAID 1 is the best thing from a data integrity perspective, because you can lose any disk in your array but one. It's just awfully inefficient space and performance wise, well, except for reading, then it's great.

      --
      You just got troll'd!
    5. Re:RAID is fine, stupid admins are not! by rthille · · Score: 1

      With 4 drives and raid 6, you get 50% storage and any two drives can fail.
      With 4 drives and raid 1(x4) (does anything even support 4x1 mirroring?), you get 25% storage and any 3 drives can fail.
      With 4 drives and raid 1+0 (a stripe of two 2-drive mirrors), you get 50% storage, but if the wrong two drives fail, it takes out the array.

      With 8 drives and raid 6, you get 75% storage and any two drives can fail.
      with 8 drives and raid 1(x8?), you get 12.5% storage and any 7 drives can fail.
      with 8 drives and raid 1+0, you'd get 50% storage, and still be screwed if the wrong two drives failed.

      --
      Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
    6. Re:RAID is fine, stupid admins are not! by DrVxD · · Score: 1

      Given that RAID1 is pretty much for two disks only

      No, RAID1 will work with any even number of disks. If you have an array with 4 drives in, you've got 2 pairs of mirrors. In the dual-failure case, you have two scenarios:
      1) 2 "paired" drives fail. The data on that half is gone, but the other half is still mirrored, so you've only lost half your data.
      2) 2 "unpaired" drives fail - you survived, you just don't have mirrors any more.

      I fail to see how any form of RAID will prevent data loss when you lose two disks.

      Broadly speaking, it's down to having better data distribution. You've got an extra set of parity data to rebuild the array from. Wikipedia is your friend. As the number of drives in the array increases, RAID6 gets even more attractive than RAID1 (From an cost & data integrity POV, at least - but RAID1 will generally give you better performance)

      Of course, it's theoretically possible to treat half or a RAID1 pair as a "hot backup" - just pull it out, plug in a fresh drive, and cart it off to use as your off-site. I wouldn't recommend that as your primary backup strategy though...

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    7. Re:RAID is fine, stupid admins are not! by DrVxD · · Score: 1

      Maybe you missed the bit where I said "arrays of equal sizes", thus causing you to spout bullshit?

      It smells to me like you're talking about a 4-way mirror; I don't see how a 4-way mirror gives you the same storage ("equal size") as a 4-drive RAID6 configuration. (and, as has been pointed out elsewhere, I don't think anything actually supports 4-way mirrors)

      If you've got 4x1TB drives, you can have a 2TB RAID1 config, or a 2TB RAID6 config. The RAID1 config will be faster, sure - but in the face of 2 drive failures, there's a good chance you've lost half your data (and if your file system spans the whole array, that could potentially mean losing all of it). With RAID6, you'll still have it all there.

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    8. Re:RAID is fine, stupid admins are not! by Anonymous Coward · · Score: 0

      No, RAID 1 is still better. He said RAID 1 with the same number of disks.

      So if you have five disks in a RAID 5, you can have one fail. If you have five disks in a RAID 6, two can fail. If you have five disks in a RAID 1, four can fail.

      If you have the money, a 3+ drive RAID 1 is superior to a RAID 6. I say you need money because you lose a lot of space in that kind of set up.

      Recovery is also much faster. No special calculations are needed.

    9. Re:RAID is fine, stupid admins are not! by prefect42 · · Score: 1

      I think that's a little harsh. Most peoples experiences of RAID5 are based around rubbish RAID controllers, which to be honest are possibly in the majority.

      RAID5 is a compromise, same as any other RAID level. With a good battery backed controller (as opposed to a bad battery backed controller, or one without) it's reasonably fast and reasonably safe. For some definition of reasonably. And the real winner, is it doesn't waste huge amounts of disk, so you can convince people to use it.

      --

      jh

    10. Re:RAID is fine, stupid admins are not! by 4D6963 · · Score: 1

      If you've got 4x1TB drives, you can have a 2TB RAID1 config

      No you don't, you blithering twit. That's just not RAID 1 you're talking about, that's something else, like RAID 0+1/1+0. You cannot have a 2 TB RAID 1 array with 1 TB disks, only a 1 TB array. Do some reading and shut the fuck up.

      Oh, and while we're at it, there's a space between numbers and their units (it's "1 TB" not "1TB") and it's "RAID 1" not "RAID1".

      --
      You just got troll'd!
    11. Re:RAID is fine, stupid admins are not! by DrVxD · · Score: 1

      *sigh* Arithmetic really isn't your strong point, is it?
      Drive 0 mirrors Drive 1 = 1 TB
      Drive 2 mirrors Drive 3 = 1 TB

      1+1=2, so a total of 2TB. Really, look it up

      But since you've descended to (well, more "started with" rather than actually "descended to") ad hominem attacks, it's perfectly clear that there's no point in trying to have a reasonable discourse with you.

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    12. Re:RAID is fine, stupid admins are not! by 4D6963 · · Score: 1

      Wow, reading the Wikipedia article I kindly pointed you to in between two insults isn't your strong point is it? What you're talking about is NOT RAID 1 (sounds more like RAID 10 depending on the relationship between your pairs of RAID 1 couples). You can criticise nested RAID levels all you want, but don't pick one and call it RAID 1 when it's not.

      Imbecile.

      --
      You just got troll'd!
  7. What. by DanWS6 · · Score: 3, Insightful

    The problem with Raid 5 is that the more drives you have the higher probability you have that more than one drive dies. That's why you have multiple raid 5 arrays of 4 disks maximum instead of one array of 7 disks.

    1. Re:What. by Anonymous Coward · · Score: 0

      The problem with Raid 5 is that the more drives you have the higher probability you have that more than one drive dies. That's why you have multiple raid 5 arrays of 4 disks maximum instead of one array of 7 disks.

      But what am I going to do with my 7000 2GB disks then?

    2. Re:What. by Bandman · · Score: 1

      RAID-01? ;-)

  8. Oh look, noobs. by Anonymous Coward · · Score: 1, Insightful

    If you use RAID to 'protect' your data, you clearly don't value your data at all.

    While the interesting bit of this article is the coming demise of RAID 5, what you should be bringing away with it is, if RAID is all that stands between you and data loss, you're a noob.

    1. Re:Oh look, noobs. by cbreaker · · Score: 1

      At home, I use RAID to protect my non-replicated data between backups.

      At work, I use RAID to protect my data between backups and to help prevent down time due to a disk failure.

      RAID is an excellent tool to protect yourself. There's multiple levels of protection and RAID is just one of them.

      --
      - It's not the Macs I hate. It's Digg users. -
    2. Re:Oh look, noobs. by Anonymous Coward · · Score: 0

      As a thought experiment. If we were to replace all single-disk storage in PCs over the world with RAID arrays overnight (e.g. two-disk RAID 1), what effect do you think that would have on the amount of data lost?

      I'm not saying it's a replacement for backup, but it does arguably protect the data stored against certain types of failures.

    3. Re:Oh look, noobs. by Migity · · Score: 1

      At home, I use Raid to kill bugs :)

  9. Redundant Array of Irrelevant Data by mschuyler · · Score: 1, Interesting

    That's what RAID stands for. It's a nice idea in theory, as long as the disks remain cheap, but I've never trusted them to work properly and had more than one break on me. "All you have to do is unplug the bad disk, plug in a good one in its place, and in a few minutes all will be hunky dory." Bzzt. Wrong. Thanks for playing.

    Backup every day to tape, to another disk entirely on a diffrent machine, to R/W DVD, twice a day if you have to, or all of the above--anywhere else but the machine itself. RAID: the accident waiting to happen. Yeah, I'm paranoid. It comes from experience.

    --
    How about a moderation of -1 pedantic.
    1. Re:Redundant Array of Irrelevant Data by Bandman · · Score: 1

      Reminds me of the definition of a lie:

      A poor substitute for the truth, but the only one discovered to date

  10. Just double-up on everythign by realmolo · · Score: 3, Informative

    If you have one RAID5 box, just build another one that replicates it. Use that for your "hot backup". Then back that up to tape, if you must.

    Storage is so cheap these days (especially if you don't need super-fast speeds and can use regular SATA drives), that you might as well just go crazy with mirroring/replicating all your drives all over the place for fault-tolerance and disaster-recovery.

    1. Re:Just double-up on everythign by Anonymous Coward · · Score: 1

      Consumer storage is cheap. Enterprise still costs around $10,000/TB in fast SAS storage that is fully featured.

      It sucks.

    2. Re:Just double-up on everythign by Anonymous Coward · · Score: 0

      Thats a bit overcomplicated. A much simpler scheme is to use raid for local storage and back up any documents saved on the array regularly. The rest of the data on most systems is available from bit torrent and doesnt particularly need a backup.

    3. Re:Just double-up on everythign by cbreaker · · Score: 1

      It's cheap - yes. For drives from NewEgg.

      I can go and buy a bunch of 500GB disks for around $55 each now. It's amazing!

      But what happens if you need Fiber Channel performance and share LUNs for Clustering or VMware? Your SATA disks from NewEgg won't help you.

      Even low-end SAN systems with iSCSI connectivity aren't exactly free. It's not really about the cost of each disk, it's the cost of the management unit and the hard disk bays. Not everyone has the ability to just drop in a bunch of grey boxes running FreeNAS. Sometimes you need an actual SAN and they cost money even when you use SATA disks.

      --
      - It's not the Macs I hate. It's Digg users. -
    4. Re:Just double-up on everythign by realmolo · · Score: 1

      You guys ever hear of OpenFiler?

      Yes, the big-iron SANs cost a lot. Too much. But you can build your own for not that much money. Yeah, you lose some of the cool management features. But so what? They're so cheap, that for the price of one commercial SAN, you can build TEN of the things.

      Unless you *really* need the performance of expensive drives, SATA drives are fine. Most companies don't need super-fast storage, they just need enough to saturate their network. And they needs LOTS of space. It's just storage, man.

    5. Re:Just double-up on everythign by tylernt · · Score: 1

      I wouldn't advise replacing your FC SAN with consumer level hardware, but you certainly don't need FC SAN performance for backups. At work we did exactly what the OP suggested, and that was stick 4 big IDE drives in an old workstation (this was back before SATA was common), install Windows and enable NTFS compression, and schedule a Task to fire off a little after quitting time. So what if it's slow, as long as the backup finishes before people come in to work the next morning who cares. If you use a diff-style backup like SyncToy or Rsync, everything after your first backup goes pretty quick.

      --
      DRM 'manages access' in the same way that a prison 'manages freedom'
    6. Re:Just double-up on everythign by cbreaker · · Score: 2, Interesting

      Well, I did mention FreeNAS so that lends itself to the possibility that I *probably* know what OpenFiler is.

      SATA disks actually aren't fine for a lot of applications. Any SINGLE app, I'll bite. But for most VMware installations where you have over 10 virtual machines (that are actually USED in production) you SATA disks might not cut it. Or they might be fine. It really depends.

      It's not about disk transfer speed, it's about IOPS. The 10 or 15K SAS/FC disks will get your data faster. And that's what it's all about. Nearly all normal infrastructure-type servers (File servers, e-mail, normal-use databases, etc) require a lot of IOPS but don't really care about throughput. It takes basically the same amount of time to fetch 4k as it does to fetch 1MB.

      I'd love to be able to offer an OpenFiler solution to our customers, and I'm pushing for it for some of out smaller clients that want to go virtual, but it's not an easy sell. For home, it's great. For a one-off project or for a non-critical backup system, sure. Production? I trust it, but I live in the real world where our customers don't.

      --
      - It's not the Macs I hate. It's Digg users. -
    7. Re:Just double-up on everythign by cbreaker · · Score: 1

      I agree with you completely, but a disk-based backup system wasn't the intended target for the discussion.

      I have to disagree with you about Synctoy. I absolutely hate synctoy. Use a disk snapshot system instead, such as Backup Exec System Recovery (i.e. Ghost 14 for servers.) You can do incremental with that and you can even preserve bootability and all metadata.

      --
      - It's not the Macs I hate. It's Digg users. -
    8. Re:Just double-up on everythign by Jaime2 · · Score: 1

      I get dual-port SAS 15K drives for $2/GB for my SAN. If you add the cost of the SAN and the fibre channel switches, then I can get two switches, a 12 bay dual controller SAN, and 12 450GB drives for $30,000. That's about $6,000/TB using RAID 6, $10,000/TB at RAID 10. The second batch of drives runs about a third of the first batch after the investment in all the FC infrastructure. HP has a new SAS SAN if the initial sticker shock of fibre channel is too much. You can get the first TB for under $10,000 and add to it for about $3,000/TB. You can't hook a ton of hosts to it, so it's more suited for a monster VMWare cluster than for a general purpose workload.

    9. Re:Just double-up on everythign by Thundersnatch · · Score: 1

      Yes, the big-iron SANs cost a lot. Too much. But you can build your own for not that much money. Yeah, you lose some of the cool management features. But so what? They're so cheap, that for the price of one commercial SAN, you can build TEN of the things.

      Building ten OpenFilers makes the problems that need solving worse, not better. Your have islands of storage, 10x the failure probability, and 10x the mangement expense.

      SANs are about provisioning and protecting storage in a mangable, policy-driven way. As well as enabling features like clustering, thin provisioning, differntial replication, zero-loss failover, VSS-aware snapshots, and a million other things no FOSS solution even comes close to doing. We paid >>$100K for ~10 TB of clustered iSCSI storage, and it has been worth every penny. We have had zero downtime and zero data loss since implementation four years ago, all the while performing upgrades to the SAN adding new features and increasing performance. In testing we yanked the power from one of the storage modules (really just a rebadged HP DL320S) and everything hummed along nicely, and re-synced in minutes when power was restored.

      Contrary to Slashrone lore, IT purchasing managers are not idiots. We pay vendors for these expensive toys because they are actually better in every way and more valuable in every way than your home-brew Linux Frankenserver.

    10. Re:Just double-up on everythign by Isao · · Score: 2, Interesting
      Good first thought, but the idea that keeps hanging in the periphery of the discussion above is that if you consolidate massive storage into a single LUN like that, it takes too long to back it up. The controllers simply can't move the data off fast enough. This is why in production systems you never see RAID LUNs maxed out. (Another reason is to distribute your transactions across multiple I/O channels.)

      EMC and its smaller rivals make a fortune on clever array technology that allows you to perform "snap clones" of LUNs that can be later backed off to off line storage at a lower rate. As long as it can be done before the next "snap" window, you're OK. Otherwise, reduce the LUN size and stand up more robots.

    11. Re:Just double-up on everythign by Abcd1234 · · Score: 1

      If you have one RAID5 box, just build another one that replicates it.

      Or just use something more reliable, like RAID 10, and toss out all of RAID5's performance problems along with it. As you say, storage is cheap, and the chief advantage that RAID5 offers (storage efficiency) seems like a minor concern compared to it's disadvantages.

  11. You're missing the point. by Polarina · · Score: 2, Informative

    A RAID 5 setup is only a precaution in case of an hardware failure. It serves as no excuse for not having backed up your data.
    And the topic is also flawed - RAID 5 doesn't have any self destruct mechanism.

    1. Re:You're missing the point. by mabhatter654 · · Score: 1

      the topic is not flawed. It does happen that in a highly error ridden raid setup near the end of the drives lives that when one drive goes another (from the same batch in the same operating conditions!) will die from the excess wear and tear trying to recover the data. I think this happened to one of our servers at work and they had to restore the whole thing from tape on a backup server.

  12. 7 2TB Disks in RAID 5???????? by quantumplacet · · Score: 1

    This story is just ridiculous. It clearly states that this doesn't affect Enterprise users, as their URE rate is lower and unless they're idiots they use smaller drives. What home users will have 7 disk RAID 5 arrays of 2TB disks? Is this really a large enough percentage of RAID5 users to call for the death of it?

    1. Re:7 2TB Disks in RAID 5???????? by cong06 · · Score: 2, Informative

      The main point of the article is to point out a problem that is going to eventually occur. If you read the article he mentions that later on with large enough hard drives, everyone will require a RAID set up with their "Dell manufactured" Computer. (assuming Dell hands out >>2-4TB disks to their average user)

    2. Re:7 2TB Disks in RAID 5???????? by pin0chet · · Score: 1

      You'd be surprised how quickly 50GB Blu-Ray images add up. I have 6 TB in a RAID 10 currently, and within 18 months I am confident I'll be at twice that, at least. With 1.5TB disks for $189 these days, who doesn't gradually amass terabytes of data?

    3. Re:7 2TB Disks in RAID 5???????? by cbreaker · · Score: 1

      You can use large disks in the enterprise without being an idiot.

      In fact, the size of the disk has absolutely nothing to do with failure rates.

      Basically stated, unless the data is used rarely or by a single host, you'd need a lot more spindles just to get the performance you'd need from that much data consolidated in one spot.

      --
      - It's not the Macs I hate. It's Digg users. -
    4. Re:7 2TB Disks in RAID 5???????? by Lershac · · Score: 1

      people with a life.

      --
      Chuck
    5. Re:7 2TB Disks in RAID 5???????? by mabhatter654 · · Score: 1

      Enterprise drives tend to stick to the 146GB limit and go for speed and smaller size. Simply because using cheaper 1TB disks sounds nice but that's too much data to recover in one failure. What's the chances of another drive having a different problem or a corrupt file when you try to rebuild by the TB... you have to account for that Murphy guy. Like you said also, the extra spindles help for moving the data around during recovery as well.

    6. Re:7 2TB Disks in RAID 5???????? by cbreaker · · Score: 1

      That's quite untrue.

      The most common new SAS disks are 300GB. 400GB FIber Channel disks are also gaining a lot of popularity, and I can't tell you how many people are using 500GB - 1TB SATA disks for more "bulkish" type storage.

      The only reason why the SAS/FC disks are smaller in capacity is because of the rotational speed of the spindle. It's a lot more difficult to make a reliable 10k or 15k disk at high capacities. The tolerances are unbelievable. As the processes get better, as the machines get better, they are able to produce higher capacity, high speed disks.

      IOPS are more important than capacity of an individual disk in many cases, and the faster the rotation the faster the disk can find and retrieve your data.

      If Seagate or Hitachi made a 1TB 15K disk you better believe people would be buying the hell out of them.

      --
      - It's not the Macs I hate. It's Digg users. -
  13. I fail to see the great insight by wintermute000 · · Score: 1

    If you put 7 disks in a single RAID5 without backup then its called bad design and bad implementation.

    This has always been true regardless of disk size/speed.

    As above posters have pointed out once you get past 4 disks the non-ZFS way to go is multiple blocks of RAID-(whatever number is appropriate for your scenario).

    Though ZFS is awesome and if your OS/hardware supports it 100% there is little reason to stick with RAID

    1. Re:I fail to see the great insight by cbreaker · · Score: 1

      Yea, except for the fact that ZFS isn't fast when you're using all the nifty features.

      Normal server-class machines (and many workstations) have hardware RAID controllers, which do all of the parity calculations themselves. ZFS in "RAID" mode is done all in software, so it's got the same disadvantages as traditional RAID in software.

      Software RAID like what is used in ZFS has a lot of advantages. It's like how LVM in Linux does - you can RAID individual partitions. But they all have the large performance hit.

      Perhaps there could exist a "ZFS Accelerator" card that would do the parity checking for you, but as far as I'm aware that doesn't exist. On a big beefy Sun box it might not be a problem but on your desktop PC it will be.

      Then there's the issue of BUS utilization. While it's true that PCIe is a fast bus, you're dealing with quite a bit less data over the bus by using a hardware RAID controller. The only things that hit the system bus is the actual I/O. With a software RAID, all of the I/O, plus all of the parity reads/writes, must also traverse the bus.

      ZFS shows promise but I'll stick with hardware accelerated RAID until such time as software RAID poses no penalties.

      --
      - It's not the Macs I hate. It's Digg users. -
    2. Re:I fail to see the great insight by boner · · Score: 1

      Actually, when you take into account that most processors will be multi-core in the future, it makes more sense. RAID controllers are not scaling as fast as processors, so the performance gap will close pretty soon, without noticeable impact on your desktop. Giving ZFS more memory will speed it up even more (and memory is pretty cheap too at around $25/ GB).

    3. Re:I fail to see the great insight by cbreaker · · Score: 1

      Sorry, but you're not correct. A CPU is a general purpose processor that can do all sorts of things. A RAID controller will have a high speed XOR/Parity engine where that's all it does.

      Just like how the fastest CPU can't even touch the performance of a $60 graphics card (when it comes to rendering 3D graphics,) a CPU is also not as fast as a RAID card with performing parity calculations.

      Well, maybe a CPU can do it faster, but you'll use a nice chunk of your CPU capacity on disk I/O. Not to mention you have to shove more data back and forth through the system bus because you have to access each disk individually and send/recieve the parity chunks.

      New RAID controller chips hit the market all the time, each must faster than the previous. That's why we're seeing most RAID controllers support RAID 6 now - the better performing chips in these things are now fast enough to do the additional parity calculations.

      Maybe some day it won't matter but right now, it just does.

      --
      - It's not the Macs I hate. It's Digg users. -
    4. Re:I fail to see the great insight by boner · · Score: 1

      You are correct, RAID controller chips are more efficient in doing the parity calculations. But that is also their limitation - the point I was trying to get across is that using a CPU for ZFS is a trade-off with respect to functionality (higher levels of protection against errors, with more data movement) versus the RAID 5/6 capability of a controller (less functionality at greater data-efficiency).

      Your mileage may vary...

    5. Re:I fail to see the great insight by asaul · · Score: 1

      Why do you believe there is a penalty for ZFS doing checksums on CPU and not on some open-to-corruption-and-complexity accelerator card?

      I can quite easily push my home RAID-Z setup (4x500G SATA-II) to 120MB/s - limited by PCI bandwidth - the CPU usage for that is fairly minimal on an Athlon64 3400. I have a friend using PCI-e who got benchmarks of nearly 200Mb/s write speeds and 480Mb/s read, on not reasonable home PC motherboards. It flogs $1200 Hardware RAID cards with cheap $30 SATA cards.

      Granted it uses more CPU than UFS would, but in most cases multi-core boxes make that negligible and so far for me it has never been an issue.

      --
      "If everybody is thinking alike, somebody isn't thinking" - Gen. George S. Patton
    6. Re:I fail to see the great insight by cbreaker · · Score: 1

      Ohh that's utter bullshit, and even more so with your original FUD "open-to-corruption-and-complexity" bullshit line.

      With a $400 SATA RAID card I can achieve 250MB/s with six 750GB SATA disks (RAID5) - copying to a 5-disk 500GB SATA array (RAID5) on the same controller. My CPU time is nearly zero. The SAME CONFIGURATION with software RAID and JBOD (the original setup of this machine) were 1/2 the performance and 40% CPU time on the quad-core Opteron.

      Software RAID uses a lot of resources that are better offloaded to something else if you can. So with your little home server you can get decent speeds, awesome.

      I don't think I'll be recommending that all our clients switch to software RAID because of some anecdotal evidence.

      Hey whatever. Go on with your bad self.

      --
      - It's not the Macs I hate. It's Digg users. -
  14. !news by Anonymous Coward · · Score: 0

    July 18th, 2007

  15. Testable assertion by merreborn · · Score: 3, Interesting

    But even today a 7 drive RAID 5 with 1 TB disks has a 50% chance of a rebuild failure. RAID 5 is reaching the end of its useful life.

    This is trivially testable. Any slashdotters have experience rebuilding 7TB RAID 5 arrays?

    You'd think, if this were really an issue, we'd be hearing stories from the front lines of this happening with increasing frequency. Instead we have a blog post based entirely on theory, without a single real-world example for corroboration.

    What's more, who even uses RAID 5 anymore? I thought it was all RAID 10 and whatnot these days.

    1. Re:Testable assertion by Anonymous Coward · · Score: 0

      My RAID goes up to 11.

    2. Re:Testable assertion by mbone · · Score: 1

      Yeah, this sounds like FUD to me, although I have no data one way or the other. I know lots of people with lots of video, and I haven't heard any screams about this, so I am inclined to doubt.

    3. Re:Testable assertion by theendlessnow · · Score: 4, Informative

      I have large RAID 5's and RAID 6's... I generally don't have any RAID columns over 8TB. I HAVE had drive failures. Yes... I'm talking cheapo SATA drives. No... I have not see the problem this article presents. Do I backup critical data? Yes. The only time I lost a column was due to a firmware bug which caused a rebuild to fail. Took awhile to restore from backup, but that was about the extent of the damage. I would call this article FUD... deceptive FUD, but very much FUD.

    4. Re:Testable assertion by MrPerfekt · · Score: 1

      What's more, who even uses RAID 5 anymore? I thought it was all RAID 10 and whatnot these days.

      Just because the number is higher doesn't mean it's 'better'. RAID levels are to be chosen based on what (performance, size, redundancy) is specifically required. One size most definitely does not fit all. RAID 10 is really just RAID 1+0 anyway, which is to say a stripe of mirrors.

      For many 'cold storage' or online-archiving cases, RAID 5 is desired because it offers the greatest redundancy and storage combination with not bad performance. I have many RAID 5 arrays at home and work. They typically consist of drives under 500GB each though and consist of 6 drives or under. The (old) article brings up valid points though and it is something that should be considered.

      ZFS does offer some more protections like byte-level parity checks but at the heart of it still consists of striping. Sun does recommend good practices on using ZFS on an X4500 (Thumper) like keeping raid groups (which are then concatenated into a pool) smaller than 9 drives and using double-parity like in RAID 6.

      The big hurdle here is that until ZFS makes it into the stock Linux kernel tree, adoption will be limited. (Full) Mac OS X support will definitely help on that front.

      --
      I just wasted your mod points! HA!
    5. Re:Testable assertion by mveloso · · Score: 1

      Actually, this happened at my old company occasionally. There were ways around rebuild errors, all of which were manual. In a few cases I remember the content wasn't recoverable, and they had to transfer all the content from another site.

      All you need is a bad batch of drives, and your whole infrastructure is hosed.

    6. Re:Testable assertion by pin0chet · · Score: 1

      For not much more money RAID 10 means significantly greater performance and a good deal more redundancy to boot. With 1.5TB for $189 these days, it doesn't make a whole lot of sense to do RAID 5 unless you're penny pinching and performance is of no concern.

    7. Re:Testable assertion by Anonymous Coward · · Score: 0

      I've not done it myself, but I have designed 26TB RAID5 arrays for pre-tape backup storage on an EMC - not a DMXx. Obviously, EMC did the work, not me. The purpose is for backup/recovery and disaster recovery use. Tape is too slow to write and read - the backup windows weren't being met.

      Active storage used by the servers is all RAID10 with 128GB of cache. Yes, that's xxxGB of cache.

      OTOH, I do have a cheapo 4 disk external disk array at home, but only with 4x320GB disks. I use Linux SW RAID5. I've had failures in the last 2 years since it was built. Putting in a new disk and adding it to the array via mdadm cmds sorta just worked.

      I'd love to use ZFS, but not until support is included in the default Linux kernel. I was burned by being an early adopter of JFS before it was added to the kernel. Recovery was not possible after a failure. I've learned.

      Someone else says that DVD isn't trustworthy for long term storage. I've been using DVD with PAR2 recovery files for 10 years to backup my DVD collection. Even when there is a failure, I can reconstruct the missing parts with `par2` and burn a replacement backup. I've never had a complete disk fault. Out of 500+ DVDs, only 3 have failed so far. I did just view a movie on disk 001 this weekend. No issues.

    8. Re:Testable assertion by Bandman · · Score: 2, Interesting

      It really only deals with SATA drive (SAS probably has lower failure rates) and it only becomes a statistical issue with mammoth amounts of data (the amount quoted in the article is 1 data read error per 14TB)

    9. Re:Testable assertion by merreborn · · Score: 1

      Just because the number is higher doesn't mean it's 'better'. RAID levels are to be chosen based on what (performance, size, redundancy) is specifically required. One size most definitely does not fit all. RAID 10 is really just RAID 1+0 anyway, which is to say a stripe of mirrors.

      Of course. But the article specifically deals with fault tolerance, and RAID 10 will be more fault tolerant than RAID 5 -- especially with respect to the failure mode discussed in TFA.

      That was what I was trying to get at: for installations where this issue actually matters, migration to a more fault-tolerant RAID configuration is an easy, albeit somewhat costly solution. This isn't an "Oh no, RAID 5 is useless, we have no other options!" moment. Options have been available for years.

    10. Re:Testable assertion by Torg · · Score: 1

      Yes, quite allot actually as I do storage for a living.

      First off raid6 failures protect against hard drives that were all made at the same time and installed. This is true if your bought the array, as an array, from a vendor. But if you made it yourself from multiple disks the changes of parallel failure is negligible. And as far as raid6 goes I spend a considerable amount of time converting 6+2 raid 6 arrays into 7+1 raid 5 arrays. In these cases 12TB would be very small.

      I also have my own disk arrays at home. Less for enterprise storage and more for those HD ATSC recordings my myth box makes. In it is a 4TB 5x1TB SATA array with a hardware raid card (really it is driver assisted). Aside from the slow write performance (it is expected with raid5) they run fine.

      Reading the posting I wonder to myself if he has every watched a raid array rebuild. For that matter has even watched a SAN attached array work. Yes it takes time to rebuild (depends on the size of the disk). But other then slow access time that array when it is rebuilding it is rather transparent to the hosts using the array (not to mention it is pretty hard to saturate and enterprise raid array).

      Why someone would mirror a set of mirrors (1+1 or 11) is beyond my understanding. Many of my customers use raid 1+0 (for speed). I have even seen customers use raid 5+1, but we tend to call that "paranoid raid".

    11. Re:Testable assertion by coolsnowmen · · Score: 1

      What's more, who even uses RAID 5 anymore? I thought it was all RAID 10 and whatnot these days.

      I do .

      Everything is a trade off. Using the most connections I can in my tower I have 4-5 disks. Using raid 5 I lose 20-33% of the potential space while using raid 1+0 or 0+1 (aka raid 10) I lose 50%.

    12. Re:Testable assertion by jerkychew · · Score: 1

      It's a bit ahead of its time but it's hardly far-fetched.

      I work in what one would call an enterprise-level data center. We have about a quarter petabyte of usable storage for about 10,000 users, not counting backups. Even though we have several TB in EMC SAN devices, due to the nature of our business we have quite a bit of one-off server builds with local storage. RAID 5 with no hot-swap spare is the economical way to go when you have a theoretical 7 day per week backup schedule. So it's hardly RAID 10 and whatnot, and I don't see us using RAID 10 on any one-off machines for quite some time.

      As for calling the 7TB array scenario "trivially testable", it's not. In the Enterprise world, server disks are just now breaking the 750GB barrier and we probably won't see 1TB disks widely used for at least two years. That's not to say that 1TB (and soon to be greater) server disks don't exist, they do, it's just that they're almost prohibitively expensive from a dollar per gig standpoint.

      However, that day is coming. 99 percent of our servers are 2U 4-disk servers in RAID 5. Once we standardize on 1TB disks (We're currently only up to 146GB in the majority of cases) we're going to see scenarios similar to what this article describes. It's a bit of a way off for us, but it's definitely food for thought.

      I wouldn't call this article trivially testable, I'd call it prescient.

    13. Re:Testable assertion by NerveGas · · Score: 1

      I haven't done a 7TB array, but I've done a fair number of 3 and 4TB drives... and if they're implemented on quality hardware, it's not bad at all. 7TB wouldn't be that bad.

      --
      Oh, you're not stuck, you're just unable to let go of the onion rings.
    14. Re:Testable assertion by cbreaker · · Score: 1

      Lots of people use RAID 5. Well, let me correct that. MOST people use RAID 5. It's by far the most commonly used RAID system because of a good balance of protection, performance, and cost.

      Only the most demanding systems will go 10, or the most paranoid admins. But for most things and most admins, RAID 5 works well. RAID 6 is becoming more popular, too - but it's basically the same thing. Instead of RAID 5 with a hot-spare, folks use RAID 6 without one.

      You should always follow common sense and don't build 14 disk arrays and make sure your backups work.

      RAID 5 is still relevant and will remain so for the foreseeable future.

      --
      - It's not the Macs I hate. It's Digg users. -
    15. Re:Testable assertion by Anonymous Coward · · Score: 0

      No, raid 5 performance is not tolerable for people that care about performance. That's like saying a yugo has tolerable performance.

    16. Re:Testable assertion by TooMuchToDo · · Score: 1

      It'll be a while till ZFS makes it into the Linux kernel, considering the licensing is currently incompatible with the GPL. *sigh* I wish ZFS would make it to Linux faster.

    17. Re:Testable assertion by TooMuchToDo · · Score: 1

      Having that much space in so many servers (1TB drives, assuming hundreds to thousands of servers) just screams for licensing the software Amazon uses for S3. You would then be able to efficiently use all of that available disk, while having the low level management functions handled by whatever S3 uses to manage IOAPI.

    18. Re:Testable assertion by Ost99 · · Score: 1

      Desktop drives have 1 in 1E14 URE
      Cheap RAID SATA drives have 1 in 1E15 and enterprise disks have 1 in 1E16.

      Unless you use 1-2TB desktop drives in your RAID setup, you'll be fine.

      --
      ---- Sig. gone.
    19. Re:Testable assertion by chenjeru · · Score: 1

      At our studio, we run several 24x500GB-drive RAID6 DAS arrays for top tier storage. The backup system is comprised of redundant NexentaStor heads with a ZFS RAID-Z virtualizing across 16 storage nodes through iSCSI, each node a 4x1TB 1U box in RAID5 (which makes a RAIDZ-55?). The RAIDZ55 holds unstructured data plus a VTL partition, which gets a scheduled dump to LTO4 and taken offsite.

      --
      Even if you're on the right track, you'll get run over if you just sit there. - Will Rogers
    20. Re:Testable assertion by chenjeru · · Score: 1

      ...forgot to mention the point: in this system we've had only 3 disk failures in the past 4 years and these were all in the RAID6 DAS arrays. Everything was rebuilt without a blip. On one of our really old 6-disk servers we had a RAID controller fail and we lost some data, but fortunately the important stuff was properly backed up. Not to harp about it, but the people saying that RAID is not a backup are right on.

      --
      Even if you're on the right track, you'll get run over if you just sit there. - Will Rogers
    21. Re:Testable assertion by Bandman · · Score: 1

      [CITATION NEEDED]

      No, I'm really hoping you're right, but where did you find those numbers?

    22. Re:Testable assertion by domatic · · Score: 1

      Patents have been slapped on it that you have a license to if you use it under the CDDL. This means that it can't even be re-implemented on Linux. Sun is deliberately coy when asked about those patents. Sun doesn't want ZFS (natively) on Linux under any circumstances.

    23. Re:Testable assertion by Ost99 · · Score: 1

      Seagate ES.2 (Cheap SATA RAID drive) 1 in 10E15: Link
      Seagate Cheetah 15k.6 1 in 10E16: Link

      --
      ---- Sig. gone.
    24. Re:Testable assertion by Bandman · · Score: 1

      Nice, thanks a lot!

    25. Re:Testable assertion by Abcd1234 · · Score: 1

      Using raid 5 I lose 20-33% of the potential space while using raid 1+0 or 0+1 (aka raid 10) I lose 50%.

      Yeah, but in a world of $189 1TB drives... *who cares*? RAID5 is slower, harder to grow, has silent failure conditions, and it's one advantage is that it's 20% more space efficient, in a world where the $/GB ratio is so ridiculously low it's not even worth thinking about.

    26. Re:Testable assertion by coolsnowmen · · Score: 1

      You arn't getting it.
      I only have space in my case for 5 drives. I can use raid 5 on 1TB drives and get 4TB of usable space. Raid 10 has to be used on an even number of drives (actually partitions) and so I could only have 2TB of on my computer. So, if I want >2TB of space on my computer, I cannot use raid 10, and so I use raid 5.

      And arguing that "it is only 189$". Who are you to tell my where I should spend my money. If I can reduce my $/Gb cost with software, what kind of business man are you to tell me that is stupid.

      I can't see how a raid5 byte failure would be any different than raid 10. (tell me?)

      growing? well not really a factor because I couldn't really add 2 drives to my case easily.

      slower? yeah, but that is why things are a trade off. slow & cheap(lower $/Gb) VS faster and more expensive.

    27. Re:Testable assertion by Abcd1234 · · Score: 1

      You arn't getting it.

      No, I'm getting it. The problem is you've apparently never considered an external storage chassis. You can get a 6-drive eSATA enclosure for under $300 these days, and combined with the drives in your computer, that's more than sufficient. And as an added bonus, the drives will last longer as they'll probably be better ventilated, and they'll be easily hot-swappable to boot.

      'course, you could also just get a chassis that's designed for large, backend storage solutions, rather than trying to retask an old ATX graybox. Honestly, what kind of businessman are you that you aren't willing to shell out a few bucks for real storage integrity?

      I can't see how a raid5 byte failure would be any different than raid 10. (tell me?)

      Simple. If any two drives in a RAID5 go, you're hosed. With a RAID 10, two drives *in the same mirror* must go, which is drastically less likely. Furthermore, if any one drive in your array goes, you don't experience degraded performance. And when you hot-swap a replacement drive, you won't undergo massive performance degradation as the RAID rebuilds. And finally, unlike RAID5, RAID10 isn't subject to silent error.

      Honestly, RAID10 is superior to RAID5 in *ever way*, save for cost, and given the $/GB levels these days, I really so no justification for RAID5.

    28. Re:Testable assertion by coolsnowmen · · Score: 1

      So your solution to my problem is buy more drives, and more enclosures...

      It is not news to me that more money can buy me better stuff.

    29. Re:Testable assertion by Abcd1234 · · Score: 1

      So your solution to my problem is buy more drives, and more enclosures...

      Well... yeah, naturally. :) Chassis and drives are relatively cheap. For the rather modest outlay, you get far better fault tolerance, far greater performance, more flexibility, and you eliminate the spectre of silent errors.

      Of course, as always, it's a cost-benefit tradeoff. I guess my point is, because of these truly massive drives, that tradeoff more frequently leads to non-RAID-5 solutions these days. The problem is people have it in their heads that RAID5 is simply *the* storage model to use, without considering all the advantages and disadvantages associated with it... I mean, you, yourself, apparently didn't even understand the different failure modes and rates between RAID-5 and RAID-10, and it's that kind of information that is absolutely key when selecting an appropriate RAID level for a given application.

  16. RAID doesn't protect against your worst enemy by YesIAmAScript · · Score: 0, Redundant

    rm -r *

    Seriously, you're kidding yourself if you think RAID is protecting you.

    --
    http://lkml.org/lkml/2005/8/20/95
    1. Re:RAID doesn't protect against your worst enemy by cbreaker · · Score: 0, Redundant

      Easily solved by file system snapshots.. which you should be doing for an important file server no matter what operating system you're using.

      --
      - It's not the Macs I hate. It's Digg users. -
    2. Re:RAID doesn't protect against your worst enemy by SatanicPuppy · · Score: 5, Insightful

      Wow, how incite-ful. Doesn't matter what the discussion is, some geek is bound to weigh in with all the shortcomings of any idea.

      Newsflash: there is no perfect backup! No method is foolproof, especially when it's bound to be boring as hell, and you've got an inevitable human factor. You get lazy moving the tapes offsite, you put off fixing a dead drive because there are 4 others, you wipe your main partition upgrading your distro and forget that your CRON rsync script uses the handy --delete flag, and BOOM wipes out your backup.

      Shit happens. Pointing out what we all already know doesn't do anything helpful.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    3. Re:RAID doesn't protect against your worst enemy by ShakaUVM · · Score: 1

      >>Seriously, you're kidding yourself if you think RAID is protecting you.

      It's a question of what type of faults you care about. Right now, I have a 500GB RAID0 setup to boot and play games off of, an internal 250GB hard drive that mirrors my important folders every night, an external drive that both mirrors my important documents and is used when I shuttle between my two home offices, and an FTP server in a different city that I back up most of my small and important files to (I ain't sending my Snoop Dogg discography over my DSL line). Each has a different failure mode and risk factor. The RAID0 is most vulnerable to hard drive death (though it's been running since '04 without any problems, and my HD health monitors show it in good shape), the second internal drive (which would have been a RAID5 drive, except RAID5 sucks ass on my controller) is vulnerable to theft (steal the RAID0, steal the extra drive too), the USB drive is also vulnerable to theft, though perhaps a different kind of thief (it can be stolen from my car, for example), and the FTP server is slow as shit, but relatively secure.

      I'm relatively lazy about doing backups, but having automated stuff to handle all of it makes it not much of a hassle.

    4. Re:RAID doesn't protect against your worst enemy by lucas+teh+geek · · Score: 5, Insightful

      RAID doesn't protect against your worst enemy
      rm -r *

      nor is it supposed to. not being a moron seems to have protected me from "my worst enemy" just fine. RAID has protected me from random disk failures. seems to be working as designed

      --
      TIAEAE!
    5. Re:RAID doesn't protect against your worst enemy by EvilRyry · · Score: 0, Redundant

      Unfortunately very few file systems actually implement them in a form that's usable under heavy loads.

    6. Re:RAID doesn't protect against your worst enemy by postbigbang · · Score: 1

      RAID 0 is only redundancy, and doesn't do anything to protect data in any way beyond what the file system might do. RAID 1, or mirrored drives, is costly (2x drives). RAID5 permits one drive (min 3, but more is desirable and effective) to fail. Above RAID 5, the ACM paper that initially described all this, sayeth not, but it's generally a hot-spare.

      We have several T; we back it up on an occasional basis to a slower drive array and it takes time. Subsequent backups are only delta, and it's surprisingly a small amount of data that needs backingup. We keep ISOs ready to go, too, in case we need to burn local laptops or servers-- online-- ready for PxE boot.

      If someone steals the SAN, we have a backup and an insurance agent. Fortunately, it's not AIG.

      --
      ---- Teach Peace. It's Cheaper Than War.
    7. Re:RAID doesn't protect against your worst enemy by 4D6963 · · Score: 1

      Which is why I only run Windows, so this evil command cannot be run. Problem solved!

      By that same logic can we also say that you're kidding yourself to think you're safe if you're using anything short of a nested RAID spread across 3 different locations?

      --
      You just got troll'd!
    8. Re:RAID doesn't protect against your worst enemy by Kleen13 · · Score: 5, Funny

      (though it's been running since '04 without any problems, and my HD health monitors show it in good shape)

      Oh man.... you didn't just say that out loud did you???

      --
      That sinking feeling deep in your gut when you KNOW you screwed up bad summed up with: {head desk} {head desk}
    9. Re:RAID doesn't protect against your worst enemy by Junior+J.+Junior+III · · Score: 5, Funny

      My data backup scheme is to steganographically embed my entire filesystem into nude pictures of Sarah Palin, and then upload them to usenet.

      --
      You see? You see? Your stupid minds! Stupid! Stupid!
    10. Re:RAID doesn't protect against your worst enemy by Kleen13 · · Score: 1

      LOL, thanks for saying that.

      --
      That sinking feeling deep in your gut when you KNOW you screwed up bad summed up with: {head desk} {head desk}
    11. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      The Egyptians found a way to preserve their message over thousands of years, surely we can come up with something. :)

    12. Re:RAID doesn't protect against your worst enemy by SatanicPuppy · · Score: 2, Insightful

      The vast majority of Egypts writings were stored on perishable papyrus, not carved or painted on stone. Of all that they ever wrote or stored, we have but the tiniest fraction remaining.

      If we lost technology today, there would be nothing left but paper in 20 years. In a thousand, there wouldn't even be much paper.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    13. Re:RAID doesn't protect against your worst enemy by reboot246 · · Score: 4, Funny

      That's why I chisel all my data (ones and zeros) onto stone tablets. In a few years the pile of stones will be taller than Everest. :)

    14. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 4, Informative

      Redundancy... You keep using that word. I do not think it means what you think it means.

      RAID 0, psudo-ironically, is not redundant at all. RAID 1, often called mirroring, are the arrays that are redundant.

    15. Re:RAID doesn't protect against your worst enemy by johanatan · · Score: 0

      I'm glad the 'not being a moron' thing worked out for you. But, what would you suggest to those in the audience that cannot claim the same. :-)

    16. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0, Troll

      Insightful? lol

    17. Re:RAID doesn't protect against your worst enemy by Renderer+of+Evil · · Score: 5, Funny

      That's why I chisel all my data (ones and zeros) onto stone tablets. In a few years the pile of stones will be taller than Everest. :)

      And in a thousand years some bearded guy will discover couple of those stones, come down the mountain and will base a religion around it. These things are cyclical.

    18. Re:RAID doesn't protect against your worst enemy by postbigbang · · Score: 2, Insightful

      If you source the original term 'RAID', it goes to an ACM article describing Redundant Arrays of Inexpensive Disks. In RAID 0, which is actually a marketing term, there's striping, but no redundancy that can infer the contents of a missing member of the array. From the perspective of availability, it has none. As you cite, RAID 1 is a mirrored pair, usually the same type of drive, and it also is likely the fastest RAID-- and most expensive in terms of available net data after redundancy for availability. There is also no RAID 6...10, as these are marketing terms, too.

      --
      ---- Teach Peace. It's Cheaper Than War.
    19. Re:RAID doesn't protect against your worst enemy by ushering05401 · · Score: 2, Insightful

      "Shit happens. Pointing out what we all already know doesn't do anything helpful."

      Actually, it gives posters like you a chance to remind everyone else that shit happens.

      I believe there would be many fewer frustrated/bitter IT workers if more people meditated on the fact that shit just happens. In today's marketplace it is usually IT left holding the bag when things go south anyhow... gotta get acclimated to that and roll on.

      Anyhow, I doubt there are many IT veterans not familiar with really expensive, really borked backup systems. Smarter people than me have observed that as technology progresses, existing strategies either age or mature. The ones that age become brittle, and the ones that mature become more robust...

      Corporate suits usually insure that both aged and mature technologies will be flogged on long past their rational retirement dates.

    20. Re:RAID doesn't protect against your worst enemy by Samantha+Wright · · Score: 3, Funny

      And look what happened? Netcraft is already half way to confirming the demise of alt.binaries!

      --
      Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!
    21. Re:RAID doesn't protect against your worst enemy by MyLongNickName · · Score: 1

      However, when adding that much data into one photo, there are a few, slightly noticeable changes.

      --
      See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
    22. Re:RAID doesn't protect against your worst enemy by SatanicPuppy · · Score: 3, Funny

      Lets hope he discovers some porn this time...

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    23. Re:RAID doesn't protect against your worst enemy by tkw954 · · Score: 5, Funny

      rm -r *

      That doesn't work for me. Try

      sudo rm -rf /*

    24. Re:RAID doesn't protect against your worst enemy by davolfman · · Score: 1

      I vote for silver halide contact prints of data encoded on high-res inkjet transparencies stored in a fireproof box.

    25. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      but alias rm='rm -i' does..

      (wright tool for the wright job)

    26. Re:RAID doesn't protect against your worst enemy by Waffle+Iron · · Score: 2, Funny

      The Egyptians found a way to preserve their message over thousands of years, surely we can come up with something. :)

      And they would have saved future generations from vast amounts of confusion and effort, if they'd only been a little more diligent backing up their pyramid construction HOWTO files.

    27. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 4, Funny

      You leave RMS out of this!

    28. Re:RAID doesn't protect against your worst enemy by h4rdc0d3 · · Score: 1

      alias rm='rm -i'

      :)

    29. Re:RAID doesn't protect against your worst enemy by m.ducharme · · Score: 1

      what is an Everest pile in, oh say Libraries of Congress?

      --
      Rule of Slashdot #0: You and people like you are not representative of the larger population. - A.C.
    30. Re:RAID doesn't protect against your worst enemy by YesIAmAScript · · Score: 1

      Newsflash:
      RAID is NOT backups.

      It's not a perfect backup. It's not an imperfect backup. It's not a backup.

      Apparently it isn't completely useless to point this out, because not everyone seems to know it.

      Why does a guy who says there's no point in pointing out that nothing is perfect spend a paragraph explaining why nothing is perfect?

      It's pretty shocking to me that your post is considered insightful.

      --
      http://lkml.org/lkml/2005/8/20/95
    31. Re:RAID doesn't protect against your worst enemy by mobets · · Score: 2

      oops, missed funny and hit overrated.
      Sorry about that. To bad this will remove some good mods up above.

      --

      It was me, I did it, I moved your cheese
    32. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      Are you saying that in a thousand years there will be a religion based on 2girls1cup.flv in hexadecimal? Sign me up!

    33. Re:RAID doesn't protect against your worst enemy by cbreaker · · Score: 4, Informative

      Well, Windows does. Taking a snapshot of NTFS, even on a heavily used 1TB+ file server, takes only a few seconds, and under normal operation the file system is still fast.

      NTFS is actually a pretty good file system. It's probably because it was originally designed by IBM.

      --
      - It's not the Macs I hate. It's Digg users. -
    34. Re:RAID doesn't protect against your worst enemy by srw · · Score: 3, Informative

      "This time?"

      Ah, I see you've never read "Song of Songs"

    35. Re:RAID doesn't protect against your worst enemy by Slashdot+Parent · · Score: 3, Interesting

      No method is foolproof, especially when it's bound to be boring as hell, and you've got an inevitable human factor. You get lazy moving the tapes offsite, you put off fixing a dead drive because there are 4 others, you wipe your main partition upgrading your distro and forget that your CRON rsync script uses the handy --delete flag, and BOOM wipes out your backup.

      Jesus Christ, you must be one unlucky soul. Do you live your entire life in a worst-case scenario?

      The system that I use for data storage is as follows:

      1. 2TB NAS that uses a scrubbed (if you don't know what that means, look it up) Linux Software RAID
      2. Anything important goes into a directory hierarchy that is backed up automatically via rsnapshot (in other words, one botched snapshot isn't going to leave me up a creek without a backup.
      3. Each week, my rsnapshot directory is automatically encrypted (and thus compressed) with gpg and uploaded to Amazon S3. My rsnapshot directory currently occupies about 3GB of space after gpg's automatic compression.
      4. The 5th oldest backup in S3 is automatically deleted.
      5. When I think of it, I burn my rsnapshot directory to DVD and my wife takes it into her office and leaves it there.

      This system may not be foolproof (what is?), but it is pretty frickin' safe, and costs me roughly $3 or $4 per month. Not too shabby for what I would consider to be a fairly robust backup system for a home user.

      I suppose the biggest challenge is deciding what goes into rsnapshot. If my RAID array suffered a massive failure, I would definitely lose data. But this is mostly video content, and really, if I lose my mythtv shows, it is not exactly as catastrophic as if I lost, say, my quickbooks data.

      There are a lot of things that keep me awake at night, but loss of important data is not one of them.

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
    36. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      Hope you encrypted them first.

    37. Re:RAID doesn't protect against your worst enemy by miscGeek · · Score: 1

      Ummm, brb, quickly looks at his backup script

      --
      May the source be with you!
    38. Re:RAID doesn't protect against your worst enemy by ShakaUVM · · Score: 1

      >>RAID 0 is only redundancy, and doesn't do anything to protect data in any way beyond what the file system might do.

      ??

      RAID0 is striping. It's not only not redundant, but it lowers your MTBF on your aggregate drive.

      >>If someone steals the SAN, we have a backup and an insurance agent. Fortunately, it's not AIG.

      Yeah, theft is really my biggest worry. With two drives getting backups (one inside the machine, one external), a thief coming in and stealing the lot is my worst case scenario. I can't automatically backup to the FTP, so I have to manually remember to do it from time to time, and I'm pretty lazy about that kind of stuff.

    39. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 3, Informative

      rm -r *

      That doesn't work for me. Try

      sudo rm -rf /*

      hell, if you want to lose data, you've gotta at LEAST use dd. rm is just removing file handles, all your data is fine, you just cant access it. run

      dd if=/dev/urandom of=/dev/sda

      (or whatever disk you want to lose) and then see how many data recovery places will turn you away. the level of data recovery available to the public is pretty crappy, there's a guy offering a reasonably big prize to any data recovery company (or anyone at all i guess) who can recover data from a disk he zero'd with dd and hasnt had any takers yet. i wish i could find the link

    40. Re:RAID doesn't protect against your worst enemy by Alarindris · · Score: 3, Funny

      #%^@%!#$!!!! The second one works!!

    41. Re:RAID doesn't protect against your worst enemy by johanatan · · Score: 1

      Mod parent up.

    42. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      usenet closed, sorry.

    43. Re:RAID doesn't protect against your worst enemy by g0rAngA · · Score: 1

      I have been saved by RAID5 before, when one of my drives died a horrible death. A few hours later, I was back up and running. I am very glad I did that.

      However, I once accidentally did
      chown -R user:users /
      instead of on *. the RAID didn't help with that at all, and I consider it quite lucky that I was able to recover from it at all.

    44. Re:RAID doesn't protect against your worst enemy by Anpheus · · Score: 1

      Uhm, my computer is pretty stable if I keep it at room temperature, low humidity, and unplugged.

      And we'd only need one of 'my computer' to survive to read almost every hard disk currently manufactured. Only thing mine doesn't have is a software emulated raid controller and SCSI backplane. So I may not be able to recover but is easy.

    45. Re:RAID doesn't protect against your worst enemy by drsmithy · · Score: 1

      NTFS is actually a pretty good file system. It's probably because it was originally designed by IBM.

      NTFS was designed and built in-house at Microsoft specifically for Windows NT.

    46. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      That'll work, I think. Would you test it for us please, and let us know ... Thanks

    47. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      >NTFS is actually a pretty good file system.
      >It's probably because it was originally designed by IBM.

      No, it was not. Microsoft designed OS/2's HPFS and they had more experience when creating NTFS.

    48. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      I hope he finds one with porn on it. Preferabl

    49. Re:RAID doesn't protect against your worst enemy by NormalVisual · · Score: 1

      Uhm, my computer is pretty stable if I keep it at room temperature, low humidity, and unplugged.

      Pompeii and Herculaneum probably had an environment that were pretty conducive to working electronics in their heyday too.

      And we'd only need one of 'my computer' to survive to read almost every hard disk currently manufactured.

      You also need to have an idea about how it works and how to make it run, otherwise it's possible to damage it beyond repair in the process of learning about it. It'd suck if they applied 500 VAC at 120 cycles and blew up the power supply, or all the capacitors in the machine had dried out and were non-functional, or it had suffered some other kind of irreparable physical damage. Long-term preservation of information isn't something trivial to achieve, and I think the parent poster is correct in pointing out that it's something we should really start thinking about.

      --
      Please stand clear of the doors, por favor mantenganse alejado de las puertas
    50. Re:RAID doesn't protect against your worst enemy by NormalVisual · · Score: 1

      And in a thousand years some bearded guy will discover couple of those stones, come down the mountain and will base a religion around it.

      I don't think RMS will be around then, and the cult of the GPL won't either. :-)

      --
      Please stand clear of the doors, por favor mantenganse alejado de las puertas
    51. Re:RAID doesn't protect against your worst enemy by YesIAmAScript · · Score: 1

      It's a metaphor. Other programs can do the deleting for you (remember the itunes installer that deleted the contents of your HDD?). Or you can instead just do something super smart like "I'll write a script to rename all these .jpgs to .jpegs" and accidentally write a script that renames them all to the same thing, thus deleting most of them.

      If you think your data is safe because you're using RAID, you're setting yourself up to fall.

      --
      http://lkml.org/lkml/2005/8/20/95
    52. Re:RAID doesn't protect against your worst enemy by Doug+Neal · · Score: 2, Insightful

      I'm glad the 'not being a moron' thing worked out for you. But, what would you suggest to those in the audience that cannot claim the same. :-)

      OS X?

    53. Re:RAID doesn't protect against your worst enemy by ezzzD55J · · Score: 1

      Speaking of redundancy.

      Why are you repeating the GP's post to itself?

    54. Re:RAID doesn't protect against your worst enemy by funkatron · · Score: 1

      You cant modify your own home directory?

      --
      "Welcome to our world. We are the wasted youth. And we are the future too." Yes, I know these are stupid lyrics.
    55. Re:RAID doesn't protect against your worst enemy by Sique · · Score: 1

      That's why it is called RAID Level Zero: Zero Redundancy.

      --
      .sig: Sique *sigh*
    56. Re:RAID doesn't protect against your worst enemy by alan.briolat · · Score: 1

      I actually did this once - just a simple case of tab-whoring and doing things too fast for my own good (it was rm -rf . in my case, in my home directory).

      It was a good experience in that ever since that point I've actually been running daily incremental and weekly full backups of everything important to somewhere a normal user has no read/write access. And by 'important', I mean documents etc., not the 500GB of media. It's all about risk vs. cost, and I just don't have anywhere to backup that much data to, when most of it can be reacquired. In fact, I just back up a list of my media instead to make that process easier.

      There's a lot of 'ideal solution' stuff being thrown around here, but the fact is that it's not practical for 99.9% of people. Most home user data is not worth the £10000 an ideal backup solution would cost.

      --
      I swear we should be allowed to give mod points to sigs... "-1, Offtopic"
    57. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      there's a guy offering a reasonably big prize to any data recovery company (or anyone at all i guess) who can recover data from a disk he zero'd with dd and hasnt had any takers yet. i wish i could find the link

      I think slashdotters agreed that that was the "lamestchallenge"?

    58. Re:RAID doesn't protect against your worst enemy by isorox · · Score: 1

      Redundancy... You keep using that word. I do not think it means what you think it means.

      RAID 0, psudo-ironically, is not redundant at all. RAID 1, often called mirroring, are the arrays that are redundant.

      The R iin raid 0 stands for "Risky"

    59. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      That doesn't seem to be doing anything.

    60. Re:RAID doesn't protect against your worst enemy by laffer1 · · Score: 2, Insightful

      Yes, that's what time machine is for. Sadly, my mac is the best backed up machine here. I have an external seagate drive hooked up with time machine and average around a month of backup points. I also burn things on DVD twice a year I can't live without like my iTunes collection. I really wish blu-ray would pick up on Macs for backup purposes. I could backup my iTunes with 3 50GB BD discs. 135GB of data to backup on 8GB DVDs?

      Tapes are cost prohibitive and optical hasn't kept up with hard drive capacity. I remember when I could backup my whole computer on 2 CDs. Now, even with BD I'd need 5 discs.

      Optical discs have their own problems, but I like to have backups on at least two different types of media. Since tapes are expensive and I've had terrible luck with them professionally, I'd like to stick to optical when possible.

    61. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      bah, half measures at best, try
      "sudo dd if=/dev/random of=/dev/sda"
      for some real Data Destruction

    62. Re:RAID doesn't protect against your worst enemy by Darth_brooks · · Score: 1

      That's nothing. He was just telling me about his best girl back in Iowa, and they're gonna get married just as soon as he gets back home. He's also only three days from retirement.

      I'd say more, but he's got to change into his red shirt for an away team mission consisting exclusively of him, McCoy, Spock, and Kirk.

      Did he say whether or not he was using an ACME brand RAID controller?

      --
      There are some people that if they don't know, you can't tell 'em.
    63. Re:RAID doesn't protect against your worst enemy by HAKdragon · · Score: 1

      So RMS = Moses 2.0?

      --
      "Our opponent is an alien starship packed with atomic bombs. We have a protractor."
    64. Re:RAID doesn't protect against your worst enemy by TheoMurpse · · Score: 1

      Let me get this straight: you embed your file system into copies of portions of your file system? Is that redundant, recursive, or meta?

    65. Re:RAID doesn't protect against your worst enemy by cbreaker · · Score: 1

      NTFS shares it's lineage with HPFS, which was part of OS/2 - originally a joint IBM and Microsoft venture.

      --
      - It's not the Macs I hate. It's Digg users. -
    66. Re:RAID doesn't protect against your worst enemy by Zashi · · Score: 1

      And our god said unto us, let us ejaculate onto the face of our woman, with our bestfriend, and his neighbor, and his neighbor's nephew, and the mailman.......

      --
      Skiffy is Spiffy, but Ort is tort.
    67. Re:RAID doesn't protect against your worst enemy by decsnake · · Score: 1

      Actually, NTFS is based on the VMS file system. The architect of NTFS came over to MS from DEC along with Dave Cutler, who was the architect/project mgr of NT.

    68. Re:RAID doesn't protect against your worst enemy by SatanicPuppy · · Score: 1

      4 years is too long. You need to start rotating in some new drives...Even the very best drives don't offer warranty replacement past 5 years.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    69. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      does usenet still exist ?

    70. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      rm -rf * ( don't forget to use the force :) )

    71. Re:RAID doesn't protect against your worst enemy by Phroggy · · Score: 1

      I hate distros that do that by default. I'm not going to confirm every single file I want to delete, so when I delete a directory, instead of just using rm -r, I'm just going to use rm -rf instead. Now, not only have I completely negated any benefit of using rm -i, but I've also lost the confirmation for files I might want to be warned about (I don't encounter this often, but I think it warns you about deleting files you don't have write access to, but can still delete because you have write access to the parent directory and the sticky bit is off).

      --
      $x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
      $x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
    72. Re:RAID doesn't protect against your worst enemy by ShakaUVM · · Score: 1

      Hmm, how do you rotate in drives in a RAID0 configuration? I know how to do it in a RAID1 or RAID5 setup, but I don't see how to do it in RAID0 without doing something like cloning the drives.

    73. Re:RAID doesn't protect against your worst enemy by SatanicPuppy · · Score: 1

      You've got a bunch of 4 year old drives in Raid 0? Jesus. I'd be afraid to run a defragger or reboot. The average service life of a hard drive these days is considered to be 3-5 years, so it would be a good idea to make a backup of anything you care about.

      Yea, you need to duplicate them. Might be easier to just buy one modern drive that is bigger than all the older drives, and copy all the data new drive.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    74. Re:RAID doesn't protect against your worst enemy by Xenna · · Score: 1

      That's smilar to what I do for my home data (I back up to a colo server though), but I think you might be missing one possibly important thing. You should guard against corrupted files in your backup, which could propagate through your snapshots by running rsync with the -c flag (compares files by a hash of their contents instead of just size and timestamp).

      I leave my MythTV shows in the considerably less secure hands of a Linux MD RAID 5 array. I'm surprised you have only 3GB to backup, My digital photo's alone are 29GB and I'm not a big photographer. The size of each of my snapshots is 162 GB.

      X.

    75. Re:RAID doesn't protect against your worst enemy by fuffer · · Score: 1

      And there will still be some geek from Slashdot yelling from atop of the mountain that if you had any skills you'd have a backup set of stones located in a climate-controlled underground bunker, and that, by the way, backing up data is important!

    76. Re:RAID doesn't protect against your worst enemy by Slashdot+Parent · · Score: 1

      You should guard against corrupted files in your backup, which could propagate through your snapshots by running rsync with the -c flag (compares files by a hash of their contents instead of just size and timestamp).

      My backup server is an old P3 machine. I wonder if it could even hash all of my files before the next day's snapshot got kicked off? :)

      I'm surprised you have only 3GB to backup, My digital photo's alone are 29GB and I'm not a big photographer. The size of each of my snapshots is 162 GB.

      I know some people who store the RAW file for every photo that they take. I delete what I don't want. So far, I haven't regretted my decision, but maybe some day I will, and maybe then I'll spring for lightroom or something.

      It's 3GB compressed, by the way. I have no idea where I'd find 162GB of data that I want to back up. High def (ahem) home movies, perhaps?

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
    77. Re:RAID doesn't protect against your worst enemy by Xenna · · Score: 1

      Home movies as well, although I wouldn't call them hi-def. I also backup my music collection. I may have lost some of the original CD's ;)

      I don't do the -c thing either, but I keep thinking I should. It's probably going to take a real long time indeed. Perhaps it's worth keeping an MD5 database in both locations and checking them regularly. Wouldn't the tripwire tool be good for that?

    78. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      Except that by taking a snapshot you aren't creating a copy of the data - hence why its so quick.

      Snapshot = metadata of changes not multiple copies therefore you can only roll back if you have the original volumes as well as the tracking of changes - the metadata

      Lose the original volume and then seen what you can get back from the VSS snapshot - none.

        VSS is intended to work with backup to make sure that point in time copies are consistent (which takes a data copy from the mounted volume + the metadata) and to roll back to earlier changes of the disk blocks.

    79. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      that was it, but i could have sworn the prize wasnt so piddly when I last saw it

    80. Re:RAID doesn't protect against your worst enemy by treeves · · Score: 1

      But that was from God via Solomon, not via Moses.

      --
      ...the future crusty old bastards are already drinking the Kool-Aid.
    81. Re:RAID doesn't protect against your worst enemy by Insightfill · · Score: 1

      (though it's been running since '04 without any problems, and my HD health monitors show it in good shape)

      Oh man.... you didn't just say that out loud did you???

      Maybe: he used parens, so it counts maybe as mumbling, maybe as just thinking it.

    82. Re:RAID doesn't protect against your worst enemy by badkarmadayaccount · · Score: 1

      s/IBM/DEC programmers employed at M$/g

      --
      I know tobacco is bad for you, so I smoke weed with crack.
    83. Re:RAID doesn't protect against your worst enemy by ShakaUVM · · Score: 1

      Yeah, as I said, I have nightly backups to another internal drive and to an external 1TB USB drive (which mirrors the whole RAID0 drive). Mainly, I use RAID0 since it makes a very noticeable difference in boot times and load times in games. It might die, but I shouldn't lose any data.

    84. Re:RAID doesn't protect against your worst enemy by Anonymous Coward · · Score: 0

      You mean these idiots? A whole $500 prize when the cost of recovery will equal at least a few multiples of ten (or exponents of the base) over the "big prize" isn't a winner.

      It's glory hounding when you know that anyone with enough brains to know what a "dd" is won't accept your challenge just based on cost alone.

      Good luck on that piece of urban crapology...

    85. Re:RAID doesn't protect against your worst enemy by drsmithy · · Score: 1

      Actually, HPFS and NTFS are completely different. However, even if they weren't it wouldn't matter, because HPFS was also created by Microsoft.

  17. Raid decay by Anonymous Coward · · Score: 0

    Sure, everyone should use atleast Raid 6 in production, atleast it's an improvement over the classic Raid 5 with a hot-spare.

    But the big problem with Raid isn't disk failure, it's disk decay, and a major reason for that being a problem is the lack of hashing on most modern filesystems.

    They basically don't check that what you put somewhere is what you get back, which means that the Raid can decay slowly and your data will just corrupt, sure it's still raid, it's just that the distributed data is corrupt.

    1. Re:Raid decay by cbreaker · · Score: 1

      This is mitigated quite a bit by hardware RAID controllers, SMART, and data validation.

      I've personally never experienced this "RAID decay" you speak of in the last 15 years of working with storage systems. And some of the arrays our customers have running consist of very old disks and controllers.

      --
      - It's not the Macs I hate. It's Digg users. -
  18. Yay, stories from July 2007! by MrPerfekt · · Score: 1

    I'm going to troll ridiculously old articles and post them to Slashdot and hope the editors don't notice... oh cool, they didn't here either!

    --
    I just wasted your mod points! HA!
    1. Re:Yay, stories from July 2007! by Bandman · · Score: 1

      Interesting. I didn't notice the date earlier. It actually came across the RSS stream earlier from ZDNet. Even if it is old, it's an interesting topic that, if true, will have dire consequences before too long.

  19. Sounds.. well. Stupid by EdIII · · Score: 4, Insightful

    I can see a lot of people getting into a tizzy over this. The RAID 5 this guy is talking about is controlled by one STUPID controller.

    There are a lot of methods, and patented technology that prevent just the situation he is talking about. Here is just one example:

    PerfectRAID(TM) is Promise's patented RAID data protection technology; a suite of data protection and redundancy features built into every Promise RAID product.

            *
                Predictive Data Migration (PDM): Replace un-healthy disk member in array and keep array on normal status during the data transition between healthy HD and replaced HD.
            *
                Bad Sector Mapping and Media Patrol: These features scan the system's drive media to ensure that even bad physical drives do not impact data availability
            *
                Array Error Recovery: Data recovery from bad sector or failed HD for redundant RAID
            *
                RAID 5/6 inconsistent data Prevent (Write Hole Table)
            *
                Data content Error Prevent (Read/Write Check Table)
            *
                Physical Drive Error Recovery
            *
                SMART support
            *
                Hard/Soft Reset to recover HD from bad status.
            *
                HD Powercontrol to recover HD from hung status.
            * NVRAM event logging

    RAID is not perfect, not by any stretch, but if you use it properly it will serve it's purpose quite nicely. If your data is that critical, having it on a single raid is ill advised anyways. If you are talking about databases, then RAID 10 is more preferable and replicating the databases across multiple sites, even more so.

  20. Smells Like FUD. by sexconker · · Score: 4, Insightful

    What is this article about?

    They say that since there is more data, you're more likely to encounter problems during a rebuild.

    The issue isn't with RAID, it's with the file system. Use larger blocks/sectors.

    Losing all of your data requires you to have a shitty RAID controller. A decent one will reconstruct what it can.

    The odds of you encountering a physical issue increases as capacity increases, and decreases as reliability increases. In theory, the 1 TB and up drives are pretty reliable. Anything worth protecting should be on server-grade hard drives anyway.

    The likelihood of a physical problem popping up during your rebuild is no higher with new drives than it was with old drives. I haven't noticed my larger drives failing at higher rates than my older, smaller drives. I haven't heard of them failing at higher rates.

    Remember, folks, RAID is a redundant array of inexpensive disks. The purpose of RAID is to be fault-tolerant, in the sense that a few failures don't put you out of production. You also get the nice bonus of being able to lump a bunch of drives together to get a larger total capacity.

    RAID is not a backup solution.

    RAID 5 and RAID 6, specifically, are still viable solutions for most setups. If you want more reliability, go with RAID 1+0, RAID 5+0, whatever.

    Choosing the right RAID level has always depended on your needs, setup, budget, and priorities.

    Smells like FUD.

    1. Re:Smells Like FUD. by cbreaker · · Score: 1

      That's another funny one I see sometimes.

      While I agree with your post and you obviously have a little experience with these matters (versus a lot of folks on Slashdot - it's surprising) there's no such thing as "server grade" when it comes to the quality of a hard drive.

      Do you think that they build "server" drives in a clean room, and "desktop" drives in a slightly less clean room?

      Hard drives are manufactured right next to one another. Some will be SATA. Some will be SAS. Others will be Fiber Channel. ALL have the same incredible tolerances and accurate measurements. Hard drives are a miracle of modern technology.

      The MTBF for hard drives is more of a warranty than a fact. There's no correlation between failure rates in server disks versus PC's.

      --
      - It's not the Macs I hate. It's Digg users. -
    2. Re:Smells Like FUD. by Trogre · · Score: 1

      I notice you mention RAID controllers. Is this entirely for performance reasons or is there something else there?

      The reason I ask is that my experience has led me to conclude that the best RAID is software. I've had battery-backed up RAID controllers fail on me and take out the DATA (not just a disk, but the data on the whole array) in an attempt at auto-recovery.

      These days I boot my OS off a single drive, which then joins a RAID 1 mirror. The critical data is kept on a RAID-5 volume, which can of course encompass partitions on the first two drives.

      I've done this on at least half a dozen servers, both Windows Server and Linux. And the Linux raid tools are parsecs ahead of what Windows offers. They really are.

      --
      "Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
    3. Re:Smells Like FUD. by Fastolfe · · Score: 1

      The odds of you encountering a physical issue increases as capacity increases

      I think perhaps you misunderstand what the article is trying to say. Consider that not all "physical issues" results in an immediately detectable error on the drive. If these occur at rates proportional to capacity, what happens when a RAID array suffers a drive failure, and the data on the remaining (large) drives must be read, in its entirety, to rebuild a replacement drive?

      Statistically speaking, there is a point where if errors accumulate at a rate proportional to capacity, the odds of a cascade of failures being detected during a rebuild make RAID 5 and 6 far less useful.

      An obvious solution, though, is to do regular scans of the data on your drives, to try and pick out and work around errors as they accumulate, instead of noticing the data is MIA when you need it for a rebuild.

    4. Re:Smells Like FUD. by sexconker · · Score: 1

      Um, server class hard drives:

      Are tested more thoroughly
      Have longer MTBFs
      Typically are NOT on the bleeding edge of capacity (or speed)
      Typically use SAS, SCSI, infiniband/fiber channel/whatever, which are less prone to errors, logically and physically (I want to kill the person who designed the SATA connectors)
      Have better warranties

      True, warranties are after the fact, but you can often get next-day replacements shipped to you free, and you can even get a good rate (or free, depending on how much $$$ you're worth to them) on data recovery.

      I think there probably is a correlation between failure rates in servers vs pcs. Keep in mind you also have to think about the usage pattern.

      Everything aside, would YOU load up a production server with a bunch of ExcelStore drives?

    5. Re:Smells Like FUD. by sexconker · · Score: 1

      Software RAID is trash in terms of performance.
      Software RAID is trash if you need to be able to recover your data, and your RAID chipset is more than a year old (good luck at getting support).

      Hardware RAID controllers are much better. You do have to get a good one, though. And you get tons of support, and even the old kit gets updates and documentation.

      Windows/Linux? RAID should be nearly invisible to the OS.

    6. Re:Smells Like FUD. by sexconker · · Score: 1

      The obvious solution is to use a decent RAID controller.

      All the good ones offer automated HD scanning and repairing, remapping of bad sectors, various recovery options during rebuilds (if errors on the "good" drives are detected), etc.

    7. Re:Smells Like FUD. by cbreaker · · Score: 1

      Wow, you really don't know what you're talking about!

      MTBF is mostly marketing. It's not a guarantee, and drives will go above or below the MTBF. There's no possibly way for a manufacturer to know when these drives will fail. It's an estimate. They give "high end" disks a higher number but the desktop drives are built with exactly the same quality control.

      "Server" drives are absolutely "bleeding" edge. The reason they don't have as big of capacity (usually) is because of the spindle speeds. Your average server disk is a 10k or 15k disk. It's more difficult to make higher capacity disks with spindle speeds that high. So, you end up with faster spindles, and lower capacity. As the manufacturing process improves, you see larger capacities in higher spindle speeds.

      SAS and SATA use the same connector and are electrically compatible. I guess you didn't know that. So there goes that theory.

      I don't love the fragility of the SATA/SAS connector but I love the interface. By doing what they did for standard connector between disks, standard positioning on the disk itself, and electrically compatible means that almost all SAS controllers and SAS drive enclosures can also use SATA disks. You can have one array with SAS, one array with larger SATA disks, all in one enclosure with one controller.

      The warranties aren't always better. Seagate, for instance, warranties all of their desktop drives for 5 years, and they have an excellent returns system.

      I have no idea what you're talking about for data recovery. No hard drive manufacturer I know of offers data recovery for warranty work. They warranty the part, not the data - it's up to YOU to provide yourself with recoverability of failed disks.

      Why don't you take a look at Google's hard drive study and see for yourself. http://research.google.com/archive/disk_failures.pdf

      I wouldn't use ExcelStore drives, but I would, and DO, use Seagate, Samsung, and Hitachi SATA drives in production.

      It seems as though you've picked up on the "common" thoughts about hard drives and you believe it as fact. It's just not. Read up on this stuff and don't rely on Internet people so much for the facts.

      --
      - It's not the Macs I hate. It's Digg users. -
    8. Re:Smells Like FUD. by sexconker · · Score: 1

      Yeah, YOU actually don't know what you're talking about.

      "MTBF is mostly marketing. It's not a guarantee, and drives will go above or below the MTBF"

      Of course they will go above or below the MTBF. It's the MEAN time before failure.

      SATA and SAS do not use the same connector. SAS is backwards compatible with SATA. Both are STANDARDS, which define the communication.

      You've never dealt with server class hardware and vendors, have you? You pay out the ass, but you get real support. You get a real warranty. Someone sold you a storage solution and it's not working right? They'll ship out a human to you. Hard drive borked and your RAID can't rebuild? They'll do data recovery for you.

      Google's hard drive study? Please. They're the people who built their storage solution around using the oldest, cheapest drives they could find, and simply tossed them on failure.

      From your own fucking link:

      Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18]. Our results do not contradict this fact.
      For example, Figure 2 changes significantly when we normalize failure rates per each drive model. Most age-related results are impacted by drive vintages.
      However, in this paper, we do not show a breakdown of drives per manufacturer, model, or vintage due to the proprietary nature of these data.

      So what are you trying to prove with that link? The link says drive models, manufacturers and age all play a key role in the failure rate.
      They go on to say that they won't be breaking down that data for us, since it's of a "proprietary nature".

      R
      E
      T
      A
      R
      D

    9. Re:Smells Like FUD. by raynet · · Score: 1

      Why do you need a RAID chipset if you are using software RAID? Software RAID also usually gives you more options to recover data than hardware RAIDs do (you can recover from multidisk failures on RAID5 by using different disk arrays for different range of sectors etc). And when you are using software RAID you can always plug the drives to another machine and they will just work, but if you hardware RAID controller fails, it might be difficult to get compatible controller to recover the data.

      --
      - Raynet --> .
    10. Re:Smells Like FUD. by sexconker · · Score: 1

      You need a hardware RAID controller because you WOULDN'T be using software RAID, since it's trash.

      Hardware RAID is much more robust.

      You can't seriously be saying software RAID has better recovery options and support and compatibility than a good hardware controller, can you?

    11. Re:Smells Like FUD. by Anonymous Coward · · Score: 0

      Haha what a shithead you are.

      I actually have worked for both HP and EMC. We offered absolutely NO data recovery services as part of the disk warranty. Prove to me that ANY manufacturer does this and I'll bite.

      SAS isn't "backward" compatible with SATA. Both interfaces were created at the same time to ensure that you could plug both types of disks into a hot plug bay. SAS disks might have a "shim" between the power and data ports, but you CAN use the same cabling if you want to with a small connector. The interfaces are 100% electrically compatible.

      Many FC (This is Fiber Channel, just to let you know) disk systems will allow for SATA disks to plug, requiring a small transceiver type of circuit board in the drive caddy. The Clariion series uses these. Similar system on the HP EVA series.

      --(read this part)--

      WHAT THE GOOGLE STUDY ACTUALLY PROVES:

      *Models* have nothing to do with drive *type*. What they're saying is that (for example) the Seagate ST4040505 (not a real model number) might be a lot less reliable than the ST5060393 - both drives of the same capacity and interface.

      All they're saying is that some manufacturing runs, and some manufacturers, have different levels of quality and drive life expectancy.

      The interface has NO BEARING ON DRIVE LIFE. Read this again.

      Do you actually believe that Seagate or Hitachi have different factories for pushing out SAS/FC then that of SATA, and that the SATA factory isn't as good? That's so stupid I really don't know what to say. Well, asshat comes to mind.

      I don't understand why you've continued to argue this, and I've obviously struck a nerve because you resorted to lashing out. You're some 12 year olds' Internet Hero!

      F
      A
      I
      L

    12. Re:Smells Like FUD. by cbreaker · · Score: 1

      Are you kidding me?

      Software RAID is a lot more flexible. You can plug the disks into ANY machine running the right OS and get your data up and running.

      You can also RAID just PARTS of your disk, like with Linux LVM.

      You can get new features with a software patch.

      There's a lot of software out there to recover data from failed Windows software RAID sets.

      Versus hardware, where you have a single vendor that you need to deal with to help you, and that you might need to acquire the same exact model controller if yours fails. This could be a problem if your controller is several years old.

      Hardware RAID is a lot faster (in most cases) but that's really the ONLY benefit of it.

      --
      - It's not the Macs I hate. It's Digg users. -
    13. Re:Smells Like FUD. by cbreaker · · Score: 1

      PS. The funny thing is that I actually agreed with your original post, but not about the failure rate of "enterprise" disks versus desktop disks.

      You're a very ignorant person. I'm glad I don't know you in real life.

      --
      - It's not the Macs I hate. It's Digg users. -
    14. Re:Smells Like FUD. by sexconker · · Score: 1

      Software RAID, as in, PURE SOFTWARE (opposed to a chipset on a motherboard)?

      Performance is so bad in that scenario that it's not feasible for most serious deployments.

      But yeah, you're right, that IS more flexible, and yeah, you can plug it into any machine.

      Most RAID controller manufacturers have a very good track record when it comes to supporting, replacing, and documenting older hardware. Sure, there's a cost involved, but if you have any concern for performance, it's well worth it.

    15. Re:Smells Like FUD. by Anonymous Coward · · Score: 0

      You got owned.
      You're a fucking moron, and you posted a link to support yourself, that actually contradicts your statements.
      Face it.

  21. Taking published stats too seriously? by Vellmont · · Score: 2, Interesting

    The whole argument boils down the published URE rate being both accurate, and a foregone conclusion. Will disk makers _really_ make drives that have a sector failure for every 2 terabytes, or will they improve whatever technology is causing these URE's to be much more rare? (if the rate was real in the first place).

    --
    AccountKiller
    1. Re:Taking published stats too seriously? by Ost99 · · Score: 1

      URE of 1 in 1E14 only applies to desktop drives.
      More expensive drives have a lower URE rate.

      Example:
      Seagate desktop drives 7200.11 have 1 in 1E14 URE
      The RAID version ES.2 of the same drive has 1 in 1E15 URE, and their enterprise disks (15k SCSI / SAS) has 1 in 1E16 URE.

      The article is just FUD

      --
      ---- Sig. gone.
  22. RAID Is not a Backup !!!!! by mbone · · Score: 4, Insightful

    How many times does this have to be said.

    RAID is not a backup. RAID is designed to protect against hardware failures. It can also increase your I/O speed, which is more important in some cases. Backups are different.

    Depending on what you are doing, you may or may need a RAID, but you definitely need backups.

    1. Re:RAID Is not a Backup !!!!! by blair1q · · Score: 0

      The chances that your RAID will have a double failure causing your data to be lost are just about the same as the chances that your RAID will have a single failure and your tape backup also has a failure.

      You can back up your RAID, but that'd be like backing up your backup tapes.

      Feel free. Tape-hangers need jobs.

    2. Re:RAID Is not a Backup !!!!! by bendodge · · Score: 1

      Read the article. Once you get into the 2TB and higher range, RAID5 won't protect much against hardware failures. As a previous poster noted, expecting even savvy home users or SMBs to keep offsite tape backups of a 7-disk array is expensive and unrealistic.

      --
      The government can't save you.
    3. Re:RAID Is not a Backup !!!!! by DrVxD · · Score: 1

      The chances that your RAID will have a double failure causing your data to be lost are just about the same as the chances that your RAID will have a single failure and your tape backup also has a failure.

      Only if you assume a single backup tape. Anyone who's even half-way serious about keeping their data intact will have multiple backup tapes. (Once upon a time we called it "grandfather-father-son" - but these days we tend to go for more generations than that). For our critical systems, we have multiple copies of *every* backup (two in the data centre for ease of restoration, and a minimum of four offsite - 2 each in 2 locations). Even on databases that are replicated globally.

      You can back up your RAID, but that'd be like backing up your backup tapes.

      Exactly like backing up your backup tapes - an eminently sensible strategy if your data means anything to you.

      I'm a little less fastidious about it when it comes to backing up my data at home - but every time I burn a backup DVD, I burn at least two copies of it. Once bitten, twice shy.

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    4. Re:RAID Is not a Backup !!!!! by JesseMcDonald · · Score: 1

      If the only issue were hardware failures that would be correct. RAID and backups are designed to protect against different failure modes, however. RAID isn't going to help you when all the drives are sent the same incorrect signal, whether that takes the form of OS- or application-level data corruption or user error.

      As the GP said, RAID gives you quick, up-to-date -- but temporary -- recovery in the event of a single disk failure (or two failures for RAID 6), and/or improved speed during normal use. You still need the backups to protect against data corruption and user error. With backups you can recover most of your data after any failure in the live system; with just RAID limited hardware failures are the only thing you can recover from.

      --
      "The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
    5. Re:RAID Is not a Backup !!!!! by nrozema · · Score: 1

      RAID is not a backup. RAID is designed to protect against hardware failures.

      If the solution to a disk failure in multi-terabyte RAID 5 arrays becomes "restore from tape" as the article implies, then the usefulness of RAID for protecting against hardware failures is pretty much nil. Seems that's what the article was about, not backups.

    6. Re:RAID Is not a Backup !!!!! by petermgreen · · Score: 1

      The chances that your RAID will have a double failure causing your data to be lost are just about the same as the chances that your RAID will have a single failure and your tape backup also has a failure.
      The thing is even if you belive that double disk failures in a raid are unlikely there are many other threats to data that raid DOES NOT protect from. For example

      * Accidental deletion
      * theft or destruction (e.g. by fire) of all the drives in the array
      * PSU failure killing the controllers on all the drives in the array at once. (this one is probablly recoverable but not without a lot of pain)

      And a sensible tape backup strategy won't reuse the same tape for every backup. It will cycle the tapes so that at any time there is always at least one tape and prefferablly more offsite.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    7. Re:RAID Is not a Backup !!!!! by cbreaker · · Score: 1

      I agree.

      But these days, I believe everyone should use RAID if they can. RAID 5 is a great mix of performance, reliability, and price. Who wants to deal with going to the backup system to recover data when a single drive fails, when it could have been avoided completely with a RAID?

      --
      - It's not the Macs I hate. It's Digg users. -
    8. Re:RAID Is not a Backup !!!!! by Anonymous Coward · · Score: 0

      It needs to be said again because I don't think it's already been said AT LEAST 20 TIMES! Geez...since this is /. you really didn't need to ask.

    9. Re:RAID Is not a Backup !!!!! by Hyppy · · Score: 1

      Feel free to not back up your data. Recovery specialists love to charge you up the arse to get bits and pieces of your precious data back after you accidentally deleted it.

      I don't need a backup. I have RAID, right?

  23. Can I tell you where to insert your plug? by hacker · · Score: 0, Flamebait

    "I'll keep the ZFS plug short."

    Start with a short plug, because where I'll be asking you to put your ZFS "plug" will most-definitely hurt if your "plug" is any larger...

    1. Lacking in file system utilities (yes, fsck IS necessary even on healthy filesystems, especially on desktops and portables)
    2. License-incompatible with anything worth running it on, other than Solaris itself... which is NOT worth running (see #1 above)
    3. Proprietary and full of patented technologies (see #2 above)

    Need I go on? There are plenty more reasons.

    We'll have a viable replacement soon enough, which is already designed to have quite a few more features that ZFS does not have, and cannot delivery in its current incarnation.

    1. Re:Can I tell you where to insert your plug? by jumon · · Score: 0, Troll

      Lacking in file system utilities (yes, fsck IS necessary even on healthy filesystems, especially on desktops and portables)

      So, why is fsck needed? If you'd actually investigated ZFS, you'd know its built to never need such an archaic utility. Either it works or it doesn't, so if it does, then no external checks are needed.

    2. Re:Can I tell you where to insert your plug? by hacker · · Score: 1
      If you'd actually used ZFS in production on multi-terabyte filesystems, you'd know precisely why fsck is needed. Add to that, running it on laptops and machines where imminent failures are common, and filesystems have to be able to recover from that.

      See here and Sun's silly response as to why they neglected to provide fsck support for ZFS in their OS.

      From a previous mini-thread on the matter. See the full thread for plenty of other examples in defense of fsck utilities on "robust" filesystems.

      "ZFS has checksums and will find errors, but only will be able to self-heal the errors in a redundant configuration. On a single disk, ZFS will find the error thanks to checksums but will not be able to recover your data. Since ZFS was mainly designed for systems that will use redundant configurations, it may have sense there, but desktops are not never going to do such things. IMO the ZFS people were a bit elitist here - "let's going to build a filesystem so good that we won't need a fsck". But in the real world you _are_ going to need a fsck util. Only in excepcional and very rare cases, but you're going to need it."

    3. Re:Can I tell you where to insert your plug? by jp10558 · · Score: 1

      I've set up an OpenSolaris raidz2 array with 9 disks at home for my backup (that is, I backup my main system that is JBOD to the raid array, so it's my extra copy). I'm not that versed in Solaris, and slowly learning (it's built in SMB was much easier for me that SAMBA has been), but I really know Red Hat better. What's coming to replace it (zfs) that's better?? Is it going to work on CENTOS?

      --
      Opera, Proxomitron-Grypen,GPG 0x0A1C6EE3
    4. Re:Can I tell you where to insert your plug? by pyite · · Score: 4, Informative

      Wow. I love your FUD. If you're going to lie, at least make it seem truthful.

      Lacking in file system utilities (yes, fsck IS necessary even on healthy filesystems, especially on desktops and portables)

      Why no fsck? And if you really feel the need to do something:

      zpool scrub <pool_name>

      License-incompatible with anything worth running it on, other than Solaris itself... which is NOT worth running (see #1 above)

      What you mean to say is "Some Operating Systems whose merits can be debated are license incompatible with the license of ZFS." FreeBSD can implement ZFS. Why can't Linux? Because of its license, not that of ZFS.

      --

      "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

    5. Re:Can I tell you where to insert your plug? by Anonymous Coward · · Score: 0

      I agree that its not for use in a non-redudant configuration. I do have a 2 terrabyte configuration with ZFS and I've yet to need any such utility. True, its new and doesn't have the years of experience I've had with UFS and its variants, but the idea seems valid.

      You may not believe them, but they have stated that they have not seen any issues with unknown pathologies in their 5 years of development and the loss you quote did state that they were doing some serious voodoo with the filesystem to get it into that state. I think that any filesystem should let you "get what you can" when its just too hosed to work right. I'd rather get dangerous access to some data than nothing at all. Thanks for the insight!

    6. Re:Can I tell you where to insert your plug? by hacker · · Score: 1

      "Why no fsck? And if you really feel the need to do something:"

      You DID see my previous reply, right? It has plenty (hundreds) of anecdotal references as to why fsck is absolutely required, especially when dealing with redundant disks and larger filesystems.

      "Why can't Linux? Because of its license, not that of ZFS."

      Sure, as long as you consider that the CDDL takes rights away that the GPL grants. See here for a MUCH better explanation than I can do justice with.

    7. Re:Can I tell you where to insert your plug? by pyite · · Score: 2, Informative

      You DID see my previous reply, right?

      Yes, I did. It quotes an explanation that you can only fix errors in redundant configuration. Considering that the whole basis for this discussion is RAID-5, I think that's a feasible thing. However, metadata is written in multiple places, so if you want a ZFS fsck to correct a corrupted superblock, it's kinda silly since that superblock is written in multiple places anyway. Also, you can tell ZFS to do a manual scrub (as I shown) which has the advantage of running while the array is running so you can cron script it and still keep the array available.

      I'm not going to argue license points. The fact is that ZFS is under an open source license and so is Linux. Sun had every right to use their own license.

      --

      "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

    8. Re:Can I tell you where to insert your plug? by boner · · Score: 1

      The problem with anecdotal references is exactly that, they are anecdotal. You can trawl through any number of Slashdot discussions to find opinions that support, oppose, ridicule, pontificate, or elevate to religion any point of your choosing.

      When it comes to making engineering decisions these anecdotal references have zero value.

    9. Re:Can I tell you where to insert your plug? by hacker · · Score: 1

      "When it comes to making engineering decisions these anecdotal references have zero value."

      ...except that 100% of them were posted by professional system administrators who are paid to support and maintain systems using these technologies. None of those were from any posts on Slashdot.

      Now, whether any of those career sysadmins are also posting on Slashdot under different names from the ones they used on Google Groups, I don't know.

    10. Re:Can I tell you where to insert your plug? by asaul · · Score: 1

      I really hate that this particular bit of FUD keeps floating around. Seriously - look at the ZFS design, it has so much integrity and self checking built into it that the only two cases for errors are:

      1. Data corruption on disk due to a corrupted write or bit rot. Depending on the case it may be recoverable from another location in the pool or not. If the fault is in the uberblock you are screwed, but same as any other filesystem where its root data is stuffed. In the not repairable case it simply gives you an IO error and flags an FMA event that this file/object is screwed. In any filesystem that is a blow away and restore event. Same as any other filesytem you might be able to rm and save some data, but generally its gone and recover time.

      2. A ZFS software failure of some sort - bad metadata written to disk with valid checksums for example. In this case the actual fix is to the ZFS module and application of a patch, and the short term fix once the bug is discovered is probably a repair tool from Sun, unless the fix involves ZFS being updated so it is able to self correct this fault.

      Which brings us to fsck. Everything that fsck does is based on previously encountered behaviour with a known pathology and possible method for repair. Typically this is addressing known forms of corruption caused by the filesystem itself and on disk inconsistancy (writes lost at poweroff etc). Sure XFS had no tool initially, but once the problems were found one was written. UFS has 20 years to experience to put into fsck.

      Many of these cases are allready handled by the inherent design of ZFS - self checking data structures, checksums throughout, copy-on-write disk updates.

      So, unless you can find an inherent flaw in the design of ZFS, or a bug in the code you have no case to need a fsck utility more advanced than zfs scrub. It may be the case that one day there will be a zfs repair command - but until such a need arises why be so paranoid?

      Most ZFS "oh fuck" events I have heard of have either been hardware issues (expensive RAID arrays doing silent data corruption) or some extreme cases of testing abuse. In my professional experience with it over the last 2 years I have only see one issue caused by bad non-ECC memory on my home PC.

      --
      "If everybody is thinking alike, somebody isn't thinking" - Gen. George S. Patton
  24. so wrong by Anonymous Coward · · Score: 0

    12 TB of your carefully protected â" you thought! â" data is gone. Oh, you didn't back it up to tape? Bummer!"

    Ummm... Wow.

    A RAID is no substitute for a backup. A RAID cannot recover a file that you accidentally deleted, for example. A RAID can't be used to rebuild your server if the building burns down. If you aren't using some kind of offsite backup, your data is not carefully protected.

    RAIDs are handy for giving you a little more reliability. If one HDD fails you can usually recover without any downtime.

    RAIDs also give you much better speed over a single drive.

    RAIDs can give you increased capacity over a single drive.

    But a RAID is not a substitute for a backup. Never was, never will be.

  25. The problem is time, not reliability by petes_PoV · · Score: 2, Interesting
    The larger the drives, the longer it takes to resilver (rebuild the RAID) the array. During this time performance takes a real hit - no matter what the vendors tell you, it's unavoidable: you simply must copy all that data.

    In practice, this means that while your array is rebuilding, your performance SLAs go out of the window. If this is for an interactive server, such as a TP database or web service you end up with lots of complaints and a large backlog of work.

    The result is that as disks get bigger, the recovery takes longer. This is what make RAID less desirable, not the possibility of a subsequent failure - that can always be worked around.

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
    1. Re:The problem is time, not reliability by Anonymous Coward · · Score: 0

      petes,
      So you're saying that having the data/server app available at half of normal speed is MORE of a problem than having the server completely down while you replace drives and restore from tape?

      You live in an odd universe friend.

      I suspect that even if you have all the hardware at hand and have done it before, your replace & restore cycle is going to be as long or longer than the automatic raid rebuild to a hot spare.

    2. Re:The problem is time, not reliability by petes_PoV · · Score: 1

      So you're saying that having the data/server app available at half of normal speed is MORE of a problem than having the server completely down while you replace drives and restore from tape?

      Not at all. Simply that the more data you need to write (to a fresh/replaced RAID element) the longer it takes. Any conclusions you draw from that is your own work, not mine. I have my own set of corollaries and remedies for this situation.

      --
      politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  26. ZFS is not Obama! by theendlessnow · · Score: 1

    It's implied in the Slashot description that ZFS solves the problem of drive failure. It does not. Just want to make that clear. In fact, I'd argue that there is actually more risk inside of ZFS with regards to the actual problem presented here.... for those that believe all is doom and gloom with regards to RAID.

  27. Doesn't make sense to me by Anonymous Coward · · Score: 0

    I don't understand how the spec'd URE of a single SATA drive can be translated across multiple drives whose combined capacity happens to match that URE. I would think that the success or failure of each drive is independent of the others and the fact that you have 6 2-terabyte drives doesn't mean that you have to have a URE. You'd think that you'd have to have an URE on 7 2-terabyte drives in any configuration fi that were true.

  28. RAID is about avoiding PRODUCTION downtime. by khasim · · Score: 2, Informative

    Spell it out for everyone.

    RAID won't save your data if there is a fire.
    Or if you delete a file.
    Or if two drives fail.
    Or a thousand other scenarios.

    All RAID does is prevent the system from going down when a single drive fails (except RAID 0). Thus giving everyone in the office time to finish up their important work and log out for the day so you can swap the drive. Or, if you're brave, swap the drive during regular work hours.

    For the home user (not working on huge graphic files) RAID 1 (mirroring) should be sufficient. As long as it is paired with another EXTERNAL hard drive that you copy your important information to. And leave with your brother or something. I'm talking family photos and such. Your tax information should be small enough to fit on a USB drive.

    If your computer completely failed TODAY what would be the really irreplaceable files on it?

    Back those up. Then store them with a friend or someone in your family.

    There, problem solved.

    1. Re:RAID is about avoiding PRODUCTION downtime. by Anonymous Coward · · Score: 0

      With hot swap, you don't have to feel so 'brave' during office hours. :) But, yes, RAID (with hot replacement) turns the inevitiable hard drive failure from "potentially several days work" to "potentially several minutes work". And IT staff have enough work to do, thank you.

      Hard drive failures are inevitable. But it is only those that RAID defends against.

      Anyway, a lot of the problems are solved - all that's required is a simple 'read every sector on the disk every so often to check'. You could do this yourself = 'dd if=/dev/somevolume of=/dev/null'

  29. Raid 5 failing, Raid 6 not far behind... by Anonymous Coward · · Score: 0

    ...clearly, Raid 7 is needed.

  30. RAID5 isn't a false sense of security by cbreaker · · Score: 1

    RAID 5 isn't a false sense of security. It actually DOES protect you from a disk failure.

    I made the decision about two years ago that all disks at home will be either mirrored or RAID5. Disks are so dirt cheap that there's no reason not to.

    RAID doesn't prevent you from having to have some sort of backup solution, and if you can't trust yourself to do them unless you're being risky with your data, I'll happily avoid dealing with restoring data and all that bullshit from a single disk failure and you can sink your time into doing it all manually.

    --
    - It's not the Macs I hate. It's Digg users. -
    1. Re:RAID5 isn't a false sense of security by networkBoy · · Score: 1

      True, it does protect from disk failure. The false sense I was referring to was the inherent "I'll do that later, the RAID will cover me for a few days" bit that none of want to admit, but we are all susceptible to.

      As to the previous poster: Yes it's crappy when your volume goes off-line or read-only. Unfortunately such is the case when you have no budget and are on under performing hardware, but that's all you can afford.

      Note that my application is *not* enterprise, or even departmental (though it should be treated as such). It is managing a non-trivial collection of data-sets for a masters thesis in social sciences. (not mine, I'm just the data mule).

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    2. Re:RAID5 isn't a false sense of security by MightyYar · · Score: 2, Insightful

      I do the same thing, but I want to warn you...

      I've had TWO occasions where it has failed me. Once, a lightning strike that zotched both drives. The second time a rubber isolator failed in the case and the master drive fell onto the backup.

      In both cases the bad spots in the two drives were different so I got back most of my data, but now I use Mozy as well as mirroring. I REALLLLLLLY don't want to lose all of my digital photos. :)

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
  31. Re:Illin with the panicillin? by Anonymous Coward · · Score: 0

    can someone tell what's the meaning of this shit?

  32. RAID6 = Win by MukiMuki · · Score: 3, Insightful

    Scrub once a week, or once every two weeks.

    RAID6 isn't about losing any two disks, it's about having two parity stripes. It's about being able to survive sector errors without any worry.

    It's about losing ONE drive and still have enough parity to replace it without any errors.

    RAID6 on 5 drives is retarded, tho, because it leaves you absurdly close to RAID1 in kept space. RAID6 is for when you have 8-10 drives. At that point you barely notice the (N - 2) effect and you have a fast (provided your processor can handle it all) chunk of throughput along with an incredibly reliable system. Well, N-3 with a hotswap.

    Personally, I think I'd go RAID-Z2 via ZFS if only because it's a little bit sturdier a filesystem to begin with.

  33. In Soviet Russia... by InSovietRussiaTroll · · Score: 0

    They recycle old news forever!

  34. Re:Illin with the panicillin? by Anonymous Coward · · Score: 0

    Probably someone attempting the 1000 monkeys on a thousand keyboards theory? They have made quite some progress don't you think?

  35. For fuck sake, at least read the summary by nedlohs · · Score: 1

    Sure expected the "editor" to actually look at the article is excessive. But:

    "Disk drive capacities double every 18-24 months. We have 1 TB drives now, and in 2009 we'll have 2 TB drives."

    Is an obvious indication that this article is old since 18-24 away puts you in 2010 now...

    1. Re:For fuck sake, at least read the summary by Anonymous Coward · · Score: 0

      It doesnt state when 1TB came out, so if it came out 6 months prior, then 2TB could be achieved by close of 2009.

      We have 1.5TB drives right now...

    2. Re:For fuck sake, at least read the summary by nedlohs · · Score: 1

      Hence the use of the word "indication" and not the words "incontrovertible proof".

      Once you see the obvious indication you click the link to check the date or if too lazy to do that don't post just in case...

  36. PANIC! by Anonymous Coward · · Score: 0

    12 TB of your carefully protected -- you thought! -- data is gone.

    oh noes!!! it will be even worse when you have 13TB instead of 12TB!!!!

    now read that BS article and downmod this "news"...

  37. RAID-10 by tonytnnt · · Score: 2, Insightful

    RAID-10 ftw? Expensive I know, but at least you have a full layer of redundancy rather than just a parity drive.

    1. Re:RAID-10 by anonobomber · · Score: 2, Interesting

      With RAID 10 you still can have 2 drives fail and lose all your data. Though if you're lucky you'll have the second failure on the same side of the mirrored portion in which case you'll still have your data.

    2. Re:RAID-10 by tonytnnt · · Score: 1

      RAID 01 then?

    3. Re:RAID-10 by Lost+Penguin · · Score: 1

      RAID 55?
      Too Expensive to use nine drives for the capacity of two?
      The question is how vital is the data, divided by your own laziness to perform backups.

      (I back it up every 5 minutes to /dev/null!)

      --
      I am the unwilling control for my Origin.
    4. Re:RAID-10 by Anonymous Coward · · Score: 0

      No, that's RAID 0+1.

  38. Not true by my experience... by mr_rarr · · Score: 1

    Well what the author is saying is not true by my experience..

    When I configure a new server with RAID, before handing over the box, I will test every single RAID 1/5/10 sets by pulling one drive out at a time for a couple of minutes then I would put it back in until the rebuild is complete.

    I worked with multiple Dell PowerVault MD1000 using 15 1TB SATA drives and never have I ran into an issue of not being able to complet the rebuild because of the issue mentioned in the article.

    I can tell you that drives will fail but if you have the right monitoring in place and catch when a drive fails or about to fails and with the right RAID solution in place, you can consider your data pretty safe. RAID is not a backup not 100% but chances are that a RAID solution will surly save your butt when a disk dies and you just don't have the time to rebuild / restore a box.

  39. 1 in 10^14 bit is not what I observe by gweihir · · Score: 4, Informative

    My observed error rate with about 4TB of storage is much, much lower. I did run a full surface scan every 15 days for two years and did not have a single read error in about two years. (The hardware has since been decomissioned and replace dby 5 RAID6 Arrays with 4TB each.)

    So, I did read roughly 100 times 4TB. That is 400TB = 3.2 * 10^15 bits with 0 errors. That does not take into account normal read from the disks, which should be substantially more.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re: 1 in 10^14 bit is not what I observe by Free+the+Cowards · · Score: 3, Interesting

      Modern drives make extensive use of error-correcting codes. It's not that expensive, space-wise, to have a code which can recover from problems to almost any desired degree of confidence. I'd be shocked of any hard drive manufacturer wasn't using an ECC that gave their devices a very near zero chance of any user experiencing a corrupted read for the entire lifetime of the drive.

      --
      If you mod me Overrated, you are admitting that you have no penis.
    2. Re: 1 in 10^14 bit is not what I observe by hellwig · · Score: 2, Interesting

      Yeah, as far as I can tell, the numbers the author used only relate to every 12TB of data read, and have absolutely nothing to do with RAID. Therefore, for every 12TB of data read, there will be a un-recoverable error. That means 50% of al 6TB RAID rebuilds fail. 25% of all 3TB RAID rebuilds, etc... At these rates, RAID was never a viable option.

      I don't know how much data is transferred over the internet every second, but I have to imagine this results in hundreds of thousands of files lost every day (due to URE). In fact, I conjecture that the rate of files being lost is outpacing the rate of files being created, soon we will have a total information blackout due to more people reading data then creating data.

      That, or the author's numbers are bullshit and he's misinterpreting the results.

      --
      Eggs
      Milk
      Bread
      Cat Litter
      Soda
      ...
    3. Re: 1 in 10^14 bit is not what I observe by jimicus · · Score: 1

      So, I did read roughly 100 times 4TB. That is 400TB = 3.2 * 10^15 bits with 0 errors. That does not take into account normal read from the disks, which should be substantially more.

      Nor does it take into account the invisible sector remapping that any reasonably modern drive will do.

    4. Re: 1 in 10^14 bit is not what I observe by Anonymous Coward · · Score: 0

      Have you compared your data against a checksum ? Zero read errors reported by the drive does NOT mean zero errors.

      My ZFS raidz array suffered one checksum error this year with only 500Gb of data. No read errors were reported.

    5. Re: 1 in 10^14 bit is not what I observe by Sobrique · · Score: 1
      Not quite. That's the expected unrecoverable error rate published by the drive manufacturers. That doesn't mean it's mandatory that you'll have one every 12Tb, it just makes it increasingly likely that one will happen.

      Nor does it mean that's the probability of a complete RAID failure - an unrecoverable error is one thingy that can't be read properly. On a RAID, that's corrected by the RAID. With the Raid down... well, oops, you've got a corrupt file (well, more likely you've a corrupt stripe). But the rest of your 12Tb isn't affected.

    6. Re: 1 in 10^14 bit is not what I observe by gweihir · · Score: 1

      Nor does it take into account the invisible sector remapping that any reasonably modern drive will do.

      Indeed. It is quite possible that my bi-weekly surface scan actually dramatically lowers the disk uncorrectable error rate, because remapping takes place at an early stage when borderline sectors are still readable. On the other hand, this type of "disc scrubbing" seems to be fairly standard on high-end RAID controllers. Come to think of it, Linux software RAID under, e.g., Debian, also does scrubbing (complete array consistency check) once a week or so in the default installation. So disc scrubbing is nothing exotic in a reasonable RAID and needs to be taken into account when estimating reliability.

      Leaves me with the impression that the author of the article is not very competent with regard to the subject matter.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    7. Re: 1 in 10^14 bit is not what I observe by toby · · Score: 1

      I'd be shocked of any hard drive manufacturer wasn't using an ECC that gave their devices a very near zero chance of any user experiencing a corrupted read for the entire lifetime of the drive.

      For suitably large values of "near zero"!

      Corrupted reads have a variety of causes. For example, sometimes it is reading the wrong sector. RAID-1 will happily deliver the bad data. For any RAID level, *writing* the wrong sector (which can also happen with non zero probability) is catastrophic. Data can also be corrupted at any stage after leaving the media: cable, controller, etc.

      ZFS (but not RAID) can protect against all of the above error modes, and will additionally self heal (fix the data on disk). It does not assume any component, or even error reporting, is reliable.

      --
      you had me at #!
    8. Re: 1 in 10^14 bit is not what I observe by Nynaeve · · Score: 1

      Read this PDF. In particular, page 13:
      Measurements at CERN
      Wrote a simple application to write/verify 1GB file
      * Write 1MB, sleep 1 second, etc. until 1GB has been written
      * Read 1MB, verify, sleep 1 second, etc.
      Ran on 3000 rack servers with HW RAID card
      After 3 weeks, found 152 instances of silent data corruption

      Reading/Writing 1MB/sec for three weeks has 152 SILENT errors. They don't mention detected errors, but it has to be more than the silent ones.

    9. Re: 1 in 10^14 bit is not what I observe by gweihir · · Score: 1

      Silent data corruption on the disk is next to impossible. The error correcting code will detect corruption reliably. I do not have exact numbers, but an undetected error will be several orders of magnitude less likely than a detected one. In addition, an undetected error needs a lot of bits to be flipped on disk (because of the ECC hamming distance) in just the right way.

      However, silend data corruption in memory or data-path is possible. In memory, I have observed something like 2 bit-errors per year in an application that transferred about 1 GB/hour (with non-ECC memory). That is about one error every 5TB, detected by bad bzip2 checksum. Incidentially my original posting also referred to disks carrying this type of data with intrinsic checksums and huge bit-error blowup (due to decompression), with the disks typically being more than 50% full, so I am sure there was no significant number of silent errors I missed. I have had the very rare silent transient error, where re-reading the data resulted in an error-free read. I expect these were also due to memory corruption.

      The CERN data is 2.7 * 10^15 Bytes with 152 errors. That would be one error every 18 TB. This is highly unlikely to have been the disks, but is entirely possible this is due to in-memory corruption. Of course there is a risk of corruption to happen in the disk memory (it is not really possible to happen on, e.g., an SATA bus), but HDDs use SRAM which has far better reliability than DRAM. It also is smaller and defective cells would show up more frequently.

      In addition, the CERN data does not analyze the nature of the error, nor the position, not the distribution on the machines. It does not state what busses, interfaces and memory was used as main memory and as memory on the RAID cards. It does not state what RAID cards were used. Basically it is useful to show silent errors happen, but cannot explain where they happen or why. It could, for example, have been one single weak bit in one of these 3000 machines. (I had one of these once. Gave about one bit error per day and was a bitch to find in a 24 computer cluster with process migration for load balancing. I finally needed to disable the migration altogether.) It also has no evidence at all that additional on-disk checksums would have helped and generally cannot satisfy to even low scientific standards. I do not blame the authors though. Very few paople have ventured into these regions of shaky reliability and have the education and experience to understand what is going on.

      Personally, I think checksums on disk are only useful if they are added by the application that produces the data as early as possible. In addition, main memory and caches should have ECC or at least parity, for this to work well. A far better solution is to do a verify read. (Difficult with modern disk buffering...) Silent on-disk failure is something I expect to not be an issue for the forseeable future.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    10. Re: 1 in 10^14 bit is not what I observe by Nynaeve · · Score: 1

      A bit of googling leads to this PDF which I got from this link. These are more descriptive than the ZFS slide.

    11. Re: 1 in 10^14 bit is not what I observe by gweihir · · Score: 1

      INteresting. Seems that silent corruption is mainly not a disk problem, just as I thought.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  40. My solution by SuperQ · · Score: 2, Insightful

    I'm in the process of building a new 8x 1T array. I'm not using any fancy raid card. Just a LSI 1068E chipset with a SAS expander to handle LOTS (16 slots in the case, using 8 right now).

    I'm not putting the entire thing into one big array. I'm breaking up each 1T drive into ~100GB slices that I will use to build several overlapping arrays. Each MD device will be no more than 4-5 slices. This way if an error occurs on one disk in one part of a disk I will have a higher probability of recovery.

    I may also use RAID 6 to give me more chance of rebuilding.

    Disk errors tend to not be whole disk errors, just small broken physical parts of a single disk.

    SMART will give me more chance to detect and replace dying drives.

  41. What's your beef with RAID 5? by cbreaker · · Score: 3, Insightful

    Seriously - what's the problem with RAID 5? It's not a FALSE sense of security: It actually DOES prevent data loss or down time on a single disk failure. If you're a moron, you're creating 14 disk arrays. If you're smart, you keep it to 7 disks at the very most.

    RAID 5 is great. It's fast, unless you have a shit controller without enough cache. It's going to prevent down time on a single disk failure (which is overwhelmingly the most common type of failure) and it doesn't cost you too much capacity.

    Usually I'm more concerned with a fire or flood than a double-disk failure.

    RAID 6 is good, but you get the same (actually worse) performance hit over RAID 5. More parity calculations. You can lose any two disks, which is nice, and if you can spare the space, go for it!

    I don't see RAID 6 as being all that much more of a big deal over RAID 5 and actually it shouldn't really have it's own number since it's exactly the same technology and parity system as 5. It should be RAID 5.1 or something. Or maybe RAID5+1. The only reason it's become more available now is because controllers have gotten fast enough to deal with the additional parity.

    --
    - It's not the Macs I hate. It's Digg users. -
    1. Re:What's your beef with RAID 5? by Luminary+Crush · · Score: 1

      All RAID6 is not created equal.

      RAID6 is a general term used to describe any system with TWO INDEPENDENT types of redundancy such that any two disks in a RAID group can be lost without data loss.

      This is not RAID5.1. RAID6 has two separate calculations.

      Within RAID6 there are different ways to create that second level of disk redundancy. Some schemes use a separate calculation of interleaved parity similar to RAID5. Others (eg NetAapp) use what they call RAID-DP, which is a form of RAID6-protection using a RAID4 with two parity disks. This is *not* interleaved parity, so it's significantly faster than interleaved RAID6 and faster than RAID5.

    2. Re:What's your beef with RAID 5? by Anonymous Coward · · Score: 0

      Are you sure about that? I was under the impression that the second syndrome used an entirely different formula than simple xors.

    3. Re:What's your beef with RAID 5? by cbreaker · · Score: 1

      Really? Because I thought the main reason for not going RAID 4 was the performance hit of having a single disk getting hammered with parity information all the time.

      RAID 5 is always preferred because it offers the same protection but distributes the parity for better performance.

      --
      - It's not the Macs I hate. It's Digg users. -
    4. Re:What's your beef with RAID 5? by cbreaker · · Score: 1

      The actual implementation may vary, but in it's simplest form it's the same as RAID 5 with two sets of distributed parity.

      --
      - It's not the Macs I hate. It's Digg users. -
    5. Re:What's your beef with RAID 5? by phasm42 · · Score: 1

      I like to think of it the opposite way: RAID 5 is a special-case implementation of RAID 6. The parity calculation in RAID 5 work out to be a simple XOR, but the second (and third, fourth, etc) parity calculations in RAID 6 with two or more parities are significantly more difficult.

      If you're interested in implementation, check out The mathematics of RAID-6 (pdf) by H. Peter Anvin, and A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems (pdf) by James S. Plank (I used these as references to write a RAID-6 implementation in Java).

      --
      "No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner
    6. Re:What's your beef with RAID 5? by cbreaker · · Score: 1

      I'm actually quite familiar with RAID-like systems. I've been using Usenet for a long time =)

      I've actually read the second link before, but I'll read the first one just because I love this stuff.

      --
      - It's not the Macs I hate. It's Digg users. -
    7. Re:What's your beef with RAID 5? by phasm42 · · Score: 1

      I'm actually quite familiar with RAID-like systems.

      I guess I should've looked at your recent comment history :p

      --
      "No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner
  42. Error Detection and Correction by Moof123 · · Score: 1

    In communications design, such as cell phones or digital TV, channels with low reliability (fading, burst interference, etc) are tasked with getting much better overall bit error rates than you'd think you could given all the crap spewed into the RF spectrum. I'm kind of confused by why the same techniques of forward error correction, interleaving, and such aren't employed more aggressively for hard drives (maybe they are more than I am thinking, maybe that's how you get to 10^-12 in the first place?). 10^-12 bit error rate is phenominally good compared to what most digital communications devices deal with.

    Typically you throw away 20-30% of your available channel with extra bits due to the encoding (imagine hardware encoding by the hard drive as it writes the bits), but you are guaranteed that if you can get most of the bits right (I'm talking 99%, not 99.9999999%) you can get the original data back, or at least know that you didn't. Interleaving spreads the bits around, so one dead sector (or 10 in a row) can easily be rocovered automatically.

  43. UnRaid? by speedingant · · Score: 1

    I use Unraid for all my data storage. The parity and data isn't striped across all drives, only one drive is parity. If I loose two disks, I only loose the data on those two disks. I've got 4TB of home storage at the moment, but will eventually scale it. Best thing is that it runs on most hardware, and boots off a memory stick. It happens to run Reiser FS though, so when a HD dies, it really makes your life hard to get any data back. But thats why you have backups, right?

  44. Poor summary by Anonymous Coward · · Score: 0

    If one read error occurs in reconstruction of the array you lose the piece of data its tied to - not everything. Still get to keep 99.999% of it.

    *eye roll*

  45. Dumbass. by cbreaker · · Score: 2, Insightful

    I guess you should be considered a new age Luddite?

    Are you the same guy that always waits for SP1 before using any software? I thought so.

    RAID is a proven technology and it's use in nearly all business IT systems from big to tiny.

    RAID isn't meant as a replacement to backups. It's one PART of the entire system of preventing unnecessary data lose, and more importantly, down time. You can keep on running your server while the failed disk is replaced and rebuilt.

    So, while I eat cheeto's and surf Slashdot while that RAID array rebuilds itself, you can go ahead and recover your old data from last night all day long while people bitch at you for not using the technology that's been around since the inception of the hard drive.

    If you actually did have the experience you claim, you'd slap yourself for such a stupid fucking post.

    --
    - It's not the Macs I hate. It's Digg users. -
    1. Re:Dumbass. by v1 · · Score: 1

      RAID isn't meant as a replacement to backups

      AMEN! It's unbelievable how many people consider RAID their backup. Raid is to help mitigate hardware failure and protect uptime.

      My system here does a complete block read of all attached devices (not VOLUMES) twice a month, hunting for bad/slow blocks. When one slice fails is not the time to find out that there's an IO error somewhere on another slice. I typically get an email about once every 3 months by the scripts, that another failing drive has been tagged. Have yet to lose a byte.

      --
      I work for the Department of Redundancy Department.
    2. Re:Dumbass. by cbreaker · · Score: 1

      A lot of RAID controllers do this on their own, too. I have a couple SATA RAID cards at home (Accusys) that will scrub periodically and make sure the parity is accurate. If not, it will find the offending drive, mark it as bad and alert you.

      Most RAID controllers do that type of thing now. Modern controllers aren't just dumb parity calculators anymore.

      --
      - It's not the Macs I hate. It's Digg users. -
    3. Re:Dumbass. by v1 · · Score: 1

      we just became a reseller of the Drobo. There's a nice system. Perfect for people that don't want to learn anything. Just plug it in, and in about 5 minutes its set up and going. lets you know if there's a problem, turns on an idiot light, you feed it another drive, and forget about it.

      --
      I work for the Department of Redundancy Department.
  46. Confessions of a reformed RAID addict by rs79 · · Score: 5, Funny

    You get your first RAID controller from a trusted friend. "Here" he says "try this" and hands you a Mylex board. It has a 64 bit bus and 3 SCSI LVD connectors. Oooh. That looks fast. So you start ebaying drives, cables, adapters, more controllers, the inevitable megawatt power supply and you mess around with raid 1, raid 0 raid 1+0 and raid 5. Suddenly every system falls prey to RAIDMANIA; eventually for yourself you build a system with 3 controllers, with 3 busses each and a drive on each one of 9 busses. With a controller for swap, one for data and one for the system will Windows now be fast? Yeah, sorta. Those drives sure are quiet - from a click-click busy noise perspective, NOT from a "sounds liks a jet airplane when running" perspective. Heat is an issue, too.

    http://rs79.vrx.net/works/photoblog/2005/Sep/15/DSCF0007s.jpg

    But oh my are the failure modes spectacular.

    I just use a laptop now and make several sets of backup DVDs or just copy to spare drives. I love RAID to death. But it's really only marginally worth the effort in the real world. But if you need fast, OMG.

    --
    Need Mercedes parts ?
    1. Re:Confessions of a reformed RAID addict by Grey_14 · · Score: 1

      Ahh, the memories, you're describing my first experience with RAID almost exactly, (Though I ran it on linux of course :P)

    2. Re:Confessions of a reformed RAID addict by Anonymous Coward · · Score: 0

      Stop smoking and clean that fan.

    3. Re:Confessions of a reformed RAID addict by rs79 · · Score: 1

      Some good did come out of it. I standardized on IBM Serveraid controllers at work each running a pair of redundant drives. I've been more than happy with that, hasn't even hiccuped once in 4 years under FreeBSD. Under xp at home either a drive would go bad, or a controller wold go bad or the boot sector would go missing or windows would just shit itself. This happened every month for 8 months till I gave up and just used 3 or 4 unraided SCSI LVD drives per machine and haven't had a problem since.

      --
      Need Mercedes parts ?
  47. Not to Worry by PingPongBoy · · Score: 1

    I have read and written about 50 Tb with my single, non-redundant 160 Gb hard drive and I have never lost so much as one bit. This is just a store-bought hard disk without even a hint of longevity. It was the cheapest drive in terms of bytes for the buck - even cheaper than used drives, which tend to be slightly overpriced for people who want to spend next to nothing. I verify all my reads and writes because I handle some large files - no data loss has ever occurred, even though I do not treat my drive lightly. I transport my drive from place to place, run it from various power sources, and use it fearlessly during thunderstorms. Zero information has gone missing.

    Therefore, with a RAID system of redundant 50 Tb disks, I could fear nothing.

    --
    Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.
  48. Techies imitating politicians by iminplaya · · Score: 1

    Is this the way it's going to be from now on? Always in full 'crisis" mode? OMG! Y2K! No more IPv4 addresses! Spam! Spam! Spam! Spam! Spam! Great for job security, I guess. This whole fear factor thing is getting out of hand.

    --
    What?
    1. Re:Techies imitating politicians by OrangeTide · · Score: 1

      Thus is the state of modern journalism. If you don't sound the klaxon then nobody will hear you.

      --
      “Common sense is not so common.” — Voltaire
    2. Re:Techies imitating politicians by iminplaya · · Score: 1

      How can anybody hear anything in all the racket? What can be done with a billion klaxons going off? The only solution left is to cover your ears.

      --
      What?
    3. Re:Techies imitating politicians by OrangeTide · · Score: 1

      I'm about ready to give up trying to shout over them.

      --
      “Common sense is not so common.” — Voltaire
  49. RTFA by IceCreamGuy · · Score: 1

    Nobody is saying that RAID is a backup. RAID is there to keep you up and running in a business environment when a drive fails, which is, as the author puts it, inevitable. Then he goes on to statistically prove that, while rebuilding an array of currently relevant size for a large business, as in many TB of data, that you will almost certainly not be able to recover your array to a healthy state because of an unavoidable, highly probable read error on one of your "healthy" disks. Of course you have a fucking backup of your production 12 TB RAID array. He said what he did about tape backups to drive home the point, which is that your shit will be down, out of production, thereby making the fact that you had your data in RAID 5 completely pointless. The author has a good fucking point, RAID 5 is statistically useless when dealing with disks that large.

  50. Does RAID = Super Fast Read Times? by rea1l1 · · Score: 0

    If I have 5 disks each with the exact same data is it possible so that when i want to access a file it will ask for 1/5 of the data from each drive? Wouldn't this increase drive life and increase read speed up by 5x?

    1. Re:Does RAID = Super Fast Read Times? by Sobrique · · Score: 1
      Depends what type of RAID. RAID1 is mirrored disks. Your data is written to two (or more) drives.

      So you have a very large overhead on your disk use - you get one disk worth of storage, using two disks for it.

      However you also get a lot of resilience - you can afford to lose one drive, with no overhead/data loss.

      And you get good read performance - as you surmise, if it can be read from two places, then you read from both drives at once, for a faster overall transfer. 5 copies of the drive, means ... well, not _quite_ 5x transfer rate, but very nearly. (You'll probably hit problems eleswhere in your system, as the combined drive transfer rate exceeds system bus throughput)

      It doesn't help for write performance though, because you still have to write that file to every drive.

      The article is talking about RAID 5 though. That doesn't actually write data in multiple locations in teh same sense. It's a bit faster, because a file might have 100k on one disk, 100k on the next, and 100k on the next and so forth - meaning you can read the whole file a bit faster, but it's not a linear speed increase. It can also service multiple file requests faster, as a randomly distributed data request will be using all the drives at ones.

  51. 1 Controller Error from Failure + Year Old Story by backtick · · Score: 2, Insightful

    First off, Isn't this story a year+ old? Sheesh.

    Second off, if you're worried about URE on X number of disks, what about a single capacitor cooking off on the raid controller? No serious data is stored on a single raid controller system, without good backups or another raid'd system on completely unique hardware. Yes, if you put a lot of disk on one controller and have a failure you have a higher risk of *another* failure. That's why important data doesn't depend on *only* RAID, and why lots of places use mirroring, replication, data shuttling, etc. This isn't new. Most folks that can't afford to rebuild from backups or from a mirror'd remote device also couldn't have used 12TB for anything *but* bulk offline file storage because it's slower than christmas VS a 'real' storage array. Using it for the uber HD DVR? Great. Oh no, you lose X-files's last episodes. This isn't banking data we're talking here.

  52. All data is not of equal value. by John+Hasler · · Score: 3, Insightful

    Prioritize your data. I cannot believe that a home user has 12TB of important stuff. Back up your critical records both on site and off [1]. Back up the important stuff on site with whatever is convenient. Let the rest go hang.

    [1] Use DVDs in the unlikely event you have that much critical data. Few home users will have a critical need for that stuff beyond the life of the media. Any that do can copy it over every five years, and take the opportunity to delete the obsolete stuff.

    --
    Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
    1. Re:All data is not of equal value. by Anonymous Coward · · Score: 0

      Right. I would estimate that for 90% of home users a fatal crash is a PITA, but not even near catastrophical. After a reinstall most things can be re-built/re-collected in rather short time.
      The most annoying would probably be your home digital photos only stored on your hard drive, you'll cry a bit but after that life goes on as usual.
      Most other things, such as the 5000 saved spam messages in your inbox will probably even feel good to finally get rid of, even though you won't admit it.

  53. Raid 6 still good up to 5EB by Anonymous Coward · · Score: 0

    With regards to raid 6 being insufficient for *single* drive failure recovery:
    Raid 6 allows you to recover from any number of read errors as long as no more than one disk in a strip has a read error.
    The probability of multiple failures on any given byte is thereby:
    1-( (1-10^-14)^7 + 7*(10^-14)*(1-10^-14)^6) =~ 2.1*10^-27

    Therefor to have a better chance of getting a failed raid 6 then winning the lottery, you'd need disks in the 5EB range ( (1-2.1*10^-27)^X=1-10^-8 gives X=~ 5*10^18).
    In these circumstances, the probabilities are negligible in comparison with those of a second total drive failure during recovery.

  54. Scrub your arrays by macemoneta · · Score: 4, Interesting

    This is why you scrub your RAID arrays once a week. If you're using software RAID on Linux, for example:

    echo check > /sys/block/md0/md/sync_action

    The above will scrub array md0 and initiate sector reallocation if needed. You do this while you have redundancy so the bad data can be recovered. Over time, weak sectors get reallocated from the spare bands, and when you do have a failure the probability of a secondary failure is very low over the interval needed for drive replacement.

    Most non-crap hardware controllers also provide this function. Read the documentation.

    --

    Can You Say Linux? I Knew That You Could.

    1. Re:Scrub your arrays by kyubre · · Score: 3, Informative

      I worked at Maxtor up till 2006, and had the privilege of being able to play with several raid controllers, and that coincidently is how I got started with Linux at home (software RAID). At the time, and mind you I only had 160 GB and 250 GB drives to play with, I build a number of raid-5 arrays up to 2 TB. When people think about RAID failure, they generally think about a hardware failure - a sector that can't be read etc. That is only the "obvious" problems. Even under ideal conditions, the 1e15 - 1e17 error rates published by the disk drive vendors also includes data errors that ARE NOT detected in hardware. It does not take a sector read failure to generate a data miscompare. What I found back in '06, is that with a 2TB Raid5 made up of 8 drives, there was about a 10% probability of a RAID data failure every time the raid array was read, sector, by sector for the entire 2TB span. That implies that in the event of a real disk failure, there was about a 10% probability that the rebuild would fail because of an otherwise undetected data read error. I am not sure where state of the art is with Linux Software RAID, and perhaps the "scrub" operation mentioned above does the trick, but the biggest failing in RAID systems I have used, is that when a data error occurs, the algorithms don't/didn't calculate the missing block, and write it back to the failing device giving it a chance to push off the sector in error. Most disk drives can "heal" with most of the common problems in a RAID system. Whats missing is back ground grooming that deals with a missing data slice, and gives the device the chance to recover from it, while alerting the admin that a problem was "handled". Its not the 3%/year hard disk failure we should be worried about - its corrected error rate. 1e15 is very unforgiving when you are talking about terabytes... As long as RAID doesn't do the "right thing" and try to recapture the missing data, RAID-5 is in trouble.

      --
      Nothing evolves faster than the word of god in the minds of men who think themselves divinely inspired.
    2. Re:Scrub your arrays by mzs · · Score: 1

      Or just replace an old drive with a new drive periodically so that no drive is ever more than two years old.

    3. Re:Scrub your arrays by Slashdot+Parent · · Score: 1

      My understanding, is that the RAID scrubbing attempts to read and verify the checksum of every sector of every physical partition in your array. Any read that fails gets the block recomputed, rewritten, and reread. If that fails, then the disk is marked as failed and is removed from the array.

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
    4. Re:Scrub your arrays by kyubre · · Score: 1

      I haven't checked out software RAID on Linux since '06. Made some desperate pleas to the Kernel mailing list, but could not take part as an employee from Maxtor (GPL, lawyers, patents, "oh my")... Its the right strategy, will have to go back and play with it. However, it should be an activity that is always running in the back ground without requiring an Admin or cron job to fire it off...

      --
      Nothing evolves faster than the word of god in the minds of men who think themselves divinely inspired.
    5. Re:Scrub your arrays by JayAEU · · Score: 1

      Or just replace an old drive with a new drive periodically so that no drive is ever more than two years old.

      This is exactly what I've been doing for the past few years. Every 1 - 2 years, I buy whatever disk has the best GB/$ ratio and is bigger than my currently smallest drive. This gets integrated into the array, ensuring that all checksums are actually rechecked. In the process, I grow the RAID array to make use of the new space, thus keeping track of Moore's law in a way.

      The old drive still gets used for experimentation purposes in other PCs or is sold off on your favourite auction site after intesive scrubbing with dd.

    6. Re:Scrub your arrays by Slashdot+Parent · · Score: 1

      it should be an activity that is always running in the back ground without requiring an Admin or cron job to fire it off...

      I agree with you that that ought to be an option. Perhaps some low-priority process that slowly and continuously scrubs the arrays.

      I guess the idea of a cronjob that kicks off this process is that it's a balance between early detection of a failed disk, and subjecting your disks to excessive, (perhaps) unnecessary usage.

      Anyhow, as I said in a different post, the monthly scrub of all arrays is the default behavior in Debian. You'd have to intentionally disable it to be unprotected.

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
  55. RAID 5 for home user?! by Phizzle · · Score: 1, Insightful

    Isn't it more cost effective to do RAID 1, with a nightly backup to an external. At least in my home, I do not require mission critical hot-swapping capabilities. Then again I only have 3x 1TB hard drives. Also, after RTFAing the author of the article assumes that an unrecoverable read error corrupts your RAID array. It does not, typically your bad sector gets added to the list and mapped out of being used. Speaking of used, article assumes that entire drive is being used, but if the error on the part of the drive not covered with data, this is also a non issue.

    --
    I will not be pushed, filed, stamped, indexed, briefed, debriefed or numbered. My life is my own.
    1. Re:RAID 5 for home user?! by Sobrique · · Score: 1
      Do you mean RAID 1? That's mirrored disk, and that's generally more expensive than RAID 5 - you need (at least) twice as much raw storage, as usable storage.

      So you'd stick 2x 1TB drives in a RAID1 to get 1Tb of usable storage.

      2 disk RAID5 doesn't make much sense, as RAID 5 'costs' one disk out of the set, but you could do a 2+1 for 2Tb of usable, RAID protected storage, on your 3x 1TB drives.

      RAID5 is a cost compromise between the factor 2 that you need to do mirroring, and the resilience it provides. A 4+1 RAID costs 1 disk, but protects 4, and you need two failures out of 5 to take it down. RAID1 needs at least two failures - you need both mirrors to fail, so it's possible that even with half your drives dead, you're still ok.

      For my purposes as a home user, I've got a really quite small set of files that I actively care about losing. Most of my storage is 'assorted junk' like game installs, videos, music files, that kind of thing.

  56. Simple formula by Anonymous Coward · · Score: 0

    How much is your data worth? Spend accordingly.

    $ = D/?

    money spent on backup is equal to value of data divided by %variable%

  57. RAID 6 by Anonymous Coward · · Score: 0

    "Apparently, RAID 6 isn't far behind"

    RAID 6 is not new, 3ware has it for some time now.

  58. It doesn't matter for a home user by Anonymous Coward · · Score: 0

    I'm self employed, and my computer(s) hold both personal home stuff as well as work files. My work files mainly consist of translated documents in Word, Excel, PPT and PDF format. My accounting data is a couple MB in size, although I have it all printed out on paper due to legal requirements.

    That said... for any home user with 12TB of data (12TB!? WTF!?), I'm willing to bet that the important stuff, like me, will all fit on a DVD. Maybe even a CD. Make a habit to burn a new DVD/CD on Friday evening, and keep up to 5 generations. That's a maximum of $5 per month spent on backups. Dirt cheap, and relatively effective. If you don't want to lose more than a day's worth of data, just make duplicate copies to a thumbdrive whenever you feel the need.

    The majority of files on a large HDD that are so big that DVD/CD backups aren't realistic, are files that probably will cause an inconvenience at best when a disk fails. (Again, I'm talking about home users, not corporate users.) Movies, music, home videos, Games, etc. You have the original media, yes? I'm betting that anyone that has more than 2TB of storage, and is nearly saturating it, is doing a lot of P2P downloading. Disregarding the legality of it for a moment, this data can be gotten again. It will take time, but it isn't lost for good.

    If you have more than a 100GB of data that is critical (I'm not talking about getting your panties in a knot over a few lost seasons of a TV show), then it's time to start thinking really, really hard about investing in a serious (and very expensive) backup system. Spare hard drives seem to be cheaper for smaller volumes of data. Just have 5 external HDDs where you can dump everything. If you have larger volumes of data, there's tape (and expensive). It's a matter of how much you value your data.

    If you're self employed doing 3D modeling and rendering or other media related work where your life lays at the mercy of large volumes of data being in-tact, you need to make the necessary investment. If your sales don't rake in enough money to cover the essential equipment you need for your trade, you need to re-examine your business model.

    Long story short, most people don't have 12TB of important data. Most people don't have 12GB of data for that matter! Put in that perspective, backups aren't hard, or expensive.

  59. Sets clock to 1911 by Junior+J.+Junior+III · · Score: 1

    I should be safe now, right?

    --
    You see? You see? Your stupid minds! Stupid! Stupid!
  60. Dumb article by Lost+Penguin · · Score: 1

    RAID is not a backup solution.
    For those administrators who think it is, you should keep your resume on the array.

    --
    I am the unwilling control for my Origin.
  61. ZFS has its own issues by Anarke_Incarnate · · Score: 1, Offtopic
    I have commented on them a lot, namely the default block size for reads as well as the ARC buffer size issues that need kernel tuning to fix, but SHOULD be fixed, and after only 5 updates to Solaris 10.....

    btrfs should be an option on Linux, for those who care to go that route

  62. Punch Cards by vldragon · · Score: 3, Funny

    I used to use the old punch card system to backup my data. Sure it takes a while but it was totally worth it... Until one day while attempted to move the many boxes fully of carefully sorted cards I fell down the steps and the cards went everywhere. I learned from that mistake and started writing all everything down on paper... Lot's o' 1's and 0's, my hand hurt.. A lot. But there was a fire at my off site :( sot I had to resort to the ultimate old school back up. A chisel and a rock... a really really big rock.

    --
    Eating the brains of your enemies does not make you smarter. But it's still fun.
  63. Don't panic! by Joce640k · · Score: 4, Insightful

    RAID 5 will still be orders of magnitude more reliable than just having a single disk.

    --
    No sig today...
    1. Re:Don't panic! by Anonymous Coward · · Score: 5, Insightful

      No, it won't. That's the point of this not-news article. It's getting to the point where (due to the size of the disks) a rebuild takes longer than the statistically "safe" window between individual disk failures. Two disks kick it in the same timeframe (the chance of which increases as you add disks) and you're screwed.

      A poorly designed multi-disk storage system can easily be worse than a single disk.

    2. Re:Don't panic! by nine-times · · Score: 4, Informative

      How reliable RAID5 is depends, because actually the more disks you have, the greater the likelihood that one of them will fail in any set period of time. So obviously if you have a RAID 0 of lots of disks, then there is a much better chance that the RAID will fail than that any particular disk will fail.

      So the purpose of RAID5 is not so much to make it orders of magnitude more reliable than just having a single disk, but rather to mitigate the increased risk that would come from having a RAID0. So you'd have to calculate, for the number of disks and the failure rate of any particular drive, what are the chances of having 2 drives fail at the same time (given a certain response rate to drive failure). If you have enough drives and a slow enough response to disk failures, it's at least theoretically possible (I haven't done the math) that a single drive is safer.

    3. Re:Don't panic! by BrokenHalo · · Score: 0, Troll

      A poorly designed multi-disk storage system can easily be worse than a single disk.

      Certainly. Though I can't speak for RAID-5 from personal experience, never having used it. I've always thought of that as so complicated, I find it difficult to trust, especially since there is no parity checking.

    4. Re:Don't panic! by bstone · · Score: 4, Insightful

      Using the same failure rate figures as the article, you WILL get an unrecoverable read error each and every time you back up your 12 TB of data. You will be able to recover from the single block failure because of the RAID 5 setup.

      With that kind of error rate, drive manufacturers will be forced to design to higher standards, they won't be able to sell drives that fail at that rate.

    5. Re:Don't panic! by Sillygates · · Score: 5, Insightful

      The mathematical theory behind raid5 is not complicated at all. http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5

      And there is parity, that's how raid5 works.

      You are probably referring to "silent" errors, which for performance reasons, isn't read/detected by most raid5 implementations. And in reality there is little reason to actively read parity, unless they are running/recovering in degraded mode: Sure, you'll be informed that there is data corruption, but there is no way to tell whether the parity, or the original data is at fault (though its true, some implementations will scrub/update the parity to match the original data on an occasional basis).

      I don't see a single set of raid5 disks as a backup solution at any measure though (disk reliability is only one aspect of this, hardware/driver/filesystem bugs can also cause hard or impossible to detect corruption), but it is a great 'best effort' to prevent a bit of downtime on high availability disks.

      --
      I fear the Y2038 bug
    6. Re:Don't panic! by Tracy+Reed · · Score: 1, Informative

      You seem to misunderstand the article. They are saying that if you need 12T of storage RAID 5 is not reliable. You would be better off with a single 12T disk if such a thing existed.

      With 7 brand new disks, you have ~20% chance of seeing a disk failure each year.

      SATA drives are commonly specified with an unrecoverable read error rate (URE) of 10^14. Which means that once every 100,000,000,000,000 bits, the disk will very politely tell you that, so sorry, but I really, truly canâ(TM)t read that sector back to you.

      So now you can't rebuild your array. And there is a 20% chance of this happening every year. If you had a single disk your chance of total disk failure averages 3%. In this case you are better off having one disk and making good backups. Or perhaps a mirror or even a 3-way mirror if the system is smart enough to read data off of the other disk in the event that one returns a URE.

    7. Re:Don't panic! by Allador · · Score: 4, Insightful

      You seem to misunderstand the article. They are saying that if you need 12T of storage RAID 5 is not reliable. You would be better off with a single 12T disk if such a thing existed.

      Thats not what the article says at all.

      The article says that if you build your RAID arrays from the biggest disks available (which no one with half a brain does) like 1-3TB drives, and you have them filled, then the numbers come out as presented.

      But there's a reason why no one on the planet builds important raid arrays out of 1TB drives. Rebuild time is too long.

      This is also one of the big reasons why you see so many 73GB and 140GB SAS/SATA drives in raid arrays, and why server storage drives dont grow anything like as fast as consumer garbage drives.

    8. Re:Don't panic! by Eivind · · Score: 5, Insightful

      Yes. It's amazing that the article presents the basic point so horribly poorly. The problem is not the capacity of the disks.

      The problem is that the capacity has been growing faster than the transfer-bandwith. Thus it takes a longer and longer time to read (or write) a complete disk. This gives a larger window for double-failure.

      Simple as that.

    9. Re:Don't panic! by Solra+Bizna · · Score: 1

      there is no way to tell whether the parity, or the original data is at fault

      Incorrect.

      Take a four-disk RAID-5 array, with disks A, B, C, and P. There are four "copies" of the data: {A,B,C}, {B^C^P,B,C}, {A,A^C^P,C}, and {A,B,A^B^P}.

      Randomly flip a bit on disk C. You've just changed the first three copies, but not the fourth one. In this situation, it is possible to say with reasonable certainty that the three copies that agree are wrong and the fourth one is correct.

      -:sigma.SB

      --
      WARN
      THERE IS ANOTHER SYSTEM
    10. Re:Don't panic! by drsmithy · · Score: 3, Informative

      The problem is that the capacity has been growing faster than the transfer-bandwith. Thus it takes a longer and longer time to read (or write) a complete disk. This gives a larger window for double-failure.

      No, the point is that (statistically) you can't actually read all of the data without having another read error (statistically speaking).

      Whether you read it all at 100MB/sec or 10MB/sec (ie: how long it takes) is irrelevant (within reason). The problem is that published URE rates are such that you "will" have at least one during the rebuild (because of the amount of data).

      The solution, as outlined by a few other posters, are more intelligent RAID5 implementations that don't take an entire disk offline just because of a single sector read error (some already act like this, most don't).

    11. Re:Don't panic! by NormalVisual · · Score: 2, Insightful

      This is also one of the big reasons why you see so many 73GB and 140GB SAS/SATA drives in raid arrays

      Didn't you mean SAS/SCSI? Most of the servers I've seen with smaller disks have been one of those, at rather brisk spindle speeds.

      --
      Please stand clear of the doors, por favor mantenganse alejado de las puertas
    12. Re:Don't panic! by Allador · · Score: 1

      Yes, good catch. Thats exactly what I meant.

    13. Re:Don't panic! by Sobrique · · Score: 1
      Depends if you're comparing the complete data loss, or any data loss figures.

      If I take 5 drives, and slap some data on each of them without any form or RAID whatsoever, one failure will cost me some data. But it won't cost me _all_ my data - I'll still have 4 drives with some stuff on them, that's as usable as it was.

      If I RAID5 across those 5 drives, I lose a bit of space for doing so, but what I gain is that no one failure will cost me data. It requires two drives to go down, within the time it takes to repair the RAID. That time is ... well, quite variable - some systems use hot spares that come online immediately and start rebuilding. Others you'll have to wait a week as you go buy a new disk, swap out the failure and do the rebuild by hand.

      So you may be right - you lose more data if a RAID 5 goes down, vs. the JBOD. You're also losing performance on your raid5 when it'd degraded, because it'd having to do on the fly parity calculations. This is entirely dependant on how long the replacement operation takes you though, and the probability of a second failure in the set during that window.

      And of course, the relative acceptability of 'partial' data loss - I don't know about you, but it doesn't actually matter to me if I'm doing a restore of 1 drive or 5, my service is still hosed, but there's some environments where it might be OK. Restore 1/5th of your users mailboxes from last night's backup, as the other 4/5ths are working fine and haven't had a service interruption.

    14. Re:Don't panic! by segedunum · · Score: 1
      You've got the wrong end of the stick somewhat.

      The article says that if you build your RAID arrays from the biggest disks available (which no one with half a brain does) like 1-3TB drives, and you have them filled, then the numbers come out as presented.

      It's the total storage that is being talked about in the article, not the size of the disks, so no, the article doesn't say that. The problem is not the size of drives but the storage requirements that exist today. Regardless of how many drives you have then you are very, very likely to have a failure somewhere due to the amount of data that is being stored. In reality, if you have a few 1TB disks versus several smaller drives then it really is swings and roundabouts whether you are more at risk from having a failure because of having more drives. It's not inconceivable that you are at greater risk.

      You can't get away from this by having smaller drives in larger quantities. It's the storage requirements that are now a problem.

      But there's a reason why no one on the planet builds important raid arrays out of 1TB drives. Rebuild time is too long.

      RAID arrays with lots and lots and lots of disks, yes - just about. Rebuild time is too long. However, RAID arrays with a few drives can get away with it comfortably and mitigate some risk with 5EE or 6. However, if you are storing terabyte upon terabyte of data with smaller disks then a not insubstantial rebuild time is going to be seen there as well and you've got a greater chance of it happening with more disks.

      This is also one of the big reasons why you see so many 73GB and 140GB SAS/SATA drives in raid arrays

      No, that's not the reason at all. The reason is because such drives are more expensive to produce, have much higher spindle speeds and need greater performance. This dramatically reduces the kinds of storage you can have. If you have terabytes of data kicking around then you are going to need a lot more drives here, and as such, the chances of you encountering a failure go up as you increase the number of drives to achieve the storage requirement you have.

      Either way, storage requirements today are increasing the chances of failure quite a bit regardless of whether you make up your multi-terabyte storage from fewer larger disks or more smaller disks. What we need is a more reliable storage technology within disk drives at an affordable price because that's where the problem is. The biggest candidate for that are SSDs.

    15. Re:Don't panic! by Lumpy · · Score: 1

      Yeah and statistically my 72 drive raid 50 is so unreliable that it fails every 3 minutes.

      It's an article about raid predicting doom written by a guy that knows nothing about raid.

      You note is the kicker. a POORLY designed system CAN be worse. You are 100% correct. Some people do raid 50's in a way that if you yank one powervault cable the raid eat's it's self. because they are too cheap to raid 5 the powervaults. so they buy 2 of them, and then they dont hire an expert at $140.00 an hour they have one of the MCSE idiots at $18.00 an hour configure it over the next week and then wonder why it ate it's self when they had a failure.

      --
      Do not look at laser with remaining good eye.
    16. Re:Don't panic! by Lumpy · · Score: 1

      I need to correct you....

      "more intelligent RAID5 implementations that don't take an entire disk offline just because of a single sector read error (some already act like this, most don't)."

      Most DO.. most real raid cards dont. Problem is that most raid cards sold are complete crap software based junk from wierd companies. People buy that $129.00 economy raid card and dont buy the $980.00 raid card.

      Guess what that good card will work most of the time (It's got a ram cache and a battery backup for that ram for a reason kiddies.)

      so most raid cards sold are low grade junk, and people expect them to work like the real deal..... they dont. It's only an illusion of raid.

      --
      Do not look at laser with remaining good eye.
    17. Re:Don't panic! by Anonymous Coward · · Score: 1, Insightful
      eat's it's self

      Aaaah it burns! It burns! Make it stop, please, make it stop!

    18. Re:Don't panic! by QuantumRiff · · Score: 1

      I don't think the Original poster understood the problem all that well either. By their own description of the problem, it would be impossible to back the data up to tape without error as well. Imagine if every time you did a backup, a bit got flipped somewhere.

      --

      What are we going to do tonight Brain?
    19. Re:Don't panic! by drerwk · · Score: 1

      I agree with what you say; but there is the issue of what causes the failure. Suppose that the disks are overheating, then the fact that it takes 10 times as long to read them will increase the likelihood of a failure during that longer time. Your argument assumes that there is already a problematic sector which is just waiting to be read and cause an error, while GPs argument seems based on a new failure being a function of time and a longer time increases the likelihood of a new failure.

    20. Re:Don't panic! by sarkeizen · · Score: 2, Insightful

      It's an article about raid predicting doom written by a guy that knows nothing about raid.

      He's correct in most things. I'm just not sure I agree with him on his dates and although I expect your example is supposed to be funny it's probably better to pick one that applies. If you read the article you'll see that depending on how many drives you have per RAID5 unit your error rate may be acceptable. However Robin makes the pretty observant point that you are essentially paying more for less protection as raid drives grow in size.

      So things he's correct on:

      Drives fail (enterprise or otherwise) at about 3% per year.
      URE do occur but the 1 per 12TB of data read quantity is for SATA drives.

      Questionable things:

      RAID controllers probably don't read the entire surface during a rebuild but rather just the parity portions of the disk. This means in a RAID5 of 1TB disks. You are reading 1TB of data. Which would likely mean that you have a 1 in 12 chance of getting an URE. This may be an acceptable risk for some.

      The assertion that it's the "end of raid 5" is a little severe. A RAID50 mitigates the risk and the functions for calculating your parity data can be extended arbitrarily HOWEVER this is always at the expense of performance.

      The rate of disk growth may not follow the proscribed pattern.

      Red Herrings(?):

      Does the controller take the array offline if it encounters an URE during rebuild or does it continue? This may make change the result from being a system halt to data corruption but neither are unacceptable in the enterprise IMHO.

      The good argument underlying "doomsday dates" is that it seems reasonable that drive size is increasing at a much faster rate than these two figures are decreasing. Which means as storage needs grow the size of drives deployed will also likely grow but there is now an extra expense to consider.

    21. Re:Don't panic! by torkus · · Score: 1

      Just to jump in here...you can't read parity data and invent the missing bits. You need to read the n-1 data bits + parity to work out your missing bit.

      I do agree though that this is little more than a doomsday article - the techie's version of 'What's in your hand right now that may be killing your children - News at 10!' nonsense. If you really did get 1 unrecoverable error out of every 12TB read we'd have an awful lot more data loss on personal computers.

      --
      You can get rich if you own a politician, but you have to be rich to buy one in the first place.
    22. Re:Don't panic! by Anonymous Coward · · Score: 0

      eat's it's self

      Seriously? I've given up trying to correct people on the its vs. it's issue, but eat's? WTF?

      Do you just randomly sprinkle an apostrophe here, there and everywhere?

    23. Re:Don't panic! by rgviza · · Score: 1

      >A poorly designed multi-disk storage system

      or unmonitored one ; )

      People are really stupid sometimes.

      -Viz

      --
      Don't kid yourself. It's the size of the regexp AND how you use it that counts.
    24. Re:Don't panic! by sjames · · Score: 1

      Exactly. The UNreliability of RAID5 approaches that of a single disk as the size increases. RAID6 improves your odds, but at even larger sizes, the same principles bring it down too.

      The odds can be helped a bit by using disks of different ages in the RAID (though for the largest sizeses, there are no older disks) so they at least don't all reach the far end of the bathtub curve at the same time.

      Faster I/O helps by reducing the window where another failure causes loss, but capacity is growing a lot faster than speed and to some extent, faster means less reliable.

    25. Re:Don't panic! by BlackSnake112 · · Score: 1

      Wouldn't those massive cached systems help? DEC, er, Compaq, now HP used to sell an external hard drive array. Each shelf could hold 8 hard drives or power supplies and the box could hold six shelves. We usually set them up with six drives per shelf and two power supplies per shelf for fail over. There were two controllers at the bottom. SCSI or fiber (depending on which controller you ordered). I remember added 2GB of memory for cache for each controller. The controllers had a total of 4GB of memory. This was back in 1999. I would not be surprised (if these are still sold) to see 8-16+ GB of memory on those controllers today.

      If you add a lot of cache on the controllers shouldn't that help with the write speed for the drives?

    26. Re:Don't panic! by sjames · · Score: 1

      The URE rate isn't as large a factor as it might appear. You can't sell large disks where a complete read will inevitably turn up a failure. It's why you don't see double height 5TB disks now.

      For RAID, the issue is the odds of a fault developing during the rebuild time (including the time to replace the disk if there isn't a hot spare). Since capacity is growing faster than reliability and transfer speed, it's getting to the point where the odds are too high that a RAID5 will suffer a second fault before the spare can be written. Of course, those odds have never been zero.

      In the old days where drives were small compared to the transfer rate and failure rate, losing a disk in a RAID5 meant order a new disk now. These days you need to have that spare on hand. Some keep it as a hot spare, but personally, I prefer to just use RAID6 in that case (might as well keep it just a bit hotter!) and have a cold spare on hand and ready to go. If the location is unattended, it's best to go RAID6 with a hot spare and an immediate service call should a drive fail.

    27. Re:Don't panic! by sjames · · Score: 1

      Actually, it won't. Cache reduces the latency for sequential partial reads, but won't help when you must stream more data than the cache size because the cache can be filled no faster than the max read speed of the drive and cannot be drained any faster than the max write speed of the drive (when doing a drive to drive copy).

      In other words, the calculations all assume that you have enough cache so that you can utilize the drive's max transfer. Having less cache than that WILL slow things down though.

    28. Re:Don't panic! by Allador · · Score: 1

      It's the total storage that is being talked about in the article, not the size of the disks, so no, the article doesn't say that. The problem is not the size of drives but the storage requirements that exist today. Regardless of how many drives you have then you are very, very likely to have a failure somewhere due to the amount of data that is being stored.

      I'm not sure how you get that from the article.

      The core issue that its talking about (which is old news) is the ratio between rebuild time in a disk failure and error rates.

      With very large disks, even with a very low error rate, if you read the entire disk worth of content, you're bound to hit an error (statistically). The total number of spindles also increases failure rate, but that doesnt matter. If you keep your spindles reasonable sized, then your rebuild time for a single disk failure is low, and your risk of a two-drive failure is low.

      The article's core argument isnt about having errors, its about the likelihood of getting a second drive failure while the array is rebuilding as a result of the first. Mathematically, this has absolutely nothing to do with total storage size, and ONLY has to do with the spindle size of the failed drive.

      However, if you are storing terabyte upon terabyte of data with smaller disks then a not insubstantial rebuild time is going to be seen there as well and you've got a greater chance of it happening with more disks.

      How exactly do you get this? Rebuild time has zero to do with total storage size and everything to do with the size of the failed disk (and the controller/bus/drive speed, but we assume thats equal).

      It's a simple function of the failed disk size, the variable of total storage size isnt even part of the equation.

      No, that's not the reason at all. The reason is because such drives are more expensive to produce, have much higher spindle speeds and need greater performance.

      That is absolutely the reason. Even if Dell or HP offered servers with 1TB SAS drives, I would never buy them that way. It's a newb mistake to build important raid arrays out of drives that size, due to the rebuild time and overall performance. In a raid array you get performance out of additional spindles and being able to serve multiple simultaneous read requests off of that array (due to some reads only having to hit one or two disks).

      If you fill it with 1TB or 2TB drives, then not only will your rebuild time be unacceptably high, but you lose performance gains, as your contention against the same spindle for a set of requests goes up.

      This dramatically reduces the kinds of storage you can have. If you have terabytes of data kicking around then you are going to need a lot more drives here, and as such, the chances of you encountering a failure go up as you increase the number of drives to achieve the storage requirement you have.

      Which is why when you need really large storage, you build pools of volumes made out of multiple arrays.

      There's a reason why all your high end nas and san vendors build their systems this way. They're made up of reasonable sized disks, grouped in redundant arrays of 5-7 (mostly), and then these groups are all pooled together.

      Its done this way not only because that makes for more expensive products, but because it works! It's the practical way to build these things.

      What we need is a more reliable storage technology within disk drives at an affordable price because that's where the problem is. The biggest candidate for that are SSDs.

      I do agree that we need something better, but I would say new kinds of storage software is the way of it, like what ZFS has done, and what google does with their massively redundant storage. SSD's may help (though I'm not sure how their real-world failure rate compares to spinning disks), but I think these sorts of new approaches will be the important jump.

    29. Re:Don't panic! by nine-times · · Score: 1
      Well I'm really saying 2 things:
      1. The purpose of RAID5 isn't to minimize the risk of failure, but only to mitigate an increased risk of failure. If you want to minimize the risk of failure, then you maximize redundancy, and will probably prefer to choose RAID1 (or RAID 10, or more generally something that uses mirroring rather than parity).
      2. It's (at least technically) possible for a RAID5 to have a greater likelihood of failure and any data loss than a single drive.

      So to exaggerate the situation, imagine you could choose to use a single 1TB drive, or a RAID5 of 1,001 1GB drives. It's true that with the 1TB drive, you'd only need one drive to fail in order to lose your data, whereas with the RAID you'd need 2 drives to fail. However, I'd bet that the RAID is much more likely to fail in a given time period.

      In case it's not immediately apparent why, let's imagine that all the drives you're talking about had a .1 chance of failure within 10 years. So if those odds played out at the end of 10 years, then it's likely that 100 of those drives in the RAID will have gone bad within 10 years, and all you need is for 2 of those 100 to go bad at roughly the same time. Make sense?

      On the other hand, I agree that if you're going to use a small number of disks, RAID5 may be safer than using those disks completely independently. If you had 3 500GB disks and only needed 1TB, but wanted a little extra safety, then RAID5 would be a good choice.

    30. Re:Don't panic! by evilbessie · · Score: 1

      RAID controllers probably don't read the entire surface during a rebuild but rather just the parity portions of the disk. This means in a RAID5 of 1TB disks. You are reading 1TB of data. Which would likely mean that you have a 1 in 12 chance of getting an URE. This may be an acceptable risk for some.
      How exactly do you expect the parity partition, which being RAID 5 is split across every disk, to work out what the 'lost' data is without reading all the corresponding stripes on the other disks?

    31. Re:Don't panic! by sarkeizen · · Score: 1

      Just to jump in here...you can't read parity data and invent the missing bits. You need to read the n-1 data bits + parity to work out your missing bit.

      Well said. I was pretty low on the sleep when I wrote this. I should have remembered this :-)

      If you really did get 1 unrecoverable error out of every 12TB read we'd have an awful lot more data loss on personal computers.

      I admit I find the value counter-intuitive too but at the same time I acknowledge that given the size of data that I generally move about on my hard drive it seems plausible that these errors are beneath our ability to detect.

      So if you're with me so far try bounding things on the basis of something that you would have experienced vs something you would have heard of.

      For example I've never opened a word file to find a sector-sized error but since word files are generally less than 20 MB.

      Even given that I shuffle hundreds of word files around on my drive. I'd still have to do it over 600,000 times to be guaranteed an URE. On the other hand if most people shuffle hundreds of small documents across their hard drive. Then it only makes the odds of
      knowing someone who has had this problem around the one in one or two hundred mark. I've certainly have met people who have had corrupted files and It's difficult to discount this as a possible cause.

      Now clearly bigger files represent a higher likelihood so moving up the chain lets look at the 1GB videos I routinely have on my hard drive. Currently I can't have more than about 100 or so of these on my drive but even so I'd still need to move about 12,000 of these to guarantee an error. Most of these use some form of lossy compression. So even though I am more likely to encounter a read error it also seems much less likely that I would notice. Not to mention that there are some upper layer facilities to fix problems with these files.

      Let's take the biggest amount of personal data I tend to move: My RAID. I had about 1TB of data on a RAID 5 across eight 200GB discs. I moved this to a 1.5 TB RAID 5 on 4 500GB discs and then from there to a 3TB RAID 50 on 8 500GB discs.

      There was only a 1 in 12 chance of any data corruption in the first move, the same in the second move. Still an 84% chance of never having a single sector hurt.

      That combined with info from places like CERN

      It seems a reasonable conclusion to me.

    32. Re:Don't panic! by electrostatic · · Score: 1

      The 12 TB figure in TFA is derived from the error rate: "SATA drives are commonly specified with an unrecoverable read error rate (URE) of 10^14." If the URE is improved - say, by devoting more bits to error correction - then the numbers get better. I expect the drive producers know this and will acct accordingly.

    33. Re:Don't panic! by Anonymous Coward · · Score: 0

      You are creating data out of nothing and your example hides that by being extra complex. If you flip a bit in C, none of your "copies" with agree, since A != B^C'^P and B != A^C'^P and C' != A^B^P. You have zero information telling you which disk A, B, C or P had a flipped bit. You can't create information out of nothing.

    34. Re:Don't panic! by sarkeizen · · Score: 1

      Sorry if this was unclear. I expect that all of the parity data is read but that doesn't mean the entire array is read.

      I got the idea from Robin's article that a 12TB (13 1TB Drive) SATA RAID 5 would be guaranteed to experience an URE during rebuild.

      But since it's only reading the parity data (and the data to reconstruct the "lost" parity partition) The odds are much better (but that doesn't mean acceptable) i.e. 1 in 12 chance of a URE.

    35. Re:Don't panic! by sarkeizen · · Score: 1

      And...obviously I'm on drugs since the XOR has to be of the parity+remaining devices.

      I still blame sleep dep here.

    36. Re:Don't panic! by evilbessie · · Score: 1

      No you miss the point entirely. Parity data can give you all the data back, but ONLY if you know what the other bits you have are. Parity is a checksum* so lets assume you have 3 disks for any given disk slice you have 2 parts data to one part parity. If you add the corresponding bit values of each of the data part, and using only 1 bit answer, eg 1100 + 1010 = 0110, this gives you the third part the parity. So to replace any missing disk, which will contain both data and parity you need to know what the parity is to work out any missing data, and you also need to know both bits of data (when they happen to exist on the surviving drive) to fill in the missing parity

      | 1 | 1 | 0 | 0 | data 1
      | + | + | + | + |
      | 1 | 0 | 1 | 0 | data 2
        =
      | 0 | 1 | 1 | 0 | parity

      His point is also only valid if you have 12TB of data, because they HDD manufacturers say the failure rate is 1 12TB of data (using some other measure).

      *May be the wrong word but I can't think of a better one right now.

      I may be wrong about how parity is actually worked out (I can't be bothered to check wikipedia) but it is essentially along those lines, the data is manipulated in such a way that the system can take 1 missing part and reconstruct from what they have, and this means that they MUST read all the data from ALL the remaining drives.

    37. Re:Don't panic! by noidentity · · Score: 1

      Using the same failure rate figures as the article, you WILL get an unrecoverable read error each and every time you back up your 12 TB of data. You will be able to recover from the single block failure because of the RAID 5 setup.

      Isn't this assuming that you read all 12 TB of data each time you back up? If you're on'y copying changed/new files, you should be reading lots less.

    38. Re:Don't panic! by operagost · · Score: 1

      Only Adaptec seems to have the guts to sell a software (BIOS) RAID card and charge $129 for it. Also remember that we're really concerned with the enterprise and not toys. Most controllers sold to business are SAS or SCSI, not SATA. I don't know of a single BIOS-software SAS or SCSI RAID card.

      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
    39. Re:Don't panic! by operagost · · Score: 1

      They replaced the HSG/HSC line with the MSA. Same concept, but they changed the OS so now all my HS skills are useless :-/

      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
    40. Re:Don't panic! by Cramer · · Score: 1

      You see 70G and 140G drives because they are very, very, very fast and very, very reliable. While rebuild times are important, performance is often far more important, and that means more spindles. For example, my 42 drive (6 shelves * 7 drives * 18G) FC array can continuously flood a dual attached 4G FC controller. (using linux software raid... each shelf is a raid5 volume striped into a 6 volume raid0. And those are 11 year old, fully functional, FC-1 drives.)

    41. Re:Don't panic! by Allador · · Score: 1

      I completely agree, and talked about that a bit more in my response to segedunum above.

      So yes, agreed, the way to make bigger storage available is to do as you did and keep the drives reasonably sized, use more spindles, and make arrays built of other smaller raid arrays (ie, just like you said).

      Variations of what you describe is how you get it from NetApp and many SAN vendors as well. Pools are built from shelves which each have their own internal redundancy, and in some systems you also have fully hot-spare shelves available too.

      Anyway, I'm completely agreeing with you.

    42. Re:Don't panic! by Cramer · · Score: 1

      Actually, I tried making a 42 drive RAID5, but Linux wouldn't have it -- the limit was 26 (md). Dmraid will do it, but keeping parity limited to one shelf is a lot faster.

    43. Re:Don't panic! by segedunum · · Score: 1

      The article's core argument isnt about having errors, its about the likelihood of getting a second drive failure while the array is rebuilding as a result of the first. Mathematically, this has absolutely nothing to do with total storage size

      To achieve the same storage size with smaller disks (comparing like with like) then you need more of them, and mathematically, there's more chance that more of them will start going at any one time the more you have, and during a rebuild procedure. Your redundancy will need to increase exponentially with that the more drives you have, as your storage requirements increase - and everyone's has over the past few years. There is a point at which that, in these multi-terabyte RAID array times (which is why the article talks about 1TB disks), becomes counter-productive unless you start using larger drives.

      Ultimately, that is why the article talks about 1TB drives and up, but I think the example of using a 7 or 8 drive RAID 5 array is, quite frankly, stupid. Even in a 4 disk array you would want at least RAID 6 or some form of RAID 5EE with a decent hot space. I don't trust hot spares because having a drive sitting around doing nothing that you won't know anything about until you have a failure is not a great idea.

      How exactly do you get this? Rebuild time has zero to do with total storage size and everything to do with the size of the failed disk (and the controller/bus/drive speed, but we assume thats equal).

      The problem is, to get an equivalent storage size (comparing like with like) you will need to use more disks, statistically there is more chance of more drives going through failures and you will spend just as much time, if not more, rebuilding the array (or multiple arrays). The array might take less time to rebuild as a one-off task, even allowing for the fact that you have more disks to rebuild (increasing wear and tear incidentally), but you will do it more often and spend more time rebuilding overall. That has only got worse as array sizes have increased.

      That is absolutely the reason. Even if Dell or HP offered servers with 1TB SAS drives, I would never buy them that way. It's a newb mistake to build important raid arrays out of drives that size, due to the rebuild time and overall performance.

      You buy SAS and SCSI drives of that size for performance and reliability, and you buy them for a specific workload that matches that. The trade-off for that performance and reliability is that you spend money on that rather than the storage size. Mind you, expensive SCSI and SAS drives are only for the gullible and I have seen no evidence that they are more reliable. They could end up making a 1TB drive of that quality, but it would be colossally expensive and no one would buy it. The storage size of a drive doesn't correlate well to the reliability of the drive. A 150GB drive is not inherently more reliable than a 1TB drive, other than you have perhaps slightly more chance with more storage space of seeing a failure. The more storage you have and the more disks you have then you have an even greater chance of seeing multiple failures.

      Personally, I find that 'enterprise' SCSI and SAS drives are simply a rip-off for the gullible who are willing to pay the premium, get less storage and have to buy more drives and build more arrays to get an equivalent level of storage. Google certainly doesn't use them. You're buying snake oil basically.

      In a raid array you get performance out of additional spindles and being able to serve multiple simultaneous read requests off of that array (due to some reads only having to hit one or two disks).

      Indeed you do, but in a much larger array more disks (and more arrays) become counter productive from a management point of view. What you need are larger drives.

      Which is why when you need really large storage, you build pool

    44. Re:Don't panic! by Allador · · Score: 1

      To achieve the same storage size with smaller disks (comparing like with like) then you need more of them, and mathematically, there's more chance that more of them will start going at any one time the more you have, and during a rebuild procedure. Your redundancy will need to increase exponentially with that the more drives you have, as your storage requirements increase - and everyone's has over the past few years.

      What you describe isnt how it works in the real world though. You just dont keep adding drives to a single array, you build pools of storage based on redundant arrays in the sweet spot size (ie, 5-7 spindles, in my experience). So no, the risk of drive failure does _not_ keep going up. The risk is at a 'shelf' level, and thats the level you build redundancy into. Then you pool multiple shelves together.

      This is how it actually works in the real world in many, many systems. And works well, and highly reliably.

      Also, in that situation, the redundancy does NOT grow exponentially. It grows linearly within ranges, and then you have a discontiguous cost jump as you move to a bigger class of storage. Then it grows linearly within that group again.

      I don't trust hot spares because having a drive sitting around doing nothing that you won't know anything about until you have a failure is not a great idea.

      Thats why you have patrols/scrubbing/whatever. The drives are constantly being tested and beat on, even if they're not being used. Including the hot spares.

      The problem is, to get an equivalent storage size (comparing like with like) you will need to use more disks, statistically there is more chance of more drives going through failures and you will spend just as much time, if not more, rebuilding the array (or multiple arrays). The array might take less time to rebuild as a one-off task, even allowing for the fact that you have more disks to rebuild (increasing wear and tear incidentally), but you will do it more often and spend more time rebuilding overall. That has only got worse as array sizes have increased.

      In actuality, since you have your redundancy in a storage unit that is sized right for your situation, it deosnt actually work out as you describe. Yes, you may end up replacing more disks as your storage size grows, but you lose disks at a fairly predictable rate within a redundant unit of storage, and the likelihood of losing 2 disks in one unit is very small, due to small arrays, small disks.

      Mind you, expensive SCSI and SAS drives are only for the gullible and I have seen no evidence that they are more reliable. The storage size of a drive doesn't correlate well to the reliability of the drive. A 150GB drive is not inherently more reliable than a 1TB drive, other than you have perhaps slightly more chance with more storage space of seeing a failure.

      There are differences. Less so now than there were in the IDE/PATA vs. SCSI days.

      There are MANY consumer class drives that are literally not rated to be run 24x7. The designer says so clearly on the box (ie, the ibm death-stars).

      The differences are less now, but in the past the cheaper consumer drives used a different kind of bearing that didnt hold up so well.

      The biggest thing you're paying for typically in consumer-class vs. enterprise class is the amount of circuitry/smarts on the drive. Again, in the SCSI/IDE days, this was very distinct. SCSI drives had all sorts of smarts, could do re-ordering and optimizations via TCQ, etc. IDE drives were just dumb slaves for the controller, as a comparison.

      Nowadays, even some SATA drives have NCQ, which can be quite useful in certain kinds of workloads.

      But heck, just pick up an enterprise class drive in one hand, and a typical consumer drive in another. The former will weigh several times the latter, as it contains much more metal. Built to tighter tolerances, better heat dissipation due to the she

    45. Re:Don't panic! by Sillygates · · Score: 1

      I stand by the first guy's reply to this. With raid5, and a random silent bit error, there is no way to tell which bit is wrong.

      A1^B1^C1 = P1 randomly flip bit C1, and you get A1^B1^C1 != P1, whoops, there is an error, but there is no indication where the error is.

      Raid5 covers this case: A1^B1^[OFFLINE](?) = P1

      In this case you can easily use the P1 bit to reconstruct C1. Really, there is only one copy of the data, with a the bits xored across (bit P gets flipped for every 1 bit seen across a b and c, and stays the same for every 0 bit). In terms of disk space, you have to think about it like this: you are losing one disk to parity. If raid5 worked that way, you would need to lose two disks to parity, and that's not raid5...

      Someone with mod points should really mod up the previous reply....

      --
      I fear the Y2038 bug
    46. Re:Don't panic! by Sillygates · · Score: 1

      In your example, the 4 copies with bit c being "incorrect"
      Data layouts:
      1: {A1,B1,C1}
      2: {B2^C2^P2,B2,C2}
      3: {A3,A3^C3^P3,C3}
      4: {A4,B4,A4^B4^P4}

      Flipping "C" (detected errors):
      A1 != B2^C2^P2
      B1 != A3^C3^P3
      C1 != A4^B4^P4

      Flipping "B" (detected errors):
      A1 != B2^C2^P2
      B1 != A3^C3^P3
      C1 != A4^B4^P4
      From this, any of the bits (examples shown with b and c, or the parity can be assumed to be wrong).

      --
      I fear the Y2038 bug
    47. Re:Don't panic! by Guspaz · · Score: 1

      There was a reason that ZFS/RAID-Z was mentioned; when you have a checksum of every block, you know which bit was flipped because all but one of your checksums match. This limits the damage if you have no parity (RAID-Z with one failed disk, RAID-Z2 with two failed disks, etc.)

      The only thing keeping me from using ZFS/RAID-Z is the inability to grow an array by adding more disks (similarly sized or otherwise). You'd think this would be kind of useful, being able to take a 3x1TB RAID-Z array and add another to get a 4x1TB RAID-Z array.

    48. Re:Don't panic! by Guspaz · · Score: 1

      It's got a ram cache and a battery backup for that ram for a reason kiddies.

      Yeah, because of the write-hole; that's got nothing to do with the original issue. Such issues are also solved on a software level via atomic writes by the filesystem at substantially less cost.

  64. RAID6 is far better. by DamnStupidElf · · Score: 2, Informative

    Not only are there two parity drives, but the operating system can perform automatic scanning of the drives to ensure that all data and parity disks are correct and silently correct any errors that occur on only one disk. It only takes a few days to scan 12 TB, and if this is done often enough the probability of a two failed disks plus a previously undetected unrecoverable error on a third disk is quite a bit lower than the failure rate for RAID5. RAID5 volumes can be automatically scanned, but if corruption is detected there's no way to know which of the disks was actually incorrect, barring an actual message from the hard disk. Silent corruption is a much bigger enemy of RAID5 than RAID6.

    I don't know why the article focuses on RAID5; RAID1 or RAID10 will have exactly the same issues at a slightly lower frequency than RAID5, but more frequently than RAID6.

    Ultimately, the solution is simply more redundancy, or more reliable hardware. RAID with 3 parity disks is not much slower than RAID6, and dedicated hardware or increasing CPU speed will take care of that faster than drive speeds increase.

  65. Raid 5 - Kills Drives Dead(tm) by fortapocalypse · · Score: 5, Funny

    RAID???!!! Aaaaaaah! (Drive dies.)

  66. Not just more UREs, but slow fsck too! by nullchar · · Score: 1

    These are only a few of the changes in disk hardware that will occur over the next decade. What do these changes mean for file systems? First, fsck will take a lot longer in absolute terms, because disk capacity is larger, but disk bandwidth is relatively smaller, and seek time is relatively much larger. Fsck on multi-terabyte file systems today can easily take 2 days, and in the future it will take even longer! Second, the increasing number of I/O errors means that fsck is going to happen a lot more often - and journaling won't help. Existing file systems simply weren't designed with this kind of I/O error frequency in mind.

    These problems aren't theoretical - they are already affecting systems that you care about. Recently, the main server for Linux kernel source, kernel.org, suffered file system corruption from a failure at the RAID level. It took over a week for fsck to repair the (ext3) file system, when it would have taken far less time to restore from backup.

    article from 2006: http://lwn.net/Articles/190222/

  67. The sky is falling! by graviplana · · Score: 0

    Nothing to see here. Move along.

    --
    "Time is nothing; timing is everything."
  68. I'm convinced. by m.dillon · · Score: 4, Interesting

    I have to say, the ZFS folks have convinced me. There are simply too many places where bit rot can creep in these days even when the drive itself is perfect. The fact that the drive is not perfect just puts a big exclamation point on the issue. Add other problems into the fray, such as phantom writes (which have also been demonstrated to occur), and it gets very scary very quickly.

    I don't agree with ZFS's race-to-root block updating scheme for filesystem integrity but I do agree with the necessity of not completely trusting the block storage subsystem and of building checks into the filesystem data structures themselves.

    Even more specifically, if one is managing very large amounts of data one needs a way to validate that the filesystem contains what it is supposed to contain. It simply isn't possible to do that with storage-system logic. The filesystem itself must contain sufficient information to make validation possible. The filesystem itself must contain CRCs and hierarchical validation mechanisms to have a proper end-to-end check. I plan on making some adjustments to HAMMER to fix some holes in validation checking that I missed in the first round.

    -Matt

    1. Re:I'm convinced. by Anonymous Coward · · Score: 0

      Yup. I have a MySQL slave that asynchronously replicates queries from a master server. After running for several months the slaved stopped due to a failed query that had a single bit flipped in the query string, it was trying to execute TPDATE rather than UPDATE (or something along those lines.)

      Scary shit.

  69. anonymous coward by Anonymous Coward · · Score: 0

    gee I sure hope someone invents a way to bind raid clusters together soon. Oh it's called storage area network? Great! Oh we can even bind those together with things like SVC? Awesome! Needless to say, I don't get this article. Using the biggest disks you can get is a dumb move anyway and has never been necessary.

  70. zdnet isn't real news by Anonymous Coward · · Score: 0

    Can we just stop pretending ZDNet is a news source please?

  71. Re:Sounds.. well. Stupid by Anonymous Coward · · Score: 0

    PerfectRAID(TM) is Promise's patented RAID data protection technology [...]

    RAID is not perfect, not by any stretch, [...]

    But I thought...? ;P

  72. 3-drive RAID-1 good until at least 2038 by tomhudson · · Score: 1

    Can tolerate a complete drive failure + hundreds of unrecoverable reads per drive on the two remaining drives. The larger the disk, the less likely that both remaining drives will fail on the same sector, so larger drives are an advantage, not a disadvantage, compared to data split across drives that has to be "rebuilt" from the parity info ...

  73. Please test, but note the date by AySz88 · · Score: 1

    I concur but in a less-condescending way: if there are any people that are already on the way towards building a giant RAID, can someone please give this a try to see what actually happens? (That is, fill the drives with test data, make the array rebuild once or twice, and see if bad stuff happens.) The article is based off the URE spec, but maybe real-world URE rate is lower.

    But to be fair to the article, note the date on the post: July 18th, 2007. It wasn't "trivially testable" at the time; terabyte drives weren't exactly cheap yet.

    1. Re:Please test, but note the date by AySz88 · · Score: 1

      (I forgot, don't forget the drives' spec'ed URE rate needs to be 1:10^14 for things to go theoretically downhill; there are a few random terabyte drives with 1:10^15 floating out there.... And yes, I know the "once or twice" won't give a statistically significant result; maybe up to a dozen times if you want to be more precise...)

  74. first step or misstep by weirdcrashingnoises · · Score: 1

    if your first step is to use the largest drive available in a performance raid, then your first step is probably a misstep.

    it's better to use those raids with the smaller-yet-faster drives, ie the 10k+ rpm raptors or perhaps some yet to be seen faster better (stronger harder?) SSD's and use the latest ultra large HDDs in a simple redundant raid setup for backups.

    multi raids ftw

    --
    sigs... don't talk to me about sigs....
  75. RAID disk failures in general by Coolhand2120 · · Score: 1

    The theory of more disks = more disk failures sounds totally logical. But in practice it does not work at all. For 5 years I ran 5 servers with various IDE RAID5 and RAID1 solutions (promise, highpoint). There was a total of about 20 IDE disks. I see a disk failure about once every two months. About 3 years ago I added a Dell poweredge 2600 running TWO SCSI U160 disks on a SCSI RAID1. A single disk fails about three times a year on the dell. I found a cheap NetApp F760 NAS. It has three disk shelves of 72gb fiber channel drives for a total of 28 disks (2TB) making up 4 RAID 4 volumes. I've had this for a little over a year running ISCSI for database servers and have yet to see a single disk failure. NetApp uses a technology, WAFL, that is exactly the same as ZFS and was in fact has been in production for more than a decade. But I digress.

    My point here is, the number of disk failures in a particular IT system cannot be generalized upon. There is no global rule for disk failures. My guess is there are so many different reasons for failure that it is practically impossible to predict how a system will behave without looking at the system itself, not at the cloud of disk failures. In my case I had a bunch of IDE disks failing at one rate, a bunch of SCSI U160's failing even more frequently, and a whole lot more fiber channel disks that have yet to fail at all!

    Also the whole premise of the article is emphasizing on the failure of RAID5 - then says enterprises won't be affected - but what typical home user even uses a RAID 5? If I were going to give my mother a RAID it would be a RAID 1, not a RAID 5. Furthermore the typical user doesn't even know what RAID is! The typical user still thinks his single HDD is safe! We (geeks) have a long way to go in educating the typical user before we get to the RAID5 is unsafe part, which is untrue anyway. A good disk controller will recover the data in your RAID 5, even with a URE of 10^14. As with most generalizations and statistics, this one is clearly false. I'm sure Seagate loves that Samsung's failure rate effects their drive's failure rate somehow! The title would be better phrased "Crappy disks and crappy disk controllers are crappy". Hmmmm, yes, I like that much better. The previous title was boring and pedantic.

    1. Re:RAID disk failures in general by brezel · · Score: 1

      i really wonder where those numbers come from too. in my company there are about 50 servers with about 400 sata, scsi and sas drives (including san/nas/external sas-attached storage and so on, all speeds, all sizes from 16gig to 1TB) and within the last 6 years i have been working there i saw exactly 5 drives dying. we exchange most hardware in 3-5 year-cycles, depending on how important that piece of hardware is.

      most of our storage are databases and video/image data which is very frequently written/read and re/overwritten. all in all i think we have somewhere around 25 TB of storage.

      i _really_ don't believe that article ^^

    2. Re:RAID disk failures in general by Sobrique · · Score: 1
      I can tell you how MTBF scales in a SAN - I've got a pair of 1400 drive storage arrays.

      Approximate failure rate is 1-2 drives per month. That seems high, until you start looking at expected failure rates, and then you realise that actually that's lower than the manufacturers spec failure rate.

      However even with such a failure rate, it's no big deal - we're on 7+1 RAID sets (where we're not RAID1), so rather than 'another drive fail' we're on 'another drive in this set of 8 disks, between the first one failing and the hot spare rebuilding'.

      We've just recently had a backend controller problem, which meant 16 drives went offline (in at least 16 different raid sets, for that very reason), which was more than we could hot spare. Thankfully our luck held, and we didn't get another failure in any of the other drives in the 16 raid sets (112 drives).

      But that wasn't actually a drive fault, rather than a controller problem. It just highlights though, that your RAID group does you no good if multiple drives are using _any_ shared components beyond a certain point, as the failure rate of _anything_ can screw with your system. (OK, so they don't tend to do so in a 'data loss' sense)

  76. Am I smoking crack??? by Anonymous Coward · · Score: 0

    So am I on crack? What am I missing when they say MTTF @ 1,000,000 hours... By my calculation thats 114 years... Even if what the report says is true and the MTTF is more like 300,000 hours, thats still 34 years. I keep a stack of spare SATA drives on hand at work because they go bad so frequently and thats in a climate controlled power smoothed room. I'd say over the last two years I've replaced 6 drives against 14 servers. Granted I haven't had a bad one in quite awhile (knock on glued particle wood fibers w/ a simulated cherry veneer).

    So what am I missing???

  77. That's pretty alarmist... by NerveGas · · Score: 1

    Assuming that you'll have another drive fail before it can rebuild is pretty alarmist. Sure, it can happen... I've *seen* it happen. But it's not the norm.

    Most people who run RAID5 are running pretty poor hardware implementations. But on a board with Raptors, via multiple (quality) SATA controllers each connected via PCI-E (avoiding bus contention), I've seen RAID5 rebuilds of over 200 MB/sec. That's a pretty far cry from something like 8 drives hooked to a 32-bit, 33MHz PCI SX8, and getting a tenth of that.

    --
    Oh, you're not stuck, you're just unable to let go of the onion rings.
  78. Ok, I'll take the ZFS bait by Matt+Perry · · Score: 1

    I'll keep the ZFS plug short. Go ZFS. There, that was it.

    Isn't ZFS a filesystem? Why would I care about what filesystem I am using when I am trying to protect my data from disk failures?

    --
    Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
    1. Re:Ok, I'll take the ZFS bait by pyite · · Score: 2, Informative

      Isn't ZFS a filesystem? Why would I care about what filesystem I am using when I am trying to protect my data from disk failures?

      Because it's a file system, volume management, and redundancy all rolled into one combined with native NFS and SMB sharing, iSCSI support, etc. etc.

      --

      "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

  79. Does this make any sense? by cyberjock1980 · · Score: 1

    First they say that these new improved lower failure-rate drives along with bigger disks are going to kill RAID5. Then, they try to tell us that if 1 drive failed it's likely you will lose the array due to an unrecoverable error on the remaining array drives. So which is it? Are hard drives becoming more reliable or more error prone? You can't have it both ways. This sounds like complete crap to me. I'll still be using them at home, and probably 5 years from now too.

  80. Precisely by Sycraft-fu · · Score: 1

    RAID-5 is UPTIME protection, not DATA protection. What I mean by that is if you have a non-redundant configuration (single disk, RAID-0, JBOD) and a disk fails, your shit is down. You can't use your system any more until you get the problem fixed. Then, it is a rebuild situation. So you can be facing a few days of downtime, maybe more. That can be real annoying.

    Well RAID-1 and RAID-5 solve that to a large degree. Now a disk failure doesn't cause a system failure. A disk fails, and provided you can get a replacement before another one fails, and it is a good bet you can, and there is no problem. You continue operating, maybe with a bit of a speed reduction but that's all.

    That's why I have a RAID-5 at home. Used to run a RAID-0, since I keep good backups. Well, the a disk failed on me. Didn't lose any data, but I was sitting around with no computer to use, and it was Friday evening which meant that any order from an online shop wouldn't be there till Tuesday even if ordered with fast shipping. I elected to go buy disks locally, despite the premium charged by CompUSA, but this time got enough for a RAID-5.

    The issue was never data loss, the issue was that my system was out and would remain so for quite awhile. If that is an issue for you, then RAID is a good answer (since drives are probably the sole most likely thing to fail other than fans). However backups are a completely different matter.

    1. Re:Precisely by grahamsz · · Score: 1

      And provided you can get a replacement before another one fails, and it is a good bet you can, and there is no problem

      If it's for any kind of business use then I'd strongly recommend keeping hot spares.

      I've seen two arrays that have ran for years and then suffered two drive failures in the space of a week, and one of them happened while fedex were enroute with the first replacement.

      Fortunately i wasn't in IT there, but some of the engineers in the company posited that one disk was a controller failure and actually unsoldered the faulty controller and replaced it with one from a new drive. Remarkably it came back up.

  81. So? by Anonymous Coward · · Score: 0

    The reality is this:

    * The home user who creates a RAID-5 array will continue to make a RAID-5 array and risk the data loss.

    * The businesses that use RAID-5 arrays will likely fall into one of three groups. Note, I'm not saying that a business won't use all three of these groups either.

        1. Lots of small capacity disks for performance (with or without off-site backup)

        2. Fewer larger capacity disks (most likely with offsite backup)

        3. RAID 5+1 users... data security by mirroring every drive in the set.

    Personally, at home, I use a RAID-5 array. I'm fully aware of the implications and the likelyhood that if I lose one drive, there's a chance a second one will die while i'm rebuilding... that's OK with me, however, since I've gained 2 things.

    1. Some redundancy to prevent completel data loss in the event of an outage.

    2. Higher performance disk (for reads), which makes a difference when I launch stuff remotely. Believe it or not, RAID-5 over a GigE connection is faster than local disk.

  82. Re:Sounds.. well. Stupid by cbreaker · · Score: 1

    Indeed.

    Personally, I believe the correct answer to ensuring data recoverability is RAID together with real-time replication. You can usually accomplish this with a very acceptable price-point.

    RAID is important to prevent down time due to a single disk failure, and replication prevents loss of data due to an array failure.

    Personally I think RAID5/6 will be around for a very long time because it works and there's actually people out there that use it correctly (versus SO MANY people on Slashdot, apparently.)

    Gosh, there's even one guy a few posts up that's claiming "The biggest problem with RAID is DECAY." Holy crap. Any RAID card made in the last 10 years will periodically scrub the disks and make sure the parity is correct - or else it will mark a disk as bad.

    --
    - It's not the Macs I hate. It's Digg users. -
  83. Re:Raid5 =FAIL. BoynondRaid =FTW! by Anonymous Coward · · Score: 0

    dumbass boyond raid is all you need atm gook bye you =FAIL //

    d-_-b

  84. My RAID setup to prevent data loss. by BitZtream · · Score: 1

    Raid 1+0 on the machine - We want a safety net to deal with minor failures and keep the machine online, and the drives need to be fast.
    Automatic backups by the hosting company, to local and offsite tape, twice a day, I'm told verified at both locations after written to tape against the original snapshot.
    Automatic backups synced to our company office, once a day, verified against md5 and sha hashs of the data when the original backup was created after it is at the office.
    Copies brought offsite to a safety deposit box when ever I feel like doing, just in case everything else fails.

    Are we safe from all harm? No theres a good chance that a nuclear detonation near our office which is near the primary datacenter will get most of our backups instantly, and possibly get the secondary backups provided by the hosting provider. However, as I told the president of our company, if that happens, I'm really not going to give a damn about restoring our data.

    And ... the whole thing was screwed because no one noticed some of our reports were acting wonky in time to keep the last good backup (last snapshot of the database before the server or OS screwed up the file itself) from cycling out. This was the time when that random old backup laying in the safety deposit box saved us.

    When that event occurred, since we had NEVER gone to the safety deposit box before, no one even thought about it when we were in the initial panic failure mode. It was actually a few days later when discussing the 'data loss' that the guy who actually controls the safety deposit box said 'What about the backups I take offsite?' and finally my obviously slow brain kicked in and said 'duh'.

    For the record, I have since turned in my sysadmin card for obvious and become a developer, and I no longer believe in backups, thats what the sysadmin is for! Now I make the problem worse with bad code rather than being responsible for and protecting someone elses data.

    Sorry, what was my point again?

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  85. Debian does this by default by Slashdot+Parent · · Score: 1

    Just an FYI, the default behavior in Debian is to scrub all of your software RAID arrays monthly via a cronjob that comes with the mdadm package (see the checkarray script).

    I wouldn't be surprised if other distros do something similar.

    --
    They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
  86. The Black Swan by jschmerge · · Score: 4, Interesting

    A Black Swan is an event that is highly improbably, but statistically probable.

    Yes, it is possible for a drive in a RAID 5 array to become absolutely inoperable, and for one of the other drives to have a read failure at the same time. This is highly unlikely though, and is not the Black Swan. The math use to calculate the likelihood of these two events occurring at the same time is faulty. The MTBF metric for hard drives is measured in 'soft failures'; this is very different from a 'hard failure'.

    The difference between the two types of failures is that a soft failure, while a serious error, is something that the controlling operating system can work around if it detects it. It is extremely unlikely that a hard drive will exhibit a hard failure without having several soft failures first. It is even more unlikely that two drives in the same array will exhibit a hard failure within the length of time it takes to rebuild the array. In my experience, it is more likely that the software controlling the array will run into a bug rebuilding the array. I've seen this with several consumer-grade RAID controllers.

    The true Black Swan is when a disk in the array catches fire, or does something equally as destructive to the entire array.

    To echo other people's points, RAID increases availability, but only an off-site backup solves the data retention problem.

  87. Drive firmware... by Chris+Snook · · Score: 1

    ...is rising to the level of complexity of a simple operating system. They do a lot of very smart things in there to catch problems before they become unrecoverable, and they even report them to you via SMART so you can tell when they're at an increased risk of failure.

    The catch is, there are several different ways to optimize firmware. Drives intended to be used in RAID arrays have different firmware from drives intended for desktop or laptop use. If you use desktop drives in a large RAID array, with a rather fault-intolerant parity RAID implementation, YOU ARE AN IDIOT AND DESERVE WHAT YOU GET.

    The hard disk vendors aren't idiots. They make nice, fat margins on drives that get used primarily in RAID 5 arrays, and they're not about to let that revenue stream dry up. You'll have to pay a little more for the RAID-optimized drives, but the price gap between bargain SATA and RAID SATA is much smaller than the gap between IDE and SCSI, so it's still worth it to an awful lot of people.

    --
    There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
  88. Smaller format HDDs more reliable? by failedlogic · · Score: 1

    I think anyone talking about HDDs here wether SCSI, ATA, Fiber, SATA are all talking about 3.5" form factor.

    If you're willing to negate performance, storage size and space, are the 2.5" or 1.8" Hard drives less *prone* to failure than the 3.5" ones? Say I want to be reckless, back up to a few DVDs and smaller format HDDs - different batches maybe different manufactureres. Unplug them onece copied.

    Personally, I have well under > 10 GB of REALLY essential work, photos, etc. I mention this at least becuase a drive of equal or grater capacity in 1.8" FF would be affordable. The rest I could care less about. So how do I best protect it?

    As the article mentions ZFS, would standardizing to one file format - Windows, Mac, Unix, Linux be a good idea with ZFS? I had a Mac and there's all kinds of crap software I had to install to get Windows to recognize the drives. Apple is considering ZFS. Linux its an option(?) and the BSDs (are working on it?).

  89. Backup is not an issue by Anonymous Coward · · Score: 0

    Just encryt all your data, rename it to $porn_dvd.part_$n and throw it on any filesharing network you like.
    Your data will be save for as long as the internet exists and due to increasing number of broadband users sharing 12 TB of data is not a problem at all.

  90. Re:12TB at home!?!? by Anonymous Coward · · Score: 0

    who has 12tb of data at home? i can only imagine that you have a nice media server. in which case, all of your music and movies should already be backed-up for you on the original media. maybe you should consider your downloaded porn expendible in the event of a disk failure. 12tb of family photos? maybe you should get from behind that camera and actually interact with your family.

    if you really have 12tb of data at home that you cannot afford to lose, then you can afford to setup another box with 12tb of capacity and sync them.

  91. does google use tape backups? by Anonymous Coward · · Score: 0

    I have no idea. But I don't. I use multiple servers. Hard disks are cheap.

    I run linux software raid. If a drive fails, I'm notified, and then scripts stop samba, sendmail, ftp, etc. When they are stopped, the system checks two different secondary servers, which are also raid 5, and it will rsync the delta over to one secondary, and then the other secondary.

    Compare the number of bytes involved in an emergency rsync, the last regularly scheduled rsync being at most 8 hours old, to a full blown rebuild of a drive.

    Teensy weensy. Even with two complete duplicates. And one of those secondary systems contains rsync snapshots coving 12 months of history.

    Then the primary machine waits for my intervention limping along in degraded mode. I can do a rebuild with a hot swap that is already installed. Or if the drives are already fairly old. I replace the whole set.

    Never needed tape. Too slow. Too expensive. Not very flexible.

    My Primary system is 4 TB (1 TB worth to parity)

    Secondary One is 4 TB (1 TB worth to parity)

    Secondary Two is 8 TB (2 TB worth of parity)

    The Secondary Two has 60 days of snapshots, and a monthly snapshot for the last 12 months.

    Drive failures are lowered dramatically if you take a $5 case fan and put into position to blow over the top of each drive.

  92. RAID-60 by c-reality · · Score: 0

    How about RAID-60? You can loose up to four drives (two per sub-array).

  93. almost certain???? by sportster · · Score: 1

    OK. I'll bite... " it is almost certain it will see an [unrecoverable read error]." What? Like most stats the 12 terabytes failure rate doesn't mean anything. If it were true, then the servers I have running 4 terabyte arrays would be failing all the time.

  94. DVD? by elzurawka · · Score: 1

    Whats all this talk about DVD? Why not sure Blu-Ray? whats it got 50 gigs? Little better...240 Disks for the example above where someone said 1600+ dvds. Also Blu-Ray today have 2 layers. In the future i am sure that will increase, or the next optical medium will be even more condensed. That being said i still believe that SSD is the way of the future.

    --
    -EL
  95. I noticed... by genw3st · · Score: 1

    ... or rather, haven't noticed, the fact that nobody has realize this article was written in Mid-2007?

    Why has it taken this long to hit slashdot? Is there a time warp here...

    1. Re:I noticed... by Lord+Bitman · · Score: 1

      Don't worry, I'm sure it made it here then, too.

      --
      -- 'The' Lord and Master Bitman On High, Master Of All
  96. logical flaw in the premise of this story by magoo75 · · Score: 1

    I work for a storage company. And have been in the storage industry for abut 8 years now. And here is the problem with this whole story. "...data from the failed drive, it is almost certain it will see an [unrecoverable read error]." Why is this almost certain? I haven't seen this happen...ever...to the extent that the RAID set wouldn't rebuild. That's, like, the job of RAID 5. Yes, it will take longer to rebuild 2TB drives. And people are building faster RAID contollers. You can always do RAID 1 (which isn't dead, but sure is damn expensive). And with the advent of SSDs...and RAID controllers with a shit load of write cache...you can do what TMS does and RAID 1 from your 128GB SSDs (however many you may have in your array) to a set of SATA disks (which can themselves be mirrored). This way you have multiple failure domains. And although you lose performance, you don't lose DATA. There are always smart people out there to solve such problems...but I see that it's quite sexy to just call something dead. I like the term (from Princess Bride) "Mostly Dead".

  97. RAID 5 sucks. by Anonymous Coward · · Score: 0

    Sometimes RAID 5 drives fail because of faulty controllers. I've had several drives "go bad" but they were perfectly fine. RAID 5 sucks. Takes too much time to rebuild. I'll only mirror or mirror+stripe from now on. Storage cost is inexpensive enough that having down-time from rebuilding is worth spending more money on a few more disks.

  98. Lies! by dlucre · · Score: 1

    The article says "RAID 5 Stops working" blah blah blah. That's not the case. The purpose of RAID 5, as has been mentioned several times in the comments, is to give you MORE time to recover from a failed drive.
    No matter what size your array is, having more time to recover from any failure is invaluable. Therefore claiming that because bigger drives are available, does NOT invalidate the value that RAID5 has to an organisation.
    Having said that, RAID5 is NEVER considered the ultimate means of protecting your data. If you think it is, then think again. You must always have multiple copies of your data, in multiple locations, and preferably on a magnetic media for long term storage if necessary.
    I have a collection of servers, all with various amounts and types of data. These servers have RAID arrays, some mirrored drives only, others with Mirrors and RAID5 for more important data.
    Each are incrementally backed up hourly to a "backup server" on to another RAID5 array.
    A daily backup is also taken, on to another path.
    Each night, this backup server's data is written to an LTO4 tape, and the next morning it is taken off-site.
    We also keep monthly tapes on-site, and off-site.
    At any one point, I can recover data to any server from up to an hour ago, as of last night from either yesterday's daily HDD backup, or last night's tape. Or at the end of every month for as long as we've had this backup strategy.
    This is the best backup strategy I could come up with the budget I had, and I don't pretend that it's the best in the world.
    But simply relying on RAID 5 and nothing else, you're asking for serious trouble.

  99. Read scrubbing is the key by Terje+Mathisen · · Score: 2, Informative

    The only solution is to regularly read everything:

    The chance of avoiding double errors in the form of unreadable sectors during rebuild about doubles each time you halve the time between full reads of all sectors on a drive. (True to about weekly full reads.)

    This is because a full read will allow each drive in the array to discover sectors that are becoming iffy (soft/recoverable read errors) and then remap them.

    See lwn.net for a discussion and links to some good papers.

    Terje

    --
    "almost all programming can be viewed as an exercise in caching"
  100. Raid 5? by Migity · · Score: 1

    I've only seen Raid Ant Baits III. Where can I pick up some Raid 5?

  101. Re:Sounds.. well. Stupid by EdIII · · Score: 1

    Well played you Magnificent Bastard (TM) :)

    Nothing is perfect, but when you use the word perfect in a trademarked name is sounds really good.

  102. It stopped working already by rrohbeck · · Score: 1

    I've seen 3 cases of data loss this year in 6 drive RAID5 with 750GB or 1TB drives. One drive failed, and a second drive failed completely during the rebuild or there was a read error on another drive during rebuild. That is not a fluke given that rebuilding can take days if the system is under high load.
    Scrubbing costs too much performance if you want it to be useful. We're using RAID6 in newer products.

  103. Tapes are overrated. by TheLink · · Score: 1

    Tapes are overrated. As long as you are careful not to drop them, HDDs are pretty decent.

    AU$1000 buys you about five 1TB SATA drives.

    So AU$3K can buy you: 5 x daily 1TB drives, 5 weekly, and 5 month.
    AU$1K is enough to buy a "build it yourself from decent parts" server - decent power supply, 4 x SATA, 2 x 1Gbps NICs, a core 2 duo (so you can do gzip to two "backup media" drives at the same time ), 2x 1GB RAM (not important but hey it's cheap) and a UPS+power filter, and special removable caddies for those HDDs (make sure they won't overheat the HDDs).

    With tape drives, if there is new higher capacity tape technology, you will need to buy a new very _expensive_ tape drive to take advantage of it.

    With HDDs you get a whole drive mechanism along with your "media".

    If in 5 years time if there are cheap 10TB hard drives, you just buy them and use them.

    Whereas if in 5 years time there are cheap 10TB LTOx tapes, you will need an expensive LTOx drive.

    For the past decade or so that has been the trend.

    I bet the SATA interface will be around for a long time, so in 5 years you'd still be able to read _most_ of your backups. Even if the bearings seize up due to age and lack of use, the data is likely to still be on the platter.

    Now if you require _hundreds_ of tapes worth of storage, then using tapes as your "media" may become more viable than using HDDs.

    But I get the impression you're not facing that scenario, so HDDs should be fine.

    There are special caddies for HDDs, so that you can plug and unplug them.

    So you could buy a Backup Server with a gigabit NIC or two, install the caddies, insert the drives, and then do backups.

    Seems doable to me. After all AU$4000 is a lot of money in Malaysia where I am (we're the cheap labour, "low brains" country).

    So, how much are they paying you in Australia to solve problems like this? ;).

    --
  104. The article is OLD and WRONG by Rui+del-Negro · · Score: 1

    If a sector fails during an array rebuild in RAID-5 (after a complete drive failure), you lose one stripe's worth of data (ex., 64 kB x N-1, where N is the number of drives), you don't lose the entire array. Following the article author's logic, if you have a read error in a single drive, then all the data on the drive is lost.

    It's amazing that ZDNet would pay someone this clueless to write an article about this subject, it's amazing they published it without any verification, it's amazing that the article is still online (and essentially uncorrected) after almost one year, and it's even more amazing that someone decided to post this on Slashdot.

    1. Re:The article is OLD and WRONG by dirtyhippie · · Score: 1

      Sorta true. The problem is many RAID controllers treat a single error on a drive as being equivalent to a dead drive. Granted the article is sensationalistic gibberish, but there is a grain of truth in it.

  105. RAID 5 is already dead... to me at least. by Anonymous Coward · · Score: 0

    I just had this very same scenario happen to me about too weeks ago, except I only had about 1 TB and happened to have most of it backed up to external drive. I had to manually rebuild the RAID array and a i am now awaiting the arrival of my 1.5 TB hard drive to come in. I am just going to manually back up the data on this drive and use my three 500GB hard drives in a RAID 1 array instead of RAID 5. Its not worth it to me to have a redundant system that doesn't work. I don't want to loose my entire collection of movies and music do to another poorly manufactured hard drive.

  106. bleh by Anonymous Coward · · Score: 0

    RAID was developed when the biggest hardrive was 100MB and people needed easy ways to prevent data loss. RAID was also for enterprises just recently it was opened to home users/gamers using RAID 0,1,10 etc.. this doesn't surprise me one bit yet people still make a big deal. the bigger the array the more prone you are to disk issues and data loss during a rebuild.

  107. Picture worth 1000 words! by Anonymous Coward · · Score: 0

    50 pin ribbon cable?! in the same mess as all of those GLORIOUS amphenol cables?!.... connected to half height SCSI drive?!?!

    At least 2 of the drives appear to be SCA-2 80 pin drives... but one of them appears to be of the 4.3 gig vintage..... (maybe a 9.1?)

    (And seriously... what SCSI cards are you using again?!)

  108. Missed point of TFA (and S) by AySz88 · · Score: 2, Insightful

    Goodness, even the summary says "didn't back up? bummer!". Yes, we all know RAID only hedges against hardware failure. The point of this whole exercise is that RAID 5 doesn't even adequately help with hardware failures once data per drive grows large enough.

  109. U in URE = "uncorrectable" by AySz88 · · Score: 1

    URE stands for uncorrectable read error, so corrections via ECC should already be factored into that spec.

    1. Re:U in URE = "uncorrectable" by Free+the+Cowards · · Score: 1

      I understand that, it's just that using ECC gives you very good control over the error rate, and I would have thought that manufacturers would want to push it much lower than 1/10^14.

      --
      If you mod me Overrated, you are admitting that you have no penis.
  110. Why ZFS? Real-time replication, not ZFS. by chrysalis · · Score: 1

    Why are you describing ZFS as the only option, are you working for Sun?

    Real-time remove replication and distributed storage are real alternative to RAID 5 or 6.

    No need to use Solaris. There's a ton of very efficient tools to do that on Linux, like the excellent Zumastor project.

    --
    {{.sig}}
    1. Re:Why ZFS? Real-time replication, not ZFS. by asaul · · Score: 1

      Because so far ZFS is the only LVM+Filesystem I am aware of that can only resilver the data you use, rather than the entire disk.

      In the case of the 7 2Tb drive array, if you only have 2Tb of data on the array, you *have* to read all of the other 6 disks in a failure to repair the RAID - so 12Tb of read and 2Tb of writes to repair the array.

      With ZFS RAID-Z you would read ~ 2Tb of data and write 300M. That means you have far less chance of encountering another error while the data is repairing.

      Of course, if you filesystem is full you have to do the lot, but for anything less than that ZFS wins.

      --
      "If everybody is thinking alike, somebody isn't thinking" - Gen. George S. Patton
    2. Re:Why ZFS? Real-time replication, not ZFS. by swordgeek · · Score: 1

      Ummm...ZFS has a lot more to offer than replication/snapshots.

      Someone else mentioned resilvering. That's a big bonus, in that it cuts down on resync time, sometimes drastically.

      However, disk scrubbing is even bigger--reading ALL of the data from disks and confirming the integrity of it, finding unreadable blocks and relocating them.

      And what happens when a block is readable but corrupted (i.e. bit-rot)? Hah--we have CRC checksums to detect that, and correct it!

      Now THAT'S resiliency!

      --

      "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  111. Thanks! What was spec? by AySz88 · · Score: 1

    Thanks for the real-world data! Out of curiosity, what was the claimed URE rate on those drives?

    The reason I ask is: It looks like your observed rate is better than 1 in 10^15 with 96% confidence, and probably nearer to 10^16. Getting an extra order of magnitude (i.e. from enterprise drives, which are 10^15) would be pretty impressive. But it would be quite astonishing if they claimed only 10^14 and actually gave 10^16 instead.

  112. (...or also "unrecoverable") by AySz88 · · Score: 1

    Whoops, technically it seems the "unrecoverable" expansion is more popular than "uncorrectable". But in either case, ECC should already be factored in.

  113. Wisdom follows, pay attention! by Anonymous Coward · · Score: 0

    I think the journalist should be fired for ethics violation. If not, the RAID manufacturers should bring the ZDNET news website to court over defamation charges. The journalist spread false information, fully knowing that it was false information!

    He was totally aware that true hardware based RAID-5 controllers regularly and automatically check the attached disks' surfaces to find sector errors and advise disk replacement preemptively, therefore a spot error on another drive during an array rebuild simply won't happen, period.

    This false and defamative article by the journalist is similar to yelling Fire! in a crowded theatre when there is no fire actually. That is NOT protected speech under the First Amendment and can be prosecuted, because it presents great danger to the patrons (i.e. a lethal stampede).

    Similarly, sedition against RAID-5 exposes server operators to data loss (financial loss, loss of life-saving medical records, etc.) when they abandon RAID protection based on the journalist false propaganda.

    In time of war people spreading panic via false information are summarily executed for good. I do not think journalists should be allowed to spread lies if they know it is a lie or even if they simply omitted checking the information's veracity.

    They should be held responsible, becuse with freedom of speech there comes responsibility for the content of their communication.

  114. I find the article misleading by nvatvani · · Score: 2, Interesting

    Firstly, the core determinants of HDD failures are:

    • Number of writes per second
    • Number of reads per second
    • Revolutions per minute
    • Environmental conditions, i.e. - temperature, humidity, etc...

    The studies by CMU and Google are not broken down at the application level, i.e. - what purpose were the HDDs serving. For example an HDD serving as an archive will perform differently from an HDD doing constant defragmentation, for the sake of example, or other read/write intensive functions as compared to archiving.

    Such a mashing is therefore "unfair". But ok, lets take the numbers produced by CMU and Google. Their rates of failure does seem to threaten RAID 5's (and other RAIDs) reliability with increasing disk sizes. This issue is immediately resolved by the RAID controller - but yes it means an extra performance penalty for the RAID implementation.

    As such, RAID 5 will not die. Its the RAID controllers that need to be more intelligent, at the expense of performance.

  115. just like by Anonymous Coward · · Score: 0

    putting all your important data on one disk, is putting it al on one raid now also "eggs in one basket"?

    So - 2TB RAIDS for everyone - and everything is happy.

    I mean, who really needs to keep all their data in ONE place anyway?

  116. I do not want to state the obvious... by OhneWorte · · Score: 1

    ...but nobody *forces* you to wait with the replacement of a harddrive until it breaks. You may do it earlier and regularly.

    Ohne Worte

    1. Re:I do not want to state the obvious... by dirtyhippie · · Score: 1

      Doing that might actually increase the chance of a catastrophic failure. If you are replacing drives, say, twice as often, you have twice as much time that the array is in a degraded state and an unrecoverable error can happen and hose the whole thing.

  117. Price by rusl · · Score: 1

    Yeah the oldies were more reliable. But... I did some archaeology on an old 386 desktop computer and the HDD still had the price tag on it. $630 for a 120MB drive!!! Nowadays we expect to get a whole home computer at that price and with all the latest. Obviously the production has become more efficient in terms of price but we make computers more and more as disposable and so we get what we pay for.

    --
    Stupidity is its own reward.
  118. Sticking to RAID 0+1 since 1997 by Anonymous Coward · · Score: 0

    - Disks are a lot cheaper than Data
    - Disk IO is the major bottleneck

    That is why I have been sticking to RAID 0+1 since 1997.

    and regarding the comments on offsite backup, you don't need flying monkeys to transfer 12TB of data every backup. Don't you have heard of rsync and database replication???

  119. Ignore: Re:Don't panic! by Christian+Smith · · Score: 2, Informative

    Oops,selected wrong moderation option. This replay is to wipe that moderation.

    1. Re:Ignore: Re:Don't panic! by Anonymous Coward · · Score: 0

      + 4 Informative!

  120. wrong, wrong, wrong by Anonymous Coward · · Score: 0

    Well... 1 failed read per 10^14 bits PER PHYSICAL DRIVE. In the example you have 6 drives, EACH having STATISTICAL 1 failed bit per 10^14reads. Since the drives are smaller than 12TB, we're quite safe. Until we get array of 12TB drives :)

  121. Oh it was expedited alright... to the graveyard by Immerial · · Score: 1

    I'll also tell you it's pretty serious when you see a Fedex plane with a cargo fire in the evening news and later find out that your shipment was on it. And oh... it's not covered because it was considered an 'Act of God'.

    1. Re:Oh it was expedited alright... to the graveyard by turgid · · Score: 1

      Why does god see fit to test the faith of Fedex when innocent peoples' goods are aboard? Can I get a witness? ImitationEnergy (993881), are you there?

  122. For cheap to-disk backups by marcosdumay · · Score: 1

    There is a very nice program called rdiff-backup, useful for cheap disk-to-disk backups. It incorporates incremental changes at your current backup, making it equal to the more recent version and keeping incremental deltas that you can apply to get the old versions (the reverse of incremental backups). Of course it isn't as reliable as proper disk-to-tape backups.

    Now, about your situation, I bet all those terabytes don't have the same importance for the company. Are you sure that you can't provide extra protection for some of the data?

  123. Mac OS X by Anonymous Coward · · Score: 0

    License-incompatible with anything worth running it on, other than Solaris itself... which is NOT worth running (see #1 above)

    What you mean to say is "Some Operating Systems whose merits can be debated are license incompatible with the license of ZFS." FreeBSD can implement ZFS. Why can't Linux? Because of its license, not that of ZFS.

    Mac OS X.

  124. Even so by anubis7733 · · Score: 2, Insightful

    Even if it was feasible to buy all these hard drives or a tape drive, the amount of time it would take to properly do all these back-ups on a useful time scale seems to be beyond the reach of the typical user. Even power users do other things in their lives than worry about their computers. I can't see somebody with enough free time to make CD or DVD or tape backups every so often. And if you are copying your whole 1+ TB drive then it would take forever. It may just be that because I'm a college student I have less time than most people with normal jobs, but I see my dad come home late from work almost every day, and then he's just too tired to want to do anything else. So maybe this whole discussion just becomes irrelevant because not too many people realistically have the time to be able to do all this backing up, and would rather just take the risk of running a RAID setup.

  125. Re:Carefully protected? (a very non-general reply) by Jeppe+Salvesen · · Score: 1

    Too expensive? Really?

    I pay about 13 USD/month for 90 gb of images at S3. I'm a hobby photographer, and those 90 gbs are primarily RAWs along with a few PSDs etc. In case you're wondering why I have 90 gbs of images..

    I don't bother protecting my music and my movies etc. I've got lots of legal DVDs and CDs.. And my mailbox is IMAP - so no problem there. My contact list is synced to my phone both at home and at work. So, for 13 USD/month I've insured myself against losing 5 years of my life.

    --

    Stop the brainwash

  126. Old and ripoff from even older paper by theskov · · Score: 1

    The article is a year old, and basicly just rehashes part of this paper from 2004...

  127. Skip DROBO, go build an unRaid box by Overzeetop · · Score: 1

    Seriously...look it up. If you've got only 3 drives it's free, but go ahead and pay for the normal version. Mine works amazingly well; I have about 2.5GB of data including my DVD and CD collections. It runs great, is expandable, and fails more gracefully than RAID5. If one drive fails, you rebuild; if two drive fail you'll only lose one drive worth of data. Bad, but not as bad as losing the whole array. I haven't kept up with the software (it works, therefore I don't mess with it), but there were plans a few months ago to implement a hot-spare option in case of drive failure.

    You'll need a separate PC box, but for just a few drives they can be had fairly cheaply.

    --
    Is it just my observation, or are there way too many stupid people in the world?
  128. Re:Carefully protected? (a very non-general reply) by Sobrique · · Score: 1

    In case you're wondering why I have 90 gbs of images..

    On Slashdot, you don't need to make excuses for the size of your porn archive. We understand.

  129. I think the URE is being misinterpreted.... by Lunch2000 · · Score: 1

    In the post the writer states that SATA drives
    have a URE rate of approx. 12 Terabytes. I would think that the URE rate applies to a single drive. Meaning that on any given drive you can expect for every 12 TB you will get an error. The key here is *for any given drive*. Therefore isn't it erroneous to apply that error rate across multiple drives? Every drive resets the chances back to 12 TB limit. Since drives are still limited to about 1-2 TB in size I would think this issue is years off. When drives are 12 TB then there will be issues.

    Can someone tell me if I am wrong?

    1. Re:I think the URE is being misinterpreted.... by Anonymous Coward · · Score: 0

      You are wrong.

  130. Re:Carefully protected? (a very non-general reply) by Lukey+Boy · · Score: 1

    When I last priced it out for the amount of data I have it was about 50 USD a month, and being in Canada that cost has been higher and higher recently. Plus, with my current setup of Bacula and DVD-RWs there's no monthly cost whatsoever, and I get much faster recovery times. Recently when my wife's laptop drive crashed she was back up and running a couple hours after I bought a replacement disk. With S3 I'd have to wait while my DSL connection downloaded over 40 gigabytes. Add in two more computers being backed up and that's a very large cost in time when recovering (or even backing up).

    Again though I'm talking about an ideal solution for me.

  131. Anonymous Coward by Anonymous Coward · · Score: 0

    Read error = another failed drive.

    As most failed drives don't completely self destruct, you can recover that data in the given scenario. Better solution (sans backup) is to initially mix the the branding and hardware revision of the drive in a raid 5 array, and have spares on hand with a already hot spare in the group with an aggressive rebuild strategy.

  132. "Informative"?! by Anonymous Coward · · Score: 0

    Oops,selected wrong moderation option. This replay is to wipe that moderation.

    Fair enough, but the two people who modded this "informative" should have replied to you for the same reason ;)

  133. Mirroring is cost-effective by natoochtoniket · · Score: 1

    Disk mirroring (aka, "RAID 10") is easy to set up on most systems any more. Disks get cheaper every year. The original reason for RAID was to minimize hardware costs while providing some redundancy for failure-recovery, back when a 1 gig disk used to cost upwards of $10,000. As hardware costs have declined, the financial incentive to cut these corners have also declined. With disks well below $1000, there really isn't much reason not to keep online mirrors and offline full image backups.

    At home, now, I keep critical data on mirrored disks, and rotate several other full image backup disks offline. All the disks are identical make/model, and all are bootable. The offline disks are stored in a fireproof safe. I periodically send a disk to my brother just in case of fire or flood. When my critical data grows to fill one of those, I will get a half dozen new disks, each 3 or 4 times bigger than the old ones.

    If you have multiple ters of personal data, you might want to consider if all of it really needs to be backed up. The quicken file is important. The sixty-fifth five-hour video of the sleeping baby might not be as important.

  134. Re:Sounds.. well. Stupid by Sangui5 · · Score: 1

    You mean "scrubbing"; I don't know what Promise's patent covers, but basic scrubbing and self-diagnostic monitoring has been built in to commercial-grade arrays since forever.

  135. From what i can tell... by pjr.cc · · Score: 1

    The point of that article was how to present poor statistical analysis (being someone who hated statistics at uni, I can see my own half-arsed attempts in the article).

    Or perhaps its fairer to say "poor statistical analysis accompanies by real-world lack of knowledge".

    Now, if you work with SAN's and storage, you probably already see the faults that I really cant be bothered pointing out... we can just sit here and laugh together at yet another ill-conceived zdnet article shall we?

  136. Scrub is no substitute for ZFS by toby · · Score: 1

    A successful RAID scrub depends on perfect error reporting. ZFS does not.

    --
    you had me at #!
  137. ZFS protects your data *better* by toby · · Score: 1

    ...than RAID can, by design. That's the OP's point here. Disk failures are far from the only failure mode; and many failures are neither detected nor reported.

    --
    you had me at #!
    1. Re:ZFS protects your data *better* by Matt+Perry · · Score: 1

      Yes, but why? How does it protect my data better? I was unable to find anything when I did a search on ZFS.

      --
      Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
  138. Talk about timing... by Anonymous Coward · · Score: 0

    Is Slashdot reporting getting slower or is it just me? The publication date on this article is July 2007. I'm sure that the marketdroids (and tech guys too) have had plenty of time to get ready for the Slashdotting...

  139. Re:Thanks! What was spec? by gweihir · · Score: 1

    Thanks for the real-world data! Out of curiosity, what was the claimed URE rate on those drives?

    The server is still up, if only as fallback, so I can look this up. My original numbers were only rough guesses.

    One is set of 4 Seagate ST3400832A (400GB, i.e. 1.6TB total), with the disks showing 31321h power-up time (that is 3.6 years. The second one is 8 Seagate ST3500641AS (500BG, 4TB total), with disk uptimes of 20548h, i.e. 2.3 years. I did reduce the scrubbing interval to 1 every 30 days about a year ago, so lets factor that in.

    Incidentially, there is an error in my original numbers: 2 years at once every 15 days is 50 complete reads, not 100, i.e. 200TB read. Sorry. Better numbers:

    1st Array:
    2.6 yrs @ 15 day interval, 1.6TB = 63 scans, 1.6TB = 101 TB read
    1 yr @ 30 day interval, 1.6TB = 12 scans, 1.6TB = 19 TB read

    2nd Array:
    1.3 yrs @ 15 day interval, 4TB = 18 scans, 4 TB = 72 TB read
    1 yr @ 30 day interval, 4TB = 12 Scans, 4 TB = 48 TB read

    That is 240 TB read in surface scans alone, without a single uncorrectable. There is one disk with 4 reallocated sectors and one with 5, the rest is at zero.

    As to the uncorrectable rate, Seagate says 1 in 1^14 reads is uncorrectable, i.e. a 512 Byte sector read, if I interpret this corrctly. That would mean one unreadable sector every 5.12 * 10^16 Bytes, i.e. one sector missing every 51 EB. That would fit the observation. Seems there is a major combinatoric goof in the original article. It is still true that one in 10^14 bits read independently (!) would be unrecoverable, but the independence is not there, as you either get a complete sector or nothing.

    To spin this further, if it really is 1 in 1^14 sectors, an 8 disk RAID5 array with 1TB disks has about 1.36*10^10 sectors and hence a chance of one unrecoverable sector of roughly 1 in 730 during RAID5 rebuild. At to the 3% failure rate per year number, at 10MB/s rebuild write speed (conservative), the rebuild takes 28 hours, i.e. adding another royghly 1 in 10000 chance of a disk dying during rebuild.

    Interessting...

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  140. Mod parent up by Anonymous Coward · · Score: 0

    Mod parent up. So many idiots and FUD here it's not funny.

  141. RMS will never die! by Medievalist · · Score: 1

    Because root-mean-square is the ONLY way to get the REAL voltage.

  142. 7 disks is too big for RAID 5. by mr_mischief · · Score: 1

    RAID 5 is designed for 3 to 4 disk arrays. RAID 6 should be used up to 6 or 7 disks. Anything bigger, and you really need something like RAID 1 over RAID 5 or a clustering file system. AndrewFS, Lustre, or anything similar can make the chances of losing every copy of your data extremely unlikely.

    You should still have data somewhere off-site just in case. A Lustre system at your primary location and a smaller Lustre system for compressed backups at a secondary location should be plenty of redundancy.

  143. Re:Sounds.. well. Stupid by Anonymous Coward · · Score: 0

    Stupid Promise Vendor. Nice job marketing your Solution.

    The Problem that is being described will have little effect on the home user but is a looming problem for the Enterprise user where data sets are getting out of hand even today. People may laugh at someone for having a 12TB or larger filesystem as not workable and definetly not easy to index or backup. How many LTO 4 tapes is that?? But the reality is that some Enterprises have filesystems that size and even larger. Some have files that are 1TB+ today. Think Oil and Gas researchers which deal with huge datasets.

    Whether you have one controller or a dozen controllers makes no difference since it is still one common filesystem you are guarenteed to run into the problem that is described in the article. The only way to work around it is to leverage new technologies and techniques that are not all well defined yet. RAID 6 will become the standard to replace RAID 5 just to get past the next few years as described in the article. New SCSI standards will also be developed think SCSI-4 as well as techniques like DIF (Data Integrity Field) that add metadata to hash each block of data to allow for error correction from the base hardware all the way into the application stack.

    Make no mistake we are about to turn a corner and the next few years are going to change everything about how we store and retrieve data simply because the old paradigms have run their useful course.

    Good luck and thanks for all the bits.

  144. This article = FUD by mizzouxc · · Score: 1

    First, ZFS is trash. It's not even complete. There are over 300 fixes in the current patch cluster from Sun. Not to mention there are no robust tools to fix it once it breaks. Did I mention software raid rocks? Secondly, hardware raid isn't going away. Finally, the article doesn't take into account increased throughput of new drives that will be developed.

  145. Be careful with rsync! by grahamsz · · Score: 1

    The one that almost bit us in the ass was a setup where we had a staging environment that had a few hundred gigs of files that would using rsync to push them out to a handful of webservers across the globe.

    The disk array that was on the staging machine failed to come up one day, so rsync dutifully synced up the empty /mnt/staging mount point to all the webservers.

    In some backup situations (certainly my personal stuff) it's usually acceptable to tell rsync to never delete stuff on the other end. Sure you can still corrupt one file and propagate the corrupt version, but you are less likely to wholesale blow away your backup.

  146. Re:Sounds.. well. Stupid by pdh11 · · Score: 1

    The RAID 5 this guy is talking about is controlled by one STUPID controller. There are a lot of methods, and patented technology that prevent just the situation he is talking about. Here is just one example: [...]
    Bad Sector Mapping and Media Patrol: These features scan the system's drive media to ensure that even bad physical drives do not impact data availability

    I'm no RAID expert, but surely the 10^-14 bit error rate the guy is talking about is the rate of previously undiscovered errors? In which case, those errors won't have been found by a bad-sector sweep, nifty though that feature is for advance warning of other problems. His point is that one error in 10^14 bits, though it doesn't sound a lot, is actually one error every time you read 12TB of previously-working data. Which is what happens during RAID5 disk rebuild.

    Peter

  147. Beyond RAID by PythonRules · · Score: 1

    Drobo!

  148. How many times does this have to be said? by Tetsujin · · Score: 1

    How many times does this have to be said.

    RAID is not a backup.

    Yeah... How many times does it have to be said? I mean, seriously, it seems everybody else who commented here said the same thing...

    --
    Bow-ties are cool.
  149. hot spare? by Anonymous Coward · · Score: 0

    Most RAID controllers have the capability to maintain a hot spare. This spare is tested to check it's integrity and if you have a RAID disk go bad the hot spare comes on line and recapitulation begins. Does this even protect you from URE? I guess not.but Wasn't this always the case with RAID5? If you really care I guess you could do a RAID 1:5, two mirrored raid 5's or a cluster or hell I don't know. Just print the stuff out that you really want to keep.

  150. Re:Sounds.. well. Stupid by EdIII · · Score: 1

    Stupid Promise Vendor. Nice job marketing your Solution.

    I love how anytime somebody gives an example of a solution to a problem, an anonymous jerk on the Internet has to accuse them of working for the companies interests. That's a WONDERFUL way of trying to discredit me.

    I have purchased and setup a lot of RAID systems in my time and Promise was just ONE EXAMPLE that came to mind. Why? I just bought a Promise 4-bay NAS. That was in the marketing literature I had read before purchasing the unit.

    Promise is not the only one to have technology and methods like that. I would bet that Adaptec, 3Ware, etc. all have similar technology.

    So *fucking* excuse me. Next time I will take the time to find ALL the technologies from EVERY vendor and present them to /. That way jerks like you will have less opportunity to claim that I secretly work for Vendor A serving products from Manufacturer B.

    But... just for shits and giggles... why not try to respond to the technology I mentioned? All you have done is to parrot the article. The "problem described in the article" is less likely to happen with the technology that I mentioned. Plain and Simple. Respond to that. I'll wait.

  151. unRAID by Patrick_Meenan · · Score: 1

    It's still no replacement for offsite backups but Lime Technology (http://lime-technology.com/) has a JBOD+parity solution that uses a parity drive to protect your data but even in a 2 disk failure more you only lose the data from those disks (or if the second was the parity drive you just lose the data from the one disk). And in the case of a single drive failure it'll limp along using the parity drive until you replace the bad disk and rebuild. The 3-drive version is free but if you want to go bigger they charge a nominal cost (sub $100)

    It works as a great solution for a media server where you may not necessarily care about all the data but you'd prefer to not lose ALL of it at once should multiple drives fail.

  152. Nonsense by Anonymous Coward · · Score: 0

    The company can't afford to not have a proper backup system.

    You think of price per megabyte backupped. But you need to start thinking in price per megabyte "not restored".

  153. Everything You Know About Disks Is Wrong? Really?? by itsybitsy · · Score: 1

    They still spin don't they? Oh drats, there are now disks (SD) that don't spin since they are not disks but they otherwise look like disks... same form factors... sigh.. progress.

  154. Such as...? by Rui+del-Negro · · Score: 1

    Which RAID controllers do that, exactly? That's a bit like saying that SATA controllers will treat a drive with a single bad sector as being dead. I've never seen a controller do such a thing, and I doubt anyone making those controllers would stay in business for long.

    Some controllers will consider a drive as "missing" if it doesn't respond for more than 30 seconds or so, and some SATA drives had a bug (more of a design flaw when used in RAID) that made them spend up to 2 minutes trying to recover a bad sector before responding. The result was the controller assumed the drive had died and said the rebuild had failed. In other words, the problem was the delay, not the bad sector. Anyway, you could still restart the rebuild, but this could be painfully slow if you had a lot of bad sectors.

    This is not true for modern controllers, SCSI drives or "RAID edition" SATA drives, that never spend more than a couple of seconds trying to recover bad sectors (they simply give up and let the RAID controller handle it).

    I'm not sure if the SATA spec has been expanded to include a "I'm busy trying to remap a sector" drive state, which would obviously be the ideal solution, but "TLER" and longer wait times by modern controllers have made the problem essentially disappear, and I think the problem never existed with SCSI / SAS drives, which is what most people would use for "enterprise" RAID-5 arrays.

  155. Hot Spare by Marin3 · · Score: 0

    I guess you never heard of Hot Spare

  156. Re:Sounds.. well. Stupid by Abcd1234 · · Score: 1

    Personally, I believe the correct answer to ensuring data recoverability is RAID together with real-time replication.

    Like, say, RAID-1? :)

    Honestly, given the cost/GB ratios these days, the space advantages of RAID-5 seem pretty silly compared to the reliability and performance issues. Why not just go with something like RAID-10 and be done with it?

  157. dd conv=sync,noerror by Anonymous Coward · · Score: 0

    I've had this happen a few times, and in every case I managed to recover my data with some effort.

    What you need to do is to copy the bad-block disk to the new replacement disk via dd w/ the "conv=sync,noerror" option -- sync tells it to 0-pad bad blocks to length, and noerror tells it to keep reading in the face of errors.

    You then use the copy, which will have 0's in place of bad blocks, in place of the bad-block disk, and use the bad-block disk in place of the failed disk.

    Annoying and time-consuming, but better than losing 10T of data.
          kieran hervold

  158. this article is not realistic by RecycledElectrons · · Score: 1

    > With 12 TB of capacity in the remaining RAID 5
    > stripe and an URE rate of 10^14, you are highly
    > likely to encounter a URE. Almost certain, if
    > the drive vendors are right.

    So one sector (let's say 4kb) in 12 trillion bits is bad, so you're saying your RAID controller is so pathetic that it cas not continue?!?!?!

    WTF?!?!?!?

    > Oh, you didn't back it up to tape?

    TAPE?!?!?

    Try backing up to eSATA drives.

    Basically, this moron has no clue of how to actually use a PC.

    Andy

  159. CDDL vs. Linux by PhotoGuy · · Score: 1

    So a lot of people are saying ZFS is a great solution for a number of the issues brought up in the article.

    ZFS isn't available on Linux due to incompatibilities of the CDDL, and Linux's GPL license.

    CDDL is (in my opinion) more "free" than GPL (which forces redistribution of code; but let's avoid that debate right now.)

    Okay, so if ZFS can't be bundled in a Linux distribution and redistributed while maintaining a GPL license, fair enough. But what is stopping anyone from doing the port, and providing ZFS as a freely distributed, do-what-you-want-with package, that installs and runs fine on Linux. If a user chooses to take this freely licensed ZFS and compile/link/install it on their Linux system, that is none of Linux/GPL's business.

    Yes, redistribution is thwarted by the GPL, but why would install-it-yourself be problematic? I'd settle for that. Why isn't this available? Why hasn't anyone finished a port that can be used in this manner?

    Not trolling, honestly curious about it...

    --
    Love many, trust a few, do harm to none.
  160. Re:Sounds.. well. Stupid by cbreaker · · Score: 1

    Well, because it's still too expensive for a lot of businesses to go full RAID 10 on their main storage system.

    The disks you buy at NewEgg are cheap, but the disks you buy for your SAN are not as cheap. They might be the same disks, but that's just the way it is. And, the big costs come in the form of cost per slot, not necessarily the disk that plugs into it.

    RAID-5 doesn't suffer from any real performance issues. Not for the last 10 years, anyway. Read speed is as fast as a stripe set, and write performance hits are easily mitigated by on-board cache. I kinda thought this was common knowledge..

    Replication can be done to a much cheaper unit or DAS, and/or can be sent to an off-site location for better recoverability in the case of a real disaster.

    --
    - It's not the Macs I hate. It's Digg users. -