Compelling Alternatives to RAID Setups?
jabbadabbadoo asks: "Our software shop has about 30 Linux servers and 15 NT servers running enterprise applications for our customers. Since we have service level agreements with most of them, uptime is crucial. One of the things we've done is to use RAID setups extensively, using products from well renowned disk- and controller vendors. However, we have discovered the paradox that introducing RAID controllers actually reduces overall uptime! Not only does more 'steel' increase the probability of failure, but what fails first is usually the RAID controllers. What is your experience? Have we been having bad luck?"
"A related problem, especially on Linux, is that setting up RAIDs is actually a quite costly process. There seems to be endless problems with library versions, and upgrading existing servers simply takes too many hours. To keep the customers happy, we routinely have to create a 'shadow' server while upgrading which in turn means we, at some point, have to synchronize data to the new server, which in turns means a bit of a downtime. Ouch. Does anyone have a good solution to these problems? Of course, cost is a major issue, but so is uptime (which also means cost if we don't provide the uptime dictated in the SLA). What setup gives the best cost/uptime ratio? Thank for any thoughts!"
I'll agree that setting it up is a nightmare. I'm currently helping test two 4TB arrays for use on a Linux box (16 SATA drives presented as a single SCSI device). Benchmarks under linux are slower than under windows. It's a mess figuring out why. Meanwhile, vendors (who I will not name ship crappy software, and take months to act on bug reports.
As for transitioning servers, I've been there too. And yes, copying a terabyte of disk in single is a very long process. It'd have taken several days, which is of course unacceptable. This is where the magic of rsync comes in handy. Copy the data over several days in advance, sync it just before the scheduled downtime, and you'll have a fairly short downtime.
This is on a lower level than the RAID you are using, but we are having major problems with 10 Promise Technology TX2000 mirroring RAID controllers that we bought. The mirrors go critical for no detectable reason. Promise Technology technical support is unable to find the problem, and the company is unwilling to escalate the issue. The Promise Technology technicians escalate the issue, but 2nd level technical support never calls back.
Promise mirroring controllers on ECS (EliteGroup) L7VTA v 1.0 motherboards have the same problem. When we call ECS tech support, there is a recorded message saying they are busy and to call back later.
We've been supplying computers with Promise mirroring RAID controllers since the company began doing business, and we've had very few problems until now.
Possibly the problems are associated with newer, faster motherboards, or with AMD VIA chipset motherboards. We've never had problems with RAID controllers on Intel chipset motherboards.
Another possibility is that the RAID controllers are incompatible with DVD burner drivers that are installed with Roxio or Nero DVD burning software.
I spent the past week and a half trying to set up a 4x160 SATA Raid-5. It was a huge excercise in frustration because every time I'd try to build a volume, my machine would promptly freeze after a few percent. I changed out IDE emulation for SCSI emulation in kernel... same thing... I changed SATA controllers, same thing. I changed SATA cables, same thing. I changed power supplies, same thing. I added 4 80 mm case fans, same thing. In the end, it turned out that the culprit was raidtools. Nobody had ever bothered to post that raid-5 + raidtools + kernel 2.6 locks up a computer. I changed to mdadm, and I had a working array 50 minutes later.