Experiences w/ Software RAID 5 Under Linux?
MagnusDredd asks: "I am trying to build a large home drive array on the cheap. I have 8 Maxtor 250G Hard Drives that I got at Fry's Electronics for $120 apiece. I have an old 500Mhz machine that I can re-purpose to sit in the corner and serve files. I plan on running Slackware on the machine, there will be no X11, or much other than SMB, NFS, etc. I have worked with hardware arrays, but have no experience with software RAIDs. Since I am about to trust a bunch of files to this array (not only mine but I'm storing files for friends as well), I am concerned with reliability. How stable is the current RAID 5 support in Linux? How hard is it to rebuild an array? How well does the hot spare work? Will it rebuild using the spare automatically if it detects a drive has failed?"
Do yourself a favour and buy some more or less cheap hardware RAID controllers. You won't regret it. Software RAID is nothing more than "showing it's possible".
Take it from me, stick with a hardware raid 5, reliablity is thru the roof, and cards are now around 300-500 for one with 128 mb of ram. Ince you spent 960 dollars on the harddrives, you might as well trust their organization to something of equal quality.
my 2 cents
Actually, a big disadvantage to hardware RAID is what happens if your controller fails.
Consider--your ATA RAID controller dies three years down the road. What if the manufacturer no longer makes it?
Suddenly, you've got nearly 2 TB of data that is completely unreadable by normal controllers, and you can't replace the broken one! Oops!
Software RAID under Linux provides a distinct advantage, because it will always work with regular off-the-shelf hardware. A dead ATA controller can be replaced with any other ATA controller, or the drives can be taken out entirely and put in ANY other computer.
Is there a good resource for hardware/software RAID support on linux? Tech support is always a challenge and we have a number of 3ware 8way and 12way powered by 250gb drives. We often have lots of mysterious drops on the array that require reboots or even rebuilding the array. Royal pain in the ass.
Laboratree - Scientific collaboration based on OpenSocial.
The guy could be looking for people's experiences rather in additional to any technical documentation, which is not only smart, but the hallmark of a responsible sysadmin (where knee-jerk comments tend typically aren't).
This is the worst possible type of advice. Do you have any reason for not using them? Maybe you've bought dozens and they've all blown up and burnt your house down, which would be a good reason to not buy 3Ware. Maybe you work for a competitor.
For all I know, you could have a very good reason. But if you tell someone to make sure to to stay away from something, you should provide a reason. Especially if it's something that seems to have a really good reputation.
________________________________________________
suwain_2
The cheap raid controllers are almost always software raid and not worth it. If performance is critical some of the higher end SATA and SCSI raid stuff is worth it, but a lot of that sucks too so do benches, take recommendations and don't believe in brand names...
Um, perhaps my understanding is wrong, but isn't RAID5 intended solely for reliability (that is, for making the storage system tolerant of a single drive failure, and thus increase its mean uptime). If you want the data to stay safe then use a backup, not a RAID.
In general (not replying you your otherwise quite correct post, please don't feel browbeaten) I really wonder
a) why anyone would need the additional uptime in an in-home setting and
b) what the point of a generic IDE raid5 is anyway. When one drive dies, the system keeps running with the hotspare. On a commercial array (or using hot-pluggable storage like firewire) you can pull out the bad drive, put in a new one, and the system rebuilds that as the hotspare, all without any loss of service. But with regular ATA (and I guess SATA, although I'm not so sure) you can't hotswap, so you have to powerdown the array to swap in the new drive - at which point the reliability you got from RAID5 is gone. Hmm, well, I suppose it's less downtime than you'd have restoring from backups, but it's questionable if that's worth the ongoing performance hit the RAID5 (even a hardware one) would cause.
If all it does is serve files, it should do fine. The 500Mhz is not going to be a factor at all, in fact, the CPU will be idle most of the time. The real thing to optimize in a file server is the ATA bus speed and hard drive latency.
Don't hang a pair of drives off each controller. Get a truckload of PCI ATA cards or a card with multiple controllers. Don't slave a drive. (No, I do NOT know what the correct PC term is for this).
Also, give 'mdadm' a whirl - a little nicer to use than the legacy raidtools-1.x (Neil's stuff really rocks!)
Software RAID5 has been working extrememly well for us, but it is NOT a replacement for a real backup strategy.
one better than mcleodeight
SATA is meant to be used internally, yes.
OP said he switched to fireware for hot swapping reasons alone, that is why I mentioned SATA as an alternative.
If you're beant on having an external RAID 5, you're probably safest going with a DIY gigabit ethernet NAS.
The unofficial
That's why you do this thing called "HAVE A SPARE ON HAND." Sure, it costs more money, but you *are* going for the highest reliability aren't you?
Hi,
:
The scenario you've mentioned is probably OK to use a software RAID. I use it in a production enviroment without problem with a higher stress that your setup will probably have.
I'd suggest you to consider the following items
a) cooling system - those HD can generate a lot of heat. Buy a full tower case and add those HD coolers to make sure your HDs stay cool
b) Buy the HDs from different brands and stores - RAID5 (either hardware or software) can recover from one drive. If you buy all from the same brand/store chances are that you end up with 2+ drives with the same defective hardware
c) cpu - if you are going to use this number of drives the processor will be a majo bottleneck. Do not forget that RAID5 XOR your data to calculate the parity.
d) partition scheme - use smaller partitions and group them together using LVM. This you help you to recover from a smaller problem without taking a lot of time to reebuild the array
Everyone has their own experience with a bad hard drive or a bad batch of drives from a vendor. I know people who swear Seagate is the worst manufacturer, and then some that say WD is awful, and Maxtor, and etc etc etc.
I've used them all, Seagate primarily though (SCSI servers), and have noticed a trend. They all suck the same!
The sooner we can move to cheap solid state storage the better.
Comment removed based on user account deletion
I have a bit of a nit to pick with your subject. =P
While I have to agree that data can be lost because of user error, I built a 2tb RAID 5 out of Maxtor 300gb SATA drives and have thus far had one in five of the drives fail. And, of course, two drives failed within a day of each other so I lost the whole shebang. RAID 5 is fine for stuff like movies and music but I'm sticking to RAID 0+1 for the really important stuff (along with good rsync backups of course).
So, "RAID5 is for High Availability but not Security" might be a better way of expressing your sentiment.
James
Also, it's partition based, not disk based (under Linux, at least). This means that with just two drives you can create one two-disk RAID-1 array (for safety) and one two-disk RAID-0 array (for performance). Just create two partitions on each drive, pair the first partition on each drive in a RAID-0 config and the second partitions as RAID-1.
You can't do a single RAID-1/0 array with only two disks though. You could try, but you wouldn't gain anything (in fact, you'd lose).