Which RAID for a Personal Fileserver?
Dredd2Kad asks: "I'm tired of HD failures. I've suffered through a few of them. Even with backups, they are still a pain to recover from. I've got all fairly inexpensive but reliable hardware picked out, but I'm just not sure which RAID level to implement. My goals are to build a file server that can live through a drive failure with no loss of data, and will be easy to rebuild. Ideally, in the event of a failure, I'd just like to remove the bad hard drive and install a new one and be done with it. Is this possible? How many drives to I need to get this done, 2,4 or 5? What size should they be? I know when you implement RAID, your usable drive space is N% of the total drive space depending on the RAID level."
For personal use, a two-drive RAID 1 is probably the easiest way to go, and involves the fewest drives, but loses the most space (half). Raid 5 is the standard, but the hardware is more expensive and it involves at least one additional drive.
For simplicity and low expense, even though you lose a full drive worth of capacity, go with RAID 1.
You might want to read The Tech Report's recent article mentioned on Slashdot if you haven't already.
That what was all this school was for... to teach us how to solve our own problems. -- janeowit
Um.. Have you done any research into RAID? Anyone with even the most basic understanding of RAID (such as someone who read the short guides that come with their RAID cards) would agree that if you have more than two hard drives, the way to go is RAID-5. Your storage space is N-1. If you have six 100gb drives, your storage space will be 500gb. If you have a problem, you remove the bad drive, replace it and reinitialize the RAID arrive.
No offense intended, but why didn't you just do a google search rather than asking 1.5million slashdotters? The words "raid type" would have produced a nice table from adaptec and ars technica as the very first result that would have explained what you needed to know:
http://www.ebabble.net/html/types.html
RAID 0, you need a hero,
RAID 1, is equally fun,
but RAID 5 keeps you alive!
stuff |
I would choose RAID-1.. because RAID Level 1 provides redundancy by writing all data to two or more drives. The performance of a RAID-1 array tends to be faster on reads but slower on writes when compared to a single drive. However, if either drive fails, no data is lost. This is also a great entry-level starting point as you only need 2 dirves. The downside is the cost per MB is high in comparison to the other levels. This level is often referred to as disk mirroring.
Hmmm.
Raid 1 needs only 2 drives but will only give you the capacity of 1 drive. i.e. 2 80 gigs will give you 80 gigs of space.
I just got x2 250gig drives and mirrored them.
Try RAID 5 or RAID 10 (not to be confused with RAID 0+1). This site has a nice overview of all the RAID options. And, of course, Wikipedia has some info.
Quick overview:
RAID 5 - Requires at least 3 HDs (many times implemented with 5 - can be used with up to 24 I believe). Data is not mirrored but can be reconstructed after drive failure using the remaining disks and the parity data (very similiar to how PAR files can reconstruct damaged/missing RAR files for the Newsgroup pirates out there). % of total space available dependent on number of drives used.
RAID 10 - High performance, but expensive. You get ~50% of the total HD space as it is fully mirrored. So, 1 TB total disk space nets you 500 GB total storage space. Your data is mirrored so if one drive fails you do not lose everything. However, if you experience multiple drive failure you can be in big trouble.
Casual Games/Downloads
Whatever you do, never have more than one disk on an ide channel. Only one disk per channel can be written to at the same time, so you will get absolutely horrible performance if you get more than one hd per channel. If possible, get an ide raid card (if you can afford it) or a SATA card/mobo and drives, which dont have this problem
95% of all computer errors occur between chair and keyboard (TM)
Wow inexpensive & reliable... Those are two words you don't see together too often.
I hear good things about this combination.
Your good options are raid 1, raid 0+1, or raid 5, depending on what you want..
Raid 1 is the safest.. just mirroring the drives, but it results in no speed increase..
Raid 0+1 does mirrored stripe sets -- you get the speed advantages of raid 0 with the full protection of raid 1.
Raid 5 is good middle ground. Raid 5 stores 1 drive's worth of parity. When you lose a drive, your system goes down (if you don't have a hot spare), but you throw another disk in and it'll come back up. You also get some speed increase over a normal drive setup. With RAID 5, you only lose a single drive's worth of capacity no matter how many drives are in your array, whereas with raid 1, you lose 50%.
Couldnt all of this been answered by just reading over each type of RAID?
the controller is a little expensive, but it doesn't have the same performance loss as raid 5 cards.
Can get a 3 drive or 5 drive card.
I've got 5 200 Gb drives in a raid xl array and it works great.
RAID 1 for if a drive fails.
RAID 1 + 0 - same as above but with speed benefits
RAID 5 - Costly, write performance takes a hit but reads are good.
IMO these are the only ones that should be considered. I'm sure somebody will elaborate on the theory behind them...
*ducks*
I went through this last year and here's what I came up with for the best benefit to cost ratio with the lowest hassle. In short, take an old PC and put a four channel raid controller card in it to do RAID 5. Add a big extra fan for safety and you're done.
Here's what I came up with: Total cost about $1200 (probably less by now).
0) Red Hat Linux, ext3 filesystem.
1) 3Ware Escalade 7506-4LP card (64 bit card, but fits in 32bit slot)
2) 4x 250Gb Western Digital drives
3) Big fan.
At RAID 5 This yields 750gigs (715Gb after crappy GB conversion).
The 3Ware software has a nice web monitor interface and does daily or weekly integrity checks. It emails me if there is a problem - I did have one drive die already and replaced it easily.
Pat Niemeyer
Author of Learning Java, O'Reilly & Associates
I work for a company that uses all types of RAID. I've experience with 2 bay, 8 bay, and 16 bay RAIDs, as well as RAID cards. If you want the cheapest option, just get a two drive system (either with bays or just a card) and use RAID1. It's basically drive mirroring.
Bottom line, you need to figure out how much you're willing to spend on this and then go from there and see what your options are. RAID5 is the hotness, but it's very expensive (easily over $10K for large capacity devices).
Best way to go is RAID5, do it in software with Linux isn't much of a headache unless you change the size of it. RAID 5 is N-1 where N is the size of a member partition, you don't have to use an entire disk. For instance, my setup: 80GB HD 120GB HD 200GB HD the raid 5 members are the size equivolent to the 80GB itself. the remaining space on the 120 houses the system and a few miscellaneous things, the remainder of the 200 is a 110GB file dump.
Logistical Chaos Officer http://www.slagg.org - LAN Gaming in Sarasota FL,USA
RAID 1. Buy one extra plain-jane ordinary hard drive and use software mirroring, like Linux's md system. I've used this extensively with no problems. I have known other people to have problems using software RAID for more complex setups like RAID 5, but if all you need is extra reliability for a basic desktop workstation, RAID 1 in software is generally fine.
include $sig;
1;
"I'm tired of HD failures. I've suffered through a few of them. Even with backups, they are still a pain to recover from.
If you just run Gentoo, you can type "emerge new_harddrive" and it takes care of everything by the end of the month!
or..
Your shit PEECEE WINTEL crap parts made in china are no match for real quality Mac hardware, which are fully integrated with the UNIX UNDERPINNINGS that have the Best GUI Ever(tm) on top.
Disclaimer: I love trolls.
do() || do_not();
Kills Bugs Dead. Stupid fuckin bugs.
Dear Slashdot,
which is better, SCSI or IDE?
Googleless in VA
If I could, I'd get 2x 250GB HDDs in a RAID1 (promise controllers are good for this), and a third 250GB for a cold backup of all my data that syncs weekly.
:)
Raid's great, but an rm -rf is still an rm -rf, thus the third drive
-- The unsig...
It's axiomatic that the more money you spend for reliability the more likely you are to have some kind of failure. Our fancypants Dell PowerVault RAID enclosures are constantly giving us trouble, yet the machines with just a single IDE drive keep on ticking for years and years.
Personal server? RAID 5, no doubt.
blah... only asshats call it that :-p
Redundant Array of Independent Disks
Casual Games/Downloads
If your running a fileserver with a decent ammount of writes yours going to want RAID5 as it has the least penalty. Hot swap drives are easy enough with SCSI or FC a bit more complicated with SATA and rather complicated with IDE but can be done. For a simple setup as little as 3 disks will do and you will get 2 disks worth of space performance setups will have more spindles. You didn't state as to what sort of load your expecting and that makes a huge difference. For the ultra cheap I have picked up IDE raid 5 cards supprting 4 drives with hot swap for sub 30 bucks on ebay they will only work with 120 gig drives max and are limited to ultra 66 but thats a third of a TB usable as well for a few hundred bucks and it's performance is good enough for a 100bt file server.
No sir I dont like it.
If you don't need any fancy hotplug, just go with 2 or more IDE drives. If one fails you just have to shut down your pc, replace the drive and boot again. For Raid1 you'll lose half your capacity, for Raid5 you just lose the capacity of one of your drives.
If you need real hotplug you'll need to get some expensive scsi or ide (3ware) controller.
Seriously. Raid is all about risk. Figure out how much risk is acceptable to you. If you have a stack of 6 drives and you only believe 1 is ever going to fail at any one time, then go with raid 5.
If you have a stack of 6 drives and believe not a single one is ever going to fail, go for level 0.
If you are a government contractor and are required to handle simultaneous failures of 75% of your drives, either mirror them all or go with 5+1 or a raid 10 setup.
All in all, its a poor question to ask slashdot. You need to let us know what you consider an acceptable failure, and by the time you have that figured out determining what raid level you need is easy.
Karma: SELECT `karma` FROM `users` WHERE `userid`=138474;
RAID 5 or 6 will stripe the data across all drives in the array. You will basically need about 8 - 10 % of the total space set aside for data recovery. You can loose 2 hard drives (as long as they are not next to eachother) and not loose any data. RAID 5 and 6 are only incredibly useful in application with more than 4 hard drives and about 500 gb of storage. It's a little faster than the lower raids becuase the redundancies are simple pairity bit calculations, and are done twice for each single data change on disk. The lower raids will have a set of disks that actually mirror the data in tact (raid 1) or perform more intensive Hamming Distance calculations and store the results on another set of disks.
So, RAID 5 or 6 would be the best (RAID 6 is worth the extra bit of space for the 2nd calculation, and really helps when you can test the pairity bits against another pairity to create the lost data.)
There will be some slow down associated with RAID, but it wont be as bad with 5 or 6 and generally, you can live through it with the thought of having relativly robust file servers.
while(1) { fork(); };
First, a couple things.
Most IDE raid controllers are really software raid. 3ware makes real IDE raid controllers and I would recommend them for home users/Linux.
I would suggest (if it is affordable) to do a Raid-5 setup with 8 IDE drives. This easily gives you 1 TB, and *reasonable* cost and with *reasonable* redundancy. RAID-5 costs 1 drive for redundancy -- in this case meaning 12.5%. If one drive dies you are fine; two drives dying concurrently means you lose everything.
For smaller requirements, I would suggest using mirroring with a cheap built in RAID controller (=software). Cost of redundancy is 50%, but that is reasonable when you are only buying 2 drives.
Idiot...
If you want the best % of drive utilization go for raid 5. It works by Striping the data across 2 drives then XORing the data on the 3rd drive. But, you need 3 drives. Raid 1 works with only 2 drives but you only get 1/2 the data basically each drive has an exact copy of the data that the other drive has.
Put simply if you don't have a lot of data to store but you want it safe go for raid 1 with small drives you end up with the same data storage as one drive but it takes 2 drives. If you have a lot of data to store go for raid 5 you get twice the data storage of one drive but you use 3 drives.
RAID 0 speed with RAID 5 reliability
n de x.html
http://www6.tomshardware.com/storage/20031128/i
www.syncraid.com
That's Independant, not Inexpensive.
This is not the greatest sig in the world, no. This is a tribute.
A simple, very safe server setup is RAID 5 w/ a hot spare. One drive fails, the array rebuilds on the hot spare, and you replace the failed drive whenever you have a chance.
In theory, some of this is possible in software, but a good RAID controller card is much, much better.
A quick note - if you re-initialize the RAID, it will erase everything you have. You should 'rebuild' the drive, unless you have a hot-swap, in which case you just take out the bad drive, pop in the good one, and ur good to go.
Have you thought about software RAID? Before everyone jumps down my throat, I realize that it's slower than hardware RAID...but, here is my rationale for using it:
1) You don't need drives that are the same size.
I've done hardware RAID, had a drive fail 2 years down the road and not been able to find an 18GB SCSI drive to re-insert to the array. That has the potential to jack your entire array. With software RAID, you buy a 36G drive, partition it so that 1 partition fits your array, and off you go
2) It's a personal file server, so speed is less important than cost (i'm guessing). With software RAID you can mix all sorts of wonderous things together. IDE drives from the basement, SCSI-320 drives you stole from work and nearly everything in between. It's for flexible, and has no associated controller cost.
3) It's easy as heck. You can configure it in Disk Druid/fdisk, and it works quite easily in any major distribution (I've done it in Slack, Debian, RH, Fedora and Mandrake).
The major downside is that you cannot (as least I don't know how to) hot-swap drives. But again, this is a personal file server. Spend your money on pizza and beer, screw the SCA hot-swap drives that are going to cost you an arm and a leg.
That's just my $0.02...flame away
Werd.
After building lots of RAID stuff on x86 based machines, I found that whatever RAID you pick for redundancy is probably going to be OK as long as it is more than RAID level 0, but recovery may vary. I would opt for a OS independent solution, where adding new drives doesn't cause you to have to back the whole thing up, and re-initialize the array. Also, it stinks if there are a lot of annoying steps to go through to change a failed drive (reboots etc).
My sympathies to the author. I had a similar question myself, but simple Googling doesn't really answer the question.
Aren't there some failure modes where RAID 1 doesn't work well? What if a drive doesn't fail, but instead fails to return the correct data?
It seems to me that RAID 5 would determine which of the drives is returning bad data, and correctly mark the drive bad, in situations where RAID 1 might not be able to detect which drive is bad.
Could a RAID expert please address this?
What's best? Probably RAID 0+1 for you. It's striping and mirroring, you gain a good deal of performance, almost double your reliability, and lose 50% of the space.
But geez, what kind of question is this for the front page? Ask on a hardware board or do some reading on your own.
The noise alternative is probably software raid-1, since you probably don't need extra cooling for only two hard drives. I have this as the configuration of my second server, running Windows 2003.
I think that this would be a great "Ask Google" question. Which RAID for a Personal Fileserver?
That said; it was pretty easy in Mandrake 10.0 Official to set up a 3 disk Software RAID 5 with 200Gb disks. Supposedly if one goes bad I can just remove it and put in another one and boot up - regeneration is supposed to be automatic.
Humor from a Genetically Molested Mind
You sir, should have tried google first, as everyone else said.
That being said, if you've already purchased drives and you have more that two (implied), you'll want to do raid 5. Unfortunately you failed to mention whether you'll be using windows or linux, so we'll have to cover both. In windows, you'll probably want to buy a shitty promise raid card, preferably with 4 or 8 channels (depending on how many drives you have), and set up a software raid with the promise software.
Under linux you'll need a (cheaper) ata controller with as many channels as you have drives, and then you'll want to use google to find out how to install your distro of choice on a software raid you've set up FOLLOWING THE DIRECTIONS GOOGLE FINDS YOU.
I use RAID 1 on all of my machines. They don't have the one that I use any mroe, but something like this is only $250 for complete hardware RAID (the best kind). It's absolutely seamless.k er.htm
http://www.raidexpert.com/RAID/DynaBac
Can someone do this calculation for me? This seems easier than getting out my graphing calculator. It's all the way inside my bag which is at my feet. 5 * 89 - 303333 + 307 % 4 =
Raid Ain't "Inexpensive", Dumbass.
for a minute there, i lost myself...
This isn't for personal use, but if I wanted a RAID at home, I would definitely consider the same setup as this:
I'm using the 3ware 7006-2 on two Linux boxes (Fedora Core 1) and I'm also using one on a Windows 2003 Server as well. All of them are configured with RAID 1 support and I haven't had any issues on any of the machines thus far (knock on wood). I also bought the Vantec EZ-SWAP MRK-102FD Mobile Rack Frame & Carrier for each drive I have in the RAID as well, these things are dirt cheap ($35.00) and are really nice looking with the LCD temperature readout on the front. This setup might be overkill for home use, but it's certainly not terribly expensive either.
--It's Pimptastic!--
A lot of people have already pointed out that if you have more than two drives, you should be doing RAID-5. Promise produces a couple of true hardware-level RAID cards in the ~$200-$300 range, depending on how many drives you want, and whether you want IDE or SATA.
3Ware's escalade seems to have much better linux support, but are generally more expensive kit (although many claim they're vastly superior). I have a simple file server that occasionally doubles as a dedicated game server at our monthly LANs, so it has plenty of horsepower, and thus I went with a soft-RAID solution. Highpoint makes a model in the $80 range that implements RAID-5 at the software level, but as long as you have some decent hardware that it's going in, that shouldn't be a huge problem.
Just a few things to remember: if you doing RAID-5, try to have a seperate boot drive so that you can keep swap off of the array disk; swap doesn't need fault tolerance and it will slow you down. If you're planning on using gigabit ethernet and heavily accessing this thing (not sure why in a home situation), look at getting one of the intel 875 or Nforce3 boards that have gigabit on the northbridge, as gigabit and heavy disk access together will saturate your PCI bus. If you only have two discs, ignore all of this, and just mirror them entirely in software.
Raid 1 for simplicity. 2 drives in mirrored configuration. Cheapest and easiest to setup. Install Linux and use software raid. Works like a charm.
If you're upto a challenge, install Linux to boot from the RAID 1 config. It was a huge pain in the ass to figure out. When I configured Redhat 9, I had to use Lilo instead of Grub as the boot loader wasn't being correctly written for both drives. Had to use "dd" to write the boot sector and Lilo to get it working properly.
Benefits of software raid allow you to swap drives with minimum downtime and recreate the drive in the background. And u save money from not buying a hardware raid card, which could serve as another possible point of failure. Then you can write scripts that can email you the status of the raid periodically with cron.
Remember to test the config by unplugging each drive separately. Of course it will take awhile to sync each drive...
If you are feeling feisty and have more money to spend try this (a copy of a previous post of mine):
Here are some interesting numbers:
$250 per drive
400GB per drive
4 drives
1.2 TB in Raid 5
Total cost $1,000
or $0.83 per MB.
So there you have it. A terabyte file server for about $1000 will be a reality soon enough. Nice. Serial ata will lessen cable clutter, and only 4 drives will be doable in any spare decent case and power supply.
Hopefully it won't take too long for prices to drop to $250.
Of course Raid of any level is no replacement for a full backup, but it's certainly better than nothing or relying on a single drive no matter how good the quality/warranty.
RAID does little for you when you have a power problem, either the power supply smokes the mother board, or the drive goes and hozes the IDE interfaces of other drives.
.
I have had both happen.
.
The cheaper solution - is to purchase drives with a good warenty and not the cheapest thing you can find, and a good power supply, not the cheapest one.
.
But the true answer depends on how much you value your time, here - down time at work is about $4000 per hour.
.
What is your time worth for your personal time and trouble?
.
Downtime at home costs what?
.
When will you be upgrading or replacing this? IN 2 years, or 3 years?
.
What is your personal ROI needs?
.
The decision is like buying an insurance policy.
Here we use the 'double nickle' on all our file servers, RAID 5 spread across five disks. In six years we have never lost data due to a drive failure and if one of the little buggers does die we just pull the drive out of its cage and slot in a new one. The server BIOS (IBM Netfinity) then takes care of rebuilding.
Ed Almos
Budapest, Hungary
The more corrupt the state, the more numerous the laws. - Tacitus, 56-120 A.D.
Generally RAID5 or RAID 10 (not 0+1) is what you'd like to see, but 5 requires a minumum of 3 disks (4 or 5 is better), and 10 requires a minumum of 6 (again, more x2 is better) disks. For personal use, there's just not that much space in the chasis for that many disks. Not to mention the potential cost of acquiring that many disks.
Really, your only choice is RAID1. Two disks, maybe a card (if you don't use the Mobo RAID Controller that seems to be standard these days). For Hardcore (even personal) usage, you're probably better off going with SCSI. However for light use, IDE is fine.
I haven't seen alot on SATA RAID yet, but it seems to be pretty popular. I would imagine it'd be fine for light use as well.
RAID 1 has another advantage for no one has listed. Make sure to mount your drives on removable trays. In the event of a fire or other disaster, you can just yank out one drive and be confident that you have saved all your data.
No need to burn to a crisp while trying to unplug all the cables going into the PC.
I'm currently running a 3 drive RAID 5. It was pretty easy to get going. It all depends on how much you want to spend and what you want to do with the server. RAID 5 gives you (N - 1)*(drive size) storage (3 x 200GB drives gives you 400 GB storage). The problem with RAID 5 is it requires at least 3 drives, the more the better. I really like my 3ware sata raid card, it gives you the option to have "hot spares" so should it detect a problem, it will automatically start rebuilding onto the spare. It's also a hardware-based card meaning (among other things) it takes very little from the server to rebuild the drive. 3ware's drivers were easy to get running (even in linux) and it included a monitoring system that can send alert emails should something go wrong. For more information on the 3ware cards, check their new line out here
/* Insert some overused slashdot quote here */
What I really want to know is what sort of performance you get from software raid solutions. After all, the concept of being able to get redundancy without forking money over for a raid card (even from ebay, they're expensive), is rather tempting.
You know when it's okay to shout fire in a crowded theatre? When it's on fire.
Question 2
What if I want to add another 250gb drive to my array (to bring the total up to 6)? Would that be possible?
:wq
I'm setting up redudent storage at home. After looking at all the RAID options, limitations and what I really needed I decided that a nightly backup of my data onto another drive on another system will be very redundent, prevent human error (me deleting things), and be cost effective. (since I already have another machine). Backing up a nightly image of your drive would restore things faster, but not help against human failures. (which I think are more common then drive failures.. but I guess it depends on the hardware and humans involved ;) ).
Google around and you will find all kinds of arguements for IDE not being a good RAID solution. (mostly write cache related).
Really depends on the type of RAID you'd like to implement.
RAID 0 stripes the data across 2 or more drives and therefore offers no redundancy (in fact, in a two-disk stripe you mutiply danger of data loss x4 compared to two individual drives -- because you not only double the possibility of failure with two disks as opposed to one, but stand to lose all of the data on both drives should one fail). In any event, no point in discussing it further since redundancy is the point.
RAID 1 offers redundancy by exactly duplicating the contents of a drive onto another drive, and needs exactly two drives. This is considered the most "fail-safe" method of RAID array although offers no performance benefits whatsoever.
RAID 10 (or 1+0 or 0+1) is a combination of RAID 0 and 1 and is nearly always done with four drives, although technically it can be done with six or eight (if your controller supports them). It offers both performance benefit and redundancy, although the cost of the "wasted" drive space is quite high.
RAID 3 involves using 3 or more drives, one of which contains parity information to rebuild the lost drive should any of the other drives fail. This is one of the least popular RAID formats and has more or less been totally replaced by RAID 5.
RAID 5 involves using 3 or more drives and writes parity information across all drives in the array, allowing one drive to fail with little to no performance loss. The failed drive can be replaced and the RAID rebuilt. Depending on your hardware/software, this can often be done hot without having to power down the system at all. It is one of the most commonly implemented RAID solutions because of the good mix between drive use (the price goes down the more drives you have in the array yet you can have as little as three), redundancy, and high availability.
There are others out there like RAID 50 but nothing worth mentioning, especially for a home user.
The only question left to you is whether the RAID will be run by hardware or software (software might be a good choice if you are already running Linux on the server, but you'll have to ask someone else about it because I don't know a thing about it). Personally I chose the hardware route years ago and bought an Adaptec 2400A, which is a four-channel hardware ATA-RAID card capable of RAID0, 1, 10, and 5 -- guess which I use. I use all four channels, each with a 200GB SATA hard drive. I've lived through a couple drive failures, a full drive upgrade (when I first bought the card it was 4x60GB drives) and even once where two drives RAID tables got zapped (I'll NEVER put my drives in removable cages again) and never lost a byte of data -- so the CAD$500 or so for the investment on the card was worth it.
600GB of storage means not having to worry about all those unlicenced-in-North-America-anime torrents running out of space any time soon.
Make your cold backup drive firewire if you want to simply plug it into another system to access your data.
Very critical data (in small quantities) gets quad redundancy- desktop, laptop, firewire drive, and usb keychain.
I tried to go a week without my data and facilitate public school classes at the same time... never again! I now have a 250gig firewire drive, confident that the chances of my laptop and desktop failing at the same time are relatively slim. The next step would be off site backup...
Spoon not. Fork, or fork not. There is no spoon.
RAID 5 requires time to rebuild. So you're down if a disk dies. RAID 1 is just mirroring. Is there any time needed for a rebuild? Or is it still available while you scramble to replace a hot-swappable disk? Obviously the new disk would need to by sync'ed up, is that done in the background?
Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
This is a hard one to answer. I would say that if I had to go with any one type, it would be RAID Ant and Roach Killer with Germfighter. It kills bugs dead!
why don't you give me your job.. and i'll take it from there
RAID 1 (mirroring) cards are cheap (and sometimes built into the motherboard), but for every gigabyte you want for storage, you need to be two gigabytes worth of hard drives.
RAID 5 cards cost more money (sometimes a thousand dollars or more), but if you set up a 5 drive system, only one of the drives is "wasted" storing redundancy data (of course, you need to buy five hard drives). RAID 5 on three drives is possible, bu the "wasted space" ratio goes from 20% to 33%, as a whole drive's worth of capacity is used to store parity info.
There are other RAID setups that use combinations of striping, mirroring, etc in an attempt to overcome performance bottlenecks.
Another interesting setup is NetCell's SyncRaid/Raid XL - Tom's Hardware had an article on it a while ago, but actually getting it is tough.
If you are planning on building a new system, Anandtech had an interesting article on RAID on the motherboard.
After a few years of experience with Promise RAID 0+1, Promise RAID 5, 3Ware RAID 5 and SCSI RAID 5, and recently 3Ware SATA RAID 5, I would say that the cheaper solutions often provide a false sense of security, especially if using IDE drives. We have a machine that has a Promise RAID-5 IDE setup that on reboot, seems to require a few restarts to get up and running, and when we lost a drive recently, it took quite a while for the array to rebuild (though this might not be an issue at home, where it's faster to rebuild an array than rebuild your whole computer on a fresh drive). I had a Promise RAID 0+1 card in a computer a few years ago that would corrupt any large file that I moved between hard drives on the computer.
If your data is important, get a good card from a trusted manufacturer (3ware is pretty good, and they have open source Linux drivers), and go SATA.
IANAAE (I am not an Adaptec Emp.), just wanna share.
I just had a very good experience with the Adaptec 2400 EIDE RAID card ($350). Configured it for RAID-5, after a month diags told me I lost a drive (NO data loss). By the time I got a replacement, the RAID had repaired the drive unit. It's been 6 weeks and no new errors, but I have the spare on hand.
(BTW, just for giggles, I boot with a SCSI drive.)
With the availability of cheap, huge drives, I'd do a RAID 1 of two big drives. This would be fine for a personal file server. For personal use I wouldn't worry too much about performance, I'd worry more about data integrity/security. Modern hard drive throughput can flood a 100Mbs, or even a 1000Mbs network with no problems.
c hipset-raid /index.x?pg=1
Here's an interesting review of chipset-based RAID solutions. RAID performance isn't always what you think....
http://tech-report.com/reviews/2004q2/
My Other Computer Is A Data General Nova III.
Mod this guy up!
It's all well and good to have redundant hot data but a virus, bad fsck'ing kernel, wacky power or a good ole' 'rm -rf *' will kill you dead.
Better to get the warm/cold backup and be able to restore from a mistake AND have a clean system.
Which configuration gives away the least access to data if someone decides to recover a dead drive that has been removed from your system... on a dead drive, a "wipe" can't be done, but much data will still, in theory, be accessable!
I just set up 2 250Gig drives to do mirroring using mdadm on linux. This setup is called RAID level 1 and in the event that one of the disks fails, you'll have a complete copy of this disk. Although this approach only allows you to use half of your total disk capacity, I think the simplicity outweighs that considering how cheap hard drives are these days. See the HOWTO below:O WTO.html
http://www.tldp.org/HOWTO/Software-RAID-H
True story...had a personal fileserver with a Promise RAID card. I got the Promise card because it was cheap and had a good rating on a couple of review sites.
What I didn't know at the time, but learned the hard way, is that Promises's RAID monitoring program "PAM" is a user-mode only application. That means that if you don't login, it doesn't run. Care to guess what happened to me?
At some point while I was gone for the weekend, I can only guess something crashed and rebooted Windows 2000. When it rebooted, I didn't have it set to automatically login (why would I? it's a server). So "PAM" wasn't running when one of the drives in the RAID 5 set failed. Maybe it even had something to do with the crash, I don't know.
Now, the point of PAM is that if a drive fails, an e-mail gets sent, in this case to my mobile phones textpage address. Since PAM wasn't running however, nothing was sent. The drive failed and, I can only guess, put off so much heat that it cooked the drive above it (why do so many cases mount hard drives horizontally above each other anyway?) and next thing I know, I can't login to my server from where I'm staying. I call a family member with a key to come by and they are unable to restart the server. It wasn't until I came home and read the BIOS messages that I understood why. Everything gone.
I had a lot of stuff on CDR, but let me tell you, I was plenty outraged that Promise could design something so utterly stupid as a monitoring utility that doesn't know how to run as a service. Even to this day, PAM still will only run as a user-mode program, and even worse, you actually have to login to the program now to start it, which can't be scripted.
F Promise. Only a complete and utter fool would be stupid enough to buy any of their products. May they rot in that special place reserved for child molesters. (Yes, I'm still bitter about it)
- JoeShmoe
.
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
Actually, I just built a 1TB fileserver for my home last month (I do a lot of video editing and need a secure place to store it). I'm using Mandrake Linux 10, but most any flavor will do as long as you have the raidtools installed. Also be sure to install Samba so you can map drives on both Windows and Linux systems.
One great thing about using Linux on the fileserver is that you can use software RAID. As the name implies, this requires no special controller cards (which is nice, since RAID 5 controllers typically run $200+). You also have the option of setting spare drives, which allows the array to begin rebuilding immediately in the event that one drive fails - the spare takes its place. Setup is easy - create a RAID, select what type you want, and then add drives to it and format.
I'm using a RAID 5 setup with 5 x 250GB drives giving me 4 x 250GB = approx. 1TB of storage space. As has been mentioned, using RAID 5 allows you to recover if one drive fails. The odds of more than one drive failing before you have a chance to rebuild the array are essentially the odds of your box being destroyed (tornado, fire, etc.).
Also previously mentioned, never attach more than one drive per IDE bus (assuming you're using IDE like I am). Doing so is irresponsible from a bandwidth standpoint as well as from a reliability standpoint, since a drive crash typically brings down the bus, and all drives on the bus with it (and as we all know by now, losing >1 drive is not survivable). Buy some cheap PCI IDE controllers, keeping in mind to ensure that they're dual channel if you plan on connecting >1 drive per controller.
Take some time and read this - it will tell you everything you need to know.
... but I like to point out that RAID (at any level) is not a replacement for ordinary backups because you still have a (fair) chance of filesystem corruption
I use RAID on all critical servers but even RAID 5 or better isn't 100% protection for your data. Last week I had a power failure that corrupted critical systerm files on a box and lost all the data. I chose to use CD/DVD backup of critical data (only a gig or 2) for that box. I have had it happen on Novell server with mirroring done in both hardware and software. The critical areas are directory tables and File allocation tables (Can't remember what the modern equivelant is). DVD burners are cheap, make a script file have it run at night and change the dvd everyday. I am glad I did.
It's for home use
No data loss if a drive dies
Easy to rebuild - remove dead drive, install new one
Budget... Ah. Why is it *every* "Ask Slashdot" never mentions the budget? On the cheap, you could do simple mirroring RAID1 - most mobos with on-board SATA RAID will do this for you. The overhead is that you pay twice as much per GB because you obviously need two drives and the performance gains are negligable.
Personally, I'd take the more expensive route; get a proper hardware RAID controller with proper RAID management software. There are 4 port SATA RAID controllers (who *really* still needs SCSI for home use?) for a few hundred dollars and do full RAID5. You lose one drive for the parity info, but that could be as little as 25% of your total capacity if you get four drives instead of the the minimum RAID5 requirement of three drives.
Also, with a proper hardware RAID controller, you should also get a performance boost from use of RAID and have minimal CPU overhead. Get four of Seagate's new 400GB drives and you'll have over a TB of disk space, which should give you some bragging rights for a months or two before it's old hat. :)
UNIX? They're not even circumcised! Savages!
3ware Escalade series products provide incredible amounts of storage for your needs. A single card can support up to 12 serial ATA hard drives. One can put 24 400GB hard drives into a 5U case and expect 8.8 TB at one's disposal. Since the question is regarding personal fileservers, I would expect an expectation of lowered cost. Having a single machine embody the entireity of one's needs is more cost effective than having multiple machines doing the job. That 8.8 TB assumes RAID 5 setup, which means that one of the drives, out of 12 on a card, will act as a parity check. This setup will provide great storage capacity in a highly reliable package. The cost of such a system is less than that of other servers with the same storage capacity.
If you've got an old PC, then really the only cost will be the hard drives to put in it. (and possibly an additional PCI IDE controller, so that there's no more than one drive per channel)
Sure software RAID will cause increased CPU load, but if it's used for something like NAS, the network is going to be the bottleneck, not the RAID.
RAID5 will give you the most space available out of your 3 or more drives. (n-1). RAID1 will give you half of the space, with better write performance.
Tired of hard drive failure. Think that RAID is too expensive?!?!? Why not join everyone that is running RAID 5000!.!.! RAID 5000 is a whole new level of RAID so much better than any other RAID's we had to take it up over two orders of magnitude. It uses all the best of RAID 0-50 with non of the cost usually associated with expen$ive hardware and extra drive$. It is a unique piece of software that keeps your hard drive from failure and garauntees 100% no-file loss.
Don't be the last in your clan to get this l33t software! Order today!
What I'd like to know is the best way to boot software RAID. At the moment, I have a 1GB root partition on disk 1, and then various other partitions for /var /home /usr etc etc
/etc. So weekly, a cron with rsync /etc to the secondary disk. I'm wondering if anyone can think of a better way to do software RAID-1 with two disks (can you RAID your root partition using software RAID?).
Other than root, everything is RAID-1. But if the primary disk goes down, I will still lose my bootsector and
You want a Promise UltraTrak SX8000 It's the easy idiotproof array. We're using several of these.
If a drive fails, it beeps at you til you replace it. You just yank it out, and put in a new drive, the same size or larger. It then rebuilds automatically. No shutdown or reboot required.
The Linux crowd will be happy to know the RM series runs linux. I don't know about the SX series, but I suppose it does too. Either one appears to the server to be a single SCSI drive. No drivers required, other than making the SCSI card of your choice work.
There's the Linux method of doing it too, which I like a lot. It saves you a *LOT* of money in extra hardware. You can go with 3 drives without adding any extra cards to your system, or you can put in IDE controllers to add as many drives as your system can support (PCI slots, power, and physical mounting points are the limitation). Read the "Software-RAID-HOWTO", which should come with your system. I've done many of these also, and they work quite nicely. You have to shut down the system to swap a drive, and then run `raidaddhot` with a couple parameters (the md device, if I remember right), and you can be running while it rebuilds.
You should have looked it up before you posted.
RAID 5 is the most common for a large redundant array. The array size is (N-1)*size . The more drives you use in a single array, the better off you are for size loss.
3 100Gb drives = 200Gb
5 100Gb drives = 400Gb
10 100Gb drives = 900Gb
10 200Gb drives = 1.8Tb
RAID 0 is striping. No redundancy, which you won't be happy with. (One failure means losing the array.
RAID 1 is mirroring. With two drives, you still only have the size of one.
RAID 50 is nice where it does striping across redundant arrays. You lose size, but gain speed.
Most other RAID types aren't very popular for various reasons.
Watch out for going over 2Tb in size on a single block device. I'm having problems with that right now. I have two Promise VTrak 15100's with 15 250Gb SATA drives in each, and anything with a block size over 2Tb is giving me grief. There are legitimate reasons for this, most of which newer documentation claims to be fixing, but I'm still having problems with a current Linux release. Making logical drives under 2Tb works, but doesn't accomplish what I need.
I hope this helps.
Serious? Seriousness is well above my pay grade.
I just finished something like that - one server is a cheap ($150 from ebay, incl ship) PIII 800Mhz with a $20 AHA2940, a $30 scsi cable with 5 connectors and 4 36Gb IBM disks for $180 (all from ebay). Slap RedHat 9 on it, read the HOWTO, use the built in tools and blammo, 100Gb raid-5.
Next project was a Sun E250 from ebay for $300 (incl. ship!), that came with 6 9Gb disk drives. Also got a box-o-10 18Gb drives for $100, and 5 spud brackets for $60. Replace 5 of the 9G drives with 18's, install DiskSuite, follow the instructions and blammo, 50Gb raid w/ an automatic hot spare failover, tested and working (just yanked a drive out one night to 'simulate' failure). Not the biggest but relatively inexpensive and very reliable.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
I replaced 6 out of ~60 of our "high quality" name-brand HP and HPaq SCSI hard drives last year. All were less than 3 years old and all are installed in name-brand racked servers and disk array cabinets, in a clean, modern, and well air-conditioned data center. None get pounded too hard except during the weekly full backup, either.
I don't think buying "high quality" SCSI drives does all that much for you, frankly. If it makes you feel more 5up3r 3l1t3 and validates your purchase, great, but experience shows it's not a panacea.
However, what happens if your place has a fire, gets vandalized, or a burglar takes off with your server(s)?
People usually scoff at the IDE vs. SCSI question nowadays, but for running a server, the underlying issue of 'consumer level' versus 'server level' drives remains. Server drives are designed to be left spinning for extremely long durations without failure while most PC drives are optimized for fast spin-up on boot. What you want to investigate is mean failure times, guaranteed spin-up-use-spin-down cycles, -Heat and Sound Output-, seek times, and sustained throughput. Good SATA drives are reasonably cheap, 0/1/0+1 SATA RAID controllers are becoming relatively common on motherboards, and SATA RAID5 controllers are getting more common and cheaper. RAID5 is obviously the way to go for 3+ drives if you can spare the US$200-$300 for a card.
RAID-1 is NEVER faster than a single drive, with a good controller it's slightly slower than a single dirve on both reads and writes. The rest of the information is correct (2 drives, redundant, only half space usable)
At my last job, we needed a basic RAID device that was under $500. We found this: http://www.accusys.com.tw/7500.htm It was about $200, and is OS and system independent. You simply put in two IDE drives, and you magically have RAID-1. You can hot-swap the IDE drives if necessary. We had one drive go bad and it worked perfectly. I recommend it to anybody on a budget. It takes up 2 drive bays, so it's a pretty easy fit in any standard PC.
This is the type of questions on ask/. sometimes I have the suspicion someone somewhere said "We need a RAID for our server but we don't now what to get. Let's throw the question into a bunch of nerds at /. and let them work for us.".
First, you must decide which RAID level meets your needs/wants. To do this, you must educate yourself on the various RAID levels and the pros and cons associated with each so you can make an informed decision. I recommend reading "The Skinny on RAID" if you want to learn the various RAID levels available.
After reading that article, you should learn about hot spares and what they can and cannot do for you. A recent article has been written about setting realistic expectations on what hot spares can do for you. "The Mythical Hot-Spare - Tape/Disk/Optical Storage" will be informative on this subject matter.
Lastly, you should read "Kill SCSI II: NetCell's RAID 0 Performance + RAID 5 Security Equals SyncRAID" to look into a innovative IDE RAID card that can give you kick ass performance and reliability. Be sure to read the benchmarks on the review so you can make an informed decision.
si vis pacem, para bellum..."if you wish peace, prepare for war"
I have friends who are fans of using cat to copy one drive to the other, then using rsync to keep the second drive up to date. If all you want to do is be able to switch drives real quick in case of a failure, this is another option available to you. Downside of using cat initially is that the drives must have identicle geometry, but if you're doing RAID this isn't anything new.
In my career I've had the best results with a 0+1 RAID also known as a striped mirror. Particularly because RAID 5 has some performance hitches to due to the redundancy method, you have to have a lot of disks to really get good performance and redunancy, and if you loose a disk your performance drops like a bomb.
In 0+1 is all just data baby! Loose a disk, just break the mirror and you'll still get good speed until you can fix the failed disk.
"I'm tired of HD failures."
Not only going to RAID, but you might also want to figure out why you are having so many HD problems. Whats the temp inside your cases? Is it humid in your house?
Reason I ask is I've had dozens of computers in the last 10-15 years and have never once had a hard drive fail (knock on wood!).
When writing to RAID5 in software some CPU time is spent on computing the parity data, but shouldn't be a problem if your going to use a modern CPU.
If your file server will run Linux and you will be running RAID in software, you should take a look at mdadm which is a lot better for managing your RAID array than the old RAID Tools.
BTW, here's a hdparm test of my array:
I am going to get yelled at for this, but have you considered ghost programs? You know make an image and then when the machine dies you rebuild from the image. Seriously it takes less time to do then rebuilding the data from the raid array probably would be and you can use the extra space you saved for more fun stuff.
"Some days you just can't get rid of a bomb."
If you want the best % of drive utilization go for raid 5. It works by Striping the data across 2 drives then XORing the data on the 3rd drive.
:)
That makes a little more sense, but it still seems like black magic to me. In the case of a big array, how can that one parity drive recover the data from any one dead drive? Is it because most of the data is striped across the rest od the drives? How about the case of just a 3-drive RAID-5 with 250 GB drives... two data drives (500 GB total) plus one parity drive. If one drive dies, now you only have half of the stripe data (just 250 GB worth of the stripe... every other block I suppose). But the parity drive, which is also 250 GB is built of parity data for BOTH drives. How that can work is complete black magic to me. It seems like data is being pulled out of the thin air.
If you can explain this to me, please do so!
Related question: would RAID 5 still work if the data stored was purely random (say from a bingo cage) numbers?
What's the best stand-alone raid system? A Raid % setup that is self-contained and can plug into a SCSI port on a PC? Are there IDE systems that can perform well in web/mail environments?
"Redundant Array of Independent Disks" Gotta love commercialization eh? It used to be "Redundant Array of Inexpensive Disks", but I guess there wasn't much money to be made with that acronym. (Yes, I know that this move was made in recognition of the fact that RAID arrays are implemented for performance as much as redundancy nowdays, so 15k SCSI drives are pretty much the standard). Anyway, back to the topic at hand. For a home server, I'd cast my vote for RAID-1. Even though I run RAID-5 on my home server, I'm just nerdy like that and I think that's overkill for the average Joe. One thing to consider that I haven't read (but might be posted already) is rebuild time. If you have a RAID-1 array and a drive fails, you can keep working. (At your own risk mind you since there will be no redundancy anymore). With RAID-5, you lose a drive and things come to a screeching halt until you replace the drive. (Which is also why I have a hot spare in my system) Someone did mention it, but I'd have to second it - RAID arrays are not a substitute for backups. If you data is important, back it up. RAID only protects from hardware failures, not user failures, file system corruptions, viruses, hackers, etc.
Dear slashdot,
I am really too lazy to use google. Instead of doing decent research on the subject by reading several HOWTOs, or taking a pencil and paper and doing some basic math I have decided to post an article to slashdot.
My question is "Why? For the love of goddess, WHY?"
Thank you for posting this fine question, and many others to add to my ignorance of google.
Kind regards,
Ann "on any moose" Coward
Don't go with Raid 1 or Raid 5
Go with Raid 3 XL
I have elected NOT to use RAID as the most
common failure mode for me is NOT hard disk
failure, rather, power supply failure. As
such, backups are more useful than RAID
in my home file server I don't use raid at all. I have my drive with exported filesystems mounted under /export and my backup drive (identical drive) mounted under /backup (but not all the time) and I run this every night from cron
/dev/yourdevicehere /backup /export /backup /backup
/export and away you go.
#!/bin/bash
mount
cd
find . -mount -depth | cpio -pdumv
umount
that way you have a perfect mirror copy of all of your data/home directories/whatever cloned on a second drive nightly and if the first drive dies you just unplug it and remount your backup drive as
Check out 3ware's latest options. Striped RAID-5's. Increased speed and redundancy. (You of course lose two drives' worth of space. www.3ware.com
I've found the linux kernel's built-in RAID capabilities more than adequate for most of my fault tolerance needs. The best part is I can move the drives to pretty much any system - a new motherboard, whatever - without having to worry about kernel support or finding that IDE driver. If a drive fails I can boot its mirror up in any system and be in great shape. I also use the utility mdadm to email me if one of the drives fails. For some linux firewall systems I've built, I use old crappy 6GB drives, but mirror them so there's no risk if one of them goes out. Looking at my basement firewall now and...
everything is cool!
There's no place like 127.0.0.1
In addition to the plethora of external USB drives, you can now get external RAID drives that act as either a FW800 or USB2 external drive, where the RAID-1 (or RAID-0) is handled transparently by the hardware device itself. Miglia.com do one that provides RAID-1 with transparent mirroring, and you can hot-swap the drives without the OS even noticing.
The main reason I looked at one of these is because configuring a RAID system is a pain in the arse under Linux (yes, it's do-able, but if something goes wrong, you're screwed). I've had friends who have toasted entire drives because they were set up with a RAID set of partitions, and it wasn't clear how to re-create then after a disk crash.
With an external device (and FW800 gives faster throughput than the hard drive can manage on its own), if there's a problem with the host, you can just yank out the plug and stick it into a new machine. If there's a problem with the external drive (and you're using mirroring), you can just remove one of the drives and mount it into another (single) external drive until your replacement comes through.
Given the cost of Raid-5 capable drivers, you'll probably get more storage for your dollar with raid-1 then a raid-5 solution, if you're a home user wanting to use IDE drives.
autopr0n is like, down and stuff.
Don't get too fancy with yourself on this one...
You definitely don't need any type of RAID solution because it doesn't offer you what you really need. You say you want RAID, but what you really want is backup.
All RAID solution deal with disaster recovery, but they don't deal with the situation where you accidentally rm -rf a directory that you wanted. If you mirror or RAID 5 your drives, you're still hosed because both drives will delete the files. In the end, this is more important and much more convenient.
Instead, go with a better approach which is copy or tar your files every night (or every week) to a backup drive, preferably over the network on a completely different machine. This will prevent the problem of a power surge or accidental shutoff from corrupting both drives at the same time.
Finally someone who know what their talking about!
I have two 80GB drives, with /boot (100MB) and /home (40GB) mirrored, but the rest is / on one drive and /data on the other.
/home than I am about full failover redundancy in the case of single-disk failure. Rebuilding the OS is a reasonably painless process but some of my data is irreplaceable (and backup CDs/DVDs are too easy to lose/break/corrupt/tempting to re-use). /data holds information I don't care about so much or that I can get back (like my ripped-from-CDs-I-own music).
Basically, I'm more worried about keeping what's in
If zero-downtime is a critical factor for you, you probably want to RAID-1 the whole disk (just remember to copy the MBR, too!)
How appropriate. You fight like a cow.
Hotspares are way cheaper than RAID-10 and are as reliable, barring simultaneous, multiple disk failures. Most controllers will also allow you to have a single spare usable for multiple logical drives, further lowering the cost.
I see lots of people here suggesting proprietary RAID controllers and I most strongly disagree. The biggest problem is hardware obsolescence. Unless you can get an exact replacement for a failed hardware RAID controller, there is no way to recover your data when the controller fails.
To my mind, RAID 1 (mirroring) is the best solution for reliability. As mentioned elsewhere here, RAID 5 will give you more storage efficiency, but at the cost of some downtime when a drive fails.
Our NT server here runs RAID 1 (software) and we lost a drive a couple of months ago. No one in the company even knew! I RMAed the drive, had a replacement in one day. During that day, everyone in the company worked on the still-functioning drive. When the new drive came in, I had to kick everyone off for about 20 minutes while I did the drive swap, brought the system back up and let NT mirror the new drive in the background while everyone continued to work. Virtually 0 downtime. If we had hot-swap capability, it would have been 0.
And hard drives are cheap nowadays; who cares if you are only getting 50% utilization? How much is your data worth?
Over the past 20 years i've seen any number of cases where a particular drive model had a manufacturing or design defect which caused many/most of the drives to fail when the drive reached a particular age. (Anyone remember the RA81 glue problem? at a site with ~60 of them we were losing one or more a week for a while..).
You can reduce your vulnerability to this problem by mirroring between drives of different design made by different manufacturers.
Hey, dickcheese! It's Inexpensive, not Independent, you completely moronic ignoramus.
"For all you Quiz Bowl participants, let's get the fun acronymical knowledge out on the table first: RAID stands for Redundant Array of Inexpensive Disks.
'Seriously. No, wait, I'm not joking. That's really what it means. (You wouldn't believe how many people e-mail in saying that it's some combination of random, really, real-time, redundant, array, assembly, interconnected, independent, inter-relation, devices, etc).'
Despite what Techweb and other sites say, the original name as proposed by the original researchers is just that (just consult Patterson, Gibson, and Katz). There's a darn good explanation for the name, and for the genesis of RAID per se. Since the inception of the hard drive, disk I/O performance has been a persistent performance bottleneck. Here's the abstract from the above mentioned source (Patterson, et al):
'Increasing performance of CPUs and memories will be squandered if not matched by a similar performance increase in I/O. While the capacity of Single Large Expensive Disks (SLED) has grown rapidly, the performance improvement of SLED has been modest. Redundant Arrays of Inexpensive Disks (RAID), based on the magnetic disk technology developed for personal computers, offers an attractive alternative to SLED, promising improvements of an order of magnitude in performance, reliability, power consumption, and scalability. This paper introduces five levels of RAIDs, giving their relative cost/performance, and compares RAID to an IBM 3380 and a Fujitsu Super Eagle.'"
- http://arstechnica.com/paedia/r/raid-1.html -
Read it and weep, bitch!
si vis pacem, para bellum..."if you wish peace, prepare for war"
For a few years I used an inexpensive (about $50) Highpoint ATA-100 RAID card on my home development server. It was perfectly adequate. I ran two drives in a RAID-1 config. Read/Write speed was very comparable to a single-drive solution, since it was a dual-channel card and I only had one drive on each channel.
I didn't have any hardware failures, so I didn't get a chance to give it a "real" disaster recovery test, but when I disassembled that PC and plugged the drives into another machine, each one was perfectly readable. So I'm assuming that, if one of the drives had failed, I would have been able to access the data on the other drive with zero complications and problems.
I don't think you *want* a more complicated solution than this for home or small business use. For home use, two hard drives are plenty noise- and heat-producing enough. The only reason you'd want a higher RAID level would be if you needed hot-swappability, or ATA-100 performance was a real bottleneck, neither of which would be a factor for home/small office use. A higher RAID level would give you less "wasted" disk space, but it's still going to be more expensive overall, since any kind of RAID-5 card will be far more expensive than a $50 (or built-in) RAID-1 solution.
One thing I didn't like: the drives never seemed to spin down, no matter how I played with Win2K Server's power settings. I don't know if that was the driver or OS's fault, but I didn't like having those guys spin 24/7.
OtakuBooty.com: Smart, funny, sexy nerds.
I like macs, but a friend of mine suffered with one of the drives in his graphite G4 powermac for months
it went sort of like this:
beep-beep(from the firmware)...freeze
Look, I know what is the best way to implement a RAID setup. However, you have to ask yourself another question: what is the best bang for a dollar? I assume that you're just a geek(ette) that needs a safe form of backup without too much concern about performance issues. If so, go RAID 1.
RAID 1 is cheap, it is easy to implement. That is all you need if you are afraid of losing your MP3 collection and other crapt that you have managed to gather. You can also have a small backup server where your workstations rsync important data on a nightly basis. That is exactly what I do.
Tar, gzip, encrypt, rsync. A simple Perl (fuck perl, shell!) script can do it for you and then you run it from crontab. I find that having an independent backup solution is the best way to store all my important data that is nicely spread around several workstations and two laptops. Currently, I have only one drive, but I rotate my images and that is why nothing fills up. I will get RAID 1 setup as soon as get extra loot.
I prefer software RAID 1. When I first set it up, I could only afford 2 120 gig disks. I kept one drive per controller (cable) and split the two disks between the built-on controller and a cheap (~$35) Promise IDE controller. I recently ran out of space and bought 2 200 gig disks, and another promise controller. The controllers play nicely together, and again I split the two drives between the two physical cards. That way, any one physical card can die and I still have my data.
I like RAID 1 because you can mount either drive alone, on any system, and get at your data. No dependence on a particular RAID controller, and perfect redundancy (RAID5 only allows 1 disk to die at a time). I ran into some stupid setup problems with the second set of disks and used the ability to mount a disk alone, without using the RAID driver in Linux (md).
Granted, you don't get as much total space, but the redundancy and ease of adding/removing 2-disk sets is key. You also get a performance hit on writes, but since my usage is primarily reads, this wasn't a huge factor for me. At 5 hard drives now, I had to upgrade the power supply (well, I didn't try the crappy one), and added a fan (I may add another).
"The universe seems neither benign nor hostile, merely indifferent." --Carl Sagan
Might as well piggyback on a good RAID discussion. Is it possible to have multiple arrays within one system? I'm building a computer and would like to have a system drive and a storage drive, i want two 74GB raptors in RAID 1 for my system drive and 2 250GB SATA drives in RAID 0 for non-critical storage. I'll be running XP Pro, and my motherboard is a DFI NFII Ultra Infinity. My motherboard has 4 SATA ports, and supports both RAID 0 and RAID 1, but i dont know if it would support both separately. Will my mobo be able to handle this, or do i need a separate controller, if there are even controlers that can do this?
"Sic Semper Tyrannosaurus Rex."
I dropped big money a few years ago on a big SCSI server drive. It broke. I lost tons of stuff. I got a replacement. It broke. I lost stuff although not so much. Screw that. If your expensive SCSI disk breaks, you're fucked. If your cheap IDE disk breaks, you're fucked. If a cheap IDE disk in a RAID array fails, no big whoop. If two or more disks in a RAID array break simultanously, you're fucked.
Now, here's a little lesson in probability. If drive failures were independent (or the great majority of them were), the probabilities would MULTIPLY. MULTIPLYING a number 1 by another number 1 yields a number that is even smaller than either number. Because I understand multiplication, I suggest RAID. Get it?
Seeing as how you want data redundancy, there are three RAID levels for you to pick from:
RAID 1 - Drive mirroring.
Pros:
-Excellent read performance, no loss of performance if one drive crashes.
Cons:
-The amount of space you can have on this array is limited to the largest drive you can find. Then you have to buy a second one to mirror the data, which means you are paying double the cost per unit storage on your array.
-Write performance is slower than other RAID levels.
RAID 5 - Striped array with parity. You can stack as many drives as you want on this array (within limits of the controller of course) and lose only one for redundancy.
Pros:
-You can build a very large data array out as many drives as you want, losing only one for the purpose of data reconstruction should a drive in the array fail.
Cons
-Array performance dies in the event of a failure, as lost data is reconstructed on the fly from parity information stored across the remaining drives. Of course, performance is restored with the bad disk is replaced and the array reconstructed.
-You need at least 3 drives to build a RAID 5 array.
RAID 10 - Drive mirroring with striping. Essentially combines RAID 0 and RAID 1, hence RAID 10.
Pros:
-Redundant and fast. Array can survive multiple drive failures.
Cons:
-Expensive. You need at least 4 drives to get started with RAID 10, and go by 2's as you expand on the array. As with RAID 1, your price per unit storage is doubled.
-The array can survive multiple failures, but that depends on which drives die...If you lose two drives out of the same mirror set, then the array is gone
Which RAID level you pick depends on your application. If you are interested in having something like a 1 TB data dump, you'll probably want to go RAID 5. If you only want 200GB or less in your array, then RAID 1 is probably the way to go. If you are interested in lots of space, lots of redundancy, and have lots of money, then RAID 10 is probably what you want.
-R
My setup is a RAID-1 setup, implemented through a hardware RAID controller. One disk is mounted permanently in the computer, the other is in a removable drive carriage. There is a third disk, also in a removable drive carriage.
What this gains is, offsite storage without effort. Just pull out the removable drive carriage, and travel to your offsite storage (i.e. your desk at work.) Grab the third disk (which was in your desk already) and take it home, plug it into the now empty drive carriage slot. Break the mirror set and re-establish it with the new drive (using the permanently-mounted drive as the source of the mirror of course)
Now if your house burns down, or the feds raid you, you still have your data. Offsite backup with almost no effort, no tape drives, no schedules. The downside is of course, you effectively use only 1/3 of your storage space. But what is more important, loss of data or being cheap?
www.grc.com
;)
now does NTFS. Linux and just about any other filesystem you care to mention!
I'm a big fan of RAID 10. With the cheap price of space these days, you can basically have the best of both worlds: faster perforamance and more dependable than a single drive.
Though RAID 5 is the most popular I've found the write penalty to be a problem. And rebuilds are slow. Anyways, check out this page about different RAID types. I found it most helpful.
Cheers.
The whole point of RAID (Redundant Array of Inexpensive Disks) is that failure is a when, not an if proposition. MTBF (Mean Time Between Failures) rates averages across all units in a model line. For every HD like yours that beats the MTBF, there's one that dies early, or two that die a little less early.
Right now, the cheapest HDs per GB are 30GB@$3 = $0.10:GB. Cheapest RAID controller card is $15 for 4 drives; a PCI PIII/1GHz server stacked with 24 drives gives 720GB (down to about 600GB with RAID redundancy) for about $400. Large capacity drives (~160GB) are at $0.50:GB, so your $400 server gets you about the same storage, but no RAID. Add the faster seek times by switching over more IDE buses rather than moving fewer drive heads, and the RAID promise delivers.
--
make install -not war
try "man google"
.... Cliff - why did you even consider this as an artice?
And stop asking stupid questions
Just click on the link. http://www.iomega.com/europe/support/english/docum ents/11243e.html
It's what I use, combined with (4) 18GB Atlas 10k III's (The drives have a seven year warranty) in a RAID 0 array. It costs 3 times as much as most motherboards and better than most motherboards.
I'm not using RAID 5 (though the 2100s supports it) because I need the space. I just run Rapidbackup to back up select data to my old IDE drive. My system is now 3 years old and it still performs fine in all but the newest games, I can run 5-6 copies of Diablo II Expansion simultaneously with almost no noticible slow down.
Last I checked the 2100s was the cheapest hardware raid system out there and it's probably worth it.
I only wish I could afford 36GB drives instead of my 18's, if I could I would use RAID 5.
Question everything
I personally have a Raid 5 setup. Never had a failure nor any real problems once I made sure the bios/driver versions matched. When building a system for home I think it is important to consider the available equipment. Promise makes a line of cost effective cards in their SX series. I would look at a 4 channel card that will handle 4 drives. The 6 channel cards are so long that they may not fit in your case! I added 4 x 120Gb drives and created 360Gb of usable space. The card was only $150 and drives are so cheap these days. For fun I have removed a drive and watched it be rebuilt when I turned it back on. At that price point I don't think you can go wrong.
I'm surprised the subject of software or hardware-based RAID hasn't come up. If one is REALLY on the cheap, you could always just utilize software RAID in Linux, *BSD, or Windows (NT-based versions, level 5 only). Just as secure, but with a definate performance trade-off.
:)
Then again, you ARE asking Slashdot, so you're already getting a performance trade-off on your advice.
Are you kidding? Who gives a shit if you're only doing RAID 0? You're dropping reliability through the floor and asking to lose all your data anyway.
It sounds to me like this guy just needs a quality HDD and good tape backup. Do not put your faith in RAID, put in a good off-site backup. I've seen RAID solutions fail to many times. I've seen RAID solutions fail twice recently. The first one was a company with a slick server and nice hot-swappable SCSI drives but their controller card went out. It was replaced by the manufacturer but the techs were unable to recover the data. Next one happened when a machines case fan went out and the mirrored HDDs cooked themselves to death. The moral of the story: NEVER TRUST RAID and as always keep a backup.
--Gentoo Baby!
RAID5 is a much bigger performance drain in most setups, its also pushing boxes up to 4+ disks (realistically raid5 you need a hot spare) and that pushes it out of 1U/2U and mini cases.
IDE is so cheap you might as well just buy two big sata drives for most usage. Do make sure you buy two drives from two different vendors - its really embarrassing when you use two identical drives with near serial numbers and they fail the same day.
Also keep external backups. One place I worked we lost an entire array and the hot spare to a PSU failure. No backups.. thankfully it was the usenet spool
It should be called inexpensive because they are not independent. Think about it. If you take a hard drive out of a server which is part of a RAID array, like RAID 5, and put it in a different computer can you still get information off of it? No. That data is spread across mutiple hard drives. Hecne the reason they are not independent. They DEPEND on each other to run in the RAID array.
Raid 0 (stripe) - Fast, Cheap - the more drives the faster - no redundancy Raid 1 (mirror) - Pretty Fast, Cheap - 2 drives - redundant - most hardware raid will rebuild failed drives on the fly, performance will be affected though Raid 5 - Redundacy is more important than speed - min 3 drives - total data space is *about* the totalof all disks minus one drive (3x36G ~ 72G R5), rebuild failed drive on fly, performance not as impacted as R1 rebuild. Raid 10 Fast, Expensive, Redundant - 4 drives min. Striped Mirror Set. IMHO the way to go if you have the dinero. Pretty sure this will get modded as redundant :)
-- kortex "Not everything that counts can be counted, and not everything that can be counted counts"
Has anyone had any experience with Dell's RAID standalone products?
We've been using Compaq's for storage and they work well but getting parts are expensive. The standard Seagate drives are modified for Compaq so you have to get their special OEM versions at much higher costs. I'd like to find a cheap, reliable, pro level raid array and have been looking at some of Dell's products. Anyone have experience?
I like raid 10, 0+1, whatever your vendor may call it. It requires four hard drives, offers a significant speed boost, and protects you from any single drive failure. Additionally, in 3/4 of cases, it will survive a 2 disk failure.
Rebuilding a raid array is definitely easier than restoring from backup, but it can take some significant time which may be speant offline. Hot-swappable drives and online array rebuilding are definite features you may want depending on the importance of uptime.
I've been doing RAID 1 with 2 disks and a raid card for a few months and I like the margin of storage safety it gives me. However, I have been having some problems with my card (or setup) and my raid array has failed occasionally lately.
;-)
One problem revealed by this is something I've not seen discussed here, and that is that I have to manually reset my raid and reboot my system to get back online (usually have to break the array and re-duplicate one of the disks to re-create the array).
What I'd like is a RAID that keeps on working as long as there is at least one good image of the disk, and lets me fix it whenever I get around to it.
For instance, if I am running a webserver and one of my mirror disks crashes in the night I would like the webserver to keep on chugging like nothing happened because there is still one good disk there. Does anyone know of any non-stop RAID cards or software or systems or research?
Thanks,
Tom
(-; Does sig advertising work? Just did! Email me for rates!
Here's my set up....
0 xa403,0xa000-0xa007 irq 11 at device 7.0 on pci00 xb803,0xb400-0xb407 irq 11 at device 7.1 on pci0
:-)
1 x highpoint HPT374 (4 channel IDE)
2 x 20GB IDE - OS installed here
2 x 120GB IDE - soon to be upgraded
2 x 300GB IDE - used to be 120s
2 x 300GB IDE - used to be 120s
atapci0: port 0xb000-0xb0ff,0xac00-0xac03,0xa800-0xa807,0xa400-
ata2: at 0xa000 on atapci0
ata3: at 0xa800 on atapci0
atapci1: port 0xc400-0xc4ff,0xc000-0xc003,0xbc00-0xbc07,0xb800-
ata4: at 0xb400 on atapci1
ata5: at 0xbc00 on atapci1
.
.
ar0: 19541MB [2491/255/63] status: READY subdisks:
0 READY ad4: 19541MB [39703/16/63] at ata2-master UDMA100
1 READY ad6: 19541MB [39703/16/63] at ata3-master UDMA100
ar1: 117800MB [15017/255/63] status: READY subdisks:
0 READY ad5: 117800MB [239340/16/63] at ata2-slave UDMA100
1 READY ad7: 117800MB [239340/16/63] at ata3-slave UDMA100
ar2: 286103MB [36473/255/63] status: READY subdisks:
0 READY ad8: 286103MB [581290/16/63] at ata4-master UDMA133
1 READY ad10: 286103MB [581290/16/63] at ata5-master UDMA133
ar3: 286103MB [36473/255/63] status: READY subdisks:
0 READY ad9: 286103MB [581290/16/63] at ata4-slave UDMA133
1 READY ad11: 286103MB [581290/16/63] at ata5-slave UDMA133
I haven't installed any of the Highpoint software, but I do run "atacontrol status " from cron once a day.
When I upgraded the disks I had to break the mirror, install new drive, copy files across, re-mirror (which included syncing 300GB, although I'm not sure if it syncs non-used parts of the disk). I used the firmware to do this.... it took hours and hours! Lots of downtime, but I just left it overnight.
Perhaps the software that come with it would sync the disks "in the background", but I couldn't get the firmware to let me re-mirror without syncing (possibly me being thick, but it's not something you have to do very often).
Anyhow.... I also e-mailed Highpoint support and thanked them for supporting FreeBSD
Auto-check your UK lottery lines
I'm not one for spelling flames, but why is it so hard for slashdot geeks to spell the word "FUCK"??? The S and U keys are nowhere near each other.
ok, just a few things.
/boot and raid 5 for /.
/boot on RAID was a majorly bad idea. First, because if the array fails, you're basically in sad shape. (two other problems I had was the third hd was from a separate controller card, and the BIOS didn't allow booting from a controller card).
/hda1 /boot 100mb /dev/hda2 1.5 gigs, swap /dev/hda3 / the rest /dev/hdg1 = /dev/md1 = /home RAID1
/home data (I put my mysql db on it). /home data and configured similarly (web server, etc). That way, if the system itself goes down, I could just use the 2nd system as my primary system and rely on my rsync of /home to restore my data. (Also, btw, I backup my most critical data on my unraided / directory on my primary machine.
/home user data, not so much worried about configurations. Come to think of it, maybe I should rsync/backup the /etc directory too.
You didn't indicate sw raid or hw raid, so I assume you're talking about sw raid.
In a nutshell, use the easiest configuration you can get away with. You don't want to spend time troubleshooting RAID stuff. Buy identical drives and consider making the whole drive as a partition.
I had major problems with maintaining software raid when I tried to do software raid on a 3 hard drive system. Actually I used raid 1(I think) for the
Putting
Second, on my gentoo, grub was doing really funny things. i spent way too much time trying to get grub or lilo to boot without the use of my array. to work. Looking back, it seems like a big bother.
Instead, I put root on a unraided drive, and did RAId-1 for the two remaining drives, using one partition for the entire hard drive.
dev/hde1
(the whole 60 gig drive)
My critical needs is having protection of my
I have less of a need to keep my system up 24x7, but I'm probably going to use a 2nd system (as my personal working machine for everyday use) which has an rsync copy of my
Like I said, I'm more worried about my
Robert Nagle, Idiotprogrammer, Houston
How come noone has mentioned RAID6?
Like RAID5, but "two" drives with redundancy, so any two drives in the system can crash without losing data (or the tempo). I thought I heard something about that there was Linux support for it now?
and I think only the guy who made the asshats comment realized it.
The original name was ( repeat after me kids! ) Redundant Array of Inexpensive Disks.
Don't believe me? Check out the Wikipedia entry for RAID.
The new name is more applicable, since RAID arrays sure as shit ain't inexpensive any more, but learn some history people...
PC moderators can suck my White pierced, tattooed dick. If you think pride == hate, s/dick/Aryan meat mallet/g.
I have the Epox EP-4PCA3+ motherboard running 4 Western Digital WB800JB drives in RAID 5. I ahve been very happy with the setup. HTH, Bod
"I say we take off, nuke the site from orbit. It's the only way to be sure."
Two drive RAID 1 mirroring is good. We've had a lot of trouble recently getting tech support from Promise Technology, so we have switched to HighPoint RocketRAID 133 adapter cards.
These RAID cards use the main CPU, they don't have on-board microprocessors. This causes some problems in Windows XP when you have a script that runs at startup. Some commands in the script will sometimes cause the mirror to break, apparently. Apparently Microsoft has not integrated some of the CLI commands into Windows XP yet. This was such a big problem that I wrote a paper on it for Microsoft technical support: Windows XP problems: Port Re-direction.
If you are willing to spend a little more, a lot of people suggest 3Ware products: 7006-2 adapter cards, for example. We have no experience with them. They have a drawback, compared to HighPoint cards: They won't boot with just one drive, according to 3Ware technical support. After the drives are used in a mirror, they will not boot from the IDE adapter on the motherboard. This could be a big drawback if your 3Ware card is not working for some reason. Possibly 3Ware cards available in the future will not be incompatible, leaving you no way to get your data from the drives. If the card fails, you will at least have to buy another one to be able to see your data.
The advantage with 3Ware cards is that there is a CPU on the adapter, leaving no way for MS bugs to cause the mirror to break. That system is also faster, of course.
I wrote a Slashdot article about RAID 1: Mirroring Controllers - What have been Your Experiences?. Note that the Slashdot software has a bug that will not let you see all the comments in nested mode. That bug is years old.
Slashdot has run a number of articles from people who wrestle with the data reliability problem.
Acronis makes backup software that has been generally good for us. It is possible to do a full hard disk backup of a Windows XP hard drive while Windows XP is running. (This uses XP's Shadow Copy mode.
Slashdot also published a story I wrote about drive imaging software: Experiences w/ Drive Imaging Software?. Best sentence: "Microsoft Windows 2000 and Windows XP have crippled file systems. The file system cannot copy some of the files that are necessary to the operating system. If you don't have experience with Microsoft operating systems, you may find this amazing..."
Windows XP keeps most of its settings in files collectively called the registry. So, no backup is complete unless you back up everything on the boot drive. MS tech support has told me many times that there is no way to do this with Microsoft tools. The recommend a "third party" method. We've tried the third party methods, and had a lot of grief with everthing except Acronis. Symantec has given us poor and unfriendly technical support, in my opinion. Symantec bought its competitor PowerQuest; I view that as a bad sign.
It is really, really miserable for me that Microsoft treats me, and every customer, as a criminal by building in copy protection that mixes all the programs and settings together; the copy protection causes me a lot of grief, and significantly damages the entire design of the OS. Linux is a very strong competitor in that area. Everyone is a friend of Linux, users are not criminals, and the OS design is not degra
Before I continue I'd probably better own up that I work for an IBM reseller so I'm using their standard terminology for RAID levels but the other major vendors (HPaq/Dell/Adaptec/etc) have similar things available but may call it something different (aren't standards great!)
There are numerous RAID levels but the most common are RAID-0,1,5 and, more recently, RAID-1E, RAID-50 and RAID-5E. There are also several RAID-x0 options (RAID-00, 10, 1E0 and 50)
RAID-0 also known as data striping. It is well-suited for program libraries requiring rapid loading of large tables, or more generally, applications requiring fast access to read-only data, or fast writing. RAID 0 is only designed to increase performance; there is no redundancy, so any disk failures require reloading from backups. Select RAID Level 0 for applications that would benefit from the increased performance capabilities of this RAID Level. Never use this level for critical applications that require high availability.
RAID-1 RAID 1 is also known as disk mirroring. It is most suited to applications that require high data availability, good read response times, and where cost is a secondary issue. The response time for writes can be somewhat slower than for a single disk, depending on the write policy; the writes can either be executed in parallel for speed or serially for safety. Select RAID Level 1 for applications with a high percentage of read operations and where the cost is not the major concern.
RAID-2 and RAID-3 - RAID 3 and RAID 2 are parallel process array mechanisms, where all drives in the array operate in unison. Similar to data striping, information to be written to disk is split into chunks (a fixed amount of data), and each chunk is written out to the same physical position on separate disks (in parallel). More advanced versions of RAID 2 and 3 synchronize the disk spindles so that the reads and writes can truly occur simultaneously (minimizing rotational latency buildups between disks). This architecture requires parity information to be written for each stripe of data; the difference between RAID 2 and RAID 3 is that RAID 2 can utilize multiple disk drives for parity, while RAID 3 can use only one. The LVM does not support Raid 3; therefore, a RAID 3 array must be used as a raw device from the host system.
Performance is very good for large amounts of data but poor for small requests because every drive is always involved, and there can be no overlapped or independent operation. It is well-suited for large data objects such as CAD/CAM or image files, or applications requiring sequential access to large data files. Select RAID 3 for applications that process large blocks of data. RAID 3 provides redundancy without the high overhead incurred by mirroring in RAID 1.
RAID-4 RAID 4 addresses some of the disadvantages of RAID 3 by using larger chunks of data and striping the data across all of the drives except the one reserved for parity. Write requests require a read/modify/update cycle that creates a bottleneck at the single parity drive. Therefore, RAID 4 is not used as often as RAID 5, which implements the same process, but without the parity volume bottleneck.
RAID-5 RAID 5, as has been mentioned, is very similar to RAID 4. The difference is that the parity information is distributed across the same disks used for the data, thereby eliminating the bottleneck. Parity data is never stored on the same drive as the chunks that it protects. This means that concurrent read and write operations can now be performed, and there are performance increases due to the availability of an extra disk (the disk previously used for parity). There are other enhancements possible to further increase data transfer rates, such as caching simultaneous reads from the disks and transferring that inform
I'm doing the same thing at home. I have three identical drives. One, the primary, is sitting in the server, the secondary is unmounted in a removeable tray in the server, and the third is also in a tray but at a distant location.
Initially I've dd'ed the primary to the other two disks.
Every morning the primary is 'cp -fpRu'ed to the second one. No files are deleted on the secondary, unless I'm running out of diskspace there, at which time I do an 'rsync -aH --delete' after some verifications.
Each few weeks I bring the third, down the server swap it with the secondary, and return the swapped third.
I feel pragmatically protected. In the case of a crash I won't lose more than a day of work. In the case of burglary, fire or Gotterdammerung, a few weeks.
Next time I'll rebuild the file server I'll make the 2nd and 3rd an external Firewire or USB2 High Speed.
Fingers crossed.
Flourescent (adj): smelling like ground wheat.
None of my crap is worth backing up, I can always get more pron, and I dont think anyone wants to see my flash movies of hamsters with top hats. jojo
I had an excellent experience using the level 5 RAID capability built into Linux.
Some years ago I setup a 100G Level 5 partition using 5 hard drives around 30G each. I used it for my main file archive and also to serve as a backing store for other workstations (most of whose drives at the time were several Gig each).
After several years of faithful operation, one of the hard drives failed and literally several months went by without my even noticing. Linux automatially recovered from the error and continued serving files but without the redundancy. (Greater diligence in my part was called for but the story still has a happy ending.) The problem didn't come to my attention until a second drive failed and the entire raid partition failed to come up. Fortunately, Linux had some reconstruction utilities that allowed a complete recovery.
The bottom line is that I retained all my data despite TWO hard drive failures. And if I had been paying closer attention and attended to the problem soon as the first drive failed there would have been no need for the reconstruction utilities.
I'm now in the process of putting together a Terabyte RAID level 5 array (5 250G drives).
I heartily endorse the built-in Linux RAID features over any hardware alternatives. If you're talking Windows, then you may be stuck with using hardware. Before you do, I would look very carefully at the recovery tools. In at lot cases they're not very useful or robust. If you can afford it, I'd suggest setting up a Linux file server seperate from your workstation. You need a minimum of 3 drives for level 5 raid. Drives are cheap and you don't need a fast cpu/motherboard to run the server.
HARDWARE RAID 5. Time-tested and nerd-approved. Forget about what people tell you about mirroring and duplexing or software RAID or this serial ata crap. Use SCSI RAID, period.
Get an *Adaptec* RAID adapter and 3 SCSI drives. Label each drive with the SCSI address that you decide to give them.
Build your array. You lose the space of one drive for parity, big deal. You can configure software to alert you to a drive failure. Or the card may already be set to spew a high pitched alarm that will let you know in a big hurry.
So your drive failed. What do you do? Well, your server keeps right on trucking. The software tells you the address of the failed drive. If you didn't label them like I told you, you can have the software "blink" (flash the LED of the drive in question) so there is no doubt which one is fubar. Replace the bad drive with a good one, tell the controller to rebuild the array and you're done.
If you truly want near hassle-free, redundant operation, use SCSI Hardware RAID 5.
I provide a high-uptime guarantee for my hosted clients. I don't actually host all that much information, just a few gigabytes of heavily databased information. As such, I get few hits, but each hit is very, very important.
1) Primary server is configured with IDE RAID level 1, two drives mirroring each other in realtime, 80 GB each.
2) Hot failover server on a different network, different city, with the same size drives as the primary system. (Backup server is not raid, tho) This is for failover in case of severe emergency.
3) Network backups performed at a 3rd offsite location using rsync over ssh. The scripting I use (in PHP) is available at effortlessis.com/backupbuddy . (though I need to update the current release) This is a cheap, low-end dedicated system with big, cheap IDE drives (~ 400 GB) that just backs them up.
It's incremented going back about 1.5 months. (At any point, I can roll back the system to any point as far back as 1.5 months)
With this setup, any two systems can fail completely and I'll still have virtually no data loss.
I have no problem with your religion until you decide it's reason to deprive others of the truth.
It seems that RAID 5 or 10 is the way to go. Here are some instructions on building a fairly fault tolerant server. The page deals with setting up a server with RAID 50 but can be switched to 51 for superior redundancy.
Over all it's about how much money you want to spend. If you have ooodles of cash go for a real server with RAID 60 and daily tape backups.
This doesn't seem like overkill.
I'm setting up something similar for my father in laws digital photo collection. Once i get my own system better set up, we'll each have a system such that:
Files get saved onto disk1
Nightly copy onto disk2
Nightly rsync moves disk2 on my system to disk2 on his and viceversa.
We have a lot of data to start with, but rarely add more than a few 10's of megabytes at a time, so the rsync shouldn't consume too much of our broadband.
If that gets too heavy going then i'll get a pair of firewire external disks and always leave one at my office.
To make a long story short, use RAID 5.
The minimum is 3 disks. RAID5 will provide you decent performances (unlike RAID 1) and one dead disk won't loss any data.
{{.sig}}
Why not read a few FAQ entries at StorageReview?
/", etc.
In short, I would probably recommend RAID5 if you have 3+ drives.
RAID5 gives you the most available space while still being redundant. It allows for exactly one hard drive failure.
RAID5's write speed is usually terrible, especially with a small number of drives, but write speed isn't a big deal on my home file server. (Only you know about your needs).
RAID1+0 (NOT RAID 0+1, which is inferior) is great for performance. With 4 drives, you have potentially twice the STR of one drive (writing) and 4 times the STR of one drive reading. Of course, since STR is not important for most IO, this doesn't really effect your end performance much unless you are dealing with linearly reading/writing very large files.
Writing performance will almost certainly be higher than with RAID5.
You do lose quite a lot of space (especially when you use a large number of drives). If you used a 4-drive 1+0 array, you would have the space in two of those individual drives.
RAID1 is nice, and is very reliable, but is impractical with more than two drives unless you are incredibly paranoid. RAID1 simply makes all drives copies of the others, this, you always have as much free space as one drive would have, even if you have ten. If course, you could also handle 9 drive failures and not lose data. RAID1 is fine for 2-drive arrays though.
DO NOT FORGET that RAID is no substitute for regular backups. RAID will not help if your data loss is caused by FS corruption, a cracker, accidentally typing "rm -rf
For lowest cost, I would use software RAID, such as Linux's LVM, FreeBSD's Vinum, or whatever Windows has. (RAID5 requires Windows server). (I would not use Windows as the file server myself).
For slightly higher cost, try a Promise controller.
I would avoid Highpoint and Silicon Image controllers. Highpoint, especially, is crap. (but it is very cheap, at least).
If you possibly can, I would recommend a nice 3Ware Escalade controller. Escalades are true hardware RAID cards, unlike Highpoint/SI and most of Promise's cards, and are OS independent and very stable (with certain exceptions for some unlikely configurations).
If you have any questions, you might try the StorageReview forums. There are a number of extremely knowledgeable people there, including engineers and executives-level researchers at hard drive companies. They can give far better advice than I can, I am sure.
By the way, all my comments assume that all drives are the same size. If not, treat all drives as if they are the same size as the smallest drive on the array (unless you are using JBOD, which is not redundant)
Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
I see nothing about variable implementations. Crappy RAID controllers aren't worth much. A good RAID controller has its own BIOS and its own processor. That means it actually controls all aspects of the drive interface. Most cheap RAID 1 implementations do not implement this.
I have seen a lot of people with a boot drive and a RAID implementation as their primary data partition. This is nicer, I suppose, than nothing. This does nothing for you if you have a total loss of the boot drive. To me, RAID implementation is about saving my own time, not having to do system recoveries at all. I don't like working on the same thing repeatedly - call me lazy.
Here is my rig:
Proliant 5500 server (4x400mhz P2 Xeon, 512mb RAM running Gentoo) w/redundant power supplies and dual redundant fans.
6x 18.2gb UW 10k SCSI hotswap drives configured up in a 70GB RAID-5 array with 1 hot spare.
Smart 2/P array controller w/4MB cache
1500va UPS
4-tape DAT autoloader for backup
Cost: about $300 on Ebay (UPS was $125 new in addition), plus about $150 shipping.
My rig is more reliable than anything anyone has mentioned here, and probably cheaper. True, it sounds like a wind tunnel, but you can put it in a cabinet if you really want to. If you want your data to be _safe_, this is the right way to go.
Imagine, all of this for my collection of mp3's and pr0n.
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
Since you mention the magic keyword "inexpensive" I'll mention that you should factor in the cost of running a PC 24/7. I've seen estimates that put this at around $5/mo for an average PC.
You keep that running for a year, and suddenly commercial solutions like the Snap Server start making a lot of sense. They draw much less power as they are designed from the ground up to be NAS devices. Rsync the snap every few minutes and you get redundancy to boot.
By the same reasoning, many home users who roll their own firewall/router solutions with old PCs running linux completely forget the power costs and don't realize that the little $50 linksys router would pay for itself in just a few months of electricity charges for the old PC.
slashsearch.org - slashdot search. powered by google.
but what is everyone backing up that they would need a RAID array? Is this just a nerd thing? I don't get it....
Any good backup system that really needs to protect data needs backup on more than one system. Although power supplies may have some internal protection, a sudden surge from a suddenly failing supply could take out every hard drive in your raid system. Backing up to a completely separate system might be safer than a straight raid system.
Is hardware failure really all you would like to address? I think it more likely that inadvertently deleting or modifying something occurs much more frequently. RAID doesn't help with this, the deletion is written to the array.
I address this by giving my drive 2 mirrors. One of the mirrors is there for drive failure protection, the other mirror I use as a backup device, which is split off at a regular interval with a script. The stale mirror serves as a point in time backup. I can mount it and get something off of it if I need to.
To make the next backup, the scripts adds the second mirror back in, lets it sync up (freshens up the backup) and then splits it off again. I do this daily, but you could set any reasonable interval.
HTH
Greg
I've been told I'm a bit obsessive about home backups, but...
:
:)
:e cord.
1. Four disk RAID 5 on Fileserver A (hardware controller)
2. Monthly full backup to disk on server B. SKip non critical that can be replaced (mp3, movies, etc.,)
a. Immediate encrypted copy of critical data to DVD media and store in fireproof safe.
b. Immediately encrpyt a copy and send to another system offsite via portable HDD.
3. Weekly backups of everything that has changed since the last full or weekly backup to disk on server B. Save to encrypted DVD and put in safe whenever a 4.7 gig chunk is created.
4. Daily incremental backup of all changed data since previous backup of any level to disk on server B. Save to encrypted DVD and put in safe whenever a 4.7 gig chunk is created.
5. Weekly mirror (via rsync) of entire data filesystem to encrypted fs on USB hard drive. Dump root/boot/var/other system partitions via dumpe2fs to the same encrypted fs.
6. Daily copy of super-critical data to an encrypted USB keychain that I carry with me when I leave the house.
7. Encrypted copy of data from step 6 to a second offsite location.
8. Encrypted hourly database backups to both offsite locations.
9. Files in one specific folder are checked every ten minutes for changes. Any file changed in the last ten minutes are copied with timestamp value to a separate folder. Anything in that folder older than 48 hours is erased. (This is because MS word keeps eating one really big document and corrupting the on-disk version.)
10. Desktops backed up to server B as above and stored as above.
Hardware
Fileserver
Dell PowerEdge 600SC 2.4 GHz P4
2 GIG RAM
USB 2.0 250 GB hard drive
DRU-500A DVD Burner
4 x 6Y200M0 Maxtor 200 gig HDDs for data (RAID 5)
1 x WD 40 gig root disk
1 x spare 6Y200M0 for cold spare.
LSI Logic SATA150-4 RAID Controller
Backup Server
Home-grown 1 GHZ PIII (x2) Abit VP6
2 gig RAM
4x 80 Gig WD HDD (striped)
Offsite Server 1
Dell PowerEdge 600 SC 2.4 GHz P4
1 Gig RAM
1x 40 gig root HDD
USB 2.0 250 GB hard drive
Offsite server 2
One gig of disk space on a friend's machine.
Off-brand 256 MB USB keychain.
Software
Amanda
RedHat 9 (Offsite 1)
Fedora Core 1 (Fileserver)
Fedora Core 2 (recent upgrade for Backup Server)
ksh
rsync
dumpe2fs
loop-aes
gpg
cdr
I've never actually written that out before... damn. I'm about to hit the limits of DVD, though, and will probably soon add a tape drive. DDS4, I'm thinking.
-- Minds are like parachutes... they work best when open.
On a similar subject, we've also had real bad luck with linux data integrity in general when handling large files (>500MB). I've seen this problem on both low-end IDE drives as well as high end (5 year warranty) hot-swap SCSI drives. I've mailed the IDE driver author on this (when I thought it was an IDE-only problem), but received no comment. I suppose I should post it to the kernel list.
In any case, for anyone interested, I submitted a test program "writetest" to the Linux Test Project for anyone interested. Just give it a really large block size (a 1GB file is generally large enough) and repeat about 10 times. I wouldn't mind hearing from people on what they see on various mb/chipset combos. Some controllers/motherboards have problems, others don't.
As to cost, you would do well to look at the RaidPort 1820A card from HighPoint, as it is available on the street for about $200.00. In setting up a RAID array, also consider that with a hot-spare, a failed drive will trigger automatic repairs. Without a hot spare, you must initiate a rebuild through the BIOS.
Another choice is software RAID, as for example, with Windows Server 2003, but the hardware path is cheaper now.
For hot-swap drive cages, look at www.cremax.com, as one possibility.
--- Bill
If you have a low budget and still want great fault tolerance and decent performance, I recommend RAID1 with two huge cheap ATA drives. There'll be no checksumming, and even on a machine with only two ATA channels, you won't have any serious bottlenecks. And you won't spend much money.
Really, the only downside to this, is that your capacity will be equal to the size of one drive (50% efficiency). I can't tell if that'll be a problem for you or not. Personal servers probably need more capacity than business, since (totally guessing here) you'll be storing multimedia. Maybe you're going to use it as the backing store for your MythTV, or keep your DVD/porn collection on it, I just don't know. But even then, 200 Gigabytes ain't bad.
Get into striping if you are willing to spend more money on more drives or need more capacity, and/or want faster reads. But if you get into more drives, think carefully about channels, especially if you're doing ATA.
One other thought: when doing RAID0 or RAID1, I recommend software RAID rather than using any fancy hardware or dedicated coprocessors. It's simple enough that it won't burden your CPU, and then if you handle your mirroring at the partition level rather than the disk level, you can do performance optimizations. For example, if you just care about not losing your porn but don't really mind if the machine temporarily crashes in the event of a drive failure, then you don't need to RAID1 your swap or /tmp, instead you might want to stripe those things for speed, and just RAID1 the "important" data. I don't know whether today's hardware raid solutions give you that much flexibility or not (maybe they do, like I said, I don't know). Just something to think about.
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
Since 1981 (that's 23 years right - - Old Timers Disease...) I have had more than 12 home PCs. Ihave had
-Zero-
hard drive failures.
My file server uses RAID-5 in a 5x200GB array, for 800GB total storage.
And I've lost it all, twice. Once through no fault of my own (dual drive failure), and once through my own incompetence combined with what i consider a fault in the array controller -- i turned off the machine while it was rebuilding from a drive failure.
I agree with others in that RAID-1 is the safest, but I just can't justify the loss in storage space. But you have to be careful; even a RAID array won't protect you if something happens outside of hard drive failure (corruption, deleting), and it especially sucks when you lose your whole array -- it's basically impossible to recover.
--------------------- -me, Crusher of those who are Foolish (don't be foolish)
Firewire!
3Ware seem to have the best Linux drivers around. It's hard to go past that. When I was looking into other chipsets, I remember seeing comments about how their drivers were 'a complete mess' and 'no-one is interested in porting them to the 2.6 kernel'. Scary stuff...
As for the type of RAID, RAID 1 ( mirroring ) is what you want for simplicity.
Hate to draw the fire, but
I mean really what kind of story is this. The Guy is basically asking which RAID level he should use. If He knows how to implement all the types he has answered his own question.
I think the real question is do you want Faster access or just redundancy.
Oops - one correction to the above:
Any HW RAID controller with battery backed memory will lose big-time to SW RAID
That of course should read "without battery backed memory"
Next up is drives. Not all drives are alike as I'm sure you already know. Do you want a SCSI or an IDE array? I won't go into this lengthy topic further. I'll assume though that you will build an IDE array. Some drives do not work well in RAID setups. The controller companies are more likely to tell you this than the drive manufacturers. I own 6 Western Digital WD12000JB drives (7200 RPM, 8MB cache, 120GB capacity). By all accounts one would expect those drives to work quite well in a RAID setup. They have excellent read/write times individually and have a massive amount of cache. Well, one would think that and they'd be wrong. Both 3ware, Highpoint, and Asus tech support (on an OEM Promise chipset in teh A7V333) recommend against using Western Digital drives. 3Ware did however say that WD will give you firmware that works significantly better in RAID setups if you ask for it. Personally I'm a fan of Maxtor, both the drives and the company. I've had very few failures with Maxtor drives. Whenever I did they were always extremely helpful with getting me a replacement fast. I've been very impressed by ther service. I have 2 Maxtor 7Y250P0 and 2 6Y200P0 drives in the server sitting next to me. The second is a very high quality drive from Maxtor's DiamondMaX Plus 9 line. It too have 8MB cache and 200GB to spare and runs at 7200 RPM. Nice drive. The first pair are from Maxtor's MaXLine Plus II. They have a high MTTF, 8MB cache, 250GB space, and run at 7200 RPM. They are also a little bit faster than the 6Y200P0. They are excellent drives. My next drives will also be Maxtors but this time I'll be buying the SATA siblings of the MaxLine Pluss II product line.
That brings me to my next point. PATA or SATA. Does your case have an abundance of room? I mean a massive amount of room to route long 80-conductor ribbon cables? Do you have at least 1 if not 2 PCI slots to waste below your RAID controller with the room needed to route the ribbon cables and make connections? If not then you need to go with Serial ATA drives. Don't even think twice about it. Go with SATA. The drives cost almost the same nowadays and you'll find wht little price difference there is ($5?) is worth it in the end. SATA drives are so much easier to wire. I have a case full of round cables. The case I have is an extremely large Codegen case and even I am having trouble with the cable mess. SATA is a wonderful thing. Along the same lines is hot-swap cages. There are a dozen brands to choose from. You should probably utilize them, even if you don't need hot-swap capabilities. I need them to create 3.5 drive slots from 5.25 bays. If you do want to do hot-swapping, make sure you drive cage and controller support it.
Finally we get to RAID levels. You don't want to increase your risk of losing data so level 0 is out. 1 is extremely redundant and with the right controller can actually speed up reads. It's also costly at twice the cost per GB. Unless the data you're storing is absolutely critical you won't want to use 1 (in most cases). Forget about level 2. For starters th
I picked up a LaCie disk pack not too long ago, and it is perfect for this kind of thing. External, Firewire 400/800 attached diskpack, each with 1TB. Two of those in a mirror would give you a decent amount of storage, and also provide for easy portability if you want to move data from one location to another. Just hook one of them back up later, and the mirror will rebuild itself.
is more common than HW failures. Therefore, instead of RAID1, I use 2 drives with rsync every night. This gives redundancy with a time lag. If you accidentally screw things up, then you can go and dig on the second drive.
I have just finished doing this exact thing.
;)
;)
I basically built a box to do nothing other than fileserv. I put together a nice simple old PC (550mhz with 256 meg of ram) and mounted it in an old rack mount case I had lying round.
It's running debian with 2.4.26.
I'm running software raid and installed 2 x 2 interface IDE cards.
I threw in 6 seagate 120 gig drives (the ones with the 8 meg cache) and ran raid5 across 5 of them and a hot spare to rebuild the raid should a drvie fail. Each drive has it's own IDE channel to prevent channel faliure from screwing my raid.
I'm using ext3 as the filesystem and wrote my own little raid mon script that SMS's me should a drive fail and alarms locally.
This setup has been rock steady and gives me 460 (ish) gig of usable space after formatting.
For added peice of mind the machine is plugged into a UPS that is connected to the machine via Serial. If the UPS kicks in it shuts the machine down properly after sending an alarm SMS (the DSL and switch are also on the UPS) (yes I'm a paranoid freak)
This makes a perfectly good media and file server and I've had no problem with it in the few months I've had it.
I also reccomend setting the spin down time onm the drives manually with hdparm. It was getting awfully warm in the box till I turned that on on the seagates. Modern drives are rather hot.
I have the whole thing mounted via SMB on my other boxes around the house and it's fast,(gig ethernet) reliable and easy.
Tho do remember that no amount of raiding will save you if you lose 2 drives through some horrible freak of badness, and no raid level is going to protect you from a house fire. Hence mine also rsyncs all my absoloutely vital files (scanned family photos and docs) offsite to a file storage site every night at 2am so as not to chew my bandwidth dduring usable times. Don't forget the only truely secure data is that which is backed up.. and offsite.... twice.
this might not be the best answer, but i have to mention it here since its kind of unique.
want a two drive mirrored raid that requires no software drivers (works with any os) and fits in a 3.5" bay?
check out the microraid
kinda pricey, but would be neat for those shoebox size cpu cases. and with 100 gb laptop drives coming out, it would give a useable amount of space.
I deleted /usr/man/man1/* by mistake one day, no big deal, I got out the backup tape.
... they didn't make them any more, and no one else did either. And a usenet call for help found no responses, no one else used it either.
...
It jammed.
I finally got the tape out but destroyed the drive.
It was a real nice TEAC drive, cassette form factor but very solid robust tapesm and had been reliable for 5 years
Well, only man1, who really cares
Then the disk drive wouldn't spin back up.
I was about ready to drop the whole thing into the lake but I found a loose fan connection and the drive spun back up. I ended up still without man1, but everything else was ok.
My lesson was that proprietary sucks, and backup is good as long as it's to nice generic hardware that you can get replacements anywhere in a hurry.
Infuriate left and right
Find a friend in the same situation and set up rsync to do an offsite backup everynight.
When it came to my final year university project i had everything (including my tex writeup) in my own cvs server. Every night i ran an rsync to back that up to university, and the following morning they'd back that up to tape.
Given that i had the files all checked out on 3 or 4 systems, i probably had about 8 copies of my work floating around.
You don't need to offsite your mp3 collection, but there's bound to be some data that you wouldn't want to loose.
I had a RAID 5 with 4 60GB Maxtor drives. I've had very poor luck with Maxtor drives (about 50% failure rate). Once when one went bad, I "corrected" the problem with Maxtor's low-level format utility. While it was rebuilding the array, *another* one got an unrecoverable error and got kicked out of the array. Two drives gone and I was screwed. However, I was able to get the "bad" drive going well enough to dd its contents off to another drive. Then I did the low-level format on it. Then I did a "mkraid" using the exact same blocksize, etc. With a little more trickery I was actually able to get the array back up and running with no lost data! I switched to 4 120 GB Western Digital drives and have had no problems with them.
BTW, I put the old Maxtor drives in a second fileserver. I use that raid for backups of stuff. I discovered that the drives were getting pretty hot, so I installed a window air-conditioning unit in the computer room (it got hot in there with 4 or 5 computers running all the time). That seemed to stop my drive failure problem. I still don't trust the Maxtors all that much, though. The WDs have run fine even when the room got really hot.
Oh, the software RAID-5 with ReiserFS seems to work well for me. I keep several VMWare virtual machine disk images on the RAID and can run them all at once with surprisingly good performance. The machine is an AMD Athlon 1.4 GHz with 512 MB of RAM. I'm about to upgrade it to 1GB of RAM while I can still buy PC-133 SDRAM...
Assuming you have systems you want to backup, why not just put together a dedicated backup system?
This is the solution I've gone for, and it works well for my modest network at home. I have one system dedicated to being the "dump box" that rsync's to my other systems on a regular basis (cygwin on Windows systems helps me out here).
This way, I'd need hardware in both systems to fail simultaneously to lose data.
I suppose one could call it "Network RAID1." 8)
But, it is cheap and simple.
Diplomacy is the art of saying, "Nice doggie!" until you can find a rock.
This is going to be software RAID, using Linux.
There are some stuff that I wouldn't need redundancy for, for example /tmp, so I figured I'd stick that in a partition on the part of the drive which isn't mirrored, if it is possible.
Is it?
A part of the point with RAID is that the system won't go down if one drive fails, so a part of the question is what would happen if the larger drive fails, can I have some kind of fallback for the stuff that writes to /tmp?
Any advices on this would be appreciated!
Employee of Inrupt, Project Release Manager and Community Manager for Solid
With mirroring, reads come from either disk and writes go to both disks. Windows Servers include mirroring of disk partitions and it works very well. Win XP allows stripping of disks but not mirroring, very frustrating. I have to buy mother boards with builtin raid 1 to get reduncency. The actual cost is very small amount extra, no where near the price of a server licence. I don't understand why Microsoft doesn't allow mirroring of XP, and I would understand why they would not allowing striping.
I doubt this will get read, but
Promise has a new 4 SATA raid card that offers raid 0,1, and 5 plus with SATA the drives are hot swappable. The proce for the card was under 200. And you get the added bonus of better airflow for the case(less heat:) Through in a couple of HDD enclosures for 12 bucks and you have raid 5 of up to about 750gb with the current high end SATA drives at 7200 rpm
Rock on
What you seem to fail to grasp is that your 5 year SCSI guarantee does not guarantee you that the disk will not fail within 5 years.. It merely means that the disk is unlikely to fail in that time and they will give you a free replacement if it does.
Therefore, if your data is important you won't just trust that an unlikely event won't happen - you'll assume that it will happen and make sure that it won't affect the integrity of your data.
Therefore you'll be using RAID and preferably regular backups whatever you do. This is what ensures your data integrity, not the reliability or otherwise of your drive.
After that, it's a case weighing performance, the cost (in money, manpower and downtime) of replacing a broken drive and the cost of setup against each other, and this is where it starts to make sense to use IDE drives for RAID:
For instance, say you've got 5 IDE RAID array. Over the space of say, five years you end up having to replace three of the drives - that's eight IDE drives you've had to buy
You also do the same thing with SCSI drives, and luckily none of them break - that's 5 SCSI drives all in all.
Now, say the IDE drives cost $100 each compared to $500 for the SCSI drives. You've spent $800 in the IDE case compared with $2500 in the SCSI case. There was no difference in the safety of your data but the SCSI one cost three times as much.
Therefore to choose SCSI, you'd *really* want to get that extra little bit of speed, which to be honest is more likely to be limited by the network to your server anyway...
So, to recap - assuming your data is valuable to you, the choice between SCSI and IDE has nothing to do with the disk reliability because you'll be relying on some other systems (RAID and backups) for your reliability anyway.
Additionally, you will need sleds for most of your drives that are compatible with your RAID card. This isn't a difficult requirement, but be sure to consult the RAID card vendor for solutions that they support or know work. You can mount your spare drives internally w/o sleds; it's the "broken" drive you're interested in swapping out, not your spare(s).
My recommendations: 3ware Escalade 9000 series SATA cards. These are hardware RAID PCI cards that work in both Linux and Windows environments. If you go with the four disk controller, do RAID 10. If you go with the 8 disk controller, you can manage it into multiple arrays and have your choice of RAID's based on your application needs.
Now, if you're pinched for cash and running Linux, use a simple software RAID level 1 with three disks (one spare). Set up your disks pin-out to "Cable Select" so that when you pull out your "broken" disk, you'll still be able to boot (some BIOS's are really picky about that). The spare isn't necessary, and you won't have hot swap, but it'll be cheap and relatively reliable (with respect to hardware RAID 10).
assert(expired(knowledge));
I would suggest you dont buy a RAID System: Heres what I do: I got 3 harddrives - one small one with a tiny linux installation on it and 2 harddrives of the same size for data. Every night Drive 1 is rsynced to Drive 2 and unmounted. Now Drive 2 will be mounted instead of Drive 1. The next Night Drive 2 will be rsynced to Drive 1 and so on. The great advance: If you accidentally delete a file, you have untill midnight to restore it without any hazzle.
Spelling mistakes: My is english spoken not tongue of mother.
How do I know? 'Cause I submitted this EXACT SAME story a month ago and was rejected.
A ID-5-3BAY&cats=&catid=314,312 It is a 3 bay RAID 5 for $800.
Sigh.
The cheapest RAID 1 OS internal and independent RAID (MIRROR) is Duplidisk3 by ARCOIDE.com
You also get a ton of implementations; Stand alone, PCI card (for power only), 3 1/2" bay, and 5 1/4" bay. The ones that install in bays are so the user can seethe status lights.
If you want an external RAID 5 the cheapest I have found is this - http://www.coolgear.com/productdetails1.cfm?sku=R
If you want 5 disk RAID 5 those are @ $1200. http://www.cooldrives.com/fii13toatade.html
If you want external RAID 0 or 1 relatively cheap then go with one of these - http://www.cooldrives.com/dubayusb20an1.html
You can find a ton of these devices on the web since they all use the same drive controllers and bays. The nice thing about these is that sometimes you can talk the store into selling you the RAID system without the external case. These things simply require you plugging in an IDE cable and power and can be installed in any PC case that has 2 5 1/4" bays open. If you but just the 2 bay controller they are @ $230 or so. I have one and I am really happy with it.
Everything I listed above uses IDE drives and is OS independent.
It comes with the drives enclosures, trays and the RAID controller. They're hot swapable and the rebuild time is relatively fast. (4 hours for a 250GB mirror set) $170 from MWAVE.com
I even went so far as to buy a third tray for offsite storage. I replace the offsite tray with one of the production trays once a week. Promise also has a monitoring utility to put on your admin console so you cant get status alerts.
Best of luck!
The greatest hindrance to success is a well-rationalized excuse
add 2 - 36GB SATA WD Raptor / 1000RPM SATA / / 5.2ms / 8MB Buffer / 5 Yr Warranty
132 year life span
I don't see a probem in my future
Make sure your drives are kept cool.
Back up data regularly.
After the first 50 or so posts, it seems no one answered this question.. most just listed the raid definitions...
"I'd just like to remove the bad hard drive and install a new one and be done with it."
Wouldn't one need an expensive hardware based raid controller to seemlessly hotswap and rebuild a volume?
My personal file server has run for many years with
only a single failure. I replaced the hard drive,
restored from the backup and restarted. It
seems like that's less work than a raid array.
Cheaper too.
-- Programming with boost is like building a house with lego. It's a cool but I wouldn't want to live in it
This story is off-topic as it can be. Read a RAID-FAQ. Every idiot knows that there are hot-swappable drives. And that is common in the RAID area.
(Psssssst ... shut up, dude!)
Breakfast served all day!
Stripped, Mirrored. If you can play in Solaris land, disk suite makes things way easy to manage, and way easy to recover from. Haveing hotspares is VERY handy. So is the ability to remove a disk from the array on the fly, fsck/format/wahtever it, and add it back on the fly, and have it auto synced to where it should be.
Those sorts of things I think are a wee bit more handy than jsut knowing 'what raid level to use'.
Assuming OS=UNIX|Linux. The problem with raid is that it doesn't protect you against a "rm -rf". I would rather suggest to buy two disks, mount them as separate filesystems, and use a daily rsync to mirror content from disk 1 to disk 2. Avoid "--delete" and you're safe.
Of course, having the second disk on another PC is better, the best is having this 2nd PC in another location (eg against fire). But this is my running paranoia.
Side benefit : you avoid RAID cost and complexity (try booting from a soft-raid1 setup when disk 1 has gone south).
Chuck
Because RAID 5 stores parity info in stripes on successive disks, n-1/n (2/3 if you have a 3 drive raid5) of your read requests will now require the controller (or worse driver if you are using software raid) to do a parity calculation to get your data. This is slow.
Speaking from experience, on a low usage array it doesn't matter at all, but on a busy array, its a huge problem. RAID 10 is the only way to go for a busy/important array.
Linux SW RAID is a breeze, just google for "linux sw raid howto"
Bryn
I build these little shuttle boxes for software delivery at work for very cheap with 400gb Raid 5 usable redundant space for a mere $687 you can add that to an existing PC $300 for the controller 3x $129 for the drives if you shop around and dont use the first froogle links like I did you may be able to save some cash we used to use a different and cheaper (like 200 bucks plus a RDR Stick of ram) card, but it sucked. if you wanna spend more money buy another drive and make it 600gb raid 5
I work for a small consulting business and since i'm the "linux guy" i was told to build a ^sigh^ cheapest, but reliable web/mail server.
I got two 60 seagate ide drives and used software raid to set it up. That was about two years ago. One of the hard drives failed last year, but i replaced it with a 80gb drive i got in the mall and just rebuilt the server in about 30 minutes (counting the googling and the server stop to replace the drive).
I definitely recommend, as i use this setup at home too.
You said you already had hardware picked out, so I don't know if this will be any help, but here's my setup that's been running constantly now for around 3 years.
-3Ware Escalade 6410 4-port true RAID controller
-IBM GXP75 80GB drives (x3)
-Redhat 7.1 with modified/upgraded kernel
This system is running as a file server on a small network (4-10 computers depending on the day) and the only problem I've had was when one of the IBMs died (a month after got it). Replacement of the drive was incredably simple: Pull the old one out, drop the new one in, turn the system on and presto: automatic rebuild! Total down time: 8 minutes to swap the drive and reboot. (okay, I cheat, they are in removable carts)
At the time, the whole setup only cost me about US$500, and I'm sure you can do exactly the same setup now for around $300
Offtopic: I'd stay away from the GXP75. Not a good drive, unfortunately I didn't know that before I bought them, I just got them for their quietness.
The longest running drive is a Samsung something-or-other, at about 7 years continuous (or very near) up time.
I use for my gaming and development boxes, RAID 1 setups. 2-36 gigs UltraSCSI seems to work well. I am using Adaptecs 2100S's with the most cache i could fit. I've just added an additional 250 gig ide drives for some non-critical items. I've had this setup for about 2 years and just had one of the drives fail due to faulty 5 1/4" tray fans. I never saw the fans fail and i guess the drive was running hot for a while until it finally went kaput!!.. Just pulled the old one out, put the new one in and after about 15 minutes of rebuilding i was back in business. Needless to say i now have enough air-flow through the box (additional fans) that i have to worry about things blowing around.
*--- Sometimes a majority only means that all the fools are on the same side. ---*
RAID 5 is only really appropriate if you are building a large array. The money you will spend on the controller will make the cost/megabyte higher than RAID 1 unless you are looking for a very big array (more than you can get with a mirrored pair.) I have a RAID 5 array I built about 2 years ago with 4 160GB drives on a 3ware 6000 series RAID controller. It has worked great and I'm planning on using RAID 5 again for my next array. I've only had one drive failure so far but it recovered from it beautifully.
If you are willing to fork out about $1100 for storage you can create a really nice array. I'd recommend a 3Ware 4 port 9000 series controller like the 9500S4LP (around $330) or a RaidCore card reviewed recently over at tomshardware. Add in 4 $180 250GB SATA drives and you have a nice 750 GB array for around $1100. The Promise FastTrack SX6000 is quite economical and supports more drives if you don't mind it's bad performance and crappy Linux support. 8 port cards are also pretty economical but it's hard to put that many drives in most cases. You have to design a system carefully in order to create arrays much bigger than 4 drives.
Once you have your array, it's a good to use Linux or something with a reliable journaling filesystem on top of it. Once you have a RAID array your filesystem becomes a much more important point of failure. Using a reliable one will do a lot towards reducing your likelihood of data loss.
I also use a separate drive with a separate filesystem for backup. I have a script that manages it for me (ignoring certain directories) which runs every night. A RAID array is pretty reliable and a big step up from single drives so it's a good half way point but I wasn't comfortable with it so I went further. How far you go us up to you.
set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
I bought a server off of Ebay then bought the drives somewhere else. I got a Dell Poweredge 4300 (supports Redhat) and I loaded it with 6 50GB SCSI hard drives for $50 a piece and did RAID 5.
I don't know if this is the best option but it gave me 250GB of RAID 5 space for under $1000 including shipping. The server came with 3 9GB drives that I bought a holder up top and mirrored them with the OS to make sure the computer doesn't go down that way. With this solution you also get the advantage of good cooling, and multiple power supplies. Of course you get the hotswap bays up front too.
This is just a thought I'm still not sure if this is better then going the SATA route since bigger SCSI drives are so expensive!
This is a FAQ. Its the very first thing covered in every single RAID FAQ you find in Google. Read the FAQ.
And if you're losing that many drives, you should probably read the FAQs about overheating too. I've been responsible for hundreds of drives in the past decade and I've lost only two. Sounds like you have a problem.
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
I've been using software RAID 5 for years, and while I've suffered my share of IDE drive failures, and even a controller failure, I've never lost important data because I also back up my server. Critical data gets backed up twice, once on the server, and also on the workstation. Software RAID is more configurable than hardware, and cheaper, too. I use standard Maxtor/Promise ATA-133 controllers, instead of Megaraid or Escalade or similar. I run eight identical IDE retail-grade drives on two controllers, and a pair of large monolithic backup drives on another, plus a pair of mirrored U160 SCSI drives for the OS. You CAN put two drives on each IDE channel, the secret is round-robin disk access timings. My RAID array consistently reads 80MB/s in hdparm -t, while my SCSI mirror only reads in the 50s. The relationship between block size and EXT3 stride length is also very important. Check out the software raid how-to at http://www.tldp.org.
RAID only protects against _hard drive_ failure. At the last place I worked, the RAID controller went nuts three times, losing all data we didn't have backed up. The drives were physically fine, but all the data was lost because the controller went bad.
An old but still good "blackpaper" article on Ars that describes RAID options and requirements. Probably nothing that hasn't been discussed here, but at least it's all in one place, and not spread over N hundred posts.
http://arstechnica.com/paedia/r/raid-1.html
http://www.baarf.com/ - battle against any raid f
For a home system keep it simple, if its simple but works then you can full understand it which means if it does go wrong then you are unlikely to panic and make matters worse.
Keep your data on mirrored (RAID1) disks and use a journaled filesystem. Use software raid and once you have set it up make sure that you can mount it using a different system than will run it normally, e.g. a Knoppix cd or do as i do and have the OS seperate on a caddied drive so try a spare disk with a distribution on it
Now that will only protect you against the possibility of a hardware problem with the system
But this doesn't help you if your house burns down, (incidentally taking your backup cds/dvds with it). I personally get round this by transfering the things i do not want to loose, e.g. personal documents, family photos etc, via sftp onto another server, same hardware setup, at my parents several hundred miles away. This amount of data if you are being honest should be small as it should only be really irreplacable files, i.e. NOT porn, mp3's or movies which you can rerip. This is done automatically but uses a 7 day cycle so that i have that long to notice a cockup and just be able to copy back
This, IMHO, means that my data is reasonably secure from being unretrievably destroyed.
I have had had disk failures with the data disks
What ever you decide to do, if you make backups please make sure you can restore from them before you are in a situation when you HAVE to be able to restore from them!
Regards
Tim
Since having my system disk fail I always mirror my system drive using RAID 1. If one goes you just have to pull it out and you're back in business. Also, some OS don't like to be installed on a RAID 5 array and will be trouble on disk failure, and impossible if you're using the OS for the RAID management. I keep my data in RAID 5 because it is more space efficient. I use five drive striping and a hot swap spare. I backup critical data from both drives to tape every night and store the tape offsite because I had to manually rebuild one of those file systems once. And most importantly, regardless of what RAID system you use, upgrade to a new controller whenever your old one is no longer supported by the manufacturer, else keep a spare around if you plan on keeping the thing going longer than the company that made it. Chances are you're not going to be able to make a simple swap of controllers on drives that are already loaded. I can tell you from experience that it's less costly to pay for the regular hardware upgrades than to have a RAID controller repaired by a company that no longer exists.
- 3ware 9500S-8 8 port SATA RAID controller $485
- 5 250GB Maxtor Maxline Plus-II drives $195 each
- Supermicro 742T 7-bay SATA hotswap server case $330
The drives are in a 4 drive array with one drive as a hot spare. About $1800 total, which includes the server case--pretty steep for ~700GB usable space, but I now have:- expandability to at least 7 hot swap drives
- a hot spare
- a dual xeon capable case with a 550W supply
- plenty of airflow
- online capacity expansion (3ware says available this summer)
Yes, it is still a personal server, but we keep a lot of video on it as part of my DVArchive setup to support my ReplayTVs. I installed Fedora Core 2 on it right after Core 2 was released.Now, when I need to store a few hundred more hours of video, I can just throw 2 more Maxline Plus-II drives at it to get up to ~1.2TB--leaving final cost at under $2/GB, including the computer case, power supply and hotswap bays.
provantage.com has the 4 port 3ware 9000 card for about $320, I think. -se
If you've got a lot of data and losing it would be a hassle, don't cheap out on some iffy homebrew solution-- get an external FireWire RAID-5 unit. They cost a little more, but they are worth it. Micronet sells a 600GB (5 x 120GB, 480GB usable) FireWire 800 RAID-5 for $2200, IIRC. If I had need of that much storage space on my LAN at home, I'd have one myself.
They only sell one size smaller than that, a 480GB (4 x 120) FireWire 400 unit, but it actually costs more for some reason.
I sell them to clients all the time, and they work great. I've only seen a single drive failure, and the users had no idea it had even happened. I slid in the replacement drive and the unit rebuilt the blank drive in about two hours, with no perceptible performance hit to the end users.
For a home file server, the most reliable solution is a mirror server. You can get a old junk computer and put massive disk drives in it. This will allow you to recover from the accidental file deletion, bad hard drive, bad memory stick, bad power supply, bad drive controller, bad network card, etc. And if you can actually house the computer at a different location, you get disaster recovery too!
This should at least leave enough time for a hot spare to rebuild before another drive goes, which can be a problem for RAID 5 (as noted here).
Is this being used anywhere?
I have a removable drive, I ghost every month and do incremental backups every week to a DVD-RW.(seperate DVD for each incremental)... Worst case I lose a week and a couple of save games.
The nice thing about ghosting is that I get to use a cheap 120GB 5400 rpm drive and save many compressed images from the drive I'm backing up... I'm backing up a 36GB(?) 10,000 rpm WD Raptor. It only take 15 mintues to restart, boot from my ghost cd, save the image, and reboot into windows XP.
I didn't like the RAID solutions I was looking at, this not only works just as well for my needs, but I get to keep the removable HDD in a safe(fireproof of course) at the other end of the house just in case...
Just a thought...
Everyone else has pretty much covered it already, but here's a brief summary anyway.
RAID 1 with 2 drives or RAID 5 with 3 drives is the best way to go for your typical desktop depending on your storage needs.
What's important (as others have also mentioned) is that you also backup the data regularly. There's a great Open Source project called BackupPC which will let you backup just about any type of machine with network access (Unix and Windows machines) automatically. You can configure it to make as many incrementals as you want/need and it also uses compression and hard-links to save storage space. All you need is a Unix server to run BackupPC on with a decent amount of storage space. Restoring files with BackupPC is also a piece of cake, there's an easy to use web interface, or you can use the command line as well.
Best MB/$
As a "personal fileserver", downtime to replace a disk is not critical.
Most IOs will be reads. Even with RAID5's slow writes it won't matter a great deal, since the machine is probably on the other end of a 100Mb (or maybe home-user-level Gb) network. The bottleneck is almost certainly going to be the network, not the disk subsystem.
For the same reason, the CPU overhead of software RAID is largely irrelevant. Any remotely modern CPU can easily handle the computational overhead of RAID5 with heaps of spare grunt to actually do fileserving.
Disaster recovery costs are lower. You aren't tied to a specific hardware RAID controller (which may be impossible to buy at all in a few years). The disks can be trivially moved to a replacement machine and accessed from there.
Also remember that RAID is for protection from *hardware failure*. It is _not_ a backup solution. It won't protect you from accidental deletions, random filesystem corruption, or malicious activities like worms and crackers.
My personal recommendation, based on my setup, is a Linux machine using software RAID5, LVM, 4-channel disk controllers and drive cages like these. (there's a few different types around, you may want to go Serial ATA, for example) Buy 3 disks at a time and RAID5 them, then glue the space together logically using LVM. You lose somewhat more space than just a straight RAID5 across as many disks as you can lay your hands on, but you gain a bit in reliability and ease of use (adding more disk space, removing old sets of drives as they become obselete and migrating data is easier).
I started off with a set of 2x(3x20G) drives years ago and have successfully and easily moved through sets of 40G, 80G, 120G and soon 200G drives by following this formula. "Replacing" old disks is a matter of adding the new ones, setting up the RAID, adding the space to an LVM volume, migrating the data off the "old" RAID set, removing the "old" RAID set from LVM and then removing the "old" drives.
But I've been using software RAID 5 + linux for years and have never lost any data as a result of drives going tits up on any of my production servers. These are primarily busy web and mail servers. Back in the day, I also used the md driver for a largish SCSI array that was used at the time for a top 50 news server spool. Nary a problem (at least on the RAID front) and that was 4-5 years ago.
With RAID drives being so cheap and most current motherboards shipping with 4+ IDE ports, a usable RAID 5 setup is cheap. I added 1/2 terabyte of software RAID 5 storage to an existing Linux box by purchasing 3 Maxtor 250gig drives, using spare IDE ports on the motherboard. I think the disks were $139/each on sale at CompUSA.
Software RAID bothers you? Get a cheap 3Ware hardware IDE RAID card. They work out of the box for all but the most old/obscure kernels.
Cheers,
- Firewire 800 and/or USB 2.0 interfaces;
- RAID Levels 0, 1+0, 5, 5+ Hot Spare (varies by manufacturer/model);
- quick removable drive bays;
- drive capacities up to 320GB/drive; and
- auto-formating after drive swaps.
I've seen prices ranging from $200 to $1300 depending on features, with some units being more portable than others. Simply Google "firewire RAID 800" as a search. I'm considering a RAID enclosure as a master archive for my mp3 and media collection (simple mirroring), with an additional portable single drive Firewire 800 hard drive enclosure as a backup for safety and portability. In my case, I have a file server currently but rather replace it with a smaller and quieter RAID box.Suppose one wanted a home machine with Windows XP Pro and some Linux/BSD variant. Are there any software/hardware solutions for that configurations? Does a raid array have to be dedicated to a single OS?
I am supposed to have raid supported on my motherboard (Highpoint, I believe) but my version is not supported by any Linux version.
Basically, your options are RAID-1 and RAID-5... as hundreds of people here have already pointed out. RAID-1 is just straight mirroring (where all drives in the array contain the same information). Usually, this just involves two drives, but there's no reason why you couldn't have, say, three or four drives all mirrored... and you could lose all but one of them and still be up and running.
RAID-5 is a very cool beast. You bascially have an array of drives with some portion of them set aside for redundancy. Most of the posts I've seen here only describe a scenario where you have three drives with one of those drives for redundancy. This only scratches the surface, however.
For example, you could have an array of, say, 5 10GB drives, with 2 drives' worth of redundancy. With this, your RAID implementation would make available to you, what seemed to be, a single 30GB drive (since 20GB of the total 50GB is used for redundancy). This way, you could have any two drives go bad and you're still okay.
Another example, I guess, is that you could have a two-drive RAID-5 with one drive's worth of redundancy. In this case, you'd have the functionaly equivalent of a RAID-1 mirroring setup. Not very sexy... but you could do it in some implementations, I'm sure.
I'm trying to use the phrase "X drives' worth of redundancy" instead of "X drives set aside for redundancy" because it's important to point out that, in RAID, all of the drives are considered equal. If you have 5 drives with 2-drive redundancy, it's not like you set 3 of them as the "main" drives and 2 as the "backup" ones. There's no preferential treatment like that. All the drives are equivalent and you could lose any of them and the others all move to cover for the one that was lost.
Now, personally, I like RAID-5 because it offers the ability to use more than 50% of the space you paid for. With RAID-1 mirroring, you always only get to use 50% of the space that really exists. This would be necessary if, when you suffered a storage failure, you always lost half of it. But that's not how it happens. Usually, you lose a single drive. So, it would be nice to maximize your space available, while having some insurance against a single drive failure.
This is where RAID-5 really shines, because each successive drive you add, you get all of that space for your usage. You could have, say, four drives, 1 drive of redundancy, and you get 3 drives' worth of space.
Now, there are a few pros and cons for both RAID-1 and RAID-5 regarding recovering/moving data and changing the size of your array, and I'll list them here.
Today decieded to check the hard disks, just in case.. one is dying :(
WTF?
Since you say you want a fileserver, I am assuming (as in my case) that speed is not an issue.
I set up linux on a machine with 4 drives configured as raid 5 using software raid in linux. No expensive hardware. Drives were 5400rpm models, so a bit slow, but I still got 15MB/sec throughput from the array. Since this is a fileserver, the 100Mbps ethernet is the limiting factor, and not the speed of the array. I use my array for weekly backups and storage of files that don't fit anywhere else.
The most important question is what is your data worth. Once you've determined that you can budget accordingly.
For high end consider a good hardware raid controller, raid 5 + hotspare, and as many spindles as you can afford. The more spindles on raid 5, the better your performance will be assuming you don't max out the raid adapter.
Regardless of whether you use raid or not, backups and rotating backup media are a must. Raid is not a substitute for backups.
Regardless of the $ you think this will cost, your personal time will cost more. There are no easy answers.
Two RAID 5 arrays striped. FAAAST!
Purchase a 3Ware IDE Raid Controller of your favorite flavor.
Run a RAID level that provides some protection. Mirror at least. Mirror stripe sets . . . or stripe a set of mirrors. Stripe with parity if you must. Read up on the pros and cons. Use your head.
No matter what, back up your junk!!!
We have a several four year old 3Ware Escalade 6000's that have saved our butts a half a dozen times or more.
3Ware will take care of you.
BTW - Don't use IBM hard drives. . . and Seagate is your friend.
IDE raid is definitely today's choice for home fileserver. if you dedicate a whole box just for that purpose (i.e. no need for those cpu cycles) you should do soft-raid5 (use that system cpu);
if you can.. make sure your NIC and disk controller are on different pci buses.
if you play with hotswap bays make sure they have some quality to them - otherwise you'll be reseating/replacing drives like crazy and that'll bring your raid5 to crawling.
another prob i've seen - most sata cables tend to become lose; i'm still looking for a decent snug fit cables out there..
lsi(megaraid), adaptec make decent sata-hw-raid cards (~300$); 3ware is another popular choice;
might wanna check out these "sofware raid cards": a slightly advanced ide controller, might do hotswap but still uses system cpu which could be an ok for you since system cpu is idle anyway..
I use four computers 18+ hours a day without RAID (just normal backups) and yet in 5-odd years, I've never had a full-on disk failure of any type. All these computers are running 24/7. There must be a reason some folks have these repeated failures yet others don't. Any insights? I'd love to have a realistic and engineering valid rationale for when I really need to go to RAID.
My company has a small business server at a server farm that runs RAID on linux, and it's down far more often than all my home machines combined (Mac OSX, Suse 9 Linux, Windows XP and FreeBSD). The failure is almost always a disk issue.
The only things I can think of that might be different with me vs. the original poster: 1) I don't go out of my way to find the absolute cheapest parts - many /.ers seem to do otherwise, 2) my computer room/office is alway kept below 80 (and more usually below 75) (air con) - most failure mechanisms are heat-activated, and 3) I don't overclock or have a hellacious video card so I'm not generally pushing any component limits in general or generating a a lot of heat in the cases - hey I suck at games and I just want to use my computer to get things done. Also redundancy doesn't necessarily buy you anything if the mechanisms to create it add more error rate individually or in overhead.
Would love comments and insights...
I installed a Linux OS on a new computer that I put together, and I was excited that SGI's XFS was available on Linux. I had been using UFS with Softupdates on a FreeBSD machine, and having gone through various power outages and a minor tornado-like event with no problems, I wanted to have some sort of similar thing on the new Linux machine, and I wasn't necessarily convinced that ext2 was up to the task (compared to UFS with Softupdates).
But when I heard that xfs was available on Linux, I jumped at the chance, and am still using that machine today. I like XFS, but you have to be careful, because you will tend to lose data when the power cuts out. Whatever files you were last working on, whatever apps were open, that is what is most vulnerable. You might very well end up with an empty file, and that's not a whole lot of fun. I guess that I was just misinformed, or was reading stuff that had misinformed me about the data loss situation. Perhaps ext3 would have been better, but I really wanted to be "cool", as well, so I had to do xfs, of course.
So I later realized that the purpose of xfs (and journaling filesystems in general) was not so much to protect from filesystem corruption and data loss, but to be able to get your filesystem back up and running easier without having to do extensive fsck procedures when you fire your machine back up. So that's a lesson learned - I still like xfs, and I love the speed with which it performs, even if there are risks when things aren't backed up. For now, I will continue using it, but it's probably not EXACTLY what I was looking for as a replacement for ufs with softupdates.
But anyway, back to RAID. There are advantages to RAID - if you have a mirrored hard drive, you don't need to go hunt down your backup, you just switch over to the other drive. You don't constantly need to take backups (every 5 minutes, I mean) to have that mirrored drive be current. You can also get the speed.
What I am wondering is this: Isn't using RAID as a way to mitigate simple hard drive failure in a residential setting something that the concept of RAID perhaps wasn't exactly designed for? Journaling filesystems, for instance - the idea is to recover QUICKLY, not just recover. It seems to me that RAID is, similarly, designed so that you can recover QUICKLY - from a hard drive failure, not just recover, and that's just a hard drive failure - there could be other failures as well, right?
So while RAID is better than nothing to help you recover from a failure of a hard drive specifically, a better insurance policy would be to have a tape drive of some sort and to use that to do backups and incremental backups. It would seem to me that striping might be a hell of a lot more fun anyway.
So I think that until we really look at the purposes that RAID can effectively be used for in a bulletproof sense, we might not understand why RAID might be an imperfect solution to the problem of surving a simple hard drive failure in a residential setting, where uptime is not as critical as it might be in a more non-residential type facility.
I think two computers is a good idea. Keep one unplugged (from everything) when you are not using it, and keep your personal files synced up as best you can (or backed up with CD-Rs, DVD-Rs). Then you can have fun with striped RAID setups if you want and not worry. In the long run, setting yourself up so that you can do the risky stuff is probably not only going to give you the confidence to have fun, but might also serve as a safety net that you didn't even know you had if that ever becomes necessary.
I've a bit of a radical opinion about raid. Raid sucks.
In twenty years I've had one hard drive crash. One. If for those twenty years I had linux half the time (240 on/off cycles or 2 reboots per month), and dos/windows/os2 the other half the time (3650 on/off cycles or 1 reboot per day) then I rebooted my system almost 4,000 times. Raid takes about 3 times longer to boot up over non raid devices so let's say raid alone cost me roughly 8,000 extra minutes of reboot time. That's five and a half days of down time.
Now let's figure out how much money twenty years of raid will have cost me. And then let's figure out how much extra time I'm down because of the added complexity of the raid technology. Er, well, let's not.
Raid is for overbudgeted IT managers who want to cover their asses when something does go wrong.
raid sucks.
I don't have any real world experience with raid 2 raid 3 or raid 4. BUT what is the real world performance and acceptance like. All i know is an easy way to remember for tests 234 bits bytes blocks :)
oh yea i vote raid 1
Look at pricewatch. there are some 1GB dimms that go for $90. Sometimes they're regular, other times they are using chips that require a chipset that can handle the arrangement. Whatever, get a board that can handle these UNBUFFERED dimms that has four dimm slots & can run with 4 gigs. There's some hunting to do here but they're there, $60-120. Get a UPS that'll last long enough to be able to dump 4 gigs to disk, which equates to just about any UPS so you could double the required UPS size if you like & it still not be expensive, $60-120. Do a raid setup like you were intending, either 2/3 drive setup is fine. For a big pipe, use two gigabit nicks with jumbo packets and bonding(both std features on even cheapy $15 ones). For configuration you can use the memory as disk cache and let the kernel manage the usage for you(it'll still do a lot of writes constantly) or setup a ram drive & chron job to periodically save to disk. I prefer the later & rely on the ups but the prior would be slightly safer while causing more drive use. Wash, rinse, repeat the hardware along with a striping distribued network file system to scale.
For more reliability you could also add redundant ups's, power supplies, and hot swap trays for the drives(linux lets you power down drives & decent trays have ground stay connected long enough after connector unplugged to not cause a short).
That way you could swap and any of the trio(ups, ps, disk) and your data survive and stay up for a single hardware node. & then there's configuring for redundant servers for handling whole system crashes or servicing.
Comment removed based on user account deletion
Why is there an ongoing hoopla over ribbon cables being big, ugly, and hard to manage?
Properly installed, they're neater and less restrictive of airflow than either rounded or SATA cables. And, you can make them yourself with tools you probably already own.
The trick is simple: Folding. Fold them against flat things, out of the way. Fold them at neat, 90-degree angles to turn corners and avoid adding crosstalk. Do this right, and they're visually stunning, easy to work with, and nearly transparent to airflow.
It just takes a bit of forethought and planning.
I recently installed a large-ish rackmount PC, with a three-bus hardware IDE RAID 5 config, along with the usual floppy and CD-ROM cables.
Abundance of room? Ha. The box was tight, with 17 full-length PCI cards installed. And cooling was at a premium, as the majority of those cards required additional power from 4-pin molex connectors. So the ribbon cables had to be dealt with accordingly.
The RAID cables snuck behind their Adaptec controller with neat bends, making them invisible and out of airflow. From there, they tucked flat against the backplane's mounting plate, keeping them almost invisible. They continued this trend until they met the 3-drive hotswap 5.25" enclosure mounted at the front.
Same stuff with the floppy, and the CD-ROM. All routed flat, out of the way. I'm not sure it could have been done at all with rounded or prefab SATA cables, without looking like a bowl of spaghetti and/or fucking up my goal of laminar airflow.
Remember the bit about rolling your own? It's easy, -and- you don't end up with slack that needs bundled up somewhere...
Kid-proof tablet..
Ok, since this is your personal fileserver, I assume you are talking about storing music, movies, files, whatever. I also take the word "personal" to mean that if you are in the middle of watching 2001 and an error on your drive causes an interruption in the movie you are not going to be overly upset. You may be sad about having wasted certain 2001 enhacing substances, but anyway.
/filemanifest
I would say rsync and scrub the disks.
Every night, simply rsync one disk onto the other one. You have two copies, you have no hardware or software raid configurations to deal with. It's cake to recover and to resync once you havea chance to run down to the store to pick up a drive.
Weekly, or while you are at work, run something like:
find . -type f | xargs md5sum >
You'll have a history of your disk and can track bitrot or whatever. You'll also be touching every single bit on the disk. No better way to detect errors, might want to run badblocks.
You are adding complexity and other people software with other peoples bugs trying to use weirdo ide hardware raid, etc. Linux sw raid1 is great because the disks aren't special if you split them, they are just plain ext2, or whatever you put on there.
Cost is increased cpu usage. Savings is reduced disk usage.
Would like to hear from more people _actually_using_ RAID, particulary SATA based. I use a 3ware 7506-4LP on FreeBSD and I have had no problems yet. Uses 4 WD 1200JB PATA drives in RAID 5. I want my next RAID to be SATA but I am cautious. Here are some caveats: http://www.ata-atapi.com/sata.htm "DO NOT tie wrap SATA cables together. DO NOT put sharp bends in SATA cables. DO NOT route SATA cables near PATA cables. Avoid placing SATA devices close to each other such that the SATA cable connectors are close to each other." etc. Check the link for more cautionary info about SATA. Have we even heard from _one_ SATA RAID user? Lotta data at stake here.
Here, I'll take the extra step and link the Google search for you.
Duplidisk makes these neat $10 (from ebay) IDE hardware raid 1 cards. I don't think there is a cheaper solution.
http://www.duplidisk.com/
We use them in all our firewalls and they've been working great.
...if you are going to the data recovery company to get your data back.
You should always have a backup that can be restored without relying on some other company's schedule.
Stop the world; I need to get off.
we use alot of 3ware cards, they seem to be the best to handle either raid 1 or 5.. depending on your needs. the best thing really depends on the amount of data you want to keep raided. if you have more than the max hd space out. i think 300+ right now u might want to look at raid 5. but if u have 300 and under just pick up to of the biggest hds and run them as mirror.. works great.. 3ware has a really cool web interface that you can check the status of your drives locally or remotely.
http://www.3ware.com/
Pocket Girls. Mobile Adult Mini Mags for your Phone.
No matter what, remember to cool your RAID array adequately. Stacking a lot of hot harddisks on top of each other in a cramped minitower can lead to frequent drive losses.
The cheapest and most simple RAID setup for a personal fileserver is to have the data and the OS on separate disks, where only the data disks are mirrored, using software RAID 1.
The reason for this setup is that booting from and RAID array can become tricky if one of the bootdisks misbehaves, unless we are talking serious (not cheap) RAID controllers.
This setup is not the best with regard to server uptime, since the OS disk isn't mirrored, and the OS therefore needs to be reinstalled if the disk dies. OTOH, it is fast and reliably when it comes to protecting the data, while still being cheap and easy to maintain.
And I assume that data protection is more important than server uptime, since you say:
My goals are to build a file server that can live through a drive failure with no loss of data, and will be easy to rebuild.
You probably want to use a IDE channel per disk, since using having to drives on the same channel/cable (using slaves) kills performance. But a simple PCI IDE controller is cheap.
Unless you can and want to pay even closer attention to you drive status than you do with a single drive, only use RAID 1 (mirroring) solutions. With (hardware) RAID 1 each drive has a FULL and COMPLETE copy of your data. If worse comes to worse you have a good probability of recovering a vast majority of the data even if both drives fail from say a fire or flood. There a several companies that, for a hefty fee have the expertise and clean room facilities to decode the flux reversals back into bits onto a healthy drive, last time I shopped a recovery a couple years ago it was about $1200. RAID 5 is recommended ONLY for an enterprise environment where a determined effort is made to raise and escalate alarms of drive failure to an onsite 24/7 staff. The very same thing that they tout as an advantage is the danger of the configuration. Your data is ripped apart and the bits spread over multiple drives. The survival of your data is dependant on not losing more than n-2 drives. You must act quickly if you loose the parity drive and replace it and rebuild back to the optimal number of drives. The danger is that while you wait for that replacement drive to ship or the bad drive to come back from warranty repair you could lose another drive. What was wrong with the 1st to die might also be wrong with the remaining drives. You probably bought all the same drives at the same time/place and stand a good chance of getting them all from the same manufacturing batch unless you go out of you way to insure they come from different lots. When you lose that n-2 drive you are HOSED. Finding a recovery company that has expertise in recovering RAID 5 data striping as imposed by your particular controller (chances are it is obsolete by a year by the time you get bit by this) is low and if you do find one they are going to charge you 5X or more what a straight mirror recovery might cost, $3500-$5000 AT THE LEAST. Yes you can add warm spares (and you can lose those too, especially if you are not "lot conscious") and take other ameliorative steps, but generally home and small business users do not go that deep into protecting data integrity. I've seen the loss of data on RAID 5 arrays happen to customers, TWICE. For a home user or a small business that runs "lights out" evenings and weekends, KISS applies, get a Promise or 3ware IDE controller and mirror two drives. Given the cost of an 80GB drive at $80 the "inefficient space utilization" is rendered a pretty moot point. You can make your life easier by installing the drives in sleds ($30-120) so you don't have to mess with opening the case and screws and cables. Don't trust "hot swap" unless you absolutely required to keep the machine running, usually not the case for a home/small business. Shut it down, pull the sled, and replace the drive in the sled (buy a 3rd sled if you need to make the swap in minimum time. Leave the RAID 5 stuff with the people and the bucks to nursemaid them, they are just a disaster waiting to happen for those that have neither the time or the inclination to monitor drives on an ongoing basis. If you are using more recent Linux or Windows servers you can use software mirroring, but I do think the $100 controller cost and Windows boot track not being included in the mirroring process (solve this by doing trivial installs on the mirror drive before mirroring the partitions with the OS) is worth the hassle of messing with it from the OS, but the more frugal might find the time/money trade-off to make it wortwhile. HTH.
There is no right to feel safe thru security vaudeville at the expense of everyone's freedom, privacy and tax money.
So I had the same goals in mind. I wanted to store a fair amount of data, be protected from a drive failure. In addition, I didn't just want to rebuild the server, I want to be able to upgrade it as needed without having to go through a lot of work.
:(
This is my solution...
Intel SE7210 motherboard
2.4 Ghz P4
2 Gigs PC3200 ECC RAM
1 60 Gig system drive
Promise Fasttrak SX4000 RAID-5 controller
4 80 Gig drives
Enhance Technology QuadraPack Q34 enclosure
VMWare Workstation
1 160 Gig drive in a USB 2.0 external case
The 60 Gig system drive houses a installation of Windows Server 2003, and the install of VMWare.
The 4 80 Gig drives are configured with 3 in a RAID-5 and 1 as a hot-spare. They are in the Quadrapack, which actually allows hot-swap. Onto this 160 Gig volume I have the images for five virtual servers. (Web, SQL, Exchange, File/Print, Build/Source Repository)
I have the 160 Gig external drive mounted within VMWare as a VMWare Shared Folder, each virtual install has it's own directory. Then I run some backup scripts within the virtuals to backup the critical data files there. Just in case... it was cheap insurance anyway, and it gives me plenty of additional temp storage when I reconfig my workstation or something.
Doing the VMWare thing is nice, because I have all this custom configuration done to those environments, and I don't have to worry about it if I want to say reconfigure the server in some way. This machine has actually been through three motherboard upgrades, a few harddrive upgrades and such since I first started doing this. The RAID-5 is fairly new, it used to just be a single 80 gig drive. Downside is since I have VMWare workstation, rather than their GSX/ESX server, I have to logon to the box to start up the virtual sessions.
Anyway, it works well. Were I doing it today I would use SATA drives. I have a RAID-1 SATA set on my main workstation. The SE7210 server board supports same, and I considered replacing the single 60 gig drive with a mirrored set of two cheap Seagate 80 gig drives or something.
The Promise RAID controller has been pretty good, I don't have any complaints. The PAM controller software kind of sucks, though not as bad as the problems I've had with the Intel server software that came with the SE7210 motherboard.
Oh yeah, I learned the SE7210 uses a special ATX power supply called ATX-12V... Didn't figure that out until I was trying to install it into the Antec SX1040BX case I already had.
The really important stuff, like my Microsoft money file... I have copies stored on the file server, my desktop, and a 32 Meg compact flash card.... It's not going anywhere unless the house burns down.
In which case I guess I really should have offsite backup, and I really should handle that by ftping up an encrypted copy of some of these important files to my website which is hosted at an ISP somewhere far away. I'm going to work on that this weekend now that I think of it.
3ware raid 5
4, 8 or 12 port ata or sata card
very nice hardware
---- Put Sig here:
I am not affiliated with them, but I recently bought one, and it works great!!! For 150 bucks you can turn any computer into a NAS (network attached storage) and even gain remote access from the internet. And in the next couple of months, you will be able to sync two reBytes remotely thru the internet (now THAT is backup!) PLUS it comes with its own backup software to backup local computers on the network with no client software needed. It is based on Linux and took less than 5 minutes to set up. I bought a brand new computer with 5 drives in RAID 5 configuration for nearly a terrabyte of storage all for less than 1200 bucks.
RTFM! Sorry to be a troll but in this case you really showed that you had no initiative to check available resources first. Here is a starting point and yes, RAID-1 would be a good start since I doubt you are going to want to spend the money on a large RAID-5 array.
I had similar frustrations, with HDs failing within a year of purchase. I finally decided that I didn't want to worry about it ever again and chose a RAID-5 setup with 1 spare disk. That way a single disk failure would be handled effortlessly. This setup is still susceptible to the failure of a second disk while the first is being reconstructed, but nightly backups handle that case.
:)
I ended up throwing a chunk of money at the problem:
$200 for a new dual chanel U160 SCSI controller
$600 for 4x 75Gb 10krpm SCSI disks (1 spare)
$200 for cabling and an external case.
$250 for a ultra-SCSI cd burner and a buttload of blanks.
Total cost was $1250 for ~150Gb of RAID-5 storage.
Roughtly half the space was lost for the parity information and the spare disk.
The multi-disk setup also turned out to be faster than using a single disk. Access rates jumped from 20MB/s on a single disk to 60MB/s on the RAID array. That turned out to be quite nice for those NFS-mounted home directories.
150Gb might not seem like much, but it's only used for important user data (home directories, mp3 library). The local machines have plenty of scratch space for unimportant data (mozilla builds, games). I don't fear the loss of the system disk as much as the loss of user data. My wife would be very upset if she found out that her email was gone forever, but wouldnt' mind too much waiting a day to have the server rebuilt. No disk failures in a year of running, but that might just be because I didn't by the cheapest drives available at Fry's.
Of course, it's nothing compared to the 2Tb disk servers at work that can write at 350MB/s. Maybe next year...
I've been looking at one of these for when I build my next server:
Adaptec 2410SA + Enclosure Kit
For $600 MSRP you get an Adaptec 2410SA 4-channel SATA RAID controller (does RAID levels 0, 1, 5, 10, and JBOD), a 4-drive enclosure with hot-swap sleds, cables, and your choice of beige or black. The only thing you don't get are the drives themselves. The 4-drive enclosure takes up 3 full-height drive bays. Note: make sure you have enough 12v current from your power supply -- I'm not sure if the enclosure staggers the spinup.
Chip H.
A few years ago I nearly lost a bunch of irreplaceable original midi files and Cakewalk projects composed and recorded by yours truly and a couple of friends, one of whom passed away from cancer not long after he recorded the songs on my keyboards and computer. I swore I'd never again run solitary drives on an important music composition/production computer and promptly ran out and bought myself a Promise FastTrak card and a pair of 20GB drives (the biggest around at the time). I ran those as RAID-1 until one of the 20GB drives started crapping out and then bought a new FastTrak TX4 100 card and a pair of 80GB IBM DeathStars before I knew about them earning that nickname. Used good old Norton Ghost to copy the known good drive from the old mirrored pair over to the new pair and was up and running on the new drives within a couple hours. Ran those for 2 years as RAID-1 without a hint of trouble despite the IBM drives, and recently replaced them with a new pair of Seagate 120GB 8MB buffer drives. I've never had any disk I/O thruput problems with the Promise card either, it's plenty fast and I also use this particular rig for multitrack audio too, and can easily support recording more simultaneous tracks than my audio hardware can even provide. I used one of the old 80GB Deathstars in a Linux box where it's still happily running, and use the other old 80GB drive to hold a recent Ghost copy snapshot of my music rig and keep that drive stashed away across town at a friend's house for offsite backup in case some major disaster hits my house.
Dear Slashdot:
/. really needs to cache information. Seeing as I've been shot down many times, I've decide to start asking inanely simple questions for the sole purpose of collecting all the obvious answers in one place. I hope you don't mind that my stories will take the place of those which could actually spark intellectual debate.
I'm having this problem where I can't access Gooogle. I just can't get the number of "O"s right. I either put in Gogle, or Gooogle, or Goole... I'm just not having any luck. Can you answer my question for me?
Thanks,
Bob_at_AOL
--------------------
Dear Slashdot;
I've decided that
Sincerely;
I.P. Freely
---------------
Dear Slashdot;
I've finally figured out a way to get off. I'm gonna put stories in with really, really easy answers, and hope we get featured on news.google.com. It really gets my goatse going to see my creation up there... it just makes me want to google myself. And I've got my next story to post!
Coming soon: "Windows versus Linux for Al Quada operatives having sex while George Bush and John Kerry engage in a Peruvian DeathMatch in Iraq during which Martha Stewart makes Bukakke Chicken Pot Pie with extra MILF and a dash of Dirty Sanchez"
Googly yours;
CmdrTaco
If I knew the wedgies I gave you back in 6th grade would have resulted in this . . . I might have taken a moments pause.
Actually, it's still 3 years on the sites I buy from. I actually checked before posting. :-)
Karma: It's all a bunch of tree-huggin' hippy crap!
is the only choice for reliability, performance and cost effectiveness. yeah, you can build one for much cheaper using ATA/serial ATA, but for longevity and reliability, SCSI is still the best bet.
look for mylex raid controllers, or if your looking on ebay, consider CMD (5000 series or greater) as a fall back option. i don't recommend chapprel.
if you're gonna do it on the cheap consider the firewire raid solutions from wiebetech or granite digital.
three can keep a secret, if two are dead - benjamin franklin
This forum could not have come at a better time.
I am going to move my office file server from a single drive to a mirrored raid drive. After some simple google searches for a linux compatible controller the promise name came up a number of times. I was going to go with that.
Why don't you like promise?
Promise SX-4000 with four Maxtor 120GB drives... nice 360GB array... only problem is closed source drivers, but they work reliably.... not nearly as expensive as 3ware stuff...
may the Promise/3Ware flame war commence..
I have never suffered a HDD failure. The key is sticking to the most reliable brand. It used to be Maxtor. As of about 10 years ago, it has been Seagate (despite what anyone says). I have used several Seagate Baracuda ATA drives, and they are flawless, *fast*, and silent.
The next big "reliable brand" could have been IBM, but then there was the IBM DeskStar... Maybe they should have called it "Death Star" instead...
[Reliability aside, IBM has done more to push HDD technology forward than almost any other company (they have created a good number of the technologies that let us pack so much into such a small space -- GMR heads, pixie dust, etc.)]
I would do raid 5 with some sort of hotplugging solution, if I had the money. Although you might have to take it offline to add a disk to the array, you could replace a bad one with no data loss and no downtime. Of course, you could do that with raid 1 as well.
This would require at least 3 drives, but the more the better (even smaller capacity) because then you waste less space. Of course, this makes it more likely for two drives to fail simultaneously.
Don't thank God, thank a doctor!
I read a lot of these answers, but not all of them, so sorry if I'm redundant. I have about 20TB of RAID 5 storage, most of which I've managed for nearly two years. For a least a couple of years before that, I managed storage in the multi-TB range. Hardware RAID 5 is the best way to go. If you're really concerned with safety, add as many hot/cold spares as you like. Performance is far superior to software RAID, no matter what anyone here may post. Good luck.
The world of achievement has always belonged to the optimist. -- J. Harold Wilkins
The question "which RAID level do I use?" raises other questions: which drive interface technology? what is your budget? how much storage do you need? do you need redundancy for swap? for boot partitions?
... users can restore the file they just deleted by navigating (graphically even!) to, say, /backup/2004-06-16/home/jason and copying a file to the desired location).
But regardless of how you answer those questions and what RAID level you finally go for, I would strongly recommend layering LVM (logical volume management) on top of RAID. Sounds bizarre and cumbersome to have two virtual layers between your filesystem and your physical devices, but in most cases it's worth it.
(Now here I'm assuming you're using Linux, but similar solutions are available for other OSs).
If you're not familiar with LVM, it virtualizes partitions. You group together one or more physical volumes (PVs) that provide a pool of physical extents (PEs). From this pool, you create logical volumes (LVs) filled with logical extents (LEs).
Thus, you could have four partitions on three drives serving as PVs, and from that pool (Volume Group or VG) you could create, say, two partitions. From there you have many options:
- You can resize the partitions.
- You can add another drive and add the space on that drive to the VG, then increase the size of the partitions.
- You can migrate data off one of the partitions, then remove that partition from the VG.
- You can migrate to another drive by adding that drive, migrating data away from the previous PVs, then removing the old PVs from the VG. This can be, by far, the easiest way to
To combine LVM with RAID, just use the md device as a PV.
And here is the top reason to use LVM:
- You can create snapshot backups.
A snapshot backup is a virtual partition, read-only, which contains the same data as another partition, frozen at a certain point in time. Something similar to copy-on-write is used so that the snapshot partition takes only the amount of disk space necessary to store the changes between the time the snapshot was frozen and the current state of the 'snapped' filesystem.
If you 'rm -rf *', you can just cp the files from your latest snapshot. (BTW, this can save a ton of work for sysadmins with forgetful users
So RAID can protect you from hardware error, and LVM with snapshots can help protect you from user error.
Alright, flame me because I'm not about the command line 100%, but in cases like this seeing directory trees makes a big difference. When I started out in linux, I was all gui because windows is at all gui. That was years ago and I definatly see how the cli can be a lot faster and a lot more efficient, but in cases where I have to delete files (which isn't too often for desktop users) I would prefer a gui. You may gawk at this, but a directory tree in konquorer shows how we percieve the files to be laid out in our mind better than #pwd; ls -l.
If you are going to be heavy write, forget software RAID-5. You will be very disappointed (i.e. performance sucks). This is due to having to calculate the parity. Hardware RAID-5 will be much better (e.g. 3ware), since parity calculations are offloaded from your CPU. If you need screaming write performance you will need to do RAID-1+0 or RAID-0+1 (software RAID is perfectly acceptable for this). If you are predominantly read, software RAID-5 would be tolerable.
Maybe I had a cursed controller, but I had the SX6000 and had a drive failure. FYI, I was doing RAID5 with four 80GB WD drives. It beeps like you said, only the SX6000 doesn't tell you which drive is bad (I was running linux, no fancy windows utils -- and their BIOS didn't have this information -- very unprofessional). So, I figured out the bad disk by running WD's diag utility on them individually. Anyway, put in a new drive. Guess what? Wouldn't rebuild. So I booted into the degraded array (oh, and the kernel would panic if you left the "new" drive attached, so I had to pull that back off) and moved my data elsewhere on the network and rebuilt from scratch.
Well, later a drive started making noise, I identified it and shutdown the system nice and neat before it failed and replaced it with a good drive. Hey, wouldn't rebuild -- again! Moved all of my data off and bought a 3ware 7500-4 controller. I've had one disk failure with the 3ware controller, and it actually worked! Rebuilt while I was using the system. Comes with decent linux utils (unlike promise -- at the time anyway). Oh, and the best part is they actually label and document their product and the BIOS will actually tell you which drive failed -- what a concept! 3ware controllers are about the same price as a promise controller, why even bother going with promise?
And no, I don't work for 3ware. Simply put, promise had their chance with me and they failed miserably. YMMV. I'm sure the SX8000, or whatever, is vastly improved and all that crap, but it's too little, too late for this consumer.
Stupider like a fox! - H.S.
Standard disclaimers: no vested interest, no relationship with developers, etc. Just a satisfied user.
If you want a RAID solution that is not absolutely awful without spending loads of money on SCSI hardware, the only way to go is 3ware. They are the only ATA hardware RAID solution (which means dedicated hardware on the controller is responsible for RAID operations, not some stupid kernel-space driver) and offer controllers with either parallel or serial ATA interfaces. I myself am using two Escalade 7006-2 cards (32-bit PCI interface, parallel ATA, two disks per controller) under FreeBSD and they are excellent. Performance is good, disaster recovery is flawless (they support background mirror rebuilding), and compatability is perfect.
They are also quite cheap. If you go to Monarch Computer, you can find the model I have for around $110 with free shipping.
I know it's hard to trust advice from Slashdot, but this is the best way to go. As many others have pointed out, RAID-1 is the obvious choice for personal uses. And always stay away from software RAID. Whether it's the Linux kernel RAID subsystem or Promise, Highpoint, etc. (these "RAID cards" are in fact software), software RAID sucks and will increase the chances of data loss, not reduce it.
I have a blog entry that talks a little bit more about this.
Join Tor today!
Yeah, okay, you could buy the $2K program, or you could use the accurate, fast, open-source utility gpart that has been around for YEARS. It has saved my ass more than once.
That way, your filesystem won't be massively corrupted if the power goes out.
I forgot to mention that in my previous message. The internal RAID card absolutely SUCKS ASS.
:) If I'm putting 8 to 15 drives on a machine, I'd really rather not power them from the machine's power supply. You'll be going through hell with power splitters, and overloading the power supply, even if you don't think you are.
I'm a huge fan of the external arrays. We've used several over the years, and I have nothing but glowing reviews of the external arrays.
The SX6000 is an internal card. The SX8000 is an external box.
We have one machine with an SX6000 card. It didn't work to start with. The intended machine (A dual AMD 2000+) simply wouldn't boot with it attached. After weeks of going back and forth with the Promise support line, who insisted that it worked on *THEIR* test platform, using the same motherboard and processors, we set it aside, to work on later. A week later, they released a BIOS update for the card, which fixed the problem. The problem was that their BIOS wouldn't allow *ANY* system with the particular chipset to boot. Nice. It wasn't an obscure chipset, so I'm sure there were plenty of people with the same problem.
A friend of mine was using the same card, and at the time he loved it, but at the first failure, his opinion became much like yours (absolutely sucks).
We've used other external arrays by other companies, but those companies seem to come and go too frequently. One company I worked for had an absolutely BEAUTIFUL external array, which came as layers, so you could pick and choose how many drives you wanted it to be capable of. Unfortunately, that company went out of business in the mid 90's.
I like any external array that lets the OS see it as a single SCSI drive, and doesn't take any special drivers, like the Promise SX8000. Hell, any OS sees a single SCSI drive, what more could I ask?
Serious? Seriousness is well above my pay grade.
t
I've tried both LVM and RAID5, and my conclusion was that LVM sucked performancewise, couldn't fill a 100Mbit pipe (which one drive without LVM does easily). And since noone could explain why this was (even talked to a guy with hardware scsi raid running LVM and he barely got 18MB/s :-/ ) I ended up running RAID5 with 5 drives. Each drive is running as a master on its own IDE channel. The additional IDE channels are provided with 2 Promise FastTrack/133 (or whatever they are called).
The performance on this system is outstanding, writing is done at a sustained 50-60MB/s (yes, megabytes) and reading maxes out the PCI bus completely (tops out at about 80-100MB/s depending on other activity on the PCI bus)
The system is powered by a 2.4Ghz Celeron with 512MB memory.
The only drawback is that it will be a pain to add an additional drive to the system, but thats not really a big issue for me anyway.
Btw, the filesystem on this raidset is Ext3. I've had a diskfailure (old drive that should have been left to its own) since I got it up and running but as long as no more than one drive fails atonce, all is well. Just replace it with a new one, add it to the set and one hour later (or thereabout) all data has been restored to it and the raidset is running at full performance again.
A tip for the hardware that will be running the fileserver. Make sure to cool your drives, this is of outmost importance. No, you don't need screaming 7000rpm fans (I use three 12dB Papst) just make sure that outside air is pulled over the harddrive and expelled in the back of the case. Avoid cases with ventilationholes on the sides. Thermaltake makes (made?) a great case which had airintakes on the front and 6 internal 3.5" bays right behind the intakes (which is the one I use).
Also, you should get a good powersupply. I had some really odd problems before I upgraded to from 300W to 450W.
Good luck
What RAID should you use? Well it just depends. Frankly, if the little fuckers are getting into your personal fileserver, and just generally screwing things up, then you might want to go with your plain-old RAID I. However, if they're using your motherboard as the local hotspot for insect orgies, then you might need RAID II. And let us not forget RAID III if long-term maintenance is an issue.
"You and your third dimension."
I haven't read through all 800 posts, so this kind of info has probably already been posted, but I found it very link-worthy...
WikiPedia: Redundant array of independent disks - great detailed article summarising RAID with explanation of all the levels.
Anyone can even jump in and improve the article.
It doesn't do RAID, but it does have built in UPS. Oh, and it plays music, nice side effect.
Start your calculation with the number of Gb of space you "need". Say that this is 160Gb. Then you have the option of:
L0: no raid: 1x 160Gb or 2x80
L1: raid1: 2x 160Gb.
L1: raid5 (3 disks): 3x80Gb.
L1: raid5 (4 disks): 4x60Gb
L1: raid5 (5 disks): 5x40Gb
and you could even go for a hot spare:
L2: raid5HS (4 disks): 4x80Gb.
L2: raid5HS (5 disks): 5x60Gb
L2: raid5HS (6 disks): 6x40Gb
Now, from 4 disks you probably need an extra IDE controller in your computer. Factor that into your costs, and you can chose the protection level (L0 means you can tolerate 0 lost disks. L2 means 2 disks, but in fact not at exactly the same time, but you can tolerate two bad disks).
Then simply chose the cheapest solution.
I'd probably go for 3x80 myself.
I use the 2410SA on my system at home with 4 hitachi drives. It is great. The Adaptec card eliminates many of the raid limitations and makes it very simple to manage. It will handle different sized drives and still uitilize the full amount of space on each drive. It is plug and play in the sense it deals with all potential hassles. When a drive dies, you just put a new drive on, and it rebuilds the array silently in the background - while you are still using your machine. You can also migrate from one raid configuration to another with a click of a mouse. There is another hidden benefit to going with a RAID controller separate from the motherboard. You can replace the controller if it dies. If its integrated, you many times have to find an exact duplicate of your old motherboard to get your data off the drives. Exact means exact - same firmwares, hardware revisions. This can be hard, expensive, and time consuming. This is another reason to go with (expensive) Adaptec - there will always be a replacement available in case of failure. I use RAID 10, but would suggest RAID 5 for maxmimum safety and final array size.
so how can you make generalisations such as "The problem w/ Software RAID is it depends on the OS, if you OS fails you can loose your data" ? Have you run software RAID under any other OSes ?
The Internet's nature is peer to peer - 20050301_cs_profs.pdf
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
Kernels: debian and mandrake support RFS fresh out of the box, and have done so for at least 2 yrs (first time I used RFS).
As for the other tools, I've never encountered the need to reroll a tool or patch it to get RFS working. Again, the hardest I hammered on it was after unrelated hardware failure, and it *just worked*, recovering all data repeatedly and transparently from unexpected shutdowns, so I didn't have to learn anything.
Were you serious, or were you trolling? I honestly don't know a lot about ReiserFS beyond using it at the suggestion/behest of a more-savvy user, and from mention of a Reiser-v-world controversy last year due to something quirky in his license. But http://www.linuxworld.com/story/32868.htm has Nick Petreley's in-depth comparison. Petreley gives props to ext3 in a few spots, but likes the designed extensibility and other aspects of ReiserFS.
Those are valid points. RAID 1 [or 10 or 01] might make an even better choice for these areas which need fast write speeds.
In case nobody's yet posted the link, Storage Review has a wonderfully detailed section on RAID performance.
Dunno if anyone has posted about this yet or not, but be watching for a new type of RAID on Intel's Grantsdale chipset. It will mix RAID0 for performance, combines with RAID1 for redundancy (otherwise RAID0 is more like AID0, and that stupid Sony dog was a waste of money). The best part is that it only requires 2 drives, whereas before RAID0+1 took 4 freakin drives (a large investment).
See here for more info. http://www.dvhardware.net/article2193.html
~Ess
dunno why they had to call it The Matrix -- seems kinda played to me
LOL. If you can't tell if I was trolling or serious then I must not have done a good job writing my last message. :-) For the record I was serious. I haven't used ReiserFS but I have researched it a little bit. I also took note of the Kernel make menuconfig details on Reiser. That's where I first found the note about patching quota and NFS packages. It makes sense, assuming that they don't already support Reiser. I wonder if the more current versions of nfs-utils and quotatools automatically support it. I looked into Reiser a year or two ago when I built a 480GB array. I went with ext3 since it was so easy to get going. I also felt that Reiser wasn't fully supported by everything at the time. I haven't taken the time to look into it since. I'm sure it works well though. Like all things it probably has its quirks, but is easy to get going for someone that knows those quirks already. Maybe I'll add it to my collection of skills someday.
Good reply, good points. Now, not to change the subject, but my biggest question is: Where the hell did the subject line change?! I don't recall even *using* that phrase or concept recently. Even my browser history/cache shows an un-revised subject line.
/. editorial laxness in the past, but have never seen firsthand an outright bug like the above morphin' Subject field. Wierd...
I've railed against
*Opens mouth* *Closes mouth* Well hell. That's a damned good question. I hadn't even noticed that! I never really noticed a bug in Slashcode before (never had time to run it locally). I've got features I'd like to add though. That's weird as hell. I'm not even sure what discussion the batteries stuff came from. Weird! LOL. Nice catch.
A two drive RAID 0 does not increase the "data of data loss" by 4x. It increases the risk by the amount associated with one additional disk drive (which is incremental). The risk of data loss on the other drive and from the controller already exists.
RAID 1 includes data striping so it only requires an even number of drives. RAID 0+1, 1+0, and 10 are marketing terms invented to make products sound more feature-rich and have come into popular use. Read the original RAID paper and you'll see that RAID 1 is not just simple mirroring but includes "drive arrays".
The wasted space of arrays of mirrored drives is the same as that of a single mirrored drive. The overall extra cost has to consider the whole system. In smaller systems, say 4 drives instead of 3, the added cost of a more expensive RAID 5 controller or extra cache to help accelerate writes will overwhealm the cost of one additional drive. RAID 1 systems are not automatically the most expensive.
RAID 3 is differentiated from RAID 5 by its dedicated parity and small (originally word-sized) interleave. It has never been replaced by RAID 5 but rather by RAID 4 with single sector interleaves. Parity schemes are now described by distributed or dedicated parity and by large or small interleave factors. RAID 3 was most well suited to hardware implementations as was RAID 2 which you omitted entirely.
The typical person who buys a cheap, 4 port IDE RAID card wants the most capacity he can get out of 4 drives so RAID 0 or 5 is inevitable. That doesn't make those RAID levels always the best choices. Since that user's typically not performance-sensitive (especially writes) any RAID configuration will be satisfactory. Home users typically can't appreciate the performance difference between levels so they often believe they don't exist.