Basics of RAID
Doggie Fizzle writes "RAID has been common in business environments for ages, and is now becoming more viable and popular for personal computers. This article focuses on the the basics of RAID, and spells things out for beginners or tech veterans. From the article: 'The benefits of RAID over a single drive system far outweigh the extra consideration required during installation. Losing data once due to hard drive failure may be all that is required to convince anyone that RAID is right for them, but why wait until that happens.'"
That's an awful lot of ads for a re-hash of well-known info. Are the editors sure this is frontpage worthy? It looks like a blatant attempt to get page views to me.
http://raid.com/
http://www.killsbugsdead.com/
There's an excellent guide to RAID levels (with pretty diagrams and such) at http://www.acnc.com/raid.html
A source of information with far better content, that isn't simply an excuse to sell ads.
Wikipedia
Depending on the your budget here in the UK you can get an 80Gb HDD for around £35, so split over some time you should be able to afford two (or an extra one if you already have one). This is a good enough reason for anyone to try RAID.
I myself currently have it setup to mirror my data across two 80Gb drives... Four months ago one of the hard disks died (funny buzzing sound, no access) but the manufacturers three year warranty was still valid, so I returned the drive to them for a free replacement. I received the replacement drive and shoved it in, mirrored the data back onto this new second drive and continued as before. If I hadn't have had this setup that data could have been permanently list. It also saves me from writing ten DVDs to store that much.
http://en.wikipedia.org/wiki/Redundant_array_of_in dependent_disks
Seriously, SATA hotswappable RAID 5, put an onboard controller on next gen motherboards, I dont care if its crappy compared to an expansion card, and you will have my money. Yeah we have RAID 0, 1 , 0+1, but no onboard commercial RAID 5 solution in mainstream motherboards. I know its more expenisve, but its also more efficient, and with every failed HD common users encounter the market gets bigger.
There is truth in humor.
RAID0 will increase your change of failure since you will loose all your data if a single drive fails. RAID0 isn't really redundant.
Here is a link that explains the basics of computer hardware; I think that it's a good companion piece to the RAID article: http://www.angelfire.com/rings/judy_patch/
Okay I guess it appeals to geeks and fancy computer modders and all. But really, when it comes down to it, a decent main hard-disk, a tray in the second bay for backup hard-disks, and a reasonable backup regimen that people keep up is all a "personal" computer user needs.
Personally, I have 3 backup hard-disks, one that keeps a "clean" base system that I update every 6 months or so, and 2 that I do full differential backups on every 3 days. The "clean" hard-disk is kept off-site, and a script tells me when to do the backups on the other 2. And for very very important files, I just write them on a CD on the spot.
With that, I've yet to lose a single file since I started using Linux in 93 or 94. My solution is cheap and doesn't involve fancy raiding. And I'm quite sure I overdo it, most people could do just fine with one main hard-disk, one backup hard-disk and a little discipline.
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
With RAID, you still have a single point of failure. Instead of it being your hard drive, it is now your RAID controller. So what is the advantage?
Since a RAID controller doesn't have moving parts, is it less likely than a hard drive to fail?
Don't forget that Friday is Hawaiian shirt day.
. Losing data once due to hard drive failure may be all that is required to convince anyone that RAID is right for them, but why wait until that happens.
:)
Because otherwise, you can tell them all about the wonders of RAID and all they'll do is just pretend to be interested while secreting thinking that you are some mad geek.
Tell them about the wonders of RAID after they've been kicked in the nuts by a drive failure, and you sure as hell would be getting their whole undivided attention.
Making the most of your effort man.. that's what it is
Online backup with Mozy, sounds like Ozzie, but more!
Cuz the boss won't cough up the money until it happens.
As a (poor) student, I find that I simply can't afford an extra hard drive! I got a 2nd hand DVD burner from a friend for £15 and backup all my really important stuff (Code for university, photos, etc) every week. All my MP3s go on another DVD along with the hard disk, and they're "backed up" on my MP3 player anyway.
As of yet I've never had a single hard disk failure... but I've not really got anything I'm bothered about losing, so RAID isn't worth it for me.
There is nothing more practical than a good abstract theory.
IDE HDD Talking to IDE Controller:
HDD: I'm gonna need more time for that write
Contr: Yeah OK, go ahead good buddy
Contr: What's up?
Contr: What's up?
Contr: Error: Drive controller timeout error
SCSI HDD Talking to SCSI Controller:
HDD: I'm gonna need more time for that write because I found a bad block
Contr: Yeah OK, go ahead and remap that bad boy
Contr: What's Up?
HDD: Need more time to map that bad block
Contr: Yeah OK, go ahead
HDD: All done, grabbing the next command in the queue
There are two types of people: Those that have lost data, and those that will.
Don't forget, though kids - RAID won't protect you from deleting your own data, or a malformed script trashing stuff.
Get your own free personal location tracker
... but how often do personal backups actually happen? I'm one of those guys that has been taking home backups seriously for a long time, and has a collection of obsolete tape units to prove it. And backups still do not happen often enough if it requires me handling tape.
Let's face it, discipline is a drag, that is why at work IT people are paid to schlepp around stacks of locked cases full of back up tapes to be shipped off site.
So... for my home file server, I went to RAID mirroring, with a 3rd drive in a drawer. A mount-copy-umount chron job copies to the drawer-drive. Drawer-drive gets swapped and taken off site "when I think of it". Because... RAID only protects you from falling over hard drives. It does not proctect you from:
1) Ooops, I wish I hadn't deleted that.
2) Gack! My house just burned down! And took 10 years of tax data with it!
3) Power supply goes wonky, causing both drives to scribble random scorfulentness everywhere.
A home RAID system does not need to be expensive. Who needs hot swap? Use cheapo PATA drives. A few hours of down time for the wife and kids is OK. It doesn't take a big, bad CPU, and software RAID works great.
For those who have run out of internal space in their boxes, and who don't have external SATA or expensive hardware boxes, you can run RAID over Firewire.
.
The problem, however, is that out of the box Windows refuses to "promote" an external disk to dynamic, which is required on all post-NT4 rigs for RAID.
The solution is to add a semi-documented Registry flag, EnableDynamicConversionFor1394
HOW TO: Convert an IEEE 1394 Disk Drive to a Dynamic Disk Drive in Windows XP
Couple that with a cheap 4-bay firewire JBOD box and any spare old enclosures and you are set!
I run 2TB in various RAID configs on my Windows server (main and near-line storage). Have done so since 2002. No problems with the external boxes. The support for external firewire RAID is a little gnarly in Windows 2000 - volume must be mounted as a named virtual directory and cannot be mounted as a letter drive. Later Windows give you both options.
Da Blog
With hardware cost falling steeply, when will it become viable for home users to start having RAID-based PCs?
All said and done, many of us do keep fairly important data on our home PCs. How many of us make an effort to back it up?
Warm regards,
Sharad Agarwal
AlcoHaul: We lift spirits!
When you're setting up a RAID set using both striping and mirroring, do you want to set up two stripes and then mirror between the stripes (0+1), or do you want to set up mirrored pairs and then stripe those mirrored pairs (1+0)?
This is a quiz, and your data will grade you.
What you want, by far, is RAID 10 (1+0).
When you set up two stripes and then mirror across them, if you lose two disks, any disk in the first stripe and any disk in the second stripe, you lose all the data.
If you stripe across mirrored pairs, then the only way to lose data is to lose both drives in one of the mirrored pairs. You can lose any other disk than the second drive in a pair, or even many more disks, as long as they aren't both in the same mirrored pairs.
This doesn't make a difference with 4 drives. At 6 drives and up, use 10. Your data and users will thank you for it.
I am running RAID 5 on my desktop server right here. It has a P4 3 year old Gigabyte motherboard. It's not hotswappable because it's not enterprise level (and I don't plan on having to hotswap all of the time, only when shit happens) but it gives me the RAID 5 that I like to use as a backup using software based RAID on Ubuntu Linux. After the install, it it would be just as easy for Grandma to use as if it were not RAIDed and I am certain any /.er could figure out the install for most any Linux distro.
Can I have your money now?
I had my system hard drive fail fatally on me, emails and so forth, only some random backups elsewhere. Right then and there I decided that no more will a hdd failure steal my stuff from me and bought 4x120gb drives (size/price ratio at time was optimum) and a Promise controller. Now I got ca. 240gb RAID 01 setup, mirroring gives reduncancy and striping keeps the array at least as fast as those drives used separately.
One hdd did fail on that array, and I just replaced it with warranty replacement hdd. No hassle, just carefree usage.
The piece of mind is worth LOT more than those extra drives. I DO NOT like the menial job of building the OS from zero to working state, just because of a hardware failure, WHEN I can just as well avoid it.
Proability of a failure greater than zero (0) is not zero. And I like it to be zero.
-Is the meaning of life vanity, or is vanity the meaning of life?
It's amazing how common RAID is now, especially (S)ATA RAID.
In video editing, RAID is everything. External SATA RAID is the big thing now, and it works pretty well, even when it's OS based. What I haven't seen yet are (relatively) cheap SATA RAID 5 enclosures. That would be the Holy Grail of fast media storage.
Adblock must be doing it's job.
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
I worked for years in development of RAID solutions for a major manufacturer. One of the problems with selling RAID solutions is the lack of understanding, or the prejudice and bias of the people who were supposed to be specifying and buying the hardware.
The 'tutorial' of the parent article is talking in kindergarden terms, oversimplifications and obsolete term, and overlooking some of the issues with using RAID. It's a good example of the true lack of understanding about the subject. By now, there are so many types of solutions that the term RAID hardly applies. But, even 10 years ago companies like Compaq had innovative rudundant storage solutions that were enterprise ready.
Best regards.
As far as desktops are concerned - well, RAID and cheap just don't mix. For instance, if you just want reliability, RAID 1 is enogh (2 drives). If you want reliability + fast writes, you need RAID 1+0, which means 4 drives (RAID 5 only gives faster reads). Furthermore, a good controller is crucial (from my experience, these generally cost upwards of 100$).
Finally, RAID does not subsume in any way a good backup system. I've seen cases where a damaged controller broke both harddrives in a RAID 1. However, for (most) desktop PCs, a good backup system does subsume RAID, since it's generally easy to just use a different computer, and get all the files from the backup.
For me, the excellent piece of software backuppc running on a cheap box (~300$) has worked like a charm. This might not look cheaper than RAID, but considering that I'm using just one box to back up 10 other machines, it's pretty good.
The Raven
um ... do raid in software and be done with. That way the point of failure is either the kernel [roll back, quick fix] or the device [rush off and buy a new one].
RAID-1 is a simple way to get a "reliable" store.
Note that copying data to RAID [any of them] is *NOT* a backup solution. It's a "temp fix" for storing data.
I use my RAID-1 [two 200GB disks I bought for 130$ each] as a simple "place to dump nightlies" which I then backup to CDR weekly. I do rely on the redundancy of RAID-1 in case I trash my CVS or home dir but my longterm backup strategy includes the weekly CDRs as well.
Some tips for good "short term backuping"
1. Use RAID-1 not RAID-0
2. Use a good file system like reiserFS that's fairly immune to the side effects of being turned off in a hurry [ntfs and ext2 are not]
3. Don't use the raid drive as a home or other frequently accessed directory. Do your backups from a well tested cronjob. The less you play with your raid drive the less likely you are to delete/mess with the files on it.
4. Do backup the files from the raid to a medium like tape or CD/DVD on a regular basis.
5. If one drive should fail, do a backup before you restore the raid. That way if you mess up restoring the raid you don't lose data [particularly beginners do this]
Tom
Someday, I'll have a real sig.
No, it's not "Real-Time" but it suits our needs in our home office situation.
I use "Smart Synch" software to incrementally copy the desired directories from the working computers to a "Backup server", an older Celeron machine on the network. Separate partitions are set up for each computer that is being backed up. At Midnight the incremental backups are made.
Then at 2:00 a.m., Smart Synch running on the backup server makes another backup to a USB hard drive plugged into it. That USB HD is on a regular plug-in timer so that it only runs during the time of night when a backup to it is being done. The idea there is that the running time is limited and drive life is extended. Weekly, a backup DVD is burned and stored off site. Am I being anal? Maybe.
"Do the Right Thing. It will gratify some people and astound the rest." - Mark Twain
Ultimately, what it comes down to is that mirroring merely makes the hardware more reliable, it is not a backup technique.
It can be part of this nutritious breakfast^W^W backup technique:
0) shut down the box
1) swap a fresh/new/wiped drive for one of the mirrored drives
2) rebuild the RAID
3) store the just-pulled drive appropriately (e.g. off-site) along with a second identical RAID controller
Now if the machine goes completely belly-up (as in a fire) the user can install the secondary RAID controller and the data-laden drive in a fresh machine, add another fresh/new/wiped drive, and rebuild the RAID in the new machine. This may not be terribly convenient nor perfect for everyone but it will be effective.
Remember, kids: just because a particular technique doesn't perform a task all by itself (in this case RAID 1 != backup) that doesn't mean it can't be part of a larger picture.
I want to drag this out as long as possible. Bring me my protractor.
My father has trusted his data (against my advice) to fakeraid chipsets on his various motherboards twice. He just got done *losing* all of his data for the second time.
Best we can tell, he had one drive go without his RAID controller warning him; then had a second drive go, killing the array. He spent weeks with a dead PC playing with all kinds of special Windows bootloaders and disk recovery tools trying to get his files back.
Fakeraid sucks because it's just a line item on the sale of a modern motherboard. The inclusion of the "RAID" functionality is borderline fradulent. REAL RAID controllers, of course, have a coprocessor and often battery backup and leave all of the storage details to themselves rather than some fly-by-night driver in the operating system.
It's no surprise that they come with virtually nothing in terms of recovery software.
The happy median, I've discovered, is Linux md. md supports many RAID levels, and according to some benchmarks will certainly outrun fakeraid in performance (which doesn't particularly surprise me). The administration tools let you simulate drive failure, monitor array health, create degraded arrays, and the documentation tells you what to do when something goes wrong.
If that RAID array is the only place you are storing all those home movies, I highly recommend you take a backup to some other kind of media. As other people have said, the RAID controller is a single point of failure. If you lose that, you lose the lot. And there's no guarantee that another controller will be able to rebuild it. Sad but true.
Serving Suggestion: Defrost
Losing data once due to hard drive failure may be all that is required to convince anyone that RAID is right for them, but why wait until that happens.'"
Isn't it human nature (or at least that's what it seems) to wait until something "bad" happens?
That goes for obese people, smokers and yes even computer geeks.
Why eat all the fat? I'll just burn em all!
- Wait til your 40-50 and check that cholesterol strike...
So many people smoke and get away with it so I will to right?
- Yeah wait til you get some health problem that will make you say "OH NOES!"
Why I need firefox? ActiveX hasn't screwed me.
- A week later "omfg whats all this junk, I want Firefox!"
Why Do I need RAID or even a burner? I got 3 hard drives that contains all my data!
- 8 months later, 1 hdd crashes "AH F*K ALL MY Pr0n!" and then he thinks of having a simple RAID 1 setup...
We always wait because we are lazy and cheap.
1. The average user does not need RAID.
2. The enthusiast does not need RAID.
3. RAID is not a replacement for backing up.
RAID is only good for two things: To protect against a hard drive failure when an uptime as close as possible to 100% is required; and to increase performance in the form of data throughput.
Alright, let's cover them one at a time. The average user (anyone who just buys a Dell to surf the web,) doesn't need RAID. Hard drive failures are fairly uncommon. And the to the 'average' user, 100% uptime isn't anywhere near a necessity. Also, data throughput isn't that important, either. (Besides, for most people, buying a faster hard drive is both technically better and more cost effective.)
Most enthusiasts (gamers, hardware tweakers, modders, etc,) don't need RAID. Again, hard drive failures are fairly uncommon, and your average enthusiast doesn't exactly store cures to cancer or rocket science on their drives. (No, the old MS Space Simulator doesn't count as rocket science.) Besides, most enthusiasts use RAID-0, which isn't really redundant. Which brings us to point B. The only thing RAID-0 measurably improves is sequential STR speed (Spindle-to-RAM, the speed of data going from the platter to the drive's internal cache.) And there are very few enthusiast tasks that do better from a raw higher throughput. (Even capturing video doesn't matter, as the slowest hard drive today can easily keep up with uncompressed HD!) Oooh, so your Doom3 level loads 1/2 a second faster. In single-player mode, it doesn't matter at all, and in multiplayer, the server waits for everyone to load anyway. RAID doesn't help for random seeks. A faster spindle helps for that. A better drive caching algorithm and a larger drive cache helps for that. If you have a 40GB, 5400RPM, 2MB cache drive, you'll see a tremendous improvement by going to a Raptor, or to a 15k RPM SCSI drive. You will see very negligible benefit by getting a second 40GB, 5400RPM, 2MB cache drive and putting them in a RAID-0.
Finally, the assertion that RAID prevents data loss. The only data loss it prevents is loss due to a failed hard drive. It doesn't protect against user error, viruses, or physical damage that destroys the whole computer. If your data is truly vital, you need to be backing up, even if you do use a RAID. Yes, businesses with vital information who need (as close as possible to) 100% uptime need RAID. That's it. Even then they still need backups.
Another non-functioning site was "uncertainty.microsoft.com."
The purpose of that site was not known.
One glaring error:
RAID can be run on any modern operating system provided that the appropriate drivers are available from the RAID controller's manufacturer. A computer with the operating system and all of the software already installed on one drive can be easily be cloned to another single drive by using software like Norton Ghost. But it is not as easy when going to RAID, as a user who wants to have their existing system with a single bootable hard drive upgraded to RAID must start from the beginning. This implies that the operating system and all software needs to be re-installed from scratch, and all key data must be backed up to be restored on the new RAID array.
Again, wrong, wrong, wrong. There are hardware RAID 1 controllers that require no drivers and you don't have to do squat - just power down the server, install the RAID 1 on your IDE interface, plug in the new drive, hit the power, and away you go. The controller is smart enough to automatically sync up the two drives in the background.
-- Ed Carp, N7EKG erc@pobox.com PGP KeyID: 0x0BD32C9B What I'm up to: http://intuitives.mine.nu
I've been using software-RAID with ATA drives on Linux for quite some time, so I can comment on the behaviour of an array containing a faulty drive.
/dev/hdc, ATA error count increased from 0 to 1" and that it would be a good idea to check up on the hosts syslog.
/dev/md0 -a /dev/hdc1" the new one into the array, starting to resync it right away.
First off, let me emphasize how important it is to set up proper email notification (or pager etc.) for such cases! If you don't know about the failure, you're certain to get nice phonecalls from affected users.
If you've set up the notification system (smartd and mdadm come to mind), you'll eventually get an email saying something like "Device:
Checking up on the system, you'll find that the average system load has increased substantially, which is due to the system trying to persuade the disk to write to a faulty sector and the software RAID having to compensate, queuing the errors.
Depending on how often the defective sector is tried to be written to, the load can increase to values of 10 and above, rendering the system unusable. This is a good time to halt it and replace the defective drive, partition it and "mdadm
This may sound really horrible, but in practise it's usually less 60 minutes (counting from receipt of the first notification email) until normal operation can resume with such a system. This is assuming you have all spare parts stored somewhere on site.
In genereal, I've found software RAID1 and software RAID5 on Linux to be exceptionally stable. I'm also very happy about the performance, given that all I'm using is a bunch of el-cheapo ATA disks. As for reliability, I'm convinced it can't be beat in the consumers' price range, since I've seen too many consumer grade hardware RAID controllers go down in a swirl when putting more than a light load on them.
In the enterprise, I've seen companies move to software RAID on their Linux systems, because they found out that their only 5 years old enterprise hardware won't be getting any new spare parts anymore, which includes motherboards, CPU and IO controllers. Moving to software RAID on enterprise grade SCSI stuff allows them to move the entire system to another piece of hardware simply by moving the harddisk to it.
Consider:
RAID 10 disadvantage: "All drives must move in parallel to proper track lowering sustained performance". In fact each drive can seek independently for reads and only pairs must seek together for writes.
RAID 1 advantage: "Transfer rate per block is equal to that of a single disk"
RAID 5 disadvantage: "Individual block data transfer rate same as single disk"
Would be nice if it was consistent about whether that's good or bad.
RAID5: "Highest Read data transaction rate" except for RAID 10, of course, where you've less chance of being bottlenecked because there are two sources for each stripe.
RAID5: "Medium Write data transaction rate", only the lowest of all except 50, because of the parity calculating and writing to a second drive.