Raid 0: Blessing or hype?
Yoeri Lauwers writes "Tweakers.net investigates matters a bit more clearly and decides that AnandTech and Storagereview should think twice before they shout that "RAID 0 is useless on the desktop". Tweakers.net's tests illustrate the contrary"
... for simplicity. It is nice to have one "large" drive (in windows) instead of spreading all of my files across smaller drives. Useless, it is not! Is it really very practical? I don't think so. I havent had a disk fail yet, but when it does I will be glad I have backups!
"Initial success, or total failure!"
remin8.com
I'm sure that even here on Slashdot there are some people who aren't running huge multi-threaded database applications on their desktop machines, and for them, RAID-0 probably isn't going to help much.
But for the majority of us normal people who are running huge multi-threaded database applications on their desktop machines, RAID-0 is much nicer than having to manually allocate all of your database extents across your disks. Of course, RAID-10 would be better, but that would involve spending money...
I don't care what tests people have done or what benchmarks they're spouting off, RAID 0 works.
I used to have a system which used relatively cheap 5400 RPM drives in a RAID 0 array. There was a quite noticable difference when not using RAID 0. When using 2 or 4 drives the system was damn fast even though the drives were individually slow.
I don't even read these articles. I know it makes a difference.
Actually, it's bye bye da or bye bye ta.
My computer is over three years old (P4 1.7 GHz upgraded to 386 MB of RAM from 128) and I've found that the slowest technological advancement seems to be hard drive throughput. This definitely reveals itself because of the fact that games like Doom 3, Far Cry, and Painkiller are all perfectly playable on my computer, but the latter two games take an unbearably long time to load. When I build my next computer, RAID 0 is one of the things I will be looking at, because I absolutely hate waiting more than 5 seconds for a game to load.
(Yes, I'm aware that only 384 MB of RAM is slowing load times via virtual memory swapping as well)
It would be cool if it didn't suck.
A common misconception is that striping beyond 2 drives is "worthless." That simply isn't true: remember that the inside of the drives, close to the spindles, has a transfer rate that is nearly half what it is on the outside cylindars. By striping 4 drives togeather, about half the bandiwdth is wasted near the FRONT of the drive, but near the tail, it's almost all being used. The effect is that the drive feels uniformly quick no matter what part of the drive you are reading from!
I personally jumped from a single drive to a 4-drive SATA raid-0 system, composed of 120GB drives from two different manufacturers.
The system screams.
I can't tell you how nice it is to have my computer boot in half the time... how your system feels like you always wished it would feel. You can add all the memory you want, all the processing power you want, but if you can't feed the computer, it's all pointless.
The only thing I wish now was that my system had a faster and/or wider bus that would allow me to take advantage of all the currently unused bandwidth available from the four drives.
A common theme, revisited several times, in the article is that the other conclusions were wrong because they used low-load testing.
"A safe conclusion would be that a Business Winstone 2004-benchmark alone is not a good starting point when testing RAID 0 performance. On the contrary: to have some reliable tests, we will need to put heavy loads on the array."
In essence, if my understanding is correct, they're saying that the value of a RAID 0 setup is under constant extreme loads, not the loads created by business applications or games. Isn't this entirely the point of the articles in question - That given the sporatic, generally light load of even power users, RAID 0 is not really that beneficial (as random access plays even more of a part than gross throughput)?
Even under perceived heavy I/O loads, the reality is often that the hard disk is under-used - I occasionally compress videos from miniDV to DVD, and my CPU would need a four or five fold increase in speed to even begin to put pressure on the single 7200 RPM hard disk.
Of course they do. After all, they've spent extra money and time pimping out their rigs.
What if you have one large disk?! loss of single disk = bye bye data... RAID-0 (or AID-0 since it hasn't has Redundancy ;-) ) is simply for performance and for a virtual unique large drive. And the article comes to prove just that.
Usually desktop users don't have much critical information on their computers (Nothing than can't be saved in a each-time-more-inexpensive DVD) and don't mind every 3 (or more) years to install stuff again. They probably switch computer before one of the disks blow...
Just the thought of using RAID-0 makes me shiver. The only people who should use this are people who keep good backups, and like using them. The speed gains are of little use for individuals, and for the professionals or corporations that might actually want the speed-up, the chances of data-loss are too high.
That's not to say there isn't a purpose for RAID-0 - it teaches people how useful backups are. The hard way.
Just because you're paranoid doesn't mean there isn't an invisible demon about to eat your face
Actually closer to: 0:bebedt 1:y y aa
Huh, writes slower on raid0? why on earth would that be? writes are just as fast as on a single drive on raid1, and writes are a bit slower on raid4 and raid5 due to parity updates, but that's it.. writes are not slower on raid0.
Just post the relevant Wiki information about Raid 0, dont need Raid's life history ;).
RAID 0
A RAID 0 Array (also known as a stripe set) splits data data evenly across two or more disks with no parity information for redundancy. RAID-0 is normally used to increase performance, although it is also a useful way to create a small number of large virtual disks out of a large number of small ones. Although RAID-0 was not specified in the original RAID paper, an idealized implementation of RAID-0 would split I/O operations into equal-sized blocks and spread them evenly across two disks. RAID-0 implementations with more than two disks are also possible, however the reliability of a given RAID-0 set is equal to the average reliability of each disk divided by the number of disks in the set. That is, reliability (MTBF) decreases linearly with the number of members - so a set of two disks is half as reliable as a single disk. The reason for this is that the file system is distributed across all disks. When a drive fails the file system cannot cope with such a large loss of data and coherency since the data is "striped" across all drives. Data can be recovered using special tools, however it will be incomplete and most likely corrupt.
RAID-0 is useful for setups such as large read-only NFS servers where mounting many disks is time-consuming or impossible and redundancy is irrelevant. Another use is where the number of disks is limited by the operating system. In Windows, the number of drive letters is limited to 24, so RAID-0 is a popular way to use more than this many disks. However, since there is no redundancy, yet data is shared between drives, hard drives cannot be swapped out as all disks are interdependant upon each other.
RAID 0 was not one of the original RAID levels.
Unlike other RAID-levels, RAID 0 does not offer protection against drive failure in any way, so it's not considered 'true' RAID by some (the 'R' in RAID stands for 'redundant', which does not apply to RAID-0).
When you have multiple hard drives, it's more likely that one will fail than if you just have one. For the obvious statistical reasons. Plus because of heat problems in many systems.
In a non-RAID setup with multiple hard drives, when one fails, you lose whatever was on that drive.
With RAID-n (for non-zero n), you lose nothing. You say "oh well", put in a spare drive, and send the old one back for replacement. (In the other order if you're cheap.) The array rebuilds itself. Without even shutting down the machine, if you have the hot-swappable drive cages.
With RAID-0, you lose everything on all of your hard drives.
RAID-0 is considerably less reliable than a single hard drive.
Since 2002, I have been using the SIIG Raid 0 http://www.siig.com/product.asp?pid=424 card on a 1999 Sawtooth G4 with 0.48TB of internal storage. Hardware-wise, this is an OEM Acard card; also available from Sonnet and Miglia.
_ RAID.html
No disk failures to date ---I backup weekly with Apple's Backup 2.0
Here are some benchmarks that compare software RAID 0 performance (included free with OS X) vs. hardware RAID 0: http://www.xlr8yourmac.com/OSX/OSX_RAIDvsIDE_Card
The next pasture is always greener
The same arguement goes for mirrored as well.
Have you ever had a "sick" drive in a mirrored array? When that drive is working, it is giving out bad data that is then being written to both drives during the update/write back. Then you have coruption on two drives instead of one.
The "safe" setup is Raid-5, but if you loose 2 drives you lost all...
A service tech loose his balance while replacing a down drive in a HOT Raid-5. He fell backward while squating pushing in the new drive. He grabed another drive in the same array to stop his fall, and pulled it out... Every bad shutdown for a production system, and a very long recovery.
Now service techs are required to sit in chair when changing a drive below chest hieght.
It's not the interface that makes IDE drives less reliable, it's just that manufacturers want to keep server/workstation drives out of desktop machines for good reason - the 10/15kRPM drives need to be cooled, and as soon as people start to put them in desktop machines, they're gonna get a lot of warranty returns. Thereby lowering their profits further, and removing any advantage that they had.
There are two possible choices:
#1 has been done by WD with their Raptor drives, but they are still expensive, and have a low capacity to reduce heat.
#2 is unlikely to work unless all the manufacturers do it at once, which isn't going to happen. And, they can't separate the pro and consumer drives as easily as when the consumer drives were IDE and pro were on SCSI.
There just isn't anything in it for the drive makers.
Just follow this link. Same article. Standard colors. It's all in the "it.slashdot.org". Also try it with Apple color scheme.
'RAID 0 (Score:0, Redundant)' LOL!
The fact that you reply anon says it all. Tweakers.net has a fine reputation among the Dutch, which is shown by the huge traffic amounts on their site (even when not being slashdotted) and their memberdatabase on both the forums and the site.
The quality of their forums and their articles are both very high, mostly concerning hardware.
The fact that this article was translated means they want to be a serious contestant in this discussion against major English sites.
Writing an article in Dutch which shows the contrairy of something said in English wouldnt be fair to those concerned, would it?
(:
Nope, reliability goes down.
Let's suppose that both the 80Gb and the 160Gb drives have a possibility of failing in a month of 10%.
Now, with the 1x160Gb you have 10% of having a failure this month, obviously. What's the probability for both drives?
Well, since each one won't fail 90% of the time, the probabilities of both not failing is 81% (0.9*0.9). The rest, 19% is the possibility that one or both fail, therefore, instead of a 10% failure rate, you get 19%... nearly twice!
So long as the operating system can take advantage of it, every spindle you add to your system will add performance. Windows does make it harder to take full advantage of multiple spindles, because you can't easily distribute disks to different parts of your file system to cut down on seeking, but using RAID 0 will help some.
/tmp disks, and other system tuning you could do on even medium-sized "big iron". I've done similar things on my FreeBSD home desktop and been quite pleased with the results, though IDE's limitations make it a lot harder to get a big win out of it than SCSI did.
Ideally, you should bring hot spots on the disk closer together, which is what filesystem optimization tools do, and have one disk for each "hot spot" on your system. %systemroot%, the swap partition, your system temporary files directory, your applications, and your profile could each be given a separate disk so that the disk head that's sitting there writing your cached files doesn't get hauled off to the other end of the disk to read a plugin from %systemroot% or a write an old dirty block to the swapfile. Old timers will remember dedicated swap disks and swap partitions on every drive, fast dedicated
With enough drives and an OS that's aware of the physical layout, you should be able to get the same kind of performance improvement from RAID 0 on Windows. Hardware RAID, of course, won't help much with the seeking problem because the OS doesn't know it's got two heads to do seek optimization on. Software RAID, if Windows is smart about seek optimization, should give you a superlinear speedup for many workloads.
A friend who works with NAS/SAN systems jokingly told me that a hard drive exists in only 2 states:
1) Failed
2) About to fail
Tb.
An OPEN mind is a beautiful thing...
The author produces a lot of words but shows remarkably poor insight. Examples his lack of understanding between sequential access arrays and parallel IO arrays in the introduction, the poor showing of the RAID 5 tests (conveniently avoiding writes in those tests), the difference between RAID techniques and caching, and the association of PCI as the performance limiter in the Promise controller.
The fact is that the article readily admits that desktop workloads show poor average IOps (under 1.5) and modest average IO size (23K). Those numbers prove that there is little opportunity to accelerate performance either with parallel access or random access designs. The first tests show clearly that the IO sizes in question leave little opportunity for large transfer gains while the lack of decent command queue depths rules out good load balance with larger stripe sizes. Interestingly, the author didn't provide the stripe size for that test. It's easy to deduce from the chart but it demonstrates his limited grasp of the subject matter.
Regarding the tests dispelling the myth of poor RAID 5 performance, hardly! Poor RAID 5 performance is no myth. First off, the RAID 5 configuration was trounced by lesser RAID 0 IDE drives. Second, the benchmarks consistently avoided writes, notably small writes, where RAID 5 massively fails, and uses a large writeback cache to further hide write performance and to cause the configuration to shine is small read tests. If you are going to sing the praises of RAID 5 for data protection you should probably mention the data integrity disaster that writeback caches introduce. If I were offering the RAID 5 config myself I would feel like I just got my ass kicked.
Ultimately this article is nothing other that a rant by someone who disagrees with others' contention that RAID 0 is of limited benefit. He justifies his position by saying that performance matters when "performance matters", that is specifically when you create disk-intensive loads you can see a benefit. Well, no shit. When you create large command queue depths through multiple disk-intensive processes then you will benefit. Again, no shit. Boot times can get shaved a little. Big deal. Beyond that he doesn't know what he talking about. There's a big difference between RAID 0 being theoretically capable of superior performance and it being a performance value to a desktop user. This is a subjective matter and he fails to make his case. Just how often does he or any other "power user" actually benefit from these unusual workloads and is that often enough to justify the costs?
Tweakers.net is kinda like /. except it focusses on tech and to where you got some real rocket scientists posting on /. tweakers.net seems to have more kids. Maybe a lack of moderation?
So just as some people hate /. some hate tweakers.net or some other tech site. It all depends on wether the site agrees or disagrees with their point of view.
Nothing upsets some people more then reading that their latest purchase is a piece of shit.
I agree about the language. While speaking dutch is all nice and local it stops it being usefull to roughly 99% of the world population. It is not like dutch people can't read english well enough for even the techiest of articles.
So I partly agree with the parent and disagree with the grandparent. Tweakers.net is just another tech site with its share of bullshit and crap. No better or worse then any other site.
As to my opionon on raid 0 (Use several raids myself including raid 0 for a while) it is definitly faster. Doesn't matter that much for me since only games require the regular loading of stuff from disk in a speedy fashion and the improvement can be lived without. So a level loads a few seconds faster. Yippie. Then again, raid 0 is pretty cheap and if you want those couple of seconds then it makes perfect sense.
The people who are against raid 0 are the same who are against dual processor or large amounts of ram etc etc. They can't afford it and therefore it must suck. Ignore or pity such people but never take their advice. They are truly the ones who said 640k should be enough for anyone (unlike bill gates who apparently never said it).
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
1. Anandtech and StorageReview benchmarked RAID 0 and found that, for desktop applications, RAID 0 is slightly slower than a single drive, because the things that RAID 0 is good at are not the things that desktops need.
2. So we changed the benchmarks to really need the things that RAID 0 is good at.
3. And now, RAID 0 improves things!
4. Therefore, the benchmarks in #1 were wrong.
Summary of the summary:
I'm looking for my keys under this lamppost because the light is better here.
Well I got a very simple solution to that, one that the overclockers I care about all use. It is called a small server with real Raid to store all the "real work" they got.
The game machine is the game machine and it doesn't need to have a long live as it won't be around longer then a year anyway.
Raid 0 fits in the "getting 1% extra fps" scene. It does not fit in the office scene.
Anatech and a whole lot of /.ers just don't seem to get that to some people every bit of extra speed is worth it. You would review a ferrari as a lesser car then a ford focus since a ferrari costs more and who needs the speed.
Does speed matter? Oh yeah, does reliability? Hell no, this ain't a server. Only thing I could loose is a few hours reinstalling windows and my games. I do that often enough anyway whenever a new piece of hardware arrives.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
You people with all your performance in mind seem to forget the time it takes to restore a lost drive. And some time the information my be unattainable to return. Hey Look I can save 1 minute transferring a gigabyte of information. The next month... Man I spent 3 days putting all my stuff back into my drive after it crashed. Using raid 0 is useless even with any speed increase. If you are doing anything important you may want to use the higher RAIDS so you get the performance and the backups yea the drives will cost more, but it is worth the investment.
On a different note, I really wish that laptops and desktops came with duel hard-drives standard w. Hardware raid 1 installed. Especially laptops.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
RTFA.
Most of the time games spend loading is not disc bound, but cpu bound. Decompressing pack files, initiating bsp-trees, ect.
Every modern disk can load 50mb/s. THe largest quake3 level has 27 MB.
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
IF you have a decent RAID controller, RAID1 is faster than RAID0 for reads (not writes), this is because with RAID1 the data isn't striped - the same data is on all the drives, so the system can read from the most convenient drive (lower latency), and then do read interleaving after that. Whereas with RAID0, the system has to wait for the drive holding the stripe with the desired data.
So RAID 0 is OK if you are sequentially reading/writing large blocks (large relative to the stripe size). But it's not so good for small random reads or writes - which could be the case in some desktop situations.
For decent performance and reliability go RAID1+0, instead of RAID5 (which seems popular amongst many of the obviously ignorant here). RAID5 sucks for writes. RAID5 is only if you want _lots_ more capacity with some redundancy and write performance isn't important.
As far as I see, disk speed is a bigger issue than disk capacity. Capacity has increased faster than drive speeds have.
The same arguement goes for mirrored as well.
Exactly. RAID should never be used in place of backups. How often could you have potentially lost data to human error in software usage/config? And how often to hardware failure?
If you suffer from human error with the software while relying on RAID, you lost your data and get a rude lesson in RAID and backups addressing mutually exclusive problems.
Real RAID, buys production systems time to keep going while a (hopefully) hotspare rebuilds or a replacement disk gets delivered for rebuilding.
RAID saves productivity during what should be only a brief period of vulnerability. Backups prevent complete loss. People who use RAID as a backup, don't understand the limits of RAID or the value of real backups.
So many times, I have had to order tapes from a data bank because some user deleted an "important" file from RAID protected storage.
Hell, I am a sole trader, who legally must keep business records for tax purposes, etc. I rsync my records, email, web site, server configs, site documentation, etc across 5 different machines, spanning 3 different architectures and 5 different OSes. On top of that I keep rotating weekly (CDRW) and permanent monthly (CDR) backups.
This might seem like paranoia, but I do it because I easily can and CDR's are cheap. I can also move to any of those machines and resume my emailing, invoicing, doco, etc. Take one of them on the road (Thinkpad or iBook) if business or disaster dictates and not worry that a worm is going to prevent me from earning my living or answering to the tax man.
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
There are a number of desktop applications that can benifit from RAID-0, but you have to be smart about it.
RAID-0 is perfect for booting an OS and loading large applications. It's also excellent for swap space, and initializing your JVM.
It's less well suited, however, for small documents and anything important (like documents).
Thus, my strategy would be to use a RAID-0 array for my OS, JVM, applications, and swap space, and a non-RAID drive for application data. A good way to achieve this on Linux would be to format the single non-RAID drive and mount it as /home, and install everyting else onto the raid array.
Seems to be a good strategy for a desktop system to me. Add in some backup for the single disk mounted as /home, and if anything goes wrong with the RAID, you're important data is completely protected.
Yaz.
This is incorrect. RAID 1 would be faster than RAID 0 for read workloads where there was (1) sufficient command queue depth, and (2) a castrophic inbalance in the workload that prevented the RAID 0 drive from utilizing its disks. Since the second case never happens (except in improper configurations), RAID 0 will outperform RAID 1 with identical numbers of disks. RAID 1 can have more than two disks (requires and even number) although some foolishly believe that striping in RAID 1 makes it RAID 10 or 1+0 or 0+1. Please read Patterson.
Assuming a two drive RAID 1 versus RAID 0 in a small random read environment with sufficient queue depth, the RAID 0 array provides twice the working capacity of the RAID 1 and therefore its relative seek distances are smaller. Remember that the data set and IO sizes don't change simply because the array is larger. The RAID 0 array will still see full utilization of both spindles due to the random access nature and sufficient queue depth. The array is faster for the same reason that a 200GB 2 platter drive is faster than its 100GB 1 platter stablemate. Less cylinder switches and shorter seeks.
There seems to be this myth that RAID is only for accelerating large sequential transfers. Nothing is further from the truth. Random IO workloads constitute the bulk of all RAID applications and RAID 0 is king of performance with identical drive counts. When RAID 1 is characterized as faster than RAID 0 it is referring to identical "data drive" counts.
Everyone keeps mentioning about the lack of fault tolerance in Raid 0. Personally I do not know anyone who runs a Raid 0 configuration on drives that containe data that would be considered important.
Personally I'm a hardcore gamer, and I run Raid 0 on two WD SATA 36G Raptors. These drives are used for my system drive and where I install my apps. Anything that is important is shoved off to a set of big, slow IDE drives that are running in a Raid 1 configuration.
So MTBF really doesn't matter to me, as when one of the drives fails it takes me a grand total of 18 minutes to reinstall Windows XP (timed it), add in another hour for driver configuration and updates, and I'm back to where I was before the drive failed.
Raid 0 can work out just fine, as long as your realize its limitations and store your data accordingly.
Gailin
I wish there was a fscking blue pill
Well, RAID is not always redundant. The RAID talked about in this article isn't redundant. In fact, it's "less" redundant than a drive with 0 redundancy, since each drive is now sensitive to failures in the others. It's negative redundancy (in fact, RAID 0 is often called "not true RAID", since the "R" in RAID stands for "redundant".
And, yes, a good RAID controller is expensive and often not available for a PC chassis & mobo.
All's true that is mistrusted
Makes you wonder why Linux and other Unices have everything under one "/"... the convenience factor is amazing :)
/mnt/ , life is a lot easier .
/dev/sda1 /dev/camera /dev/camera /mnt/camera
With NFS, cdroms , USB cards and harddisks in
Imagine this
bash$ ln -sf
bash$ mount
One "/" to root them all , eh ?
Quidquid latine dictum sit, altum videtur
Everyone is concerned about redundancy and increased probability of one drive failing the more drives you have. Yes. If you are comparing it to RAID 1, RAID 1+0 or RAID 5. But we are talking desktop systems. You, know, the ones with single drives. So you already have no redundancy to begin with and now you are adding speed. And a somewhat increased risk of drive failure. However, I already have 3 separate drives on my system, so I am only going to get the speed benefits.
-- "You can lead a yak to water, but you can't teach an old dog to make a silk purse out of a pig in a poke" - Opus
If you read Anandtech's article, you'll see that his test only covered the fastest drive available, the 10K rpm WD Raptor. The price /GB (in canadian dollard) for this drive where I live is 4.61$/GB, compared to 0.86$/GB for a WD Caviar drive.
What I was looking for was a 0+1 array, striped and mirrored, using inexepensive drives. I'm one of those old fashioned people that didn't switch to using "independent".
So Anand shows that if you take the fastest drive available, you don't get much by striping it. But what about the average 7200rpm drive, is there a performance increase? Does it get close to a single raptor?
How would you think about using a Raptor as your main drive where application would reside, and a mirrored array of inexpensive 200GB drive to store your various collections of files, would that be a better choice?
> but you've spent four times as much, and, more importantly in my mind, your probability of failure has increased from P to P^4.
... I've been putting some stuff on Sync'd non-swappable Ram Disks - makes a hell of a difference for proper apps who mmap the file instead of reading it into the core.
The probability actually went from P to P ^ 0.25
p*p*p*p is LESS THAN p for probability terms (0 < p < 1.0)
You calculated the chances of ALL 4 failing together. But Raid-0 has a problem with even one failing which is the 4th root of P , which is obviously higher.
Anyway, Raid-0 makes sense if you're doing stuff like Video Editing for the Desktop
Quidquid latine dictum sit, altum videtur
As the probability of failure becomes smaller and smaller, then the probability of there being a failure in two drives becomes more and more closer to being doubled. Even if your failure probability was 0.01% for one drive, then the failure probability of two drives would be 0.019999%.
The fans and other components make it more complicated, but still make RAID-0 often a lot messier. Suppose a cooling fan does die, it might not instantly kill the drive, but will shoot the probability of failure way up. So now you are back to a 10% failure probability or something, and you still end up with a 19% probability of failure in at least one drive. This does assume that both drives are cooled by the same fan, but if they are cooled by different fans, you now have a larger probability of a fan failure and we are still back to the same problem.
That said, I don't think there is a problem with RAID-0 if you think the gain is worth the cost and you don't mind the decreased reliability.