Raid 0: Blessing or hype?
Yoeri Lauwers writes "Tweakers.net investigates matters a bit more clearly and decides that AnandTech and Storagereview should think twice before they shout that "RAID 0 is useless on the desktop". Tweakers.net's tests illustrate the contrary"
... for simplicity. It is nice to have one "large" drive (in windows) instead of spreading all of my files across smaller drives. Useless, it is not! Is it really very practical? I don't think so. I havent had a disk fail yet, but when it does I will be glad I have backups!
"Initial success, or total failure!"
remin8.com
Huh, writes slower on raid0? why on earth would that be? writes are just as fast as on a single drive on raid1, and writes are a bit slower on raid4 and raid5 due to parity updates, but that's it.. writes are not slower on raid0.
Just post the relevant Wiki information about Raid 0, dont need Raid's life history ;).
RAID 0
A RAID 0 Array (also known as a stripe set) splits data data evenly across two or more disks with no parity information for redundancy. RAID-0 is normally used to increase performance, although it is also a useful way to create a small number of large virtual disks out of a large number of small ones. Although RAID-0 was not specified in the original RAID paper, an idealized implementation of RAID-0 would split I/O operations into equal-sized blocks and spread them evenly across two disks. RAID-0 implementations with more than two disks are also possible, however the reliability of a given RAID-0 set is equal to the average reliability of each disk divided by the number of disks in the set. That is, reliability (MTBF) decreases linearly with the number of members - so a set of two disks is half as reliable as a single disk. The reason for this is that the file system is distributed across all disks. When a drive fails the file system cannot cope with such a large loss of data and coherency since the data is "striped" across all drives. Data can be recovered using special tools, however it will be incomplete and most likely corrupt.
RAID-0 is useful for setups such as large read-only NFS servers where mounting many disks is time-consuming or impossible and redundancy is irrelevant. Another use is where the number of disks is limited by the operating system. In Windows, the number of drive letters is limited to 24, so RAID-0 is a popular way to use more than this many disks. However, since there is no redundancy, yet data is shared between drives, hard drives cannot be swapped out as all disks are interdependant upon each other.
RAID 0 was not one of the original RAID levels.
Unlike other RAID-levels, RAID 0 does not offer protection against drive failure in any way, so it's not considered 'true' RAID by some (the 'R' in RAID stands for 'redundant', which does not apply to RAID-0).
When you have multiple hard drives, it's more likely that one will fail than if you just have one. For the obvious statistical reasons. Plus because of heat problems in many systems.
In a non-RAID setup with multiple hard drives, when one fails, you lose whatever was on that drive.
With RAID-n (for non-zero n), you lose nothing. You say "oh well", put in a spare drive, and send the old one back for replacement. (In the other order if you're cheap.) The array rebuilds itself. Without even shutting down the machine, if you have the hot-swappable drive cages.
With RAID-0, you lose everything on all of your hard drives.
RAID-0 is considerably less reliable than a single hard drive.
Since 2002, I have been using the SIIG Raid 0 http://www.siig.com/product.asp?pid=424 card on a 1999 Sawtooth G4 with 0.48TB of internal storage. Hardware-wise, this is an OEM Acard card; also available from Sonnet and Miglia.
_ RAID.html
No disk failures to date ---I backup weekly with Apple's Backup 2.0
Here are some benchmarks that compare software RAID 0 performance (included free with OS X) vs. hardware RAID 0: http://www.xlr8yourmac.com/OSX/OSX_RAIDvsIDE_Card
The next pasture is always greener
Here's the whole thing - I *have* tried it. If your workload involves lots of long, sequential reads, it's a great thing. I've personally got 2 machines running drives in RAID 0 as they get used for working with files in the 1.5-2GB range. It makes a difference here.
The whole point of SR and AT's articles, however, is that for most desktop systems, RAID 0 is pretty much a bad idea. You'll see marginal improvement on more random data sets, but you've spent four times as much, and, more importantly in my mind, your probability of failure has increased from P to P^4.
So really, I can see some applications where RAID 0 can be useful - I fit one of them. But for most desktop systems, it's not worth the cost. For systems with more than 2 drives anyway, it seems like a patentedly Bad Idea(TM). You really should've gone with RAID 5 - you'd still have striping, but you don't risk losing everything to a single faulty drive.
Just follow this link. Same article. Standard colors. It's all in the "it.slashdot.org". Also try it with Apple color scheme.
The fact that you reply anon says it all. Tweakers.net has a fine reputation among the Dutch, which is shown by the huge traffic amounts on their site (even when not being slashdotted) and their memberdatabase on both the forums and the site.
The quality of their forums and their articles are both very high, mostly concerning hardware.
The fact that this article was translated means they want to be a serious contestant in this discussion against major English sites.
Writing an article in Dutch which shows the contrairy of something said in English wouldnt be fair to those concerned, would it?
(:
Shouldn't that be 1-(1-p)^4?
p^4 would give you a decreased failure probability.
So that say there is a 1% chance of failure over 3 years for a given drive. Using the first formula, using 4 drives in raid 0 would increase the chance of at least one drive failing (and consequently all) to 3.94%.
IF you have a decent RAID controller, RAID1 is faster than RAID0 for reads (not writes), this is because with RAID1 the data isn't striped - the same data is on all the drives, so the system can read from the most convenient drive (lower latency), and then do read interleaving after that. Whereas with RAID0, the system has to wait for the drive holding the stripe with the desired data.
So RAID 0 is OK if you are sequentially reading/writing large blocks (large relative to the stripe size). But it's not so good for small random reads or writes - which could be the case in some desktop situations.
For decent performance and reliability go RAID1+0, instead of RAID5 (which seems popular amongst many of the obviously ignorant here). RAID5 sucks for writes. RAID5 is only if you want _lots_ more capacity with some redundancy and write performance isn't important.
As far as I see, disk speed is a bigger issue than disk capacity. Capacity has increased faster than drive speeds have.
This is incorrect. RAID 1 would be faster than RAID 0 for read workloads where there was (1) sufficient command queue depth, and (2) a castrophic inbalance in the workload that prevented the RAID 0 drive from utilizing its disks. Since the second case never happens (except in improper configurations), RAID 0 will outperform RAID 1 with identical numbers of disks. RAID 1 can have more than two disks (requires and even number) although some foolishly believe that striping in RAID 1 makes it RAID 10 or 1+0 or 0+1. Please read Patterson.
Assuming a two drive RAID 1 versus RAID 0 in a small random read environment with sufficient queue depth, the RAID 0 array provides twice the working capacity of the RAID 1 and therefore its relative seek distances are smaller. Remember that the data set and IO sizes don't change simply because the array is larger. The RAID 0 array will still see full utilization of both spindles due to the random access nature and sufficient queue depth. The array is faster for the same reason that a 200GB 2 platter drive is faster than its 100GB 1 platter stablemate. Less cylinder switches and shorter seeks.
There seems to be this myth that RAID is only for accelerating large sequential transfers. Nothing is further from the truth. Random IO workloads constitute the bulk of all RAID applications and RAID 0 is king of performance with identical drive counts. When RAID 1 is characterized as faster than RAID 0 it is referring to identical "data drive" counts.
Now, that is not true. If d is the chance of failure in a given time interval for a single disk then the chance of failure in the same time interval for a two-disk RAID-0 is 2d - d^2. For small d, this is roughly equal to 2d (or, more generally, nd for n drives). Thus, the chance of failure goes up (at most) linearly.
> but you've spent four times as much, and, more importantly in my mind, your probability of failure has increased from P to P^4.
... I've been putting some stuff on Sync'd non-swappable Ram Disks - makes a hell of a difference for proper apps who mmap the file instead of reading it into the core.
The probability actually went from P to P ^ 0.25
p*p*p*p is LESS THAN p for probability terms (0 < p < 1.0)
You calculated the chances of ALL 4 failing together. But Raid-0 has a problem with even one failing which is the 4th root of P , which is obviously higher.
Anyway, Raid-0 makes sense if you're doing stuff like Video Editing for the Desktop
Quidquid latine dictum sit, altum videtur
As the probability of failure becomes smaller and smaller, then the probability of there being a failure in two drives becomes more and more closer to being doubled. Even if your failure probability was 0.01% for one drive, then the failure probability of two drives would be 0.019999%.
The fans and other components make it more complicated, but still make RAID-0 often a lot messier. Suppose a cooling fan does die, it might not instantly kill the drive, but will shoot the probability of failure way up. So now you are back to a 10% failure probability or something, and you still end up with a 19% probability of failure in at least one drive. This does assume that both drives are cooled by the same fan, but if they are cooled by different fans, you now have a larger probability of a fan failure and we are still back to the same problem.
That said, I don't think there is a problem with RAID-0 if you think the gain is worth the cost and you don't mind the decreased reliability.
You're mixing statistics from different tests. The average queue depth was 1.34 I/Os in the Storage Review Office DriveMark 2002 trace and 3.22 I/Os in the Business Winstone 2004 trace from Tweakers.net. The low queue depth in SR's trace is one of the probable reasons for the disapppointing performance scaling in the RAID 0 benchmarks from Storage Review.
The results of the RankDisk tests show a respectable performance gain of 36 percent in office workloads on a low-end Promise FastTrak S150 TX2plus with two Raptor WD740GD drives. So even though the average queue size was pretty low and most (75%) of the transfer sizes were lower than the stripe size (64K), the Promise controller managed to show a significant performance increase.
Regarding the tests dispelling the myth of poor RAID 5 performance, hardly! Poor RAID 5 performance is no myth.
We have many benchmark results of RAID 5 configurations available in our Benchmark Database. It is true that most SATA RAID adapters have limited scalability and performance improvement in RAID 5. Many controllers won't scale beyond a 30-40 percent performance improvement over single drive configurations. Still, a 30 to 40 percent performance improvement cannot be considered 'poor' performance.
The MegaRAID SCSI 320-2X / Intel SRCU42X is a different beast. Thanks to a fast I/O processor, fast DRAM interface, large cache and high performance RAID software stack this adapters shows truely remarkable performance in RAID 5 configurations. The 320-2X provides some insight in the performance of future (SATA) RAID 5 adapters.
First off, the RAID 5 configuration was trounced by lesser RAID 0 IDE drives
So you expect a 4-drive SCSI RAID 5 configuration to be faster than a 4-drive RAID 0 configuration using faster SATA drives? Get real.
Second, the benchmarks consistently avoided writes, notably small writes, where RAID 5 massively fails, and uses a large writeback cache to further hide write performance and to cause the configuration to shine is small read tests.
Our desktop tests take more than two hours to run on a fast configuration and consists of almost half a million I/O operations with varying characteristics. In many of the tests, read I/O is outweighted by write I/O and average I/O size is below 50K. Modern RAID implementations have no problems with these type of access patterns.
If you are going to sing the praises of RAID 5 for data protection you should probably mention the data integrity disaster that writeback caches introduce.
That's why you can put a battery backup unit on any decent RAID 5 controller.
[quote] There's a big difference between RAID 0 being theoretically capable of superior performance and it being a performance value to a desktop user. This is a subjective matter and he fails to make his case. [/quote]
This case is clearly being made. For many power users, RAID 0 will improve performance just like moving to a faster hard drive will improve performance. The differences will be subtile improvements in responsiveness, not visible to everyone, better performance in multi-tasking scenarios with heavy I/O and better performance in applications limited by disk I/O. On many systems RAID is already embedded on the mainboard so the costs of RAID 0 are minimal.
Note that the purpose of the article is not to advocate RAID 0 or encourage people to use it. Personally I would not recommend RAID 0 to any user unless he or she has no data to care about.