Hardware RAID 5 Performance Configurations?
gandy909 asks: "I am facing the need to replace a major server in the next few months due to both EOL status and disk I/O bottleneck issues with the array containing the data. The server is configured with a 2 channel array controller. It is a RAID 5 array and has 4 drives, 2 per channel (2 data, 1 parity, and 1 hot spare). Obvious performance benefits in replacing the server are quadrupling CPU speed and doubling the memory. Other side benefits I will gain with the new drives, that I think should help my performance, are moving from u160 to u320, and going from 7200 RPM to either 10k or 15k RPM." How would you configure a larger array to best increase its performance?
"Having googled around, the consensus is that increasing the number of drives is the preferred way to attack the I/O bottleneck. What I don't find much help on is determining the configuration of the larger array. Assuming I am going to be using a 12 drive array I have come up with the following possible configurations:
1 2 channel controller, 6 drives per channel
1 4 channel controller, 3 drives per channel
2 2 channel controllers, 3 drives per channel
3 2 channel controllers, 2 drives per channel
Would any one of those configurations provide better performance than the others, or would they all even out considering other factors?"
1 2 channel controller, 6 drives per channel
1 4 channel controller, 3 drives per channel
2 2 channel controllers, 3 drives per channel
3 2 channel controllers, 2 drives per channel
Would any one of those configurations provide better performance than the others, or would they all even out considering other factors?"
You lost me at "I am. . ."
But generally, I don't recommend RAID 5 for performance critical situations. It's great for data warehouse, but if you lose one drive their goes your performance. Also, realize that, often, the place where you can really boost performance is in the database, not the hardware. How's your query optimization? Do you have appropriate indexes? Is the code accessing the database efficient?
"He who would learn astronomy, and other recondite arts, let him go elsewhere. " -- John Calvin, commenting on Genesis 1
I'd put 2GB fibre HBA's in the server(s) and attach an OpenSAN RAID box from Winchester Systems with dual-homed fibre to the array. That'll take care or your bottleneck and help your future scaling requirements.
... the key to bottlenecks, as i've understood, are: 1) how many physical paths are there to disk, 2) how big are the buffers.
/and/ more disks, #2 can be fixed by buying the right controllers. not /too/ too useful, but a start.
/really/ concerned about performance (enough to spend cash on it), give someone like storagetek a call --- they've got this down to a fine art. quite probably waaay overkill for what you want to do, but it's a start.
#1 can be fixed by adding more controllers
if you're
First, 3 drives in RAID-5 is not very useful. You get a lot of the disadvantages with few of the benefits. Having more drives really helps throughput. So go for more smaller drives over fewer faster drives for RAID-5.
Second, RAID-5 is great for read speeds, but less great for write speeds. A good caching controller will help hide this, but a small write operation requires a read from each disk in the set before the write can be completed (in order to recompute parity for the stripe). If this is mostly reading, or if most writes are large (not small and random), RAID-5 will work fine (data storage, data mining, etc). If writes are frequent (transactions), RAID-5 is painful. RAID-10 might be better.
Time flies like an arrow. Fruit flies like a banana.
If this is an EOL system and its using U160 drives . .chance are those drives are 36 gig or less . .I'd even bet they might be 18's . . . But lets say they are 36's . . four of them in a raid 5 is giving you ~120GB? Why not just get a pair of 147GB drives and run in a raid1? I mean, like others have said, without knowing what you are doing with it, it is hard to say where you are going to get the most benefit, but a lot of times, Raid5 is chosen just due to the increase in space you can get. . . There are more complex options out there that can get you better performance than raid5 (ie, raid 1+0 raid 5+0 and similar) if your controller and wallet support it . . .
What are you using it for? RAID5 is slow for writes. Plus AFAIK rebuilding a RAID5 is a LOT slower and hurts performance more than rebuilding stripes of mirrored drives.
What are the applications? Would you be needing more drive space or adding drives in the near future?
If it's a single application and you know you are not going to need to expand soon or add/change apps - drives are BIG compared to what app needs AND performance is _THE_ issue. Then I'd suggest optimizing around the application.
For example if you have an OLTP postgresql database app, you should put the transaction logs on a separate array optimized for writing speed, then put the tables, indexes on another array, and maybe the temporary sort directories on yet another array.
Even though striping etc helps, if you understand how the application works, it'll often perform better with each type of access on separate arrays. Of course if the next version of the application changes the way it does thing drastically then all bets are off.
You should probably put the O/S stuff on a mirror/array by itself, so the O/S can go do whatever it wants and mostly not affect your main application throughput.
Typically there are diminishing returns as you optimize more exactly to what the application needs, and you usually lose flexibility the more you tailor the configuration to a specific scenario.
So go for big wins. Forget about the last 5-10% performance gains if you are going to lose flexibility or add significant administration cost (complexity).
The more cards, the more buffers, the better the performance (marginally).
I'll just assume it's an ERP, cause I like ERPs. They do a ton of reading and infrequent sustained transfers. If it is an ERP, your top priority should be more main system memory. Then use more smaller, fast drives. The more cards the merrier because your double or quadruple the buffers. At worst a buffer offers no advantage. Hopefuly it's a dynamic optimization supporting what the database is already optimizing in main memory. Little bonus for reliability: this might allow you to move drives to a new channel or card in case of failure.
I've had great luch with Promise's Ultratraks. This is an array of IDE drives that connect to a SCSI interface. Using them in RAID5, and yes it does take some time to rebuild the set when you lose a disk, but if you can handle a little down time. Currently using the 8disk tower, populated with Western Digital 120GB drives. Works great, but when a drive fails, it takes about 6 hours to rebuild the set.
Consider using your CPU as the RAID engine. The hardware engines that venders use can easily end up underpowered when dealing with modern disks. Get enough SATA or SCSI ports and some hot swap bays you can build quite a powerful software array. If you dump the money you would have spent on a hardware RAID card into faster CPU you can easily end up with a faster overall solution.
If you are planning to go the hardware route make sure you get a controller that can keep up. There's nothing worse than having a fast expensive CPU stalled waiting forever for a read because the array's tiny cheap processor is backed up with writes.
Make sure if you go the software route, use an OS that supports it well. I know Linux's support is good but I've heard bad things about Windows. Make sure you can protect the OS as well as other data and can monitor array status and alert someone when things are about to fail or have failed.
set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
This sounds exactly like a question I had in Computer Architecture class.
It looks to me that if you're upgrading from a computer that was 1/4 the speed of today's servers, then if you get a modern server machine with raid SATA you'll be fine.
He who knows not and knows he knows not is a wise man. He who knows not and knows not he knows not is a fool.
it is better to stick a controller on each bus. This reduces contention across the controllers. That is, it is better to have two 2 channel controllers than one 4 channel controller, provided they aren't on the same bus.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Two quick comments about RAID5. First, the posts dismissing it as an option due to performance concerns are silly since your requirements are unknown. Second, another poster already brought up the fact that a 3 disk RAID5 is a bit odd - adding little additional storage while introducing a write penalty. But hey, if it meets your requirements go for it.
This is a *nix box running a Court application from a vendor that used a Synergy DE ISAM type 'database' backend. We have about 100 users, a general mix of add/change/query operations but not a lot of deletes. It is keeping Court data, and that stuff never goes away. The reads are mostly displaying screenfuls of data or small reports to the screen all the time, or larger reports to the printer several times during the day. The writes are usually writing manually entered information, a screen or paragraph at a time, so there tend to be less of them, and at a slower pace, but all day long.
:)
The app is a legacy non-SQL type db that is not, nor ever will be, anywhere near normalized by any stretch of the term. The largest of the data files is just over 1 gig at this point. The OS file size limit is 2gb. Due to this, and the other reasons we will likely be moving to a completely different system in the 5 year range.
Hardwarewise, the box as I inherited it, is a Dell 6400 rackmount server with 4 700mhz P3 Zeons (only 1 activated...don't ask), 1g mem, a PERC2(AMI MegaRAID) dual channel controller, and a split(4+4) backplane. It holds 8 9 gig drives in 2 arrays. Even with these small drives there is over 50% and 70% free space on the arrays.
My budget limit is $10k to replace it. One of the options I was looking at was a Dell 2650 with a PERC3-QC controller and one of the Storcase 10 bay Infostations they offer on the Dell site to hold the rest of the drives. The way the app is so 'interconvoluted' together I don't think I gained anything by separating the data into 2 arrays and will likely just use a single array on the upgrade.
I hope this helps...
(Stolen sig) Remember: it's a "Microsoft virus", not an "email virus", a "Microsoft worm", not a "computer worm
If you're looking for performance, ditch RAID5 and go with RAID0+1 (mirror pairs of disks, then stripe the mirrors). It costs more in terms of dollars per gigabyte, but the performance and reliability increases are substantial over RAID5. The only justification for RAID5, really, is "we can't afford to buy enough disks to do something else" (which is a valid argument for many people). The other thigns you touched on still apply in a stripe/mirror world as far as splitting things over controllers and whatnot.
11*43+456^2
you need at leaat 5 disks for reasonable performance on a RAID5 set, even with hardware support. (do some research). The more disks in the set the better the performance.
also in RAID 5 the parity info is spread all the disks, otherwise it's a RAID 3 set, so loss os single disk doesn't mean major performance problems.
things to consider 1) hardrave sustained transfer speed, IBM made some bad ass 10k ultra 160 drives but they could only push 40 MB/s so a 4 disk RAID 5 would saturate the channel on the controller. also 2)consider the card slot 32 bit slot on a 66 MHz bus is 1053 Mb/s or 132MB/s in theory, so you need a faster bus like new chips or a 64 bit bus, most the last server hardware I worked on only wnet to 66 MHz bus speed with 64 bit busses, so your mileage will vary, so basicly get an Ultra 320 controller (Which will be a 64 bit card) and then you will have to worry about getting the data off the server to users, so gigabit ethernet otherwise alll the work you put into getting the hardrives to work fast is a waste, also personal prefference, I like raid cards from LSI logic, and hate what adaptec made. just make sure that the computer can handle the bandwidth all the way around. For 12 drives you would want 1-2 hot spares, an ultra 320 card, 10k rpm drives, and 2 gigabit NICs, and a router that would handle that kind of traffic. just my thoughts on doing it, there is a lot to consider, good luck
Jon
Comment removed based on user account deletion
Would give you the best write performance.
*BUT* YMMV depending on how your application actually uses the I/O
I know of a widely used RDBMS that really sucks on RAID5 due to its underlying data storage.
You'll need to test with all the combinations and also for RAID 10 (stripped mirror) and RAID 01 (mirrored stripe) to see which gives best performance for your application
Second, what OS is this running? Unfortunately, I doubt you can defeat the 2GB limitation without access to the source.
With what you're describing as the read/write pattern, I would suggest you look at a RAID1 configuration, perhaps even "multiplexing" - that is, have three or more disks mirroing the same information, in toto. I would also strongly suggest that you load up on cache memory on the RAID controller - 1gb would not be unreasonable - which will give you a very high cache hit rate, reducing disk I/O.
In this application, I doubt fibre channel would buy you much - it's limited to 150MB/s (although many don't know this), and is in theory no faster than SCSI3 drive for drive. Its real advantage over SCSI-3 lies in its ability to be networked for SAN applications.
"He who would learn astronomy, and other recondite arts, let him go elsewhere. " -- John Calvin, commenting on Genesis 1
At work we are replacing a bunch of file servers, and I just had to chime in with my totally unscientific benchmarks
All systems dell sc1600's 2 x 2.4 xeon's 1 gb ram, perc/3 dual channel raid card
for the highest usage server we did 5 x 36 raid 5 array 2 on one channel 3 on the other for ~ 100gig space + hot spare.
for the others we did 2 73's mirrored one on each channel.
All drives were same family from maxtor mark IV's i believe 10k's
In my rudimentary testing using hdparm -tT the mirrored drives performed at 75% of the level of the raid 5 for both read's and writes.
i'm not quite sure what to read into that, but i think given = # of disk drives the mirror would probably outperform the raid 5 but i could be way off base.
YMMV
Wang33
PAGERANK++ Robsell.com
What seemed to be of value was to put WAL onto a separate array simply out of the knowledge that those disks would get burned through faster, and we'd know they would have a higher replacement incidence.
What I'd like to do to fix that is to have a few GB of SSD, where heavy writes wouldn't hurt any disks.
But once you have a smart enough controller with battery backed cache, how the disks are physically configured gets less and less important.
If you're not part of the solution, you're part of the precipitate.
And learn what reads and writes are happening, and what is getting queued up now to create i/o bottlenecks. I haven't seen anyone comment on it, but understand that your primary LUN is just one part of the equation. At a minimum, you should be looking to separate your system files, your swap files, and your database partition onto separate arrays. For the first two, RAID1 is frequently the best choice.
Let the OS do the work. Add RAM. Have an Intel x86 box? Push it to 4G. Need more, go Opteron and push it to 8G. Trust in the filesystem buffering algorithms, and the speed of your system CPU(s), instead of the limp CPU and dinky cache on your RAID controller. And get a UPS. If your data set is large, dump the RAID5 and go RAID 0+1.
I want to delete my account but Slashdot doesn't allow it.
Since you are already using 4 drives just use raid 1+0 optimally split the 1+0 across 2 controllers
and you will get muchbetter preformance. even just 1+0 is faster than raid 5.
My lowly MS SQL server has 4 raid 1+0 channels.
1 for the OS and swap, and 3 for databases.
--Tim
TKrabec Pahh
I disagree on the rounded cables statement. Round cables (esp. in >4-drive arrays) allows more airflow, which keeps the disks cool. For 10K and 15K rpm disks, this is critical.
Now, if you'd said "don't try to make round cables yourself," I might agree -- even though I've done exactly that in the past AND once managed to cut one of the wires in the process -- but there is nothing inherently wrong with round IDE cables. You might claim crosstalk is an issue, but that's largely what the extra ground line are for between each and every EIDE signal wire (80-pin vs. 40-pin).
Of course, SATA cables make this argument entirely moot... and if you're buying a new IDE RAID solution, and not buying a SATA RAID adapter, you're fodder for the /dev/ignore set anyway.
Comment removed based on user account deletion
I am going to recommend getting a PCI Express RAID controller. I have seen some vast improvements in performace with these controllers, PERC4e (LSI logic). Also, if possible with larger number of drives set up a RAID 10 configuration. For instance for 1 channel 6 drives or 2 channels 6 drives (3 per backplane if split) you would set up 3 mirrors that are spanned. On the 2 channel config the mirrors would be split across the backplane. This provides the best performance and data is redundant. 73GB 15K RPM HDDs are the route to go for performance.
If your budget is $10k and its upgrading a legacy system without a large database, you might just go with a Dell 2650 model or whatever is the current model. This can hold 5 drives. I use 2 mirrored drives for the operating system and 3 RAID-5 drives using the hardware based perc3/dc controller. My drives are 36GB but you can probabley get 73GB now. The advantage is that you don't need an external chassis and you might be able to get 2 servers for close to $10k. The second server can be used to test patches and other configuration changes before going on the production system and if its EOL you still have one server for parts. And if you get really fancy you might be able to get some type of replication working.
if he's about to replace the controller, move to SATA and be done with those wide cables..