Slashdot Mirror


Increasing the Transfer Rate?

Nintendork asks: "I recently started a new job as a resident computer geek and am analyzing the performance of our SQL server. I did quite a bit of research and would like an opinion from the Slashdot community on my proposed solution for increasing the STR (Sustained Transfer Rate) from the server to the workstations. The server (Compaq ProLiant ML530) has 16 10,000 RPM drives with an average STR of ~43MB/sec. per drive. 14 are used for two RAID 5 logical drives (7 physical drives per logical). The remaining 2 drives are backup drives in case one fails. Currently, they're all connected to a Compaq fibre RA4000 adapter. It runs at 100MB/sec. from what I could find in a jungle of fibre information. Reasoning tells me I have a huge bottleneck at the fibre adapter and the 100baseT NIC. I should also mention that the server has 2 PCI buses. One runs at 64-bit and 66Mhz and has 2 PCI slots. My proposed setup would be to back up all the data and create a new array with a few hardware modifications. Take out the fibre adapter and use two, dual channel 64-bit 66Mhz ultra160 adapters on the two 64-bit 66Mhz PCI slots (4 drives per channel). Take out the 100baseT NIC and start a gigabit backbone." Would this significantly increase performance? Read on, if you to check out the numbers on the new setup.

"From what I've learned thus far, the proposed setup would be a blazingly fast file server approaching ludicrous speed. Let me break it down. Data can be read from the drives at a STR of ~602MB/sec. (~43MB/sec. * 14 drives). Each Ultra160 channel has a STR of 132MB/sec. This provides a bearable bottleneck that reduces the overall STR to ~528MB/sec. (132MB/sec. * 4 channels). The 64-bit 66Mhz PCI bus has a STR of 528MB/sec., which is an exact match for the 4 ultra160 channels! From there, I assume the data goes out the NIC, which is on a gigabit backbone. This would provide a STR of ~528MB/sec. to the workstations. Unless I'm missing something such as a possible bottleneck between the PCI bus and the NIC, my reasoning makes gosh darned perfect sense!

Thanks in advance for any insight you all can provide on this issue."

5 of 34 comments (clear)

  1. Ditch Raid5. by AntipodesTroll · · Score: 4, Informative

    Your first priority is to ditch Raid5.
    By the looks of your post, you have money to spend, so invest in more disks and go RAID 0+1. You'll notice a speed increace right there. If youre worried about PCI latency, get a system with 2 or 3 PCI busses.

    Infact, Whoever set up this configuration needs a slap, if they were going for performance. Raid 5 has more and more overhead penalty the more disks you add to a stripe set. Even knocking back the sets to 4 or 3 disks would help, I would never use more than 4 disks in a raid5 set.

    That said, if your app is mostly db reads, 0+1 on good controllers should give you 4* the average disk throughput, thanks to striping, and round-robin mirror reads. Also, make sure your OS and filesystem are well tuned. Often you see people spending money on hardware, because they havent bothered to optomise the software. (Kernel variable, filesystem tuning, etc.)

    --
    Anyone who considers arithmetical methods of producing random numbers is, of course, in a state of sin.-John von Neumann
    1. Re:Ditch Raid5. by Insightfill · · Score: 3, Informative
      Agreed. RAID5 does fairly well in the "reads" dept., but starts taking serious hits in the write department, because every little 20K MS Word document needs to have it's parity written off to the side. Not as big a deal when the files are big enough, but can be a pain.

      Check out:
      http://www.cs.washington.edu/homes/savage/
      for the work of Stefan Savage, a member of the UCSD faculty who's done some good work in the field. Scroll down to the near-bottom for the paper on "AFRAID" and see what I mean. The idea is that if we delay the parity-write long enough to queue them up, we start approaching the speed of RAID0. There's a possible data loss here, but his premise is also that the MTBF rate for most drives makes this less an issue than power supply failure, controller failure, etc. (RAID was developed back when MTBF was closer to 20K hours.

      He also has some great work on DoS attacks, too. See the write-up at Ars:
      http://www.arstechnica.com/reviews/2q00/networking /networking-1.html

  2. Tune, Tune, Tune some more by fooguy · · Score: 3, Informative

    Like the other posters said, start with ditching RAID 5 for RAID 10/0+1 (depending on your preference). RAID 10 (Mirror + Stripe) is my preference because of the higher redundacy - if one disk dies the whole stripe doesn't drop out. RAID 0+1 is faster but slightly less redundant. Either way, the parity overhead generated by RAID 5 is the death of a database.

    Your controllers are pretty fast, it's more likely your software config or your network. Are you running MS SQL Server, or something else? MS SQL Server requires some pretty specific tuning to get good performance (like telling it *not* to use all the RAM). How well are you objects tuned?

    How about your OS? What filesystem are the datafiles living on? Oracle supports RAW partitions, which allows you to eliminate the OS overhead from the database. We're testing the performace now of Oracle on different Solaris filesystems. You'd be surprised the differences between UFS, UFS with Logging, and Veritas File System.

    While I won't deny that gigabit ethernet is fast, it's kind of expensive (especially if your network infastucture isn't equipped to handle it). If cost is a concern, and you're in a switched enviroment (no hubs) you can add more 100Mb Ethernet NICs and trunk them for more bandwidth. In reality, I've never choked a 100Mb connection unless something was wrong (like end users writing nasty Crystal Reports).

    --
    "All I ever wanted was to see Larry Wall give Bill Gates a Perl necklace."
    http://www.eisenschmidt.org/jweisen
  3. You should ditch RAID-5 by duffbeer703 · · Score: 4, Informative

    RAID-5 and relational databases are a dangerous mixture. Not only does RAID 5 give you a 50% performance hit, but there are cases where data will be lost or corrupted without you ever hearing about it.

    In the event of partial media failure over time with one or more disks in a RAID set, errors can be introduced into your data that will not be detected by partity checks. Once the drive runs out of sectors to remap you'll eventually have data that cannot be reconstructed by the ECC code on the drive.

    Also, in the event of total drive failure, the rebuilding process performed automatically by the controllers can reduce overall performance by up to 85%.

    RAID 10 is the way to go. Not only do you get highest possible level of performance and redundancy, but you suffer no performance hits during a single failure.

    Don't read this post and scoff "I've never had drives break like that". I've worked in some large data facilities (ie 400-500 TERAbytes of storage) and have entire defective batches of 200 brand-new disks. Although hardware failures happen much less than they did in the past, they can and do happen every day.

    So my advice to you:

    1. Keep your current Fibre-channel configuration.
    2. Buy more drives than you need, max out your array.
    3. Backup the data, ditch RAID-5 and build RAID 10 volumes.
    4. Reload the data, carefully plan where your busy tables and transaction logs are to avoid hot disks.
    5. Conduct a through analysis of how your data is accessed and rearrange the volumes accordingly. Re-analyze everything every quarter.

    You have reached the point of maximal return looking at your performance issues from the POV of a system administrator. You need to get a very smart DBA or start reading at this point. Designing your physical database design around your queries is the only way to pull signifigant performance increases out of your system. (except for getting rid of RAID-5)

    Also, your performance expectations are too high for x86 equipment. You are never going to push out 100MB/sec from a database, even with trivial queries and optimized tables.

    --
    Conformity is the jailer of freedom and enemy of growth. -JFK
  4. Re:Sounds like your NIC is the bottleneck by Bryan+Andersen · · Score: 3, Informative
    I forgot to divide by 8! Change of plan: Wait for the 10 gigabit NICs to be released to the public. I guess I could place it on the 64-bit 33 Mhz bus.

    For a 10 gigabit NIC you would need a much faster local bus. Better at that point to change out the underlying motherboard and processors while your at it. People just don't understand numbers and their importance.

    This guy needs to really figure out where his true bottleneck is before changing things. Looking at the setup. The two optimizations I see are gigabit NIC or multiple 100Mbit NICs and switching to RAID 0+1. That's it. Without reallying looking at actual performance numbers and profiling I wouldn't touch it.

    Performance tuning for the sake of performance tuning is just asking for trouble. Do it when you know you need the extra performance or know you will need it soon. My bet he could get all his DB/Disk speedup via tuning the DB. He could get the communication speedup by going with a 64bit/66Mhz gigabit NIC for talking to the users.

    From the sounds of the system layout it looks like the original DB wasn't setup properly. Only the data segments of the DB should be on RAID 5 and then only if you don't have the hardware to go RAID 0+1. The rest should be on RAID 0+1 or just stripped. I also bet there aren't separate RAIDs for data, index, before and after image, and log files. Separate off the before image, after image, and log files first, then separate out the indexes, each to their own RAID. The indexes should just be on the fastest RAID 0+1 array you can afford. The before and after image RAIDs should be in the same speed class as the main data RAID. The striping on them should be short as they are only accessed sequentially. The data and index segments need fast seek and low latency to perform best.

    First thing is find out where the real bottleneck is. Don't just swap and hope you solve it. You don't want to swap something out to find out the replacement isn't stable.