Slashdot Mirror


SCSI vs. SATA In a File Server?

turboflux asks: "I'm currently in the process of replacing an aging file server with something more robust. Company-wide, there will be about 100 people who could be using this server, but I don't imagine there being more than 50 concurrent users. Right now, I'm torn between spending alot on SCSI hardware, much like our other servers, or spending less, but getting more space, with SATA II drives. Whatever I decide, the server will be setup with a RAID 1+0 array for the numerous benefits it offers. Does Slashdot have opinions or suggestions on performance, reliability, and stability?"

29 of 303 comments (clear)

  1. Re:What's this SCSI you speak of? by Anonymous Coward · · Score: 1, Informative
    What's this SCSI you speak of?

    You know, those are the drives offering logs of models running at 15,000 RPM. The ones with controllers which can often take large amounts of RAM. You know, drives for servers.

  2. Hmm by PrvtBurrito · · Score: 4, Informative

    We have both SCSI raid (2 1TB arrays with 10k RPM SCSI drives on a dell powervault) and a several arrays with 3ware cards (an 8 way and a 12 way both with 200 or 250GB drives). We run Red Hat WS. We find that the 3ware cards are excellent for large data storage but have latency issues compared to the SCSI raid array. We are happy with both systems, but the price break on the 3ware shows, and I wouldn't recommend for really heavy use.

    --
    Laboratree - Scientific collaboration based on OpenSocial.
  3. SATA? I don't know.... by toofast · · Score: 5, Informative

    I use SATA on our smaller, non-mission-critical servers. For our data backend, I wouldn't touch it with a 10-foot pole.

    Here are some scenarios where I wouldn't hesitate to use SATA:

    - You have redundant servers. Using LVS and/or Heartbeat and your favorite tools, you can get full server redundancy using less expensive hardware. The overall solution can be quite elegant, with hot failover. Why just cover the drives?

    - Front-end cluster nodes. You have a powerful, expensive backend server (with a cheaper failover) and you use inexpensive front-end servers for serving client requests. Sounds like overkill for what you want, but with the right server load balancing technology, it can give you a scalable, fault-tolerant and damn fast solution.

    - You can live with downtime. Install a server with a couple of SATA disks in a RAID configuration and hope for the best.

  4. The real info by sabreofsd · · Score: 5, Informative

    There might be some benefits to you to sticking to SCSI vs. SATA, it really depends on your preference. Both SCSI and SATA offload the main processor from the duties associated with reads and writes. SATA also now has optimized reading patterns just like SCSI. The only real advatages SCSI has right now are the speeds (SATA 150 (there is a newer faster one coming) vs SCSI 320). Also, most SCSI drives are desgined for 24/7 use, whereas most SATA drives are designed for desktop use. Just make sure the SATA drives you buy are made for Enterprise level operation. So it really comes down to compatability/speed vs. cheap/larger. Hope this helps!

    --
    Sabre
  5. Re:BACKUP! by Spazmania · · Score: 3, Informative

    ATA and SATA drives are a great choice for online backup. Its pretty easy to put several terabytes worth in a box these days, software raid-5 them with Linux and then use tar and gzip. The price is not exceptionally higher than tapes either and the reliability (i.e. your success rate restoring data) is superior.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  6. SATA II is not your father's SATA by xusr · · Score: 5, Informative

    the SATA II spec is quite a bit different from the original SATA. SATA II adds port multiplication, hot plugging, native command queuing, external enclosures, and port selection. Also, with a theoretical peak of 3Gbps, it's twice as fast as the old SATA. here is a decent article with more explanation.

  7. These are different media for diff jobs by postbigbang · · Score: 2, Informative

    SCSI is very fast, and usually more expensive. You can get really fast, highly cached drives in SCSI with high-RPM spindles, and cool controllers. But they're $$$$. Do you need the speed?

    If not, SATA is still pretty fast, much less expensive, less clever controllers, but still very reasonable for things like archiving, steady low-concurrency-demand streaming, and so on.

    SATA also has the advantage of not needing loads of austere cables with distance limitations imposed on them; it's a serial rather than a parallel bus-- hence the S in SATA. Use SATA when you don't need the absolute fastest you can get-- and you won't have to spend the most on the controller (which is hopefully a SCSI PCI-X controller or other fast clocker), the drives, the pricey cables, and so on. But if you need the speed, there is no faster than SCSI except for flash drives, which are still hideously expensive.... and not writeable as much as we'd like them to be.

    --
    ---- Teach Peace. It's Cheaper Than War.
  8. SATA and Linux will be much faster... Soon. by MarcQuadra · · Score: 3, Informative

    I recently did all this research myself. SATA on Linux is going to get MUCH faster, probably as fast as SCSI, but you'll have to wait for the libATA improvements to take hold. Right now NCQ isn't implemented, and neither are 'multiple sector transfers'. I bought hardware that WILL support those features because I know that NCQ will dramatically improve speed and latency (under high-use conditions) when it is finally fully-baked.

    The site to track progress on the library and driver status is here: http://linux.yyz.us/sata/

    The project has been moving along quite well. I think their goal is to completely modularize, simplify, optimize, and consolidate the ATA, ATAPI, and SATA kernel pieces into one overarching (underlying?) library. I like this kind of work. I can't see why ALL disk-like I/O isn't under one big modular kernel library, it seems like it would make adding new transport types and drivers a lot simpler and reduce maintainance all-around.

    --
    "Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
  9. Re:SCSI by Anonymous Coward · · Score: 1, Informative

    I personally disagree with the view that SCSI is the way to go in every scenario. You pay more for SCSI and while a fair argument can be made that you tend to get what you pay for, there's also the scenario where you end up swatting flies with a howitzer, so to speak.

    If there won't be more than 50 concurrent users and the file access is relatively lightweight(as opposed to a heavyweight oracle database for example), then some money can be saved and good results realized with a SATA solution.

    I personally have had very good results with a moderately used Filemaker server sitting on SATA drives with about 40 concurrent users at peak, and about 70 users total hitting it throughout the day. The application is for clinical trial information tracking, if that helps anyone.

    But I definetley remain firmly in the SCSI camp for heavyweight applications.

  10. Fibre Channel by MoFoQ · · Score: 3, Informative

    I'd say Fibre Channel.

    One benefit that SATA does have over SCSI is the cabling....it's smaller and blocks less airflow (and easier to do the cabling).

    SCSI on the other had has other benefits....like it's used in enterprise servers now. Faster, daisy-chained, more RAID options, etc.

    Of course, Fibre Channel is basically SCSI on steroids and has the cabling benefits that SATA has.

    With more room thanks to less data cabling, u can add watercooling to reduce the heat generated by the 15k+ rpm drives.

  11. Stick with SCSI by dFaust · · Score: 3, Informative
    SATA drives have definitely improved, and for file servers NCQ definitely helps out alot.... but for the absolute best performance in a (true) multi-user environment, 15k SCSI drives still offer gobs of performance over even the new 150gig 10k Raptor SATA drive. Ultimately it will come down to how important price vs. size is to you... but speaking purely on performance, 15k SCSIs are the way to go.

    One way to curb some of the cost, I might add, would be to switch to something like RAID 5... you won't have as high throughput, but you'll still see performance gains and end up with more usable drive space. The throughput likely won't be your problem, anyways... typically it would be the drive's ability to handle multiple simultaneous requests, which heavily relies on low access times (which is why SCSI dominates in this type of environment).

    Here's a quick reference of some IOMeter benchmarks using a file server test pattern. You'll see what I mean. Wealth of info on drives on that site.

  12. Re:SATA is fine ... for some things by abcess · · Score: 5, Informative

    As a matter of fact, you may not be flying at all. It all depends what you're using it for. The problem with SATA is latency, and there's not much that controller is going to do about it. If you've got a server that is performing latency sensitive tasks, then SATA can cause performance problems.

    In my experience, if you've got alot of random I/O, SATA is not a viable solution. That said, even if your I/O is mostly random, if there's not a heavy load on the disk, then you're probably ok. If you've got 200 people hitting a database or email server, you're probably going to have some performance problems. Swap it out with SCSI drives, or a quality disk array, and you'll be doing much better. If you've got a web server, or a database server that is exclusively reading, you can probably get away with SATA. Again, it all depends on how much and how random the disk I/O for your application is.

  13. SATA by Andy+Dodd · · Score: 5, Informative

    SATA's peak raw transfer rate (150 MB/sec) is half that of the peak raw transfer rate of SCSI (320 MB/sec), but you're going to be limited by the individual hard drive's transfer rate anyway. Keep in mind that a proper SATA implementation will be 150MB/sec PER DRIVE, since each drive is on its own channel. SCSI is 320 MB/sec per channel, but you're in for a cabling nightmare if you want only one drive per channel. Note that there is a 300 MB/sec SATA standard, although few drives and controllers seem to support it.

    If you buy the right model, you can get SATA drives that have gone through the rigorous quality control testing that has historically been reserved for SCSI drives. Many of the higher end server-grade SATA models are warrantied for 24/7 operation. SCSI has lost its advantage there.

    SATA has Native Command Queueing, formerly a SCSI-only performance feature. Note that it's optional for SATA drives though, so make sure you get a controller and drives that support NCQ. Again, one of SCSI's few advantages has disappeared.

    Last, but most definately not least, SATA cabling is far simpler and robust than SCSI cabling. SCSI cabling is a finicky nightmare where even high-end cables can cause data corruption if you're not careful, whereas even the cheapest SATA cables I've seen worked reliably. I've had hardware related data loss on hard drives twice in my life. One case was an IBM Deathstar, the other was a SCSI cable that started flaking out and corrupted data on three drives at once. I haven't touched SCSI with a ten foot pole since that incident.

    --
    retrorocket.o not found, launch anyway?
  14. Re:SCSI?? by Anonymous Coward · · Score: 3, Informative

    You people are really showing your (inadvanced) age! Back in the good old days, many external peripherals (such as scanners) were connected to your machine via a SCSI bus. Don't forget what SCSI stands for - Small Computer (Serial|Standard) Interface. I believe our friend here was referring to peripherals, and in that case he's right - SCSI was replaced by USB. As the rest of you seem to have been born in the 80s, you probably thought he was referring to SCSI hard disks - the most common use of SCSI these days. For this purpose, SCSI (especially its modern incarnation) is vastly superior to USB. So the answer is yes and no. :-)

    - Jeremy
    madscience AT mac DOT com

  15. Controller matters much more than the drives by Pygmy+Marmoset · · Score: 2, Informative

    All hardware (and software) sucks, and it breaks, it's a fact of life. No matter if you go with SCSI or SATA, the important thing is that you can find out when a drive dies so that it can get replaced.

    Many low to mid range SCSI raid cards (most? all?) either don't have any sort of interface to find the raid status when the server is up (they just beep at you and expect that somehow that's going to be hard over the AC and server noises when you're walking by the machine), or the tools for checking the raid status are so poor that they'll lock up the shared memory segment after checking a certain amount of times (ADAPTEC, I'M TALKING TO YOU). Since being certain about your raid status means checking it via something like nagios, that means that it gets checked many times, and will thus eventually lock up.

    While SATA is nowhere near the performance of scsi (despite what SATA fanboys will tell you), 3ware cards are actually really good at:
    a) letting you know when a drive has failed
    b) letting you check with their tools as many times as you want without locking it up

    And since the SATA stuff is so much cheaper, you can buy multiple servers, so even if the card fails, you have a hot backup.

    If you absolutely have to have the fastest, go for a raid 10 of 15krpm drives.

    If you don't, and want peace of mind, get at least 2 SATA setups with 3ware cards.

  16. Re:SATA is fine by schnurble · · Score: 4, Informative

    Look at some of the stranger RAID options. If you just use RAID5, you'll be selling yourself short. RAID3 is worth a look. I'd actually suggest you put two controllers in a machine. Run RAID0 on 4 drives on a single controller. Run RAID0 on 4 drives on the other controller. Then use Windows or Linux software RAID to run RAID1 between the two RAID0 drives. Very fast performance and fully fault tollerant.

    Uhh. Yes. Then you can lose one disk in each side, and you have lost all your data.

    This would perhaps be slightly less than fully fault tolerant.

    Perhaps you meant to set up 4 mirror pairs, 2 on each controller, and use software to RAID0 them together.

    I have successfully done this with a 24 disk 5U chassis, and it is an IO steamroller (our database server, right now).

    --
    "To err is human, to forgive is simply not my policy." --root
  17. Re:BACKUP! by kahanamoku · · Score: 5, Informative

    I've seen more dead HDD's than backup tapes, and have seen 60 times as many backup tapes than HDD's...

    and last time I checked, an Ultrium 3 tape was half the price of a 400GB Drive.

    I wouldn't use disks for backup, unless they're to be used as live backups, and then I'd still archive to tape (provided it was affordable).

    --
    ----- Concentrate on promoting more than demoting.
  18. Re:SCSI by NutscrapeSucks · · Score: 2, Informative

    Even back in MFM's heyday, SCSI was the standard for workstations and servers.

    --
    Whenever I hear the word 'Innovation', I reach for my pistol.
  19. Re:SATA is fine by Bios_Hakr · · Score: 4, Informative

    The chances of losing two disks at once are slim. RAID 0+1 will provide great performance and good fault tolerance if you react to problems as they happen.

    But I guess it depends on what your users need. If they need raw throughput, RAID 0+1 is better. If they need low latency, then RAID 10 may be the answer. Or maybe both systems would fall within the margin of error of each other.

    In any event, once you get into what-if situations, no RAID will be good enough. What if you lose a disk? What about two? Five? Well, what if lightning hits the chasis or the janitor unplugs it to buff the floor?

    The best you can do is roll the dice and play the odds. You'll see that I told him to use RAID 0+1. I also told him to use good monitoring setups to mitigate problems. I also suggested a tape backup. Actually, maybe I didn't, but I did tell him to verify his backups work and that he is able to restore from them, so that's kind of the same thing.

    When it gets down to it, oppinions are like assholes; everyone has one. And most people only care about their own and don't really want to look at their coworkers'. I guess I'm the same in that respect.

    --
    I'd rather you do it wrong, than for me to have to do it at all.
  20. Re:SATA II is not your father's SATA by Blackforge · · Score: 2, Informative
    I've never understood. . . why the hell would you want hotplugging for internal components? Isn't it always a smart idea to turn your PC off before you reach your hands inside the case?


    It's not necessarily for "internal components". It's also for entry-level servers and raid arrays. You can get hotplug bays that fit in 5 1/4" slots on a machine. This provides an easy way to swap out the drives in case of failure. If you're running a server you nor your users want downtime. If you're running RAID 1, RAID 5 or RAID 10, you want to be able to rebuild to the replacement drive before you lose another drive and lose the whole array. Its a lot faster to rebuild a single drive from the existing data on the other drives, than having to restore from tape or other backup media.

    Also motherboard manufacturers are now starting to include external SATAII ports to "hotplug" external SATA drives.
  21. Re:The very definition of RAID... by _generica · · Score: 2, Informative

    > The very definition of RAID is "Redundant Array of INEXPENSIVE Disks".

    Actually, the definition has been back-formed to "Redundant Array of Independent Disks, since you won't necessarily be using inexpensive drives any more.

    Just because you put 500gb drives in a RAID array, doesn't suddenly make them inexpensive, but they are each independent.

  22. SCSI. Still. by aussersterne · · Score: 4, Informative

    SCSI still tears the alternatives to shreds for price/performance at the heavy end of the load curve, no doubt about it.

    If you doubt it, try both.

    For going on twenty years it's been the same: those who haven't tried SCSI claim that there's no or little difference. Those who have used both SCSI and [MFM,RLL,IDE,ATA,SATA] in high-load environments hate to try to make due with anything but SCSI.

    For performance and reliability reasons both, you want SCSI if you're dealing with high-random-access-load or high-throughput situations. ATA/SATA is fine if you're just offering up noncritical bulk network storage but for the rest you want the real deal, and you will notice the obvious difference if you try both in a stressed environment.

    --
    STOP . AMERICA . NOW
  23. Re:SATA is fine by Anonymous Coward · · Score: 1, Informative

    Monitor the damn thing! My last job someone let the server die. It had RAID5 over 5 drives. One drive had failed and no one noticed. When the second failed, that was the end of it.

    That's nothing! I work for a Fortune 500 company, and we had a centralized database containing data from sites nationwide. This database was on a RAID 5 array with well over a dozen drives, including two hot spares, situated in a data center which is staffed 24x7. We lost the database.

    How? Simple -- nobody was monitoring the array. One drive fails -- no problem, rebuild it on a hot spare. Another drive fails -- rebuild it on the other hot spare. Yet another drive fails -- running in degraded mode but no data loss yet. Finally, a fourth drive fails. The entire RAID 5 volume is lost, unrecoverable. Of course, that was when the problem was first discovered! We had to rebuild the database from scratch.

    There was nothing wrong with the design of the hardware platform, but nobody had been tasked with monitoring the array and replacing failed drives!

    RAID can't protect you forever. Monitor those disk arrays!

  24. Re:SATA is fine by aiken_d · · Score: 2, Informative

    The chances of losing two disks at once are slim

    Not in my experience. I've worked with many, many systems over the years, and I'd say that about half of the time a drive in an array failed, at least one other one went with it either simultaneously or shortly thereafter.

    Sometimes the failed drive has literally melted, putting great load on the power supply and taking one or more other drives out at the same time. In arrays that stripe error-recovery information across multiple disks (RAID 5, etc), I've had the additional load be the final straw for a second drive. I've had hot spares that died the moment they were asked to actually do something.

    It may be that the chances of losing two out of three disks at once are slim. But I can tell you that the odds of losing two out of twenty disks at once are not slim at all. Either that or God just hates me. In either case (he may hate you, too), it's best to plan for multiple drive failures and at least one power supply and one SCSI bus failure happening at once.

    Cheers
    -b

    --
    If I wanted a sig I would have filled in that stupid box.
  25. Re:SATA is fine by Malor · · Score: 2, Informative

    Out of 8 disks, the chance of losing two at once is higher than you'd think, especially if he's using the cheaper SATA drives. I lost 2 drives out of a RAID5 in short succession just recently, and *just barely* managed to save the most recent data before the second drive died too.

    RAID 0+1 is much inferior to RAID10. 0+1 is what the GP poster said... stripe 4 disks in RAID-0, and mirror those. You're no more fault tolerant than a RAID5 array.. if ANY two drives fail, you're hosed. You lose 50% of the space to boot. (in RAID5, on 8 disks, you'd lose only 12.5%, though of course it's slower.)

    RAID10, on the other hand, is setting up four mirrors, and striping the mirrors. You still lose 50% of your space. However, you lose the whole array only if both 'sibling' drives in a given mirror fail. That means you have a pretty good chance of surviving a multi-drive failure. And it's very fast....just as fast as 0+1, but it's a lot more robust. Of course, both are very inefficient in terms of space lost, but drives are so cheap these days that it doesn't matter too much.

    Any good controller will do RAID10 nowadays... only the very cheapest/crappiest controllers are limited to the inferior 0+1.

  26. MTBF usually better on SCSI by abdulwahid · · Score: 2, Informative

    If you want reliability for the disk you had better check what the manufacturer claims for the MTBF (mean time between failure).

    Many SATA drivers have a MTBF of around 0.6 to 1 where as SCSI have between 1 and 2. Your SCSI disk therefore has about twice the life expectancy. If you couple this with the speed of the SCSI I guess for the moment if your budget allows for it then go for SCSI

    If your budget doesn't allow for it...just make sure you have good redundancy in your RAID with at least 2 redundant disks

    --
    perl -e 'print $i=pack(c5, (41*2), sqrt(7056), (unpack(c,H)-2), oct(115), 10);'
  27. Re:SATA is fine ... for some things by Anonymous Coward · · Score: 5, Informative

    Assuming equal storage sizes, SCSI drives would have way better throughput and latency than a SATA drive because you can get 15K SCSIs. However, the sizes are NOT equal. Fact is that for the price of a 147GB 15K SCSI drive, you can get about 2TB of 7200RPM SATA space.

    What you end up with is the following throughput when disks are empty:

        1x147GB 15K SCSI -- 150MB/s
        8x250GB 7200 SATA -- 275MB/s to 550MB/s depending on exact RAID configuration

    Now fill up both configurations with 140GB of data and the throughput of the 15K SCSI has dropped in half to 75MB/s because the heads are now positioned at the "slower" inner portion of the disk. Meanwhile, the 2TB SATA config is 7%-15% slower depending on the RAID config.

    Latency also benefits from many disks for the same reason. Fill up a disk and you possibly have to traverse the entire disk. So while a 15K drive has a seek time of 2-3 times faster, you end up having to move 10X-15X farther than in a mega array where the heads pretty much just hover over the 2X faster outer portion.

    The big advantage for SCSI is the better TCQ algorithms for multi-user access. This can be mostly negated if you use a SATA RAID controller with enough onboard RAM to reorder IO at the controller level versus depending on the drive's NCQ.

    This is the route we've taken -- we went from a LSI MegaRAID 320-1 + 4-drive SCSI RAID config to an Areca 1170 + 1GB RAM + 24-drive SATA RAID. Every aspect of performance is up by big amounts -- throughput, latency, multi-user access. The drive array is actually TOO fast for our 2x244 Opteron server to drive. We ended breaking the array into 3 8-drive volumes and mirroring 2 volumes against each other for more redundancy. One of these days, we'll upgrade to faster CPUs and retest a 16-drive volume.

  28. Re:SCSI?? by Kymermosst · · Score: 2, Informative

    scsi absolutely is not serial, duh

    While he did screw up the second 'S' in SCSI, you cannnot seriously expect anyone who knows anything about the evolution of SCSI to take you seriously after you stated the above.

    I will prove your statement false with a single counterexample: Serial Attached SCSI (PDF). Note the date of the document.

    Remeber that with SCSI-3, the standard became more modularized in order to do things like separate the SCSI command set and the SCSI physical interface.

    Here's the SAS FAQ from the SCSI trade association.

    --
    "Alcohol, Tobacco, Firearms, and Explosives" should be a convenience store, not a government agency.
  29. Re:SATA is fine by Qzukk · · Score: 3, Informative

    Eh, his description is funky. I think he meant 4 sets of 2 mirrored drives, that are then striped. You could do it the other way, I guess, but that IS a lot of wasted space.

    As for "extra redundancy" The difference between RAID 10 and RAID 01 is in the failure mode, not strictly in the redundancy.

    In RAID 01, the data is stored like this:
    [ABCD] - four drives striped
    [ABCD] - four drives striped ... and then mirrored. If a drive C fails, that entire mirror becomes useless since you can't mirror ABCD to ABD, making the state:
    [ABCD] - four drives striped
    [XXXX] - four drives offline ... and the next drive failure kills it, assuming that offline drives don't count. Some hardware raid systems will continue to mirror ABD, essentially converting it to RAID 10 on the fly.

    In RAID 10, the data is stored like this:
    [AA] - two drives mirrored
    [BB] - two drives mirrored
    [CC] - two drives mirrored
    [DD] - two drives mirrored ... and then striped. If a drive fails only that drive is offline...
    [AA]
    [BB]
    [CX] - one drive offline
    [DD] ... if the remaining drive C fails, then the array is lost. However, any other drive could fail without destroying the array (in fact, up to three more, if you're lucky).

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.