iSCSI vs. Fibre Channel vs. Direct Attached Disks?
mrscott asks: "Does anyone have any good, simple benchmarks about iSCSI performance in a SAN as it relates to fibre channel and direct attached storage? There's a lot of information out there about iSCSI TCP offload adapters that improve performance, but it's still hard to get a handle on even those stats without the original numbers for comparison. We're considering an iSCSI solution from Lefthand networks, but finding independent (and reasonably simple) numbers has proven somewhat difficult, even with the awesome power of Google behind me."
You want manageability. You want the ability to take "some disk" and add it to a server, anywhere, at any time. You want the ability to grow/shrink the filesystems on those servers. You want redundancy, and you want top notch vendor support. Direct disk might be faster, with local processing handling the FS buffering in local RAM, but what happens when ServerA needs 20G of the 100G disk you installed in ServerB?
I want to delete my account but Slashdot doesn't allow it.
Umm not exactly..
1. Fibre-channel:
Speed = 2Gigabits/sec = 2048 Megabits/sec =~ 256 Megabytes/sec
2. Ultra320 SCSI (direct attached storage)
Speed = 320Megabytes/sec
3. iSCSI (assuming gigabit network link)
Speed = 1Gigabits/sec =~ 100Megabytes/sec
Once 10Gb NICs become common, then iSCSI will have better link speed (doesn't mean it will be faster).
Many things affect the speed of storage systems.
1. raw disk speed
2. raw disk access time
3. interface (iSCSI or Fibre-Channel or UltraSCSI) latency:
iSCSI latency > FC latency > SCSI latency
4. protocol overhead
iSCSI latency > FC latency > SCSI latency
and on and on...
Good yard stick..
If you have five or more 15K drives in a storage system the link speed will be the bottleneck.
Reasoning: six * burst throughput of single drive > link speed.
apples to apples..
1. Local attached storage will generally be faster than fibre-channel or iSCSI as long as the fibre-channel or iSCSI storage system doesn't have some really highend RAID/cache system.
If you have many small hosts, generally throughput should not be an issue except for some hosts, where a highend internal, multi-SCSI-channel RAID controller and SCSI storage systems will be the fastest.
My 0.02cents (taxes extra)
This sig space tolet, reasonable rate.
Both iSCSI and FC are networked version of SCSI, and all 3 technologies are much faster than their respective disks, thereby not being the bottleneck at all. After Ultra160, the standard PCI channel is saturated, and 64-bit PCI like PCIX is needed for Ultra320, all the while usually even in the burst mode ( from cache) disks cant saturate this available bandwidth, say 6x RAID5 15K RPM disks in read mode.
FC and iSCSI are much more expensive than SCSI Ultra320, which is commodity hardware now. FC just sends the data in optic to outside the system, where larger datawarehouses can be managed instead of getting bigger and bigger Unisys boxen.
So if you need terabytes of data all in one place (I mean at least 10 terabytes), consider iSCSI and FC and putting the disks outside the system for better management. We are getting a NAS solution to replace our backup tapes, requirement was 1.2TB. We will get 4x 300GB Maxtor Maxline II SATA disks... the slow cheap ones, and put them in an IBM xSeries 206 which are going at $500 CDN, with an Adaptec RAID card.
Upto 16 SATA 400GB disks can be managed by a simple adaptec raid card, beyond that, think FC arrays.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
Seriously, its not so much a speed issue but an issue of how you manage your environment. You can get enough performance out of all three solutions and there are other, more important things to look at.
As you see, I would recommend worrying less about the performance and more about what you really need and what your environment looks like.
If you just want performance for cheap, then local disks are unbeatable. Instead of spending money on expensive fibre or iscsi offload controllers buy tons of cheaper scsi cards, instead of running fibre and buying a fibre switch buy tons of disk drives.
Most people make the mistake to worry about capacity - in reality, its the number of spindles you need to look at and the capacity or your storage is a result of that. Figure, each disk attached gives you about 100 i/o ops per second. If you need to do 5000 ops, you need 50 disks - no matter how they are connected. Figure a disk can get you 20 MB/sec - if you need 200MB/sec throughput, thats 10 disks.
In that example, I need 50 disks then to satisfy my requirements. Next, take the max throughput and divide it by 3 - that's 66MB/sec for fibre then, 100MB for scsi and 33MB for iscsi over gigabit. So to run my 200MB/sec I'd need only 2 scsi channels, 3 fibre or 6 iscsi connections.
Next of course can't have 50 drives on 2 channels - more than 3 disks drop the transfer rate dramatically... Since fibre and iscsi mask the physical spindles they don't care but I need to have 16 scsi controllers to really run the disks.
Peter.
FC-AL is the "gold standard" for performance and reliability, but has limitations when you want to expand clustering (FC switches still cost gobs of $$$). Fibre Channel cabling also has distance limitations that go away with iSCSI.
... provided the network will drive it. 10Gbit Ethernet and more will definitely fuel this migration, sooner than you think!!
The acid test -- for me anyway -- is seeing LARGE customers (banks, airlines, government agencies [pick a 3-letter acronym], pharmaceuticals, major industries such as oil and energy, entertainment companies and movie studios, etc.) implementing iSCSI on an equally LARGE scale, and quite successfully.
With few exceptions, if the underlying Ethernet network is functioning properly, iSCSI performs remarkably fast. No, you won't get the 2 Gbit FCAL rates -- *yet* -- but we have customers running dozens of large (>TB) databases off a single appliance over one or two GigE ports and iSCSI.
Generally, it's recommended to segment off the iSCSI traffic so it's not routed or mixed with public traffic anyway, but even those (small) customers that pipe all of their storage appliance data through a single 10/100 Ethernet interface only have problems if they put too many users on there as well. (A direct crossover cable is *ideal* for iSCSI.)
In addition, the Microsoft iSCSI initiator has finally outgrown its initial bugs/problems (with our help in some cases), and is darn solid with plenty of different targets.
I'd love to drop a bunch of example company names, but I'm sure those companies consider that information to be competitive, and it's not my place to divulge it. Any large company you can think of already has an investment in FC-AL, and all but a very small percentage have iSCSI infrastructures as well. Medium-size and small (50 employees) companies are also seeing HUGE benefits from iSCSI their own implementations.
The 0-day acid test (which works amazingly well in our labs with the right HBAs) is SAN booting over iSCSI. Imagine having an nnn-Terabyte set of storage, from which ALL your servers boot EVERYTHING. Not a single magnetic disk is required in the servers themselves. Makes server clustering and blade/grid computing so very attainable
(As an engineer for a major storage vendor (FC/iSCSI/near-line IDE storage/archiving), I work with all of this stuff on a daily basis. Not saying I'm an expert, just that I kinda know what I'm spouting here...)
No avoidance is necessary, and unless you're familiar with iSCSI, your "evidence" is anecdotal at best.
iSCSI works well, and it's as fast as the underlying network (ideally, a direct crossover connection to the target storage, so there's no other traffic to contend with). iSCSI is not as fast as FCAL (2 Gbit/sec max), but only because 10Gbit Ethernet isn't the main course -- yet.
Usually, your storage pool will be Fibre Channel, and some servers connect via FC LUNs (via an FC switch), others will connect via iSCSI LUNs. In either case, the storage array itself is not usually JBOD -- it needs to be something like RAID4 or RAID5 (or at least RAID1 / RAID0+1) to preserve the data integrity in the case of a simple disk failure.
In any case, FC-AL has one fatal flaw -- an open loop. (Think of FCAL as "token ring," because that's essentially what it is.) If a disk fails "open" (I see this occasionally as a storage engineer for a major SAN/NAS vendor), or a cable or GBIC on the loop fails, the entire loop is taken offline.
Almost...
iSCSI LUNs can be pretty friggin' huge. More than once, I've seen a customer nuke the holy bejeebers out of a 950GB iSCSI LUN all at once, with a single click (the equivalent of "LUN DESTROY"). I actually had a WebEx session to a customer once and watched him delete a 1.3TB LUN before I could stop him.
If you're on plain JBOD, you have to get out your restore tapes and pray to the tape gods. If you're on a SAN/NAS appliance (like NetApp), you can usually restore that data in a matter of seconds (literally) from snapshot. That's saved many an admin from getting insta-pinkslipped...
The original question was regarding real world performance of iSCSI in particular, and since frew of the posts seem to touch on that I may as well tell what I've learned from hard experience regarding the other technologies: SCSI and FCAL.
My experience is with very high transaction volume OLTP databases (oracle) backing a financial website. I've found that neither SCSI nor FCAL adapters limit performance significantly. This was with qlogic qla3200 adapters, or with highend adaptec Ultra320, on Solaris 9 and the last few versions if RedHat enterprise. Only the older versions of redhat had some kind of problem with the qlogic driver, plus bounce buffer IO, which drove down performance. But then to be nit picky, that was the driver not the HBA. Solaris was always fine, and now redhat is too.
The main performance challenge was *always* tuning the database and spreading out on lots of spindles. The HBAs at over 200M/sec each never posed a problem on larger sun boxes (8 or more procs) with 7 or 8 way parallel sequential reads going. On smaller hosts or smaller disk arrays,k the problem was always on the host itself or the disk seek times respectively, not the hbas themselves.
A 10k rpm drive will do about 70 mbytes/sec off the outer platter (near block 0) and as a rule of thumb, a 2Gbit fcal adapter will do 200mbytes a second (at least on solaris or newer redhat EL). So my dual qlogics would do 400Mbytes a second under absolute optimal disk access, but typically it's not that perfect 8 way parallel *serial* scan off the outer platter, its usually farily random.
So in the high end database applications (datawareyouse or OLTP) least the usual tuning challenge (and $$ for that matter) are with getting a fat spread across a lot of spindles, and making sure the application is either caching well (OLTP) or doing orderly, sequential scans (datawarehouse)
"Hi. I'm thinking about buying a car. Which is the best one?"
Ummm What are you doing with it? What's your budget, and what are your expectations? Some people think a TB is big, some consider 20+ TB a building block standard (I do).
1. Look on the net and in mags. - read read read
Is this a big implementation, or a small one. I'd bet a small one, or you probably wouldn't be asking here - that probably means you don't want iscsi. Direct FCAL on multi-port RAIDs might be your speed - if you aren't big enough for that, maybe some low-end storage from Dell or Apple might be your thing.
2. Bring in some vendors, tell them what you want - ask them how to do it. - talk to a few.
3. Ask about install costs to implement what you want to do. (A lot of people don't realize the work involved for large scale implementations)
If you don't have the staff to know if iscsi is for you, you may not have the staff to implement and troubleshoot it. Direct attached fibre - or even via a switch will be far easier to deal with.
If you can afford it, stay away from SATA. The problem is that it's sooo much cheaper vs FC disks on larger systems - but the seeks are slower, the mechanicals are poorer, and the command set less complete. - It's $ vs reliability.
Cheers,
a storage guy.