Slashdot Mirror


iSCSI vs. Fibre Channel vs. Direct Attached Disks?

mrscott asks: "Does anyone have any good, simple benchmarks about iSCSI performance in a SAN as it relates to fibre channel and direct attached storage? There's a lot of information out there about iSCSI TCP offload adapters that improve performance, but it's still hard to get a handle on even those stats without the original numbers for comparison. We're considering an iSCSI solution from Lefthand networks, but finding independent (and reasonably simple) numbers has proven somewhat difficult, even with the awesome power of Google behind me."

46 comments

  1. As of right now... by virid · · Score: 1, Flamebait

    in terms of speed, iSCSI can't touch 2GB Fibre-Channel...

    --
    "The world only exists in your eyes. You can make it as big or as small as you want." - F Scott Fitzgerald
    1. Re:As of right now... by QuantumRiff · · Score: 1

      Especially the speed at which fiber channel blows the department budgets!
      Seriously, I would think with good equipment, it should come resonably close in bandwidth, but iSCSI would probably have a bit more time in response with the protocol overhead. Of course there are benefits to just using 1 network and 1 stack of switches for everything.. (and dangers!)

      --

      What are we going to do tonight Brain?
    2. Re:As of right now... by Xross_Ied · · Score: 5, Informative

      Umm not exactly..
      1. Fibre-channel:
      Speed = 2Gigabits/sec = 2048 Megabits/sec =~ 256 Megabytes/sec

      2. Ultra320 SCSI (direct attached storage)
      Speed = 320Megabytes/sec

      3. iSCSI (assuming gigabit network link)
      Speed = 1Gigabits/sec =~ 100Megabytes/sec

      Once 10Gb NICs become common, then iSCSI will have better link speed (doesn't mean it will be faster).

      Many things affect the speed of storage systems.
      1. raw disk speed
      2. raw disk access time
      3. interface (iSCSI or Fibre-Channel or UltraSCSI) latency:
      iSCSI latency > FC latency > SCSI latency
      4. protocol overhead
      iSCSI latency > FC latency > SCSI latency
      and on and on...

      Good yard stick..
      If you have five or more 15K drives in a storage system the link speed will be the bottleneck.
      Reasoning: six * burst throughput of single drive > link speed.

      apples to apples..
      1. Local attached storage will generally be faster than fibre-channel or iSCSI as long as the fibre-channel or iSCSI storage system doesn't have some really highend RAID/cache system.

      If you have many small hosts, generally throughput should not be an issue except for some hosts, where a highend internal, multi-SCSI-channel RAID controller and SCSI storage systems will be the fastest.

      My 0.02cents (taxes extra)

      --
      This sig space tolet, reasonable rate.
    3. Re:As of right now... by psyconaut · · Score: 1

      Well, "back in the day" companies like NetApp demonstrated that networked storage was very viable (NAS)....so there's good hope for iSCSI.

      -psy

    4. Re:As of right now... by Anonymous Coward · · Score: 0

      Back in the day? We still are!

      NTAP stock price didn't go from $15 to $36/share all by itself... iSCSI is becoming a large and significant part of what we sell and support every day. And, since our iSCSI license is FREE ..... :)

    5. Re:As of right now... by photon317 · · Score: 1


      Don't forget that 10G Fiberchannel will probably come at some point as well. And in the much more realistic and cool world, there's also 2.5Gb and 10Gb Infiniband. If the Infiniband guys will just re-engineer the god-awful physical connectors they designed, it could really take hold fast. Plus it was designed as a generic high speed low-latency data transport - you can do storage, IP networking, direct MPI library stuff, etc all through one Infiniband connection. Some of the Infiniband switch vendors are also making products that adapt legacy networks onto Infiniband - you stick GigE and FC cards in the switch itself, plug them into the legacy IP and Storage networks, and they become shared virtual adapters for all the Infiniband hosts on your Infiniband network.

      --
      11*43+456^2
    6. Re:As of right now... by onemorechip · · Score: 1

      Actually, the FC rate is in multiples of 1.0625 Gb/s, so it is 2.125 Gb/s for 2GFC, but you need to divide by 10 instead of 8 to get the symbol rate, so 2GFC delivers 212.5 GB/s. Unlike parallel SCSI, though, Fibre Channel is full duplex, so with a good mix of reads and writes FC will move around 400 GB/s.

      Parallel SCSI has higher overhead than FC. Arbitration, selection, and messages (which still use asynchronous transfers) are bandwidth killers. And if pSCSI has lower latency, it is because you are comparing different topologies -- loop or a switched SAN vs. a 3 meter bus (or is it 1.5 meters for 320? I forget). This is not to suggest that FC should be used for direct attachment of disks -- it would be blazingly fast, but it wouldn't be cost-effective.

      --
      But, I wanted socialized health insurance!
    7. Re:As of right now... by Xross_Ied · · Score: 1
      Unlike parallel SCSI, though, Fibre Channel is full duplex, so with a good mix of reads and writes FC will move around 400 GB/s.


      Try more like 10% on average. Why?
      a) the link may be full-duplexed but the spindles on the other end are not.
      b) Very few applications have sustained bursts of reads and writes. Most have periods of sustained bursts of reads or writes.

      Only when one is talking about multiple terabytes does the affect of a) dissappear (if your SAN distributes data across all available spindles).
      For b) if you put ALL your applications on the SAN then you will see the affect of parallelism come into play. But for a particular host the full-duplex isn't that big a deal.

      Many people understand the issue of latency and how bad it is for transactions. All layers of software are geared to hide/minimize latency by parallelizing transactions.

      I wonder for each application/host, does one really need the blazing speed of a SAN? Is the higher latency of each transaction worth it?

      As google has shown, you don't need a SAN to store a cache of the web. Many commodity PCs with simple IDE disks (lower latency) can do the job in parallel faster.

      --
      This sig space tolet, reasonable rate.
    8. Re:As of right now... by onemorechip · · Score: 1

      a) the link may be full-duplexed but the spindles on the other end are not.

      Of course not, the spindles don't transfer data. Maybe you meant the drive heads are not full duplex, or maybe you mean that the spindles aren't synchronized. The first isn't an issue for the host adapter, which can always maintain a full duplex transfer on its link if it needs it, and for a disk-to-switch link it won't matter anyway because it is a point-to-point connection that won't be saturated by the data rates of a single disk. On the second, I think the effects of rotational latency are lost in a large, busy SAN due to queuing of packets at every stage of the network.

      b) Very few applications have sustained bursts of reads and writes. Most have periods of sustained bursts of reads or writes.

      A valid point. I was referring to the peak demand that will load the link down and force packets to wait, and with the exception of backbones you won't see it that much. Still, There are situations in which FC will use that full-duplex bandwidth opportunistically to keep packets flowing. If a write is in progress and another disk has read data or status to send, it can do so right away. If a read is in progress and the host adapter wants to send data or another command (read *or* write), it can do so right away. Not so on the SCSI bus.

      As google has shown, you don't need a SAN to store a cache of the web. Many commodity PCs with simple IDE disks (lower latency) can do the job in parallel faster.

      True, but the Google approach isn't for every enterprise. Google doesn't need to back up all that data, and the cache doesn't require SAN-level storage management capabilities. If they lose a disk, no problem, they just get the data back on the next crawl. I doubt that Google uses the same sort of network for their corporate administration and engineering development.

      --
      But, I wanted socialized health insurance!
  2. Performance isn't everything by Gothmolly · · Score: 2, Informative

    You want manageability. You want the ability to take "some disk" and add it to a server, anywhere, at any time. You want the ability to grow/shrink the filesystems on those servers. You want redundancy, and you want top notch vendor support. Direct disk might be faster, with local processing handling the FS buffering in local RAM, but what happens when ServerA needs 20G of the 100G disk you installed in ServerB?

    --
    I want to delete my account but Slashdot doesn't allow it.
    1. Re:Performance isn't everything by Xross_Ied · · Score: 2, Insightful

      With disks so cheap just add another disk to ServerA.

      There are many external SCSI storage systems with integrated RAID and management functions (everything from audible alarms to SNMP/email support). e.g. http://www.promise.com/product/externalstorage.htm

      The cost of disks have fallen so much that the idea of a giga-SAN ($$$$) to master all storage is just plain silly. Local attached external RAID storage with management is all one really needs. Only when talking about multi-Terabytes of data should one consider a SAN.

      --
      This sig space tolet, reasonable rate.
    2. Re:Performance isn't everything by Gothmolly · · Score: 2, Insightful

      And when you've filled the rack that the server is in, where do you stick your disk array? Or do you only populate your racks 1/3 full, to allow for additional capacity, just in case? When the server in the 1U case needs more disk, where do you add it? How do you add space w/o taking the server down? A TB is 1000 GB, which is 100 10GB servers. Any decent sized shop will easily suck up a TB. Any large shop will devour lots more.

      --
      I want to delete my account but Slashdot doesn't allow it.
    3. Re:Performance isn't everything by Xross_Ied · · Score: 3, Informative

      1U server:
      3U external RAID storage system.
      Holds 14 to 15 disks, fill as you need.
      RAID will allow expansion on the fly.
      SCSI DISKS: 14 * 146GB = 1.9TB
      IDE DISKS: 14 * 400GB = 5.2TB

      If you really need to go to the next rack, fibre channel for the link (external fibrechannel RAID storage, no SAN/fc-switch) is still an option.

      The only reason I don't like it is there are very few server platforms (apple XServe being the exception) that boot from fibre-channel storage systems. If I need two internal disks (have to be RAIDed and managed) to boot the OS to load the fibre-channel driver to access the external storage why bother?

      Most server platforms suck for internal storage and RAID functionality..
      1. Sun Sparc:
      No HW RAID for internal disks, sw mirroring only. Most model's dont have support for booting from external disks.

      2. Apple XServe:
      Good SW RAID for internal disks but if a disk fails..
      a) backup
      b) recreate RAID
      c) restore

      3. Dell PowerEdge:
      Internal HW RAID controllers allow on the fly expansion but controller driver doesn't expose RAID even alerts to OS. For that you need Dell's OpenManagement suite (not support on all OSes).

      I prefer external RAID storage, where the storage system provides the management interface OS agnostic.

      --
      This sig space tolet, reasonable rate.
    4. Re:Performance isn't everything by larien · · Score: 1
      Sun v440s have hardware mirroring on internal disks, although it's only supported on 2 of the 4 (i.e. you can mirror any 2 disks using the hardware RAID, after that you need to use software RAID). In any event, I've not seen any issues in using software mirroring on any server (Sun, HP or IBM). The extra CPU load is minimal.

      Additionally, having the boot disks internally (or on a small enclosure) simplifies the bootup of the system which is crucial for problem resolution; this is especially important in SANs; if your system won't boot is it because of a problem in the server hardware, HBA, fibre cable, SAN switch or the SAN itself? With SCSI disks (internal or otherwise) there are far fewer things that can be wrong and they are far easier to debug.

      I'm not aware of any limitation in booting from external disks on any Sun systems (it's mandatory on 4800 systems and higher, up to 25k); all it really needs is a disk path the PROM can understand.

    5. Re:Performance isn't everything by Xross_Ied · · Score: 1

      Perhaps my opinion is coloured by memories of Solaris7 (on a 420R), where one had to have veritas filesystem to get decent journaling and mirroring.

      Things have improved with Solaris9 (on V240) and havent played with 10 yet, but still software mirroring is *okay* but really, for the price one pays for a Sun server can't they include hardware RAID (not just mirroring)??

      e.g. v240 can house 4 internal disks, why can't there be a integrated hardware RAID controller to do RAID5 on the internal disks?

      --
      This sig space tolet, reasonable rate.
  3. The only thing I know about iSCSI is ... by Pegasus · · Score: 1

    ... avoid it at all costs. Everyone I've talked with about iSCSI, from driver writers to end users does not like it. ATA over ethernet shows more promise, as it's much more simple than iSCSI. We'll see what future will bring ... but for now I'd stick with fibrechannel.

    1. Re:The only thing I know about iSCSI is ... by Anonymous Coward · · Score: 1, Insightful

      What do you mean by 'more promise'? I agree that iSCSI is complicated, but there is a reason it was done at the IP level. Another thing that diffrentiates iSCSI from ATAOE or even HyperSCSI (SCSI over Eth), is its inbuilt catering for redundancy,scalability and error management. In fact, if you look at the iSCSI spec approx 40% caters to error management, and in IMHO is done well.

      Regarding complexity, speed is not an issue as protocol processing is already being offloaded into hardware. The big advantage of going over IP is just that. An IP packet is ubiquitous.

    2. Re:The only thing I know about iSCSI is ... by keesh · · Score: 0

      The problem with iSCSI is the Ethernet component. It's too slow, craps all over all other Ethernet traffic and nowhere near as reliable as fibrechannel.

    3. Re:The only thing I know about iSCSI is ... by aminorex · · Score: 3, Insightful

      but, but... iSCSI has nothing to do with Ethernet.
      iSCSI is an IP protocol, and it could be running
      over anything that sends datagrams. FiDDI, HiPPI,
      Myrinet, la la la...

      If you dislike iSCSI over Ethernet (and frankly,
      it's only interesting in low-performance cases where
      IP routing is important for WAN access to NAS,
      so I can understand your aversion), don't use it.
      But keep iSCSI in your toolkit. The interoperability
      and the option to route is extremely valuable.

      --
      -I like my women like I like my tea: green-
    4. Re:The only thing I know about iSCSI is ... by hlygrail · · Score: 2, Informative

      No avoidance is necessary, and unless you're familiar with iSCSI, your "evidence" is anecdotal at best.

      iSCSI works well, and it's as fast as the underlying network (ideally, a direct crossover connection to the target storage, so there's no other traffic to contend with). iSCSI is not as fast as FCAL (2 Gbit/sec max), but only because 10Gbit Ethernet isn't the main course -- yet.

      Usually, your storage pool will be Fibre Channel, and some servers connect via FC LUNs (via an FC switch), others will connect via iSCSI LUNs. In either case, the storage array itself is not usually JBOD -- it needs to be something like RAID4 or RAID5 (or at least RAID1 / RAID0+1) to preserve the data integrity in the case of a simple disk failure.

      In any case, FC-AL has one fatal flaw -- an open loop. (Think of FCAL as "token ring," because that's essentially what it is.) If a disk fails "open" (I see this occasionally as a storage engineer for a major SAN/NAS vendor), or a cable or GBIC on the loop fails, the entire loop is taken offline.

    5. Re:The only thing I know about iSCSI is ... by Anonymous Coward · · Score: 0

      I've never heard of ATA over ethernet. How/why do you think its better than iSCSI?

      Also, I haven't found iSCSI too challenging to configure (though I've no experience with MPIO- maybe that's hard). Certainly its no harder than FC ...

  4. Differences shouldnt matter by mnmn · · Score: 3, Informative

    Both iSCSI and FC are networked version of SCSI, and all 3 technologies are much faster than their respective disks, thereby not being the bottleneck at all. After Ultra160, the standard PCI channel is saturated, and 64-bit PCI like PCIX is needed for Ultra320, all the while usually even in the burst mode ( from cache) disks cant saturate this available bandwidth, say 6x RAID5 15K RPM disks in read mode.

    FC and iSCSI are much more expensive than SCSI Ultra320, which is commodity hardware now. FC just sends the data in optic to outside the system, where larger datawarehouses can be managed instead of getting bigger and bigger Unisys boxen.

    So if you need terabytes of data all in one place (I mean at least 10 terabytes), consider iSCSI and FC and putting the disks outside the system for better management. We are getting a NAS solution to replace our backup tapes, requirement was 1.2TB. We will get 4x 300GB Maxtor Maxline II SATA disks... the slow cheap ones, and put them in an IBM xSeries 206 which are going at $500 CDN, with an Adaptec RAID card.

    Upto 16 SATA 400GB disks can be managed by a simple adaptec raid card, beyond that, think FC arrays.

    --
    "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
    1. Re:Differences shouldnt matter by Xross_Ied · · Score: 1

      See...
      http://www.seagate.com/cda/products/discsa les/ente rprise/tech/0,1084,656,00.html

      "Formatted Int Transfer Rate (min) 85 MBytes/sec
      Formatted Int Transfer Rate (max) 142 MBytes/sec"

      5 * 85MB/sec = 425MB/sec

      This provides a rough estimate of how much sustained throughput 15K drives have.
      Throw in SCSI command overhead and 5 15K drives can saturate a Ultra320 channel.

      --
      This sig space tolet, reasonable rate.
    2. Re:Differences shouldnt matter by floodo1 · · Score: 0

      those figures are no where near correct. top end 15k drives dont eclipse 100MB/sec yet sustained. those figures are only for cache bursting.

      still tho approx 5x 15k drives could saturate 320, but only when they're all at max speed.

      --
      I KUT J00 M4NG!!!
  5. As always, the answer is "depends..." by loony · · Score: 5, Informative
    Unfortunately you omitted to say what size of installation you're talking about... I work at a large company and layout and maintain a development shop for like 600 developers... I prefer direct attached disks for a simple reason - if something fails, I have less people complaining about it :-)

    Seriously, its not so much a speed issue but an issue of how you manage your environment. You can get enough performance out of all three solutions and there are other, more important things to look at.
    • Price: direct attached storage being the cheapest usually, then iscsi being a little more and fibre being by far the most expensive.
    • Redunancy: the higher the price, the more redundancy you usually get - its not a technology issue but a cost issue again... dasd is usually at the same level as iscsi since most manufacturers just give you a preconfigured computer with dasd and then sell it as iscsi... Fibre gives you the highest level of redundancy - because if you already go spend 5 million on symetrix frames its easy to justify the additional $400K for one more smaller connectrix. If you only talk about $4000 disk trays its hard to justify having an electrician comming in to give you redunant power for another $3000...
    • Clustering is another reason to go with fibre or iscsi - direct attached storage is usually a bad idea there...
    • Backups are another consideration.. Backup through ethernet is much slower than fibre tape units through a san...
    • Fibre and iscsi are not dedicated. So if I have a server with direct attached disks that happily does 2000 io/sec today, it will do so tomorrow unless that server has issues. If I have fibre or iscsi and lets say I run year end reports on another box, that heavy load there can drop my io rate to 20% or even lower.
    • Fibre is a bitch to configure the first time. direct attached disks are easy and iscsi is usually managable - but getting fibre right the first time is much more difficult. You'll break a fibre cable here cause its so delicate, have a bag gbic there and then can't get the san wwn masks right - and you just had the most frustrating 3 weeks of your life...


    As you see, I would recommend worrying less about the performance and more about what you really need and what your environment looks like.
    If you just want performance for cheap, then local disks are unbeatable. Instead of spending money on expensive fibre or iscsi offload controllers buy tons of cheaper scsi cards, instead of running fibre and buying a fibre switch buy tons of disk drives.
    Most people make the mistake to worry about capacity - in reality, its the number of spindles you need to look at and the capacity or your storage is a result of that. Figure, each disk attached gives you about 100 i/o ops per second. If you need to do 5000 ops, you need 50 disks - no matter how they are connected. Figure a disk can get you 20 MB/sec - if you need 200MB/sec throughput, thats 10 disks.
    In that example, I need 50 disks then to satisfy my requirements. Next, take the max throughput and divide it by 3 - that's 66MB/sec for fibre then, 100MB for scsi and 33MB for iscsi over gigabit. So to run my 200MB/sec I'd need only 2 scsi channels, 3 fibre or 6 iscsi connections.
    Next of course can't have 50 drives on 2 channels - more than 3 disks drop the transfer rate dramatically... Since fibre and iscsi mask the physical spindles they don't care but I need to have 16 scsi controllers to really run the disks.

    Peter.
    1. Re:As always, the answer is "depends..." by mrscott · · Score: 2, Informative

      Peter, I did fail to include some important information. We're looking at an initial implementation in the 4TB range. The goals are: * Provide clustering ability * Snapshots for data protection * Easier/more efficient storage allocation * More reasonable backups On the clustering side, we're rolling out Exchange 2003 and Citrix in the next few months, and I'd like them to be highly available - hence clustering. We're also just simply wasting DAS space on each of about 35 total servers. Eventually, I see us at around 30 servers (through consolidation) all connected to an iSCSI SAN. We're looking at both LeftHand and EqualLogic right now. As I add physical units, I also add more network connections and the EqualLogic solution automatically stripes all data across new units as they're brought online. In theory, that means I get additional I/O to the box. Further, EqualLogic is releasing a multipath I/O driver for Windows in early January. So, (again, in theory and what I've been reading), since each server will have four 1Gb NICs, I can use one for client communication and two for communication with the iSCSI array, and providing both load balancing and fault tolerance. If I go down this road, I will have separate switches between the server farm and the storage array. I don't want to mix client traffic and storage traffic. I do want decently performing storage for cheap, but also want some other things like the ability to cluster servers and provide a higher level of data protection, neither of which I can do at present. The way I understand things: if I need more performance out of my iSCSI array, I just add more processors and network connections. Of course, this won't necessariyl provide an individual server with better performance, but the system as a whole should improve. Does that make sense? Scott

    2. Re:As always, the answer is "depends..." by Xross_Ied · · Score: 1
      We're also just simply wasting DAS space on each of about 35 total servers.


      Have you estimated a $$$ cost for this wasted disk space?

      Even if you factor whatever extra cost you want to associate managing DASD vs a SAN, do these extra costs justify the cost of a SAN?

      Ok iSCSI *might* be cheaper but does it improve the way you manage storage or does it just abstract the problem away from your host OS(es)?
      --
      This sig space tolet, reasonable rate.
    3. Re:As always, the answer is "depends..." by hlygrail · · Score: 1

      [shameless plug]

      It sounds like your environment and requirements might be a good candidate for a Network Appliance filer... NetApp pioneered iSCSI, and you'd also get the benefit of instant snapshot backups, the fact that they already have (working) MPIO drivers for Windows, the performance of a RAID4 Fibre Channel storage system, as well as enterprise-level clustering/backup/recovery and a host of other bullet items you can find on your own.

      At risk of sounding like a dreaded sales person, expand your search a bit and look beyond LeftHand and EquaLogic -- they're only providing part of the picture that NetApp (and others) already have for the taking.

      And, yes, I own NetApp stock ... as do several of my mutual funds. :)

      [/shameless plug]

    4. Re:As always, the answer is "depends..." by ostiguy · · Score: 1

      are lefthand and equallogic on the windows hardware compatibility lift? i know another follow up shamelessly plugs netapp, but i am fairly certain they are on the list. if the hw is not on the list, MSFT could tell you to pound sand if you have a support issue. (imagine you buy 3 million worth of HW, but for some reason exchange runs like a dog)

      ostiguy

  6. The acid test: by hlygrail · · Score: 5, Informative

    FC-AL is the "gold standard" for performance and reliability, but has limitations when you want to expand clustering (FC switches still cost gobs of $$$). Fibre Channel cabling also has distance limitations that go away with iSCSI.

    The acid test -- for me anyway -- is seeing LARGE customers (banks, airlines, government agencies [pick a 3-letter acronym], pharmaceuticals, major industries such as oil and energy, entertainment companies and movie studios, etc.) implementing iSCSI on an equally LARGE scale, and quite successfully.

    With few exceptions, if the underlying Ethernet network is functioning properly, iSCSI performs remarkably fast. No, you won't get the 2 Gbit FCAL rates -- *yet* -- but we have customers running dozens of large (>TB) databases off a single appliance over one or two GigE ports and iSCSI.

    Generally, it's recommended to segment off the iSCSI traffic so it's not routed or mixed with public traffic anyway, but even those (small) customers that pipe all of their storage appliance data through a single 10/100 Ethernet interface only have problems if they put too many users on there as well. (A direct crossover cable is *ideal* for iSCSI.)

    In addition, the Microsoft iSCSI initiator has finally outgrown its initial bugs/problems (with our help in some cases), and is darn solid with plenty of different targets.

    I'd love to drop a bunch of example company names, but I'm sure those companies consider that information to be competitive, and it's not my place to divulge it. Any large company you can think of already has an investment in FC-AL, and all but a very small percentage have iSCSI infrastructures as well. Medium-size and small (50 employees) companies are also seeing HUGE benefits from iSCSI their own implementations.

    The 0-day acid test (which works amazingly well in our labs with the right HBAs) is SAN booting over iSCSI. Imagine having an nnn-Terabyte set of storage, from which ALL your servers boot EVERYTHING. Not a single magnetic disk is required in the servers themselves. Makes server clustering and blade/grid computing so very attainable ... provided the network will drive it. 10Gbit Ethernet and more will definitely fuel this migration, sooner than you think!!

    (As an engineer for a major storage vendor (FC/iSCSI/near-line IDE storage/archiving), I work with all of this stuff on a daily basis. Not saying I'm an expert, just that I kinda know what I'm spouting here...)

    1. Re:The acid test: by Nutria · · Score: 1, Flamebait

      Imagine having an nnn-Terabyte set of storage, from which ALL your servers boot EVERYTHING. Not a single magnetic disk is required in the servers themselves.

      That's called, "having all your eggs in 1 basket", and we all know what a bad idea that is...

      --
      "I don't know, therefore Aliens" Wafflebox1
    2. Re:The acid test: by hlygrail · · Score: 1

      Not necessarily.

      Having them all in one place makes mirroring the entire data set in real time more feasible. Heck, you can even mirror across a WAN link and have an offsite DR location that's always in sync with the original nnn TB data set.

      Having all the disks in one place can also allow for more security. Admins may have access to the servers, but not the media itself, so there's diminished risk in an entire hard disk going "missing," a la NASA JPL and/or Los Alamos. If you guard that, plus don't allow any other removable media (USB keys, floppies, optical media) and only specific, verified/authenticated network access, you've covered a large swath of access points to that data.

      In cases of medical/privacy data, guarding a single physical location is certainly preferable to having to guard multiple...

      I can think of many reasons TO have all the disks in one place, and only a few reasons to leave them in the servers as JBOD or a server-based RAIDn array.

    3. Re:The acid test: by a11 · · Score: 1

      Let me guess, you're an MBA; you sir, should not comment on shit you know nothing about, even though that's likely your job. would you also say that having all your client-server communication done over the internet is having all your eggs in 1 basket? Yes, it's one set of storage, but the SAN has multiple fabrics, each logical disk having not a single point of failure. A sample path would be a protected raid device on the back, presented down two FAs, each connected to a separate fabric, each fabric connected to a separate HBA.

    4. Re:The acid test: by hbackert · · Score: 2, Insightful

      That's called, "having all your eggs in 1 basket", and we all know what a bad idea that is...

      About the booting part: at work we boot from local disks because we Unix SAs don't have control of the EMCs, thus if a machine does not boot, we cannot do much beside calling someone who has no idea about Sparc machines. If we Unix SAs were able to control the EMCs and everything related like the FC switches, then we would boot directly from SAN. If I were able to control the iSCSI storage box, the routers and switches for the iSCSI SAN, then I see no problem of not booting off the local disks. After all, if a machine does boot but all the data and apps is not reachable, the machine is not very useful. A not booting machine is not much better.

    5. Re:The acid test: by Nutria · · Score: 1

      you're an MBA

      Don't pull shit out your ass. I'm a DBA, with a BSci in Comp. Sci.

      We have a Hitatchi SAN, and one day, guess what? The SAN crashed!! All systems using it were down for a day: AS/400s, HP PA-RISC boxen, large Alpha VMS servers, and mainframes.

      So fuck you and the horse you rode in on.

      --
      "I don't know, therefore Aliens" Wafflebox1
  7. best speed to clusterf*ck... by mkcmkc · · Score: 3, Funny
    An alternative benchmark to consider: With old-fashioned direct attached SCSI, you can pretty much only screw up the disks one host at a time. With FC (and iSCSI I suppose), you can completely fuck up your hundred-million-dollar site with one mouse click. Now that's power!

    Mike

    --
    "Not an actor, but he plays one on TV."
    1. Re:best speed to clusterf*ck... by hlygrail · · Score: 2, Informative

      Almost...

      iSCSI LUNs can be pretty friggin' huge. More than once, I've seen a customer nuke the holy bejeebers out of a 950GB iSCSI LUN all at once, with a single click (the equivalent of "LUN DESTROY"). I actually had a WebEx session to a customer once and watched him delete a 1.3TB LUN before I could stop him.

      If you're on plain JBOD, you have to get out your restore tapes and pray to the tape gods. If you're on a SAN/NAS appliance (like NetApp), you can usually restore that data in a matter of seconds (literally) from snapshot. That's saved many an admin from getting insta-pinkslipped...

  8. Spindle transfer rates by grantsucceeded · · Score: 4, Informative

    The original question was regarding real world performance of iSCSI in particular, and since frew of the posts seem to touch on that I may as well tell what I've learned from hard experience regarding the other technologies: SCSI and FCAL.

    My experience is with very high transaction volume OLTP databases (oracle) backing a financial website. I've found that neither SCSI nor FCAL adapters limit performance significantly. This was with qlogic qla3200 adapters, or with highend adaptec Ultra320, on Solaris 9 and the last few versions if RedHat enterprise. Only the older versions of redhat had some kind of problem with the qlogic driver, plus bounce buffer IO, which drove down performance. But then to be nit picky, that was the driver not the HBA. Solaris was always fine, and now redhat is too.

    The main performance challenge was *always* tuning the database and spreading out on lots of spindles. The HBAs at over 200M/sec each never posed a problem on larger sun boxes (8 or more procs) with 7 or 8 way parallel sequential reads going. On smaller hosts or smaller disk arrays,k the problem was always on the host itself or the disk seek times respectively, not the hbas themselves.

    A 10k rpm drive will do about 70 mbytes/sec off the outer platter (near block 0) and as a rule of thumb, a 2Gbit fcal adapter will do 200mbytes a second (at least on solaris or newer redhat EL). So my dual qlogics would do 400Mbytes a second under absolute optimal disk access, but typically it's not that perfect 8 way parallel *serial* scan off the outer platter, its usually farily random.

    So in the high end database applications (datawareyouse or OLTP) least the usual tuning challenge (and $$ for that matter) are with getting a fat spread across a lot of spindles, and making sure the application is either caching well (OLTP) or doing orderly, sequential scans (datawarehouse)

  9. Bad ?. How about what do you want to do with it?? by draziw · · Score: 1, Informative

    "Hi. I'm thinking about buying a car. Which is the best one?"

    Ummm What are you doing with it? What's your budget, and what are your expectations? Some people think a TB is big, some consider 20+ TB a building block standard (I do).

    1. Look on the net and in mags. - read read read
    Is this a big implementation, or a small one. I'd bet a small one, or you probably wouldn't be asking here - that probably means you don't want iscsi. Direct FCAL on multi-port RAIDs might be your speed - if you aren't big enough for that, maybe some low-end storage from Dell or Apple might be your thing.
    2. Bring in some vendors, tell them what you want - ask them how to do it. - talk to a few.
    3. Ask about install costs to implement what you want to do. (A lot of people don't realize the work involved for large scale implementations)

    If you don't have the staff to know if iscsi is for you, you may not have the staff to implement and troubleshoot it. Direct attached fibre - or even via a switch will be far easier to deal with.

    If you can afford it, stay away from SATA. The problem is that it's sooo much cheaper vs FC disks on larger systems - but the seeks are slower, the mechanicals are poorer, and the command set less complete. - It's $ vs reliability.

    Cheers,
    a storage guy.

  10. Choose your metric... by Anonymous Coward · · Score: 0

    Throughput is limited by the number of spindles and the network speed (eg, 120MB/s on GigE). It doesn't matter if its iSCSI or FC or DA. Caveat: I've found some of the iSCSI HBAs can't quite reach linespeed, but you'll be fine with a software initiator.

    Latency is probably going to be "a little" higher on iSCSI (over ethernet) than FC or DA.

    Host CPU cost is higher on iSCSI if you don't use a HBA.

    If you're serious about storage, manageability is shit on DA.

    Summary:
    Expensive, fast, manageable: FC
    Cheap, possibly a little slower, manageable: iSCSI
    Cheap, fast, impractical: DA

  11. iSCSI vs. Fibre Channel... by Anonymous Coward · · Score: 0

    Is really apples-to-oranges.

    It would be theoretically possible to run iSCSI over IP-over-FC, for example.

    Now, if you want low-latency speed, get yourself an InfiniBand SAN, or 10GigE for roughly the same speed at somewhat higher latency (but open standards-based... yay)

  12. What you talkin about, Willis? by FreeLinux · · Score: 1

    The only reason I don't like it is there are very few server platforms (apple XServe being the exception) that boot from fibre-channel storage systems.

    I have an HP MSA 1000 fibre channel SAN that runs Linux, Netware, Windows 2000 and Windows 2003. The servers connect to the SAN via a fibre channel switch and HP(Emulex) HBAs in the servers. All systems boot from the SAN! There are no disks in any of the servers. This makes replacing failed server (hardware) a 3 minute plug-and-play operation. The hardest part is lifting the server in and out of the rack.

    I set it up about a year and a half ago and it wasn't easy at the time. Initially, I couldn't get it to work for anything but the first server. After a very long time fighting with it, HP came up with a firmware upgrade for the fibre channel HBA. After that it worked like a charm and was stupidly easy. Any OS should be able to do it as the "boot from SAN functionality" is actually in the HBA not the server hardware or OS. The HBA simply presents the 'physical' LUN, what ever it is, as a virtual LUN 0 to the server's BIOS. Presto, it boots!

  13. SATA by Hydraq · · Score: 1

    SATA scares me a little, simply because Native Command Queuing doesn't allow the OS to impose any kind of ordering constraints on the commands, the way that Tagged Command Queuing does.

    So, as I see it, in a power-failure, all that hard work done on ensuring that a modern file-system is always consistent just goes out of the window: meta-data updates happening before the data itself is updated. In fact, that's likely to happen quite a bit with many writes to separate files, since the metadata's slightly more likely to be grouped together.

    However, I don't know how this works out in practice, as opposed to theory from reading what the features provide. Do any OS/FS combinations actually support enabling even TCQ in a safe manner, instead of just a potentially unstable performance boost? If so, recommendations on an FS choice gratefully received (please, specific to this issue, not the usual /. FS holy wars).

    Disable NCQ and get a major performance hit? Or pick a platform (SCSI, ATA4) which supports TCQ and an OS/FS which then uses this intelligently for meta-data handling?

    Sure, there's UPS power and decent SAN storage arrays take battery-backup to decent quality levels, but with each network link being a point of failure the next time some monkey is let loose on "unrelated" cabling issues, and the general flakiness of data-center UPS kit not quite managing to perform as advertised, having SATA with NCQ in a reliable environment leaves me apprehensive and I really don't want to be responsible for a Supported System relying upon SATA/NCQ in a separate box from the box which has the file-system drivers in it.

    Or am I smoking crack?

  14. the baskets aren't equal though... by jamesh · · Score: 1

    The "all your eggs in one basket" metaphor only works if all the baskets are equal. And if you are comparing iSCSI/FC to internal storage, you're talking about completely different 'baskets'.

    I could have a lot of fun with that metaphor but i'm sure you get the point without it.

  15. Really, the transport method isn't that important by Anonymous Coward · · Score: 0

    After doing consulting for an ISP installing a mail cluster for 70,000 users, we found that the interconnect speed has little to do with performance. In this instance, the filesystem was the biggest bottlekneck and it normally wouldn't have been if it wasn't for file locking race conditions. There is no way to tell if a given system will be better or worse for your application, and it is true... you can't find accurate benchmarks. I would suspect that OSDL would be in a good position to do reviews because they have lots of "big iron" hardware.

    My gut feeling is that iSCSI is kind of in its infancy and that 2GB fiber channel is pretty much standardized. As the previous posters said, the interface speed less important than the speed of the storage array controller and drives.

    For a different project, we used a 2GB fiberchannel disk array of 16 250gb hard drives. These gave a single machine r/w time of about 75 megabytes a second (on par with SCSI). There are LOTS of people who resell intel controller based raid boxes with either fiber channel or SCSI 3 interconnect.

  16. Reverse iSCSI? by bill_mcgonigle · · Score: 1

    Has anyone seen a reverse-iSCSI adapter? Sorry, this isn't the best possible name, I'm sure.

    I've seen plenty of adapters that you can hook to a SCSI device and call it an iSCSI device but that assumes you have a proper iSCSI HBA.

    What I need is something I can attach to a SCSI HBA that has an ethernet connector on the other side, so I can connect it to a network and do iSCSI over it.

    This particular instance is for an old machine that noone's ever going to make an iSCSI HBA for to connect it to a tape drive on the other end of a Cat 6 cable run a hundred feet or so away.

    It seems obvious but I haven't been able to dig one up on Google or Processor.com. I've even written a SCSI/iSCSI bridge manufacturer who was intrigued but unable to help.

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)