Slashdot Mirror


"iSCSI killer" Native in Linux

jar writes "First came Fibre Channel, then iSCSI. Now, for the increasingly popular idea of using a network to connect storage to servers, there's a third option called ATA over Ethernet (AoE). Upstart Linux developer and kernel contributor Coraid could use AoE shake up networked storage with a significantly less expensive way to do storage -- under $1 per Gigabyte. Linux Journal also has a full description of how AoE works." Note that the LJ article is from last year; the news story is more recent.

39 of 235 comments (clear)

  1. AOE? by laffer1 · · Score: 5, Funny

    I didn't know Age of Empires can do network storage! WTG Microsoft!

    1. Re:AOE? by Gattman01 · · Score: 2, Funny
      a fireball is a single target attack!


      I'm sure that'll go over great with your party fighting enemies in a narrow hallway.
      I'm sure the DM and your party members will be VERY forgiving when they have to create new characters.


      I forget, the AoE of Fireball is either 5 feet or 5 meters. Either way, using it in a small room is not a good idea when you're in the room, unless you don't like your "friends."
    2. Re:AOE? by Smelecat · · Score: 2, Funny

      Use AoE with caution. In a crowded data center, AoE will agro nearby equipment.

  2. Will it catch on? by andrewman327 · · Score: 4, Insightful
    From TFA:
    Some significant caveats mean that not everyone is so keen on the technology. For a start, it's a specification from Coraid, not an industry standard. Its networking abilities are limited. And its detractors include storage heavyweights such as Hewlett-Packard and Network Appliance.


    So will this ever develop into a real standard or will it remain the sole domain of one company? I do not know if I want to invest time and money into it if the latter is true. From a comp sci point of view this is a great approach to networked storage. It uses what people already have to make storage reletively cheap. I am going to wait to see where this technology goes. Maybe it will blossom and become a serious contender.

    --
    Information wants a fueled airplane waiting at the hangar and no one gets hurt.
    1. Re:Will it catch on? by SpecTheIntro · · Score: 3, Informative
      For a start, it's a specification from Coraid, not an industry standard.

      I don't know that this is true, because the LinuxJournal article directly contradicts it. (Unless I'm misreading it.) Here's what the LJ says:

      ATA over Ethernet is a network protocol registered with the IEEE as Ethernet protocol 0x88a2.

      So, it looks like the protocol has been officially registered and was granted approval by the IEEE--so that makes it an industry standard. It may not be adopted yet, but it's certainly not something like 802.11 pre-n or anything; there's an official and approved protocol.

    2. Re:Will it catch on? by hpa · · Score: 4, Informative
      So, it looks like the protocol has been officially registered and was granted approval by the IEEE--so that makes it an industry standard. It may not be adopted yet, but it's certainly not something like 802.11 pre-n or anything; there's an official and approved protocol.

      Anyone can register a protocol number with IEEE by paying a $1000 fee. It doesn't mean it's a protocol endorsed by IEEE in any shape, way or form.

    3. Re:Will it catch on? by Harik · · Score: 2, Informative

      The non-routable is a killer. Protocol-level bridging, no off-site redundancy, strict dependancies on port location. No thanks, it's a toy protocol that may get some use in the home NAS market, but it was hell to implement a reliable setup in our lab under controlled conditions. I'd hate to have to deploy it 'for real'.

      The only way to really do it is to purchase a dedicated Block Controller (spare ethernet card) and a dedicated Block Data Cable (Cat 5) and hook it up to a dedicated Block Device Multiplexer (switch). If you want a replacement for FibreChannel and are willing to live with the limits of direct local physical connections, it's useful.

      Just have fun getting those frames into a xen/vmware virtual host from an external machine...

  3. Cheaper? by DSW-128 · · Score: 4, Interesting

    I guess I don't really see how it's cheaper that iSCSI? Sure, there's less overhead from the lack of TCP/IP, so you may not need as massive a network to drive it equally. But I've been under the understanding that iSCSI doesn't require SCSI drives, so you could build an iSCSI target out of the same machine/drives as an AoE host, correct? For some applications, I think the lack of TCP/IP might be a benefit - less opportunity to hack. (Then again, I'd expect anybody deploying something like this or iSCSI would drop the few extra $$$ to build a parallel network that transports just storage.)

    --
    This .sig is printed on 100% recycled electrons, but is best viewed using 100% fresh photons.
    1. Re:Cheaper? by hpa · · Score: 2, Informative
      The main advantage of AoE is that it's simple enough that you could build it in hardwired silicon if you wanted to, or use small microcontrollers way smaller than what you'd need to run a fullblown TCP stack (this is what Coraid does, I believe.)


      The main disadvantage with AoE is that it's hideously sensitive to network latency, due to the limited payload size.

    2. Re:Cheaper? by NSIM · · Score: 2, Informative

      You are quite correct, there is no requirement for SCSI drives in an iSCSI implementation, iSCSI refers the protocol, not the drive interface, i.e. it's the SCSI command protocol implemented over TCP/IP. So yes, you can build an iSCSI system out of commodity parts and many people are doing so. if you want get an idea of the options out there for doing this, take a look at: http://www.byteandswitch.com/document.asp?doc_id=9 6342&WT.svl=spipemag2_1

    3. Re:Cheaper? by tbuskey · · Score: 2, Informative

      I hacked together an iSCSI setup from some old hardware.

      2 P II 400MHz systems running FC4
      One system had software raid 0 on 2 IDE drives.
      The target has a spare 10GB IDE drive.

      Added 2 10/100T cards with a crossover cable.

      Did a quick dd if=/dev/zero count=some large number of=the raid mirror or iSCSI target.

      The iSCSI target was 30% slower.
      Way cool.

    4. Re:Cheaper? by Zephiris · · Score: 3, Insightful

      The really silly thing about this is that they claim it's "lower overhead" than TCP/IP because people are having to buy "expensive TCP offloading engines" for iSCSI, when a few seconds of research provided, namely on Wikipedia (http://en.wikipedia.org/wiki/ISCSI), that plain NICs can outperform the offloading ones, and sure, it's obviously going to be lighter than TCP/IP, however, ATA over Ethernet only has basic authentication (MAC addresses, which can be forged cheerily), can't be routed, and isn't very available. It's -only- usable for Storage Area Network, not really for general remote drive (or part of a drive even) access. At currently, only Linux support is available. iSCSI is supported by Windows, Linux, Solaris, among others. Even FreeBSD is working on a native implementation. Windows Vista will even include a fully built-in/native support for iSCSI. I can't imagine why they complain that iSCSI is 'more expensive' to implement, when their primary product for ATA over Ethernet is a 'special drive enclosure' (according to their documentation, you can't even use AoE with standard networking hardware, interfaces, routers, etc) with special networking hardware which can house up to 15 ATA drives. The enclosure itself (with nothing else) costs about $4000. You could build ten high-end machines dedicated to serving iSCSI requests to multiple drives each for that (five if they use actual SCSI), and still use standard networking hardware, and still have it accessable from a network across the world, with things like actual user authentication.

      The whole ATA over Ethernet thing seems like trying to blow smoke up the arses of some very rich and silly people. At the same time, the technologies are rather different, too. If you just want to build a SAN? Sure, go for HyperSCSI or AoE, maybe, but if you actually want remote drive access? Why would you want any of this? They shouldn't be trying to utterly replace iSCSI. It's absurd. As far as I see it, iSCSI is more of a general and free/open replacement for things such the old 'X drive' remote service, and network filesharing like SMB/NFS. Websites can (and are starting to) offer iSCSI targets to offer remote drives for backup. It can also be used for cheap SAN, or more-or-less replacing SMB/NFS over a network. It does all of this rather well.

      It seems to me that the company behind ATA over Ethernet is becoming rather desperate to resort to such claims.

      --

      "A Goddess rarely smiles for she is forced by others to be an island unto herself." - Zephiris
  4. Yes! by mihalis · · Score: 3, Interesting

    I like the look of this technology. The great thing it has going for it is that most of the non-hard-disk infrastructure (switches and cabling) leverages the tremendous investment in ethernet. That is great.

    The thing that needs work, in my view, is that the bit that links the disks and the rest isn't cheap enough. In fact what would be awesome here is if, say, Seagate provided disks with native ATAoE connectors built-in. They might have to buy Coraid for that to happen.

    In case anyone thinks I'm out of my mind here, don't forget that disks can already be had with ATA interface, SCSI interface, FCAL interface, SATA, SAS - that's five and there are probably more. Yes they might be a bit more expensive, but if they come in under the combined price of "regular ATA disk" + Coraid ATAoE disk adapter then you'd come out ahead. Someone like Seagate would, I think, have the industry-wide clout and respect to succeed in making this an open standard. Something that will be a challenge for Coraid for a long time (I have nothing against them, btw, they are friendly and their mailing list didn't spam me when I signed up).

    When I was on the OpenSolaris pilot project I tried to get people interested in using this with Solaris. I think it might be great for ZFS, for example. At that point the real storage wizards were more interested in iSCSI, but I respectfully disagree, OpenSolaris + ZFS + cheap storage = awesome file server. Emphasis on the cheap. As Sun people will admit, their previous attempts at RAID were more like RAVED (Redundant Array of Very Expensive Disk). Coraid does have a Solaris driver, so this is definitely feasible.

    1. Re:Yes! by Lisandro · · Score: 2, Funny

      I like the look of this technology.

          It's the eyeliner. It doesn't look half as good in the morning.

  5. iSCSI killer? by apharov · · Score: 4, Interesting

    In the context of using this in low-cost environments with Linux I can hardly see how this could kill iSCSI. Last week I implemented an iSCSI setup for about 500 euros (target serves out 500GB disk space for non-critical backup) using standard components, FC5, iSCSI Enterprise Target and Microsoft iSCSI Initiator.

    Works great and is a lot (>10x) faster than the about similarly priced NAS device that was used for the same task before.

  6. Re:Reliability by dfghjk · · Score: 2, Insightful

    how is that relevant to the discussion of protocols?

    reliability of SCSI versus ATA is largely imagined and the rest is intentional. drive manufacturers want you to believe their enterprise drives are more reliable and right now those drives are largely SCSI.

  7. iSCSI can talk to ATA drives by Anonymous Coward · · Score: 2, Insightful

    iSCSI is a protocol. ATA disks are a physical medium. They work together, and you get the benefits of SCSI commands with the price of ATA disks. Just because iSCSI is the protocol does NOT mean that you need to use SCSI disks. You might even be talking to a RAID of ATA disks and not know it.

    So, why would you need AoE? It's already cheap, and been for sale for some time.

  8. Re:How does it lower costs? by Unknown+Relic · · Score: 2, Informative

    Oops, only the linux journal article is down, the cnet article has answered my question: it isn't any cheaper than iSCSI + SATA solutions. $4,000 without any drives, compared to a starting price of $5,000 for a StoreVault (new from NetApp) with 1TB of storage. Other options such as Adaptec's Snap Server start just as cheap.

  9. Re:Reliability by SpecTheIntro · · Score: 3, Informative
    People often forget there is a considerable difference in the reliability of ATA drives versus SCSI. If you are going to use some sort of ATA based SAN be prepared for disk failures much sooner than if they were SCSI.

    This is not necessarily true. It all depends on how your network storage is being used. SCSI drives are built and firmware'd for the sole purpose of running a server, and they consistently beat any ATA drive (be it IDE or Serial) when it comes to server performance and reliability. ATA drives just aren't built to handle the sort of usage a server requires--note that this isn't a reflection of quality, but of purpose. But a file server (which is the only thing the SAN would be used for) requires much less robust firmware than a server housing MySQL, PHP, maybe a CRM suite, e-mail server, etc.--and so ATA drives shouldn't immediately be ruled as less reliable. The maturity of the technology plays a more important role than the interface.

  10. Re:Another "Killer" by wasabii · · Score: 4, Informative

    AoE is a networked block device technology. NFS and Samba are network file system. One is about making block level access to a device available over the network, the other is about making file operations available.

    In the case of AoE, a single remote block device can be shared between multiple systems. Each client could issue it's own write/reads. in combination with a distributed file system, each node could mount the same FS.

    It's the same as NBD, iSCSI, Shared SCSI, and Fiber Channel.

  11. Bootable? by Anonymous Coward · · Score: 2, Interesting

    Is it possible to boot WindowsXP via AoE or iSCSI? I want a diskless WindowsXP box.

    1. Re:Bootable? by KingMotley · · Score: 2, Informative

      Yes, you can. Just look for an iSCSI PCIe card. It's basically an ethernet card that looks like a standard ethernet card and disk controller (Most are SCSI controllers, although there is no reason they couldn't make it look like an ATA controller, but you'd lose a lot of features).

    2. Re:Bootable? by wild_berry · · Score: 2, Insightful

      My inexpert guess would involve getting a Tyan Thunder/Tiger motherboard with LinuxBIOS and compiling and configuring your own ATAoE support. Windows would need to think it's a local disk; LinuxBIOS could pretend that it was.

  12. Not an iSCSI killer, here are the reasons why not by cblack · · Score: 4, Insightful

    1) Complexity for RAID and volume management is not centralized and is pushed to individual hosts. One of the main benefits of SAN technology is that you can just carve out storage from a single interface and assign it to a server and the server simply sees it as a block device. With AoE each drive is addressed separately by the server, which means it is up to the server (and server admin) to figure out how to handle distributing over multiple drives, handle drive failures, and expanding volumes. This is huge.
    2) It is not a standard and is only really supported by one vendor. This may change in the future but it is significant right now. It is registered with the IEEE but that hardly makes it a peer-reviewed standard with input/improvements from many experts.
    3) No boot from SAN. Until someone makes some sort of mini bootstrap system on a CD or a hardware card implementation of AoE that can be addressed as a block device admins will be unable to host the root filesystem and/or C: drive on an AoE SAN
    4) No multipath (that I can see). Perhaps I misunderstand this, but it seems like there is no way to do multipath IO with this system. That is, all the drives are single-connected to a network. If that network switch goes down, all drives on that network are inaccessible.
    So AoE looks like a neat technology for pushing drives out of the box and potentially sharing them among hosts, but there is no intelligence there. It is just dumb block addressable storage with no added availability or management, and therefore is far from being an iSCSI or FC killer.

  13. Re:Another "Killer" by slimjim8094 · · Score: 2, Insightful

    So was MP3 (at least implementations) and it was around longer and more widely supported by programs/devices.

    --
    I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
  14. Re:More for business? by jimicus · · Score: 2, Informative

    Maybe cheapie little IDE hard disks are under $1/GB. If you want hot-swap, availability of half-decent RAID cards and disks which actually get to see some testing before they leave the factory, then you'll have to spend quite a bit more.

  15. Not so much cheaper by err666 · · Score: 2, Insightful

    Ok, so the Coraid people are selling their ATA over Ethernet 15 slot version for $3,995.00. That's apparently around EUR 3133. I can get something proven iSCSI based from Promise here in Germany for 4.499,- (a Promise M500i). Ok, that is almost 50 percent more expensive, but the iSCSI solution is supposed to work under all operating systems (Linux, *BSD, Windows, etc.) more or less out of the box, while for AoE you will have to buy drivers for Windows, and has generally worse support for other operating systems.

    Now, suppose you will really use this baby and you want to have *lots* of storage.

    So you buy 15 SATA drives, like say Seagate ST3750640NS for EUR 444 each. Now the difference between AoE and iSCSI becomes less:

    AoE solution: EUR 9793
    iSCSI solution: EUR 11159

    Now the iSCSI solution is only 14 % more expensive.

    Now it would be clear for me to go for the "safe" path of something proven and widely supported like iSCSI instead of AoE. The infrastructe you need will be the same anyway (Gigabit Ethernet, Gigabit ethernet switch).

    --
    reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))
  16. Re:More for business? by riley · · Score: 2, Interesting

    Storage Area Network solutions are not under the $1/GB. Running a network filesystem (NFS, SMB, Coda, etc) are running a local filesystem over networked storage are two different things, fulfilling two different needs.

    iSCSI and AoE don't necessary directly benefit the small/home server market, but for the things that SANs are traditionally used for (data replication across geographically separated sites without any changes to the application software) there could end up being a big win in cost.

  17. Re:Not an iSCSI killer, here are the reasons why n by Cyberax · · Score: 3, Interesting

    You can use Ethernet-based multipath IO, a lot of switches can be stacked to provide redundancy (and load-ballancing).

    AoE is a COOL thing exactly because it's a 'dumb' technology. You can buy a switch, a bunch of disk drives and AoE adapters, a small Linux PC - and your storage system is ready. There is a lot of existing RAID manipulation and monitoring tools for Linux, so RAID configuration is not a problem.

    You also can boot from SAN, it's not a problem. Just add required modules and configs to initrd and place it on a USB drive.

  18. ATAoE is a crock, it's no better than iSCSI by NekoXP · · Score: 3, Informative

    So. Coraid has not, in a whole year, explained why iSCSI is somehow more expensive (disks + Linux kernel + network.. all the same) than their ATAoE implementation.

    They'll give excuses about the cost of iSCSI hardware offload.. but you don't need that. ATAoE is all software anyway it's just a protocol over ethernet, rather than layered on top of TCP/IP.

    What is wrong with using TCP/IP - which is already standard and reliable? Nothing. We know TCP/IP provides certain things for us.. resilience (through retransmits), and routing, are a good couple, and what about QoS?

    ATAoE needs to be all the same network, close together, they're reimplemented the resilience, you can't use inbuilt common TCP checksum, segmentation and other offloads in major ethernet chipsets because they're a layer too low for it.

    No point in it. Just trying to gain a niche. They could have implemented products around iSCSI, gotten the same performance with the same features, for the same price. Bunkum!

  19. Re:Reliability by Ahtha · · Score: 2, Informative

    I agree there are reliability problems with ATA. We expect ATA disk failures within the first year for all of our ATA RAID systems and have yet to be disappointed. ATA drives just don't seem to be able to handle the pounding they get in a RAID configuration. We still use them, however, mirroring the ATA RAID with another server/disk installation as a backup. Of course, that doubles the cost of the ATA solution, but, it's still cheaper than a SCSI solution.

  20. I can't tell if this is clever or stupid... by YesIAmAScript · · Score: 2, Interesting

    A wise man once told me there is a fine line between them.

    ATA is a crappy protocol, even when local. It's only good for squeezing that last $0.03 out of the controller cost. Once you are using ethernet cables ($1) and links and PHYs on each end ($4 each), it makes a lot more sense to put some brains back in. Use SCSI. Heck, even ATAPI optical drives (the optical drive in your computer) uses ATAPI, which is SCSI in packetized ATA transfers.

    Also, I'm a bit nervous about the packet CRC validation being done in the ethernet controller/layer itself. The problem is that if an ethernet switch between you and the storage device stores packs and forwards them (as all smart switches do), it may also chose to regenerate the CRC on the way. If it corrupts the packet internally and generates a new, valid CRC for the new, corrupt packet, you have undetected corruption. I'd be a bit nervous about that for my hard drive.

    I do think using GigE is a smart way to attach hard drives to servers. I look at the back of an Apple XServe and see two GigE ports and a fibre channel card. Why can't one GigE port be used to attach to the network and one to the XServe RAID? Why do I need to get a multi hundred dollar card to attach to the XServe RAID when that GigE port is fast enough? It'd sure save a lot of cost, and hopefully reduce the price ot the end user.

    Anyway. I'm pro GigE attachment, not sure I'm for this AoE.

    --
    http://lkml.org/lkml/2005/8/20/95
  21. Re:More for business? by rf0 · · Score: 2, Interesting

    iSCSI is slightly differnet as rather than presenting a file system, it presents a hardware device. So you show it a 1TB device over the network (e.g /dev/sdb) then the client machine can partition that disk up as if it was local. Thats the advantage over just a shared network filesystem

  22. I just deployed an AoE SAN by Tracy+Reed · · Score: 4, Informative

    AoE rocks. It is very easy to set up, way simpler than iSCSI or fibrechannel or any other SAN technology I have used. And it enabled us to have many more options for high availability or clustered filesystems (which we are not yet using but I have been following the progress of GFS and Lustre, learning towards Lustre). We did not buy the Coraid stuff but instead used vblade on our own disk machines. A disk node in our cluster has 4 300G SATA disks which we RAID 5, 512M RAM, and the cheapest CPU Intel currently makes. We have dual core Opterons with 4G of RAM each with no internal disk. They PXE boot and then mount root straight off the AoE. Then we run Xen on the Opteron boxes. This is the killer setup. We can migrate xen domains avoiding downtime for hardware maintenance and if a machines dies we can instantly restart it on another machine because it all runs off the AoE SAN.

    So far I am very pleased. Just make sure you get hardware that can do jumbo frames as this will increase your performance by 50%.

  23. put it back in the oven by jhackworth · · Score: 2, Insightful

    perhaps an interesting idea, but just because I can build a computer out of old, recycled clock parts doesn't mean it is going to become my server. Also, iSCSI adoption has increased something like 40% this year. Windows support for iSCSI will improve dramatically with the next revision, and iSCSI costs are only going to decrease.

    Also, consider management of one of these AoE boxes. What sort of tools are out there to simplify provisioning, deployment, snapshots and backup, etc. In order for this to go anyplace but the basement of 'the IT guy at work' a whole lot more stuff will be required. Oh yeah, and that probably isn't going to happen with 1 vendor controlling the market.

    AoE is not fully baked yet. Put it back in the oven and let me know when the timer goes off.

  24. AoE works, and it is cheaper by MagicMerlin · · Score: 2, Informative

    we bought coraid devices, and they are AoE is much simpler (read: cheaper) than iSCSI. when using jumbo frame switches/cards, we were able to get transfer rates very near theoritical limits on gigibit links, something I have never seen on iSCSI or fc for that matter.

    the only thing that bothers me about AoE is there is only a single vendor supporting it at the moment. other than that, it is great stuff. while it is not routable in the sense ip is routable, you can do creative things with ethernet switches and vlan basically giving san like functionality at a fraction of the cost. no longer do you have to keep dual fc/cat6 infrastructure in your server farm.

    it's cheap, and if/when it supports bonding lines, well beat fc in performance (comparing two gigabit fc vs/ bonded gigabit ethernet).

    merlin

  25. Re:How does it lower costs? by Tracy+Reed · · Score: 3, Informative

    I think you are probably looking at the cost to buy Coraid's gear. You do not have to buy their stuff, although I am sure that they prefer that you do. I built my own AoE SAN using regular PC's. Way cheaper. I take the google approach: Use a larger amount of commodity hardware and design the system in an intelligent way to achieve the same performance and reliability at a better price/performance. Coraid hardware is basically just a Linux box with disks exporting AoE volumes. The nice thing about it is that you get their support. But AoE is so simple that you generally don't need support beyond perhaps the mailing list.

  26. Re:Reliability by afidel · · Score: 2, Informative

    the odds of 3 drives failing at once are astronomical.

    No, they aren't. Just have an array running for a year or two and bring it down for maintenance, your chances of multiple drive failures are VERY good. Of course that happens even with SCSI drives, but it even more underscores the need for a premium part. Btw I just live through a scare this weekend. We lost one drive after powering up one of our main DB servers, then lost a second about 10 minutes later, luckily the 16 drive array was setup as RAID6 instead of RAID5, the first good decision we have found from the previous staff =)

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  27. Re:Another "Killer" by die444die · · Score: 2, Informative

    My point was that something being opensource does not really help it in the end. In fact, this seems to rarely boost public appreciation of any product.

    --
    die444die