HyperSCSI Examined
An anonymous reader writes "Eugenie Larson of byteandswitch.com has published a brief article that reviews the HyperSCSI protocol, which like iSCSI allows for an IP based san. The twist of HyperSCSI is that it's opensource, and runs over raw ethernet, avoiding the overhead of TCP/IP. The article has some comments from early adopters of HyperSCSI, as well as some comments from top vendors in the iSCSI industry."
I read somewhere that it's like 5 times faster than SCSI over TCP/IP. Is it true? And how great is the sacrifice of not using TCP/IP? I mean, what doesn't support Ethernet these days?
The summary says "IP based" then "without the overhead of TCP/IP" and then "raw ethernet".
Which one?
Can't be both IP based and raw ethernet at the same time.
You don't expect us to RTFA do you?
I like the idea. Ethernet hardware is dirt cheap and fast. What it needs is a cheap IDE bridge board. That would let you put some IDE drives in an external enclosure and plug them into the local LAN.
Mea navis aericumbens anguillis abundat
why is there an article on this, i mean linux wont support it for another 2 years lol obviously, it's so when linux does support it, legions of slashbots can complain about the duplicate stories!
Do you even lift?
These aren't the 'roids you're looking for.
If you look go to the MCSA site and look at the HyperSCSI FAQ, it does implement reliability and flow control, just not in the same manner as TCP.
The only technical negative side I can (at this time) is that because the implementation isn't over IP, you can't traverse a router. This usually isn't a problem but could cause some inflexibility in larger deployments.
And this is a technology breakthrough? I wouldn't want my data travelling down a wire with no error recovery no matter how small the error rate.
From excellent karma to terible karma with a single +5 funny post...
and
obviously, it's so when linux does support it, legions of slashbots can complain about the duplicate stories!
From the article:
Read, L
Two big disadvantages:
First, Ethernet can't be routed, so hyperSCSI isn't going to be nearly as flexible as iSCSI. This is the reason that just about everything that might want to be routed is usually carried over IP (and TCP and UDP and other stuff on top of IP). Straight ethernet is for stuff like ARP that really doesn't want to leave a network segment.
Of course, one could reasonably do something hyperSCSI-like across IP, and still save the TCP overhead. (Consider that in a low-loss short-hop environment, NFS over UDP generally outperforms NFS over TCP). The problem here is that SCSI was never ever intended to run well over a lossy transport, and it doesn't. That seems a serious objection to running SCSI over both non-routable and non-reliable ethernet and routable but still non-reliable IP.
C'mon, there's a reason why people use TCP....
And why iSCSI chose TCP as the transport....
Use loose and fast HyperSCSI on your local segment where it's possible, and use a concentrator that translates into iSCSI over IPSEC for secure WAN connectivity.
That way you only need to buy one TOE card per WAN edge. Those can get expensive!
Fuck Beta. Fuck Dice
SCSI is dead
Many people here disagree with you. I wish I had SCSI hardware... you troll... now you've hurt my feelings...
tuned for SCSI commands and data transfers. This is the particularly interesting part of the protocol. It assumes you're going to be doing bulk transfers, and lets both ends negotiate windows for performance (as opposed to using a sliding scheme).
As I see it, the real problems:
- SMP "experimentally" supported
- client and server can't coexist on same box
- client model is not decoupled enough from the server (a server going down can mean the client could crash)
It appears the driver software needs some work properly implementing what seems to be a nifty protocol. And they want to port it to Solaris. I think they should get the locking and stability down first.
Fuck Beta. Fuck Dice
BTW, Andre Hedrick is one of the main IDE developers for Linux.
I certainly appreciate his IDE efforts, but of course he is going to criticise the technology - his company is an iSCSI company!
What, do they think he is going to say, "Gee, and all this time, I've thinking that iSCSI is the right thing to work on. I'm going to abandon iSCSI right now, and start playing with this HyperSCSI thing."
The Internet's nature is peer to peer - 20050301_cs_profs.pdf
If it is truly raw ethernet, this means that you cannot route it. On most current networks, you probably wouldn't want to because of latency issues, but, on networks with 1 gig or 10 gig links, latency should not be a problem. So, if it does use raw ethernet, this is actually a limitation of it, not a benefit.
Need Free Juniper/NetScreen Support? JuniperForum
More to the point, we have usb 2, serial ata, and firewire. Really all you should need is usb 1.1 and firewire in ever-increasing speeds, besides your display output. There's literally no need for anything else. Firewire can be operated in a synchronous fashion, after all. It supports lots of devices per adapter (63 or 127 depending on implementation) and comes in powered and unpowered as well as peer to peer or host-based forms, but nearly all devices will fall back to the lowest, slowest mode - which is capable of 400Mbps per second, or 50 Megabytes, clearly enough for all but the fastest hard drives, and nearly any other use to which you might put it. 800Mbps firewire is available today, and 1.6Gbps is uh, just over the horizon or something. They claim they'll do 3.2Gbps over fiber in the near future as well, but I'll just stick to developments just over the horizon, not way over it, here.
Frankly what I want to see more than anything is native 1394 storage devices. 1394 is somewhat to scsi (it would kind of have to be, wouldn't it?) and is at least well documented. I want hard drives that have only power and 6 pin 1394 connections on them, pass through style. You could treat them basically like an external scsi chain not requiring termination, but inside your PC. Firewire to IDE bridges are not the answer, they just make things more complicated and firewire is supposed to make things less complicated, though I suppose they are a sort of interim solution.
For the average user, if you could get native (and thus inexpensive) firewire solutions for everything, it would fulfill your needs much better and would make computer use much simpler. The only thing you need besides firewire and USB for every common peripheral is video output and possibly high speed networking, perhaps provided from a MII or similar connection, just as AUI was the standard at one time. Just, without the annoying box and cables. And let us not forget, a memory slot. I assume most people would also like to have wireless ethernet with an external antenna jack. For the rest of us, we would continue to run our however-shaped boxes with internal expansion, PCI-Express or what have you, and our cabling would be much simplified. Once again, 800Mbps is 100MBps, yes? Assuming you can only really sustain 50 or 60% of that with multiple devices competing for the bus, that's still fast enough for basically any two hard drives anyone is likely to have at home. So 800Mbps firewire is perfectly adequate to the task of connecting hard drives today. It's not that SATA isn't, it's that having less types of interface simplify a computer. If you're only going to have two interfaces, you will get more mileage out of USB2 and IEEE1394 than you will USB2 and SATA, even one of the bastardized forms of it like External SATA, if people will just make IEEE1394 devices.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Fiber Channel SANs aren't based on IP either, yet people manage to do off site replication with them.
I don't know how far away you want to put your off-site backup, but Cisco have been selling a GBIC (Gigabit Interface Converter ? Too many FLAs for my head these days), which they've been calling 1000BaseZX, which will send an GigE signal around 90 Kilometers over single mode fibre.
Even Full Duplex Fast Ethernet over multi-mode fibre will go 2 Kilometers.
You can build some really big ethernet networks these days. I don't think the non-IP thing is all that much of an issue.
Although the idea of using cheap commodity equipment like ethernet is to rationalise multiple networks down to a single IP network, there are also good reasons to using commodity ethernet to build a separate network for your storage, security being the main one. It probably wouldn't be too good to have CodeRed or other worms of its ilk infecting your storage network.
The Internet's nature is peer to peer - 20050301_cs_profs.pdf
The article does and abysmal job of covering this, but the homepage for HyperSCSI has a nice PDF presentation that covers just this topic. In short, it goes something like this: The SCSI protocols already provide error checking The HyperSCSI layer adds flow control and retransmits Ethernet provides certain other checks So, in total, you have the same reliabilty of iSCSI and FibreChannel with less overhead (i.e. significant overlapping of the protocols in terms of error detection/correction).
Oh, was that my outside voice?
USB maximum cable length: 5m
Serial ATA maximum cable length: 1m
Fiber Channel maximum cable length: 10,000 m
The lessons of NFS are being ignored, and I'd expect HyperSCSI to die when it hits the same limitations.
NFS started out UDP-based, and moved toward TCP with NFSv3. Why? Because having all that error correction done at the network layer made for a better product; TCP does all the work to insure packets aren't lost or out-of-order. UDP doesn't, and the NFS application layer had to handle it, making it slower, more painful, and a duplication of effort better spent elsewhere.
The industry guys are almost right on this one. It isn't a beer can with a motor; it's a beer can with an M-80. Fun to watch when it works right, damn painful if you screw it up.
SATA is killing SCSI. Yeah. Thats right, I just replaced my 30 drive dual channel ultra320 array with... what, 3 12 channel SATA-150 controllers?
Come back and troll when SATA has doubled in speed, and when I can plug at least 15 drives into a card.
Or hell, just stay out of the game, since Fibre Channel has 2Gbit/ps (250MB/ps, still faster than SATA, slower than ultra320) and 255 devices, with multiple host access over a SAN, which can be set up redundantly. And that ignores the point of this article, SCSI over normal networking equipment.
So, to reiterate:
SATA - Lower speed. Lower capacity (# drives). Single host access. (Lower drive warranties too.) Cheap.
SCSI - Higher speed. Higher capacity (# drives). Multiple computer access via FC,iSCSI,HyperSCSI. (Longer drive warranties.) Expensive.
So, for the home user, cheap is good.
For the average financial institution which I'll estimate has roughly 1TB of information that needs to be available to everyone all of the time, well, they'll get what they pay for.
Fiber Channel maximum cable length: 10,000 m
Add the appropriate routers and switches and you can easily go 90 km on dark fiber. Add some appropriate routers onto a fast network (T3, ATM, what have you) and you can go 500 km. With fast FC connected storage at each end. Of course, this sort of solution is used by data centers, not home users. But Fiber is the obvious solution to data storage problems. And there is enough mass in the server storage market now that prices are starting to come down. Of course, if you need fast, redundant, capable storage you won't blink at the cost.
In my universe I'm perfectly normal, it's not my fault you don't live in my universe.
Never underestimate the bandwidth of a pair of sneakers carrying a hotswappable hard drive jogging down the hallway.
KFG
iSCSI has drivers for every OS you can imagine, written by CISCO, IBM, Microsoft, and released under the GPL. This is from the iscsi sourceforge page.
This implies for instance that one could boot ones diskless workstation from a collocated netapp on another continent, protected by a an IPsec tunnel. While i could do something similar with ethernet layer tunneled over IP, it leads to many complications and difficult debugging. I have personal experiance with this, as this how our company runs its ethernet layer phone system.
Because the IP checksum and TCP checksum occasionally disagree about the packets' validity in real-world routers and operating systems - they are both needed to provide redundancy and robustness. Stevens' TCP/IP Illustrated cites [Mogul 1992] providing counts of checksum errors on a busy NFS server:
Layer Total packets # chksum errs
Ethernet 170,000,000 446
IP 170,000,000 14
UDP 140,000,000 5
TCP 30,000,000 350
Basically, when absolute accuracy is required the more error checking the better.
We recently built a 1.6 TB SATA file server for our (ahem) institution. Used a 3ware 8500-12 controller (which looks to the O/S like a single scsi device), 12 disks (10 active, 1 parity, 1 hot spare, 160 GB each). Redundant everything. The speed limiters turn out to be the filesystem (ext3, probably not the best choice for small files; writing directly to the device is about 60 MB/s, to the filesystem typically 10-20 MB/s) and the network connections. Our users haven't noticed a speed difference between it and the NetApp it replaced. But for about 1/10 the price, they sure noticed the 15x extra capacity!
I'm rather partial to the "HighHeelAndHighHemlineNet" network myself, but the protocol is harder to configure and maintain.
Not to mention the potential cost overruns.
KFG
The GPL is not specific to Linux. You can have an application built upon Win32 and released under the GPL, and it will not run on Linux. The point of the GPL is not to help any particular operating system, yet to assist software to have a conditional freedom and authority backed by copyright law in recognition that if it were true public domain then it could be "hi-jacked" into another software.
The GPL establishes penalties for people or artificial entities that don't provide the sourcecode of the software, yet the GPL states it is voluntary to accept the GPL and if not accepted then copyright law is the premise for not accepting the GPL's distribution rules. And in copyright law, whoever using the copyrighted software needs the permission of the copyright owner on use of the alleged "software." GPL is a harmless fuzzball, fear the copyright owner.
If you want true freedom, then software would be released anonymously in the Public Domain and the risk is someone could steal the software and claim they own it. An example of a Public Domain work is the Authorized Version Bible aka King James Version Bible of 1611 A.D. All the recent alleged "Bibles" such as "New International Version", "American Standard", "New American Standard", "New King James Version" (false King James AV), and "Century New King James" (false King James AV), and many more are all actualy copyrighted! They are known as false Bibles because in their preface or inserts there is a text from a corporation that establishes conditions of their usage, unlike Public Domain bibles such as the King James Authorized Version Bible and also Gutenberg Bible. Further example, the NIV aka "New International Version" manifests conditions upon the reader: "The NIV text may be quoted and/or reprinted up to and inclusive of one thousand (1,000) verses without express written permission of the publisher, providing the verses quoted do not ammount to more than 50% of a complete book of the Bible nor do the verses quoted comprise more than 50% of the total work in hich they are quoted." In short, the NIV changes the many books of the Bible in such little ways to change the outlook and image of God's message and issues a copyright'd patent proclaiming they own and say how much you can quote without express permition! How long before a alleged "Bible" is released that says you can only read it on Sunday and only upon the permission and interpretation of an alleged "father of the Catholic Church" and you must accept the Catholic Church without condition as the divine authority of all scripture:
"For the Roman pontiff (pope), by reason of his office as VICAR OF CHRIST, and as pastor of the entire Church has full, supreme, and universal POWER over the whole Church, a power which he can always exercise UNHINDERED."
--CATECHISM OF THE CATHOLIC CHURCH, 1994, P. 254 #882
"[W]e hold upon this earth the place of God Almighty."
--POPE LEO XIII
"...We declare, state and define that it is absolutely necessary for the salvation of all human beings that they submit to the Roman Pontiff [pope]."
--POPE BONIFACE VIII, BULL UNUN SANCTUM, 1302
"No person shall preach without the permission of his Superior. All preachers shall explain the Gospel according to the Fathers. They shall not explain futurity or the times of Antichrist!"
--Pope Leo X, 1516
Although I may seem offtopic, my point I try to emphasize is copyrights are evil to the extent they can regulate and infringe upon others by use of truth and law that is vested in the mere essence of man if not written on paper. The GPL uses copyrights for the good of mankind, as if to turn copyrights into a double-edged sword, yet even the GPL can be used for evi
Secured Party, Without Prejudice, UCC 1-207: Creditor
Will you be able to boot from these devices? I don't see any mention of it in the article; I'd imagine you'd need support both in the BIOS and possibly the network card..?
In short, it doesn't. That has to be done by the filesystem layer or application layer itself. A Fibre Channel-based SAN setup such as is common in enterprise deployments doesn't normally provide for this either (Fibre Channel zoning is an exception, but that's a feature not normally required by most setups). If you really want to do concurrent filesystem access by multiple machines, you need a distributed lock manager of some kind, similar to what you have in Oracle RAC's cluster filesystem, Sistina's Global Filesystem, or OpenGFS.
Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
I think if you read again, and this time don't assume every poster above you doesn't already know this, you'll see the point they were trying to make.
If this uses it's own Layer 3 protocol (presumably, and for the sake of argument, called HyperSCSI), then it's NOT IP BASED... and the article summary indicated it was IP based, then contradicted itself.