Storage Area Networks for Linux?
angelo asks:
"Many of us have heard the buzz (on the radio of all places) about Storage Area Networks, or SANs. They are a method of accessing a common drive system, backing up information over a second network, and collaborating a server farm. My question is this: Can you connect a Linux box to a SAN network? If so, which SAN products support Linux connection and administration? Do all channel cards have drivers, some of them, or is it "in the works?"
Almost all SANS will do NFS, most likely FTP and some SMB... Some will also have fibre channels...
this is a slightly clueless post ... off the top of my head, nexsan (www.nexsan.com i think), western scientific (wsm.com), hp (may have? surestore?), artecon (LynxArray, LynxNSS) ..those were some i looked at for our backup system here anyway...ive got a complete list of Linux compatible arrays with drivers somewhere...its not that hard to find...and i searched around 3 months back. heres also a few links i have sitting around on my drive currently (from sun(?) i think) :
m m e -scsi.html l #storage
----
Veritas Software Storage Foundation
http://www.veritas.com/product-info/foundation.ht
Legato Systems Backup Software Products
http://www.legato.com/Products/index.html
IBM Corporation/ADSM
http://www.storage.ibm.com/software/adsm/index.ht
Datalink Corporation specializes in the integration of information storage, high-availability, and
disaster-recovery solutions
http://www.datalink.com/frames/fr_prdcts.html
MTI Technology provides high-performance, cross-platform data storage management solutions
http://www.mti.com/products/index.htm
"Fibre Channel vs. SCSI: Which is more advantageous for your storage area network?" by Ron Levine
(SunWorld, March 1999)
http://www.sunworld.com/swol-03-1999/swol-03-fibr
For more storage-related stories, see SunWorld's Site Index
http://www.sunworld.com/common/swol-siteindex.htm
There are differences between what a storage area network (SAN) and what a network-attached storage (NAS) device do *today*.
A NAS box is essentially a ready-to-go enclosure that you can load up with storage, and runs an embedded OS designed to optimize the path between network services (NFS, SMB, FTP, HTTP, as you mentioned), and the disk drives. Typically they implement very nifty disk-level filesystems that support journalling for fast reboot, and snapshot capability for backup and point-in-time applications. Getting a NAS running is an issue of configuring and attaching to the network, ideally.
SAN is different. The idea is to network the storage devices sitting behind a group of servers just as the servers are networked via the LAN to client machines. The promise of the SAN is the ability to deploy enterprise-class storage solutions using readily available storage-devices, servers, adapters, and infrastructure h/w. Since storage is networked behind the servers thru a hub or switch, a failed server doesn't mean that the storage becomes unavailable - instead, another server can work as a failover server until the failed box is brought back up.
You can achieve this with multiport parallel SCSI RAIDs, but your scalability is limited and you're locked into a proprietary RAID architecture instead of buying off-the shelf disks.
This is not to say that NAS boxes can't use Fibre Channel inside, as you mentioned, and Fibre Channel is the network infrasturcture du jour for SANs. And in the future, what the file-level NAS services and the I/O channel and block-level services of SAN might run together a bit.
Hope this helps!
Linux supports Fibre Channel today.
SAN support is another issue.
Fibre Channel is the physical, link, and protocol layer used to implement most SANs today (there is also SSA). SCSI runs over FC just as it runs over honking big parallel cables that are getting to be the size of firehoses.
Linux includes support for one FC HBA, the QLogic QLA2100. The driver works pretty well, actually, should you pop it into a PCI slot, buy a Cheetah 9LP, and cable them together point-to-point, or thru a hub or switch.
The first problem with Linux, it seems to me, is that the SCSI subsystem isn't really designed to have devices popping in and out at arbitrary times. One of the advantages of FC is that the size of the device space (126 for arbitrated loop topology, approx 2^24 for switched fabric) is such that you can readily envision devices coming and going quite a bit. Correct me if I'm wrong (and I'd love to be wrong here), but the current architecture doesn't readily allow for very many disks popping in and out. Nor is there a consistent scheme for disks that do pop in and out to show up with consistent device names.
The latter problem could be eliminated by running a logical volume manager over the physical devices and letting it sort out who's who by reading identifiers off of a private area on the drive or partition. Another way would be for the driver to map disks to consistent major numbers using the FC world-wide name on the disk device.
The former problem probably requires some patches to or an overhaul of the SCSI subsystem in order to support dynamic reconfigs a bit better.
Anyhoo, that's what you'd probably want to do in the OS in order to get Linux a bit SAN-friendly.
You can already hook a Linux box up and access storage, and a FC switch is capable of doing what's called zoning so that you won't have to see and possibly stomp disks that are being used by other nodes on the SAN. That leaves you with the application layer - what do you want to do with the SAN, anyway? If you're only using it to get volumes, than you're probably good to go at this point. If you're using it to implement LAN-free backup of your storage, you'll need a SAN-aware backup client that talks to your server and runs on Linux. If you want server failover for your volumes, you're either rolling your own or waiting a while. Likewise if you want to do any fancy clustering (via IP or VI) over the FC network.
These problems are not just with Linux. Many of these problems are only partially solved no matter which OS you decide to hook up.
There are two problems that I see:
First, as the earlier poster mentioned, last time I checked linux assigns /dev/sda to the first scsi device it finds, /dev/sdb to the second, etc. I've never seen what linux does when, for instance, scsi id 5=/dev/sda and you add a device with id 3 and rescan the bus. Personally, I'd love to know what happens. It would be nice if there could be a switch from the way things are to a system more like (as I recall) Solaris uses which is not sequential, but definitive, i.e. incorporating into the device name the system bus/adapter/adapter bus/scsi ID/LUN. Of course that would be a pain for all of us that are used to the old way.
The second problem with the Vicom's is that you can only configure them from an NT box. In general this isn't too much of a problem, since you only set it up once, but if you really hate NT, it may be a problem.
They've got some interesting stuff. They are here.
Pound! Bang! Bin! Bash! is this a shell script or a Batman comic?
We are currently running several Solaris systems with a 3rd-party FCAL SAN-in-a-box from XIOtech. We have a second one serving our Netware servers. If you're serious about getting the benefits promised by SANs today, I'd recommend checking them out. They allow you to create virtual disks in various RAID configurations, move the virtual drives between different virtual clusters, and copy and swap virtual drives on the fly.
Lans Carstensen - lans@carstensen.cx
Links
XIOTech press release on LinuxToday
The Global Filesystem Group
The High Availability Linux Project
RSi's RSF-1 high availability software, available for trial download