Best Server Storage Setup?
new-black-hand asks: "We are in the process of setting up a very large storage array and we are working toward having the most cost-effective setup. Until now, we have tried a variety of different architectures, using 1U servers or 6U servers packed with drives. Our main aims are to get the best price per GB of storage that we can, while having a reliable and scalable setup at the same time. The storage array will eventually become very large (in the PB range) so saving just a few dollars on each server means a lot. What do people out there find is the most effective hardware setup? Which drives and of what size? Which motherboards, etc? I am familiar with the Petabox solution which is what the Internet Archive uses — they have made good use of Open Source software. So what are some of the architectures out there that, together with Open Source, can give us a storage array that is much better than the $3 per GB plus that the commercial vendors ask for?"
If $/GB is a dominant factor, I would suggest Coraid's products. They have a pretty niffy technology which is dead simple and extensively leverages OSS. From my personal experience as a customer, I think they are a bunch of good folks as well. They also seem to constantly be wringing more and more performance our of their systems. Anyway, something to explore if I were you.
So having thought about this a lot, here's what I would do:
1) Run a FC SAN as the backend. This allows you to connect anything you want without wondering what future technology will allow for - ATAoE, iSCSI, ???
2) Love thy Apple. XServe RAID's are 3u, 7tb (raw) and $13,000 - get a bunch - each controller see's 7 disks, set them up as a RAID 0 and uplink the thing to a FC switch.
3) Use DNFStorage.com's SANGear 4002 / 6002 devices to RAID 5 across the XServe RAID 0 LUN's. Your data security then can tolerate half of an XServe RAID going offline. RAID 6 allows for an entire unit to become DOA. Make sure to have an online spare or two.
4) Repeat - but remember, just because you can create it, doesn't mean you can reasonibly back it up.
Now the stupid question - what are you trying to do that would require this much space when you don't have the budget to get a "tested, supported, enterprise" solution? Building things is fun, but at some point you need to back up and say, "Am I willing to risk my company on my solution". EMC, HDS, IBM, HP and other big vendors are willing to step up and make sure your solution works, runs and will not fail (see that video with the SAN array getting shot?)
If you require high levels of performance (=comparable to local direct-attached-disk) or reliability (=must be online "all the time") then stop right now and go out talking to commercial vendors. You will not save enough money doing it yourself to make up for the stress, people-power overheads and losses the first time the whole house of cards falls down.
However, if your performance or reliability requirements are not so high (ie: it's basically being used to archive data and you can tolerate it going down occasionally and unexpectedly) then doing it yourself may be a worthwhile task. I get the impression this is the kind of solution you're after, so you'll be looking at standard 7200rpm SATA drives.
Firstly, decide on a decent motherboard and disk controller combo. CPU speed is basically irrelevant, however, you should pack each node with a good 2G+ of RAM. Make sure your motherboards have at least two 64bit/100Mhz PCI-X buses. I recommend (and use) Intel's single-CPU P4 "server" motherboards and 3ware disk controllers. I believe the Areca controllers are also quite good. You will have trouble on the AMD64 side finding decent "low end" motherboards to use (ie: single CPU boards with lots of I/O bandwidth). Do not skimp on the motherboards and controllers, as they are the single most important building blocks of your arrays.
Secondly, pick some disks. Price out the various available drives and compare their $/GB rates. There will be a sweet spot were you get the best ratio, probably around the 400G or 500G size these days (it's been several months since I last put one of these together).
Thirdly, find a suitable case. I personally don't like to go over 16 drives per chassis, but there are certainly rackmount cases out there with 24 (and probably more) hotswap SATA trays.
Now, put it all together and run some benchmarks. In particular, benchmark hardware RAID vs Linux software RAID and see which is faster for you (it will probably be software RAID, assuming your motherboard is any good). Bear in mind that some hardware RAID controllers do not support RAID6, but only RAID5. Prefer a RAID6 array to a RAID5 + hotspare array.
You now have the first component of your Honkin' Big Array. Install a suitable Linux distribution onto it (either use a dedicated OS hard disk, get some sort of solid-state device or roll a suitable CD-ROM based distro for your needs). Setup iSCSI Enterprise Target.
Finally, you need a front-end device to make it all usable. Get yourself a 1U machine with gobs of network and/or bus bandwidth. I recommend one of Sun's x4100 servers (4xGigE onboard + 2 PCI-X). Throw some version of Linux on it with an iSCSI initiator. Connect to your back-end array node and set it up as an LVM PV, in an LVM VG. Allocate space from this VG to different purposes as you require.
When you start to run out of disk, build another array node, connect to it from the front-end machine and then expand the LVM VG. As you expand, investigate bonding multiple NICs together and additional dual- or quad-port NICs to supplement the onboard ones. I also recommend keeping at least one spare disk in the cupboard at all times for each of your storage nodes, and also a spare motherboard+CPU+RAM combo, to rebuild most of a machine quickly if required. Ideally you'd keep a spare disk controller on hand as well, but these tend to be expensive, and if you're using software RAID, any controller with enough ports will be a suitable replacement for any failures.
We do this where I work and have taken our "array" from a single 1.6T node (12x200G drives) to 10T split amongst 3 nodes. We are planning to add another ~6T node before the end of the year. *If* this is the kind of solution that would meet your needs, I can offer a lot of help, advice and experience to you.
However, our "array" has neither high perfo