Terabyte Storage Solutions?
DeMechman asks: "As many on Slashdot may know, storage is one thing which you can never have enough of. Given the current situation with CD/DVD rot (Personally I can attest to a 10% attrition rate) hard drives in a RAID configuration seem to be a better and more economical solution. If you own more than fifty CD/DVDs, it can be a daunting task to find a file. I am wondering if anyone has found a hardware solution that can inexpensively be set up to handle 10 or more 250GB HDDs in a RAID configuration. Primarily, has any case manufacturer tackled this niche market yet?"
We just use old server cases and fill them with drives, a couple power supplies (some of those drives suck up POWER lemme tell ya) and then throw a NIC in to get to them all. Mind you, there's no RAID that way... we recently started messing with RAID in small ways and I like it, we will eventually start putting RAID controllers into the boxes and mirroring our setups.
We just added a couple of these at the office. We used a SATA RAID card from LSI Logic (formerly AMI MegaRAID) and on top of the 6-port device added six 200GB Western Digital drives. From that page, a 200GB Maxtor can be had for around $85.00. Add in a 2U case, which is probably the most expensive part at around $300.00, and you have yourself the most expensive components of what you need, subtract the motherboard, processor, and all that jazz (which can be had for another $300.00 or so). Running Linux LVM with Samba-3 and Winbind for full Active Directory integration and authentication on top of an ACL-enabled ext3 filesystem, of course! ;)
I bought a case from http://www.servercase.com/, a 3Ware RAID Controller and 8 200GB IDE drives. I've got 1400GB of usable space in RAID5. It runs Linux with Samba and NFS. I also use it for a MythTV Backend.
Unfortunatly, once you have all this space, you WILL find a way to use it all and need more. I put this system together about 10 months ago, and it's at 85% capacity now. I'm preparing to build a new server with 12 250GB drives, to have just over 4TB between the 2 systems.
We use Linux LVM to take snapshots and then do a hot backup of that data to an archive box. That archive box contains removable hard drives (tape drives are just crap), and we then take the pysical drives to an off-site location to provide security and all the goodness that comes with off-site storage. We also use rsync to synchronize our production NAS devices with a parallel NAS device, to which we can hot-cut and have a current copy of all our data to a 15 minute window. Because rsync (with ext3 ACL support, mind you) only copies what has changed on the filesystem, it goes relatively quickly. You can find my rsync packages at ftp://bagel.express.org/ (as well as patched Samba-3 packages that really work with Winbind and some updated kernel packages for LVM+snapshot support) at that FTP site.
Not the most optimal arrangement because of the bus having two drives on each channel, but it doesn't seem to affect performance too much since it is striping the data across all of the drives. I'm assuming it stripes in order, so you'd want to stagger the drives such that 1 & 2, 3 & 4 are not on the same controller.
Have you worked with a 3ware card? Believe me when I say that this solutions' performance will suck compared to using a real raid solution such as a Escalade 3ware 9500s. Even on software raid, the 3ware card will kick it's butt (Hmm I not even sure 3ware's Hardware Raid is as fast as Linux software raid on a Fast system).
1) First you are using 2 cards per channel thus it only writes to one drive at a time on each channel. An 8 port 3ware card can write to all 8 at once.
2) The Promise Card is only an ATA 133 card not raid and doesn't support command queuing.
3) You are multiple cards which requires more IRQ requests, which in turn slows down overall system performance.
4) Promise support in Linux sucks. It's better now that it has been in recent years with Libata but it's still crappy promise hardware.
That's one thing that I always appreciate product builders keeping an eye on - for example, the 3.5tb XServe RAID, while more expensive (and providing more features), specifies maximum heat output of 1365 btu for the disk array and 990 for the server (assuming all 17 disks running full tilt with both G5s pegged). Not bad for a 4.25tb system. Under lower load, heat output from both drops substantially.
You're special forces then? That's great! I just love your olympics!
I'm sure you were expecting this, but their disks are not technically raid as they are not redundant. lose one disk, lose the lot. this means that your data is less safe than it would be just spread across 2 or 4 seperate discs as if ou lose a drive then, you only lose some data, not all.
;)
also, I beleive they don't even qualify as the badly named, raid0 as I under the impression that the disks are concatted together, not striped.
what I'd love to see is an Xraid mini as it were. something with much of the managability of the full size xraid, but not as much redundancy. so perhaps a nice desktop case (to match the g5 *of course*:) that could take 4 or 5 sata disks in hot swap caddies (maybe the same caddies as in the xraid) with a hardware raid controller on board for striping, mirroring and raid 5. a single gig ethernet on the back and then fw400 and 800 ports.
if it had the same cross platform compatibility as the the big xraid, same type of management tools etc, then it could be a big hit, and be an official filling for the big hole that is g5 storage.
sure, the xraid is great and cheap, but it's price of enhtry i still high when all you want is a terabyte or so of fast storage for one of two machines at home, ie no rack to place, no need for redundant psu's and fibre channel connectivity, that kinda thing.
HD video editors esp need something as for the data speeds they need for uncompressed hd (180MBps) thats 4 striped disks which you can't place in a g5 without using third party solutions.
just a thought, come on apple. and when you make one, I just ask for a fully loaded one for myself
dave
10% attrition? In my case, I can't remember a CD-R I recorded that ever failed (I don't use CD-RW, maybe these are something else...). All my source backup are on CD-R, and I make quite a lot of them. I also burn a *lot* of music compilations for my car. Some of the CD-R there stopped working after (quite) a while, but it is only because a car is a harsh environment for a CD-R.
The CD-R brand must have something to do with it. I only use Sony's CD-R. Not for a particular reason. Only that none of them ever failed me.
Thus, my opinion is that CD-R are one of the bests (if not the best) solutions for non-industrial backups. By industrial, I mean freaking mission-critial multi-GB multi-millions dollars worth backups.
Note: I never tried DVD-R. You must code a *lot* of lines to make your projects' sources weight more than 700 mb. (Hum, quick cvs tree check: 75 mb... ok, I might be wrong here. However, these 75 mb were a *lot* of work for me...)
perception is reality
Because, from experience, putting a 3ware controller in a 64-bit/66MHz slot is more than 4 times faster than putting the same controller in a 32-bit/33MHz slot. If you're paying for the controller and the array, don't skimp and cheat yourself out of 80% of the performance because you won't pay for a decent motherboard.
iv got well over a hundred cds on the desk next to me, bout 6 or 7 hundred filed away, and we havent even begun to discuss DVDs. i think i could max out a TB pretty quick. but anyway, my cas can hold 10 drives, actually it can hold 11, and thats only if your not creative enough to find new was to mount them. just pick up a full tower, had mine for years, its a real 'babe magnet' as well! (unfortunatly it seems to have the same polarit as women)
I have two systems, with about 1.3 and 2.5 TB respectively for archiving DVD quality video and MP3s. I looked at RAID but found it was not necessary. I prefer to manage the disks (some are removable) and do not need high performance even when streaming the video over my in home LAN.
I use DVArchive with DVD or satellite to ReplayTV for video capture and play back, DVA is great for managing multiple volumes and dynamically discovers vidoes if I want to move them to another drive. It also supports copy/move between the two systems (I use a 1Gb switch between systems). CPU performance is not key for play back though it is critical for transcoding (I use a dual processor system for transcoding and it smokes my single CPU system).
I have a LARGE MP3 collection (forgive me for not publically admitting to its size) and I find the same systems/drives are ample for supporitng a digital audio library. I switched to iTunes for managing music (MusicMatch melts down when the number of files gets large) and stream it with SlimServer to squeezebox devices for high quality playback on home theater and other receivers.
My recommendation is to go with generic disk drives - brand names, 7200 RPM with 1-3 year warranties --I get them locally on sale for under $150, sometimes $130/250GB, thats 52 cents per GB, a little more per GB than a DVD-R disk but more reliable and infinitely more flexible. I can recreate a DVD off of the disk image if needed.
I am more concerned with heat and power consumption (it adds up) than disk performance, someone will need to explain to me why I'd need to mess with RAID for this...
The cheapest way I know is still a large PC tower case, a 3ware SATA raid controller a big PSU and a couple of large fans withs attitude. On the PCI side you don't need much unless you want to do gigabit (or of course just shove your server in the same case and dont do the I/O networked)
I don't understand why people automatically assume that DVD+-R/RWs are prone to the same problem. The recordable layer in these disks is buried in plastic. It's NOT on the surface. There's no oxygen coming to it, so in theory DVD+-R/RWs should be a heck of a lot less prone to "rot".
I was just researching this, I myself am making a 1 TB storage server. I have come up with the following solution. 1 Broadcom BC4852 Serial ATA Raid drive. Now this thing works wonders, and only costs (approx) 362$. It can change raid levels without bringing it down (IE, start with 1 HD, pop another in once you buy it, SATA is hotswap, and Move to raid 1 or Raid 0 automagically, pop another in and go raid 5.), it supports 8 drives PER controller, and you can use 4 controllers (AND THEY ALL ACT AS ONE). This means you can have up to 8 TERRABYTES using 250 GB drives, or 7.75 TB raid 5. The 160 GB drives are about 169$ a piece so add a basic motherboard and chassis and you got a full system. Dont forget a bunch of drive trays if you want to hot swap.
Yeah--oddly though I'm getting better Bonnie results by using Linux RAID 5 than their hardware RAID 5. But it is possible (just) to stream full-size, full-framerate PAL video to 'em over NFS! (sustained 40MB/s). Anyway in software you can now do RAID 6 :)
The reason why you're getting better RAID 5 results from software RAID vs. hardware RAID is because of the parity calculations involved with writing to a RAID 5 volume. On a hardware RAID setup, these are calculated on the RAID card itself, which probably has a 200 or 400 mhz. chip that does these calculations. Back when CPUs were only 400 mhz, this was great, because there was no load put on the CPU, and the RAID controller worked just as fast or faster than a software RAID setup. Now that CPUs are 3 ghz. +, there's no way a dedicated hardware RAID card can keep up, and unless you're running a huge load on the server, youv'e probably got 1 ghz. or so of free CPU bandwidth to burn for software RAID...
Want to see the performance really increase? Give up RAID 5 and go with a real RAID solution like RAID 1 or RAID 1+0.
"When the president does it, that means it's not illegal." - Richard M. Nixon
I'm in the middle of doing this myself... I'm saving the money to buy the drives I need. I've already got the server and array below.
;-)
2x Sun Ultra 10 desktop machines (360mhz / 512mb / 2x 18gb drive (hot-pluggable drives)) @ 150.00/ea (EBay)
2x 3' HVD cables @ 28.00/ea
2x X6541A Sun Dual Differential Ultra/Wide SCSI @ 100.00/ea (EBay)
1x Sun StorEdge A1000 storage array @ 120.00 (EBay)
10x Sun Ultra2 SCSI Drive Sleds @ 58.00 (EBay)
7x Seagate - ST1181677LCV 188gb Ultra2 SCSI drives @ 550.00/ea (PriceWatch)
Total for 1,128gb of Raid-5 storage: $4584.00
The trick is, with this setup you will have two machines with redundant access to the drives and data in the array. The Ultra10 is enough to handle any home use I can think of, and paired with Solaris 9 or even Linux will be blazingly fast. I just think that it's more expensive than any comparable SATA setup... great for us Sparc lovers tho!
This solution looks very interesting to me.
http://www.areca.us/IDERAID.htm
It takes up 3 external 5.25" bays and allows you to connect 5 3.5" drives. It provides expandable RAID 5, all internally with it's hardware and simply looks like an ATA or SATA device to the computer.
Has anyone here actually used one?
kiwi
--
System Architecture
Toshiba TMPR4927ATB 200MHz 64-bit RISC processor
64MB on-board cache memory with ECC protection
Areca 5 channels IDE controller (ARC600-66) with enhanced H/W XOR engine
NVRAM for RAID configuration & transaction log
Write-through or write-back cache support
Firmware in Flash ROM for easy upgrades
RAID Features
RAID level 0, 1 (0+1), 3, 5 and JBOD
Multiple RAID selection
Array roaming
Online RAID level/ stripe size migration
Online RAID capacity expansion and RAID level migration simultaneously
Automatically and transparently rebuilds hot spare drives
Hot swap new drives without taking the system down
Instant availability and background initialization
Automatic drive insertion / removal detection and rebuilding
Disk Bus Interface
Ultra ATA/133 compatible
5 channels, operating in parallel
5 hot-swap drive trays
48-bit LBA support allows disk exceeding 137GB
Staggering the Spin-Up of Individual Disk to Solve the Power-on Surge
Host Bus Interface
ARC-5010
Dual ATA interface-Ultra ATA/133 & Serial ATA 1.0
Ultra ATA/133 compatible Transfer rate up to 133MB/sec
Serial ATA 1.0 - 1.5Gbps(150 MB/sec)
ARC-6010
Ultra 160-Wide LVD SCSI; Transfer rate up to 160MB/sec
Tagged Command Queuing
Concurrent I/O commands
"In RAID, IDE has the disadvantage..."
IDE RAID hit mainstream over five years ago, when Adaptec released an IDE RAID card. This card happened to have four separate IDE controllers chips on it, and four cable connectors. I installed a solution using this card with four 73GB IDE drives from IBM (as big as they came in 1999, I think) in an 0+1 configuration. Mirrored striped sets, total usable capacity of 130GB, I think. (Not bad considering I had replaced mirrored 9GB SCSI drives.)
Wouldn't you know it, but one drive failed after three months. No problem, it was taken out and replaced with anohter (FedEx overnight from Dirt Cheap Drives) at a cost of 30 minutes after-hours downtime. And it was done by a technician who'd never seen the configuration before. I was overseas when it happened.
AFAIK, this machine (SuperMicro dual PPro 200, 384MB RAM) is still chugging along, running Windows NT Server 4.0, doing its thing as a file server for an engineering department who still haven't filled it up.
(What was it you were saying about IDE RAID?)
So for that 50 TiB total, you need 50/1.4 = 36 systems. 36 systems * 8 drives = 288 SATA drives spinning. How often do you have to replace one? I'm just wondering as I have had 4 x 200 GB drives in RAID 5 in my personal system for just under a year now and I've already had to replace one. Didn't lose anything, and it was under warranty, but in a month, I'll be out of WD's crappy one-year warranty and I'll have to start buying drives as they fail to keep my data.
Karma: pi (Mostly due to circular reasoning in posts).
The reason they gave is that the even a fraction of modern CPU performance still far outclasses the chips on hardware RAID cards. Also, data cached on the card still has to go over the PCI bus, but data cached in RAM... well, it's already available.
A RedHat employee who was there confirmed that RedHat has seen the same thing in their own testing. For performance go with software RAID. With anything over about a 800Mhz CPU, you would be hard pressed to notice the CPU use.
In fact, unless you are doing something that is virtually entirely computational like SETI@Home, you are going to be generating a fair amount of output. Enough that the faster disk IO actually increases your speed more than what would be gained by moving the RAID load to seperate hardware. It also lets you spread disks over a couple SATA controllers and potentially multiple PCI buses (if your MB supports it.)
Could you tell me where do you find good 250g SATA drives for $100 each, please?
"The number of Unix installations has grown to ten, with more expected." (Unix Programmer's Manual, 2nd ed.; june 1972)
How on Earth is RAID 5 "less real" than RAID 1 or 10 ?