Terabyte Storage Solutions?
DeMechman asks: "As many on Slashdot may know, storage is one thing which you can never have enough of. Given the current situation with CD/DVD rot (Personally I can attest to a 10% attrition rate) hard drives in a RAID configuration seem to be a better and more economical solution. If you own more than fifty CD/DVDs, it can be a daunting task to find a file. I am wondering if anyone has found a hardware solution that can inexpensively be set up to handle 10 or more 250GB HDDs in a RAID configuration. Primarily, has any case manufacturer tackled this niche market yet?"
I'd say that $2.82/GB, for a well-built, well-designed 14-drive 3U RAID (0, 1, 3, 5, 0+1, 10, 30, 50) hardware cabinet with dual-2Gb/s fibre channel connectivity, dual-100mbit ethernet and serial for monitoring and management, excellent Java setup, management, and montoring software, redundant hot-swappable power supplies and fans, and that works and is qualified for use with Windows, Linux, and Mac OS X, qualifies as "inexpensively". But that's just me.
http://www.apple.com/xserve/raid/
Academic prices for:
1.00TB - $5399
1.75TB - $6749
3.50TB - $9899
It's not RAID, but you could buy a 1-terabyte drive from LaCie.
In soviet russia, You ask not what country do for you, but what you do for country!
Oh wait...
Apple is one of the cheapest, at 6000$ (with drives)
See page here.
Comment removed based on user account deletion
I have a TB here, and rather than raid, I decided to do a nightly "rsync" mirror to a "yesterday" partition.
The two advantages of the nightly rsync over RAID are
- It protects against user-error too. If I make a bad edit, I can always 'diff' against
/yesterday/home/me/...'
- It makes upgrades of both hardware and software easy. Since my live backups are excactly that (live, and tested every day), one machine can be fully upgraded while the other acts as the primary one for a while.
Important data also gets backed up to another large HD in my car and DVDs in a safe occasionally, to protect against a fire or burglars.you can "cheaply" buy 3U rack mount cases that hold 15 drives in hotswappable SATA or SCSI cages up front. Combined with a 3ware 9500-12, and leave 3 cages empty(or spare drives just not cabled up), this will give you 2.75 TB in each unit of raid5 storage. If you were really hard up for space, you could use a pair of 9500-8's and this would give you 3.25 TB per unit. Some 4U units hold 16 drives, which gives you the full 3.5TB in 2 x raid5 arrays.
I have 8 x 160GB Maxtor drives in a RAID5 array. It's fast, relatively inexpensive [Fry's Electronics recently was selling the 160s for $69/ea]
/dev/md2 1.0T 521G 522G 50% /ext
The 160GB drives used to come with a Maxtor [Promise] ATA-133 card. Two of those will support eight drives. Not the most optimal arrangement because of the bus having two drives on each channel, but it doesn't seem to affect performance too much since it is striping the data across all of the drives. I'm assuming it stripes in order, so you'd want to stagger the drives such that 1 & 2, 3 & 4 are not on the same controller.
Output of df -h:
The cost to assemble something like this?
~ $600.00
8 x $70 for the 160GB drives
2 x $20 ATA-133 controllers
The biggest issue is that there is no easy way to back up the array. You could use RAID 6 and have two drives worth of parity info, but it still leaves you vulnerable to a catastrophic hardware (or building) failure.
Anyone have any ideas on how to back up 1TB in a home environment? i.e., not $3000 tape drives & $200 tapes
I bought a case from http://www.servercase.com/, a 3Ware RAID Controller and 8 200GB IDE drives. I've got 1400GB of usable space in RAID5. It runs Linux with Samba and NFS. I also use it for a MythTV Backend.
Unfortunatly, once you have all this space, you WILL find a way to use it all and need more. I put this system together about 10 months ago, and it's at 85% capacity now. I'm preparing to build a new server with 12 250GB drives, to have just over 4TB between the 2 systems.
I use a Hard Drive Enclosure for backing up files. With IDE HDD's getting less and less expensive, picking one of these versatile enclosures up for less than $50 is a good value. I own a DVD burner but rarely use it for data storage since the enclosure is way more convenient. Now as far as 10 250GB drives in a Raid configuration, how redundant redundant do you need you data to be? Or is it that you're just overly cautious after having your backup DVD's fail? Just curious.
I can answer your question, as I've just built one as a giant backup solution for our hosting company.
:)
I went with Serial ATA for a couple reasons:
1) It's cheaper and has more capacity than SCSI;
2) Cabling is not a mess as it is with regular IDE (if you've never seen serial ATA cables, the first thing you will notice is that they are small!);
3) It can hotswap, unlike regular IDE;
4) It's not that much more expensive than regular IDE.
I custom-built a 3U server from InterProMicro. They are a small (local if you are in the Bay Area) SuperMicro reseller that does great work. (If you need something, call and ask for Andy. Tell him Erica from Simpli sent you!)
The machine I specced out was as follows:
* 3U case with 8 hot-swap SATA drive bays;
* 8-port 3Ware 8506-8 SATA RAID controller;
* 5x250GB SATA drives in a RAID-5 array;
* Dual Xeon processors.
The 5 drives give you 1TB of storage, and expanding up to 8 gives you 1.75TB. I would also recommend a separate mirrored SATA 10KRPM array for the OS if you want really fast speeds.
This whole solution (Xeons; 5 drives; 3U case) cost just over $3000... which is pretty reasonable for 1TB of network-accessible storage. Interpro has solutions that go up to 24 SATA drives, which at 250GB each gives you an ungodly amount of space (5.75TB, if my calculations are correct.)
My suggestion is to go with a niche server builder like InterproMicro over Dell or Compaq or any of those guys. You can get the same high quality from a custom manufacturer without paying the steep brand name price from a larger manufacturer. As for the drives, any time the goal is "as much space as possible", SATA should be your first choice.
Good luck!
Simpli - Your source for San Jose dedicated servers and colocation!
-What RAID level you want (5 usually requires better hardware)
-Whether you want hardware RAID (I strongly recommend this) or soft RAID
-How much redundancy you need (Battery backup cache? Redundant controllers? Hardware environmental controls?)
If you are looking for good pci cards, I would strongly suggest a card from 3ware, and a card from a place such a Seagate. Getting a super-duper cheap card when terabytes of data are on the line is just fundamentally stupid. You can save some bucks now, but be ready with your next Ask Slashdot: "How do I recover data from my dead RAID?" Seagate now has a nice 5 year warranty, which match well with good quality and reasonably cheap drives. Look at some of the SATA drives like the Barracuda. However, any decent quality drive maker can work. If you have even more money, you can look at some of the things offered by places like StorCase. A larger initial investment can become cheaper as you scale up the cheap harddrive count, and it can be a good thing in the long run. Obviously, the more time you are willing to invest doing things yourself, the cheaper you can get to some extent vs premade items. However, no support as well.
Do read up on some of the fundamentals of RAID: Everything you need to know (and lots you don't) is probably at least mentioned in the PC Guide on RAID. Look through that. Things like hot swap and hot spares are important to understand. Finally, you should remember to check compatability. Unfortunately, I for instance have not been able to find much of anything in the way of controller cards that is compatable with OS X (except the obvious, the XServe RAID). So I have something set up on a BSD box in my server closet that I then link to, more like a storage appliance. Happily, the 3ware cards and many others are now compatable with a wide variety of *nix and BSD flavors along Windows, but do check to make sure.
Last but not least, remember this!: RAID is *not* a backup solution, but an highly redundant onsite storage system. Have another form of backups, even if it is just a RAID 1 off site, or DVD-Rs, or something. If a disaster happens (thieves, fire, nuclear destruction, John Ashcroft) on site storage won't save you.
- Find any tall beige-box case. ($150)
- Find 9 good 250g Serial ATA drives. ($100 each = $900)
- Get an 8-port serial ATA hardware RAID controller like these ($300)
- Get a good 400-500W power supply ($200)
- Any motherboard and CPU will do ($200)
- Spend a few extra bucks on gigabit ethernet ($50)
Put 8 of the hard drives into a RAID-5 array. (1 for your O.S/system use). That makes about 1.4 TB for only $1800 total. The 3Ware IDE raid thing works great with FreeBSD, which is what we use for everything.Rip all your CDs as FLAC so that (1) you never have to rip them again (it's lossless), but (2) it's half the size of saving WAV files
At least that's what we've done with our 68,000 CDs we have here.
I have two systems, with about 1.3 and 2.5 TB respectively for archiving DVD quality video and MP3s. I looked at RAID but found it was not necessary. I prefer to manage the disks (some are removable) and do not need high performance even when streaming the video over my in home LAN.
I use DVArchive with DVD or satellite to ReplayTV for video capture and play back, DVA is great for managing multiple volumes and dynamically discovers vidoes if I want to move them to another drive. It also supports copy/move between the two systems (I use a 1Gb switch between systems). CPU performance is not key for play back though it is critical for transcoding (I use a dual processor system for transcoding and it smokes my single CPU system).
I have a LARGE MP3 collection (forgive me for not publically admitting to its size) and I find the same systems/drives are ample for supporitng a digital audio library. I switched to iTunes for managing music (MusicMatch melts down when the number of files gets large) and stream it with SlimServer to squeezebox devices for high quality playback on home theater and other receivers.
My recommendation is to go with generic disk drives - brand names, 7200 RPM with 1-3 year warranties --I get them locally on sale for under $150, sometimes $130/250GB, thats 52 cents per GB, a little more per GB than a DVD-R disk but more reliable and infinitely more flexible. I can recreate a DVD off of the disk image if needed.
I am more concerned with heat and power consumption (it adds up) than disk performance, someone will need to explain to me why I'd need to mess with RAID for this...
The cheapest way I know is still a large PC tower case, a 3ware SATA raid controller a big PSU and a couple of large fans withs attitude. On the PCI side you don't need much unless you want to do gigabit (or of course just shove your server in the same case and dont do the I/O networked)
I don't understand why people automatically assume that DVD+-R/RWs are prone to the same problem. The recordable layer in these disks is buried in plastic. It's NOT on the surface. There's no oxygen coming to it, so in theory DVD+-R/RWs should be a heck of a lot less prone to "rot".
And i've installed quite a bit of these:
* SuperMicro motherboard (any of the newer ones, depend on your choice of architecture). Be sure to get one with PCI 133/64 and gigabit onboard.
* 3Ware RAID board(s).
* Chembro rackmount cases (they have a very nice one with 16 SATA hotplug slots with backplane and all)
* Don't go cheap on the power supply. You'll need at least 600W. I always go for redundant ones.
* 16 SATA disks of your choice (250, 120 or 80GB)
* Linux!!! (Be careful with fedora core2, it doesnt support nativelly the 3Ware cards - you'll need to compile your own)
Of course you could save about $1000 by using a cheap motherboard, chassis and PS. But it really pays off using the good brands on those.
By the way, you should always get an extra hard drive (or two). They will fail (sooner or later) and you don't want to be left hanging.
I can attest to this:
/home volume on a 3ware controlled array. Sometimes, we get those users that decide they need to write out their incremental data sets across the NFS mount... from 48 nodes. Sure, a parallel file system would be great, but from what we've seen, only GFS was close to production quality (and they just recently gpl'd it).
Our 48 Node beowulf has a
Anyway, that kind of load brought that head node (dual proc 1700+ MP) to its knees until we decided to rebuild it. Moving from the hardware controlled raid to linux's software raid completely resolved that problem.
If you want to spend the extra money and have a warranty and fancier case, look at Nexsan , or EMC's AX100. Scary that EMC is selling something cheaper than the competition, but they are. Sorta disturbs the natural order of the universe. Still, either will set you back several thousand. The AX100 looks pretty impressive on paper. Options for dual controllers, and up to 3 TB in a 2U space. Haven't tried one myself yet.
Disclaimer: I work for a storage integrator, both are brands we sell.
Ignorance is the root of all evil.
Yeah--oddly though I'm getting better Bonnie results by using Linux RAID 5 than their hardware RAID 5. But it is possible (just) to stream full-size, full-framerate PAL video to 'em over NFS! (sustained 40MB/s). Anyway in software you can now do RAID 6 :)
The reason why you're getting better RAID 5 results from software RAID vs. hardware RAID is because of the parity calculations involved with writing to a RAID 5 volume. On a hardware RAID setup, these are calculated on the RAID card itself, which probably has a 200 or 400 mhz. chip that does these calculations. Back when CPUs were only 400 mhz, this was great, because there was no load put on the CPU, and the RAID controller worked just as fast or faster than a software RAID setup. Now that CPUs are 3 ghz. +, there's no way a dedicated hardware RAID card can keep up, and unless you're running a huge load on the server, youv'e probably got 1 ghz. or so of free CPU bandwidth to burn for software RAID...
Want to see the performance really increase? Give up RAID 5 and go with a real RAID solution like RAID 1 or RAID 1+0.
"When the president does it, that means it's not illegal." - Richard M. Nixon
Or, if you want really durable read-only storage (i.e. lasting a few hundred years without maintenance), you could use the little 1x1 LEGO blocks as bits.
Therefore, a mere eight-by-eight city block area could store a full 1 terabyte of LEGO-ROM, with no worrying about DVD rot or head crashes (although access speeds would leave something to be desired).
>;k
Um... ever consider the mind-bogglingly simple solution of:
ls -R> ~/dvd.index/<disc_label> for each dvd
grep "<whatever_youre_looking_for>" ~/dvd.index/*