Ideas for a Home Grown Network Attached Storage?
Ken asks: "It seems that consumer level
1TB+ NAS boxes
are all the rage
right now. Being a digital packrat, with several computers/entertainment devices on my home network, I am becoming more interested in getting one of these for my home. Unwilling to dish out 1K or more up front, and possessing a little of the DIY spirit, I would like to build my own NAS and am interested in hardware/software ideas. While the small form factor PC cases are attractive, my NAS will dwell in the basement so I am thinking of a cheap/roomy ATX case with lots of power. I think that integrated gigabit Ethernet capabilities and PCI-Express on the motherboard are a must, as well as Serial ATA HDDs, but what processor/RAM? How strong does a computer really need to be to
serve files? What about the OS? Win2K3 server edition? WinXP Pro? Linux?"
"I have been using Red Hat and then Fedora Core since it came out but only in a workstation role, and I have little experience with other flavors. What file system should I use for maximum compatibility? I will need it to work with Windows, Linux and several UPnP devices. I am planning on starting out with two or three HDDs in a RAID 5 config. and I would like to be able to add more HDDs as space is needed without any major changes. Thanks for any ideas."
If you're not worried about having it all in one big partition, do what I did. Get a big case that can hold lots of drives, and just keep adding in SATA or IDE expansion cards and drives. It's worked well so far.
If you do want it all on one big raid5 partition, good luck finding a way to add additional disks into it without rebuilding.
Samba.
"The world only exists in your eyes. You can make it as big or as small as you want." - F Scott Fitzgerald
Common linux file systems (ext, reiser, etc) contains critical data-losing type bugs on file systems bigger than 2TB, except XFS. This was found to be the case in even the most recent 2.6 kernels.
Tony Battersby posted a patch to the LBD mailing list recently to address the ones he could find, but lacking a full audit, you probably shouldn't use any filesystem other than XFS.
Considering the gravity of these bugs, you might consider using XFS for everything, if the developers left these critical bugs in for so long, it makes you wonder about the general quality of the filesystems.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
I feel strange advocating a MS-originated protocol -- but the truth us, serving files via Samba on Linux is going to be the best-performing[1], most-compatible remote file system available.
As for hardware, for small servers I like Linux software RAID, but for a big multidisk farm, you can't beat 3Ware cards. They take nice cheap IDE drives and turn them into a SCSI RAID. Moderately expensive, but beautifully functional. Finally, I've been having good luck with Seagate and WD drives, and bad luck with Maxtors. Your mileage may vary.
[1] Samba beats the MS implementations of SMB/CIFS. No guarantees about Samba vs NFS, GFS, Coda, whatever.
I was thinking of using a mini and a single firewire disk for a somewhat similar project.
But, OS X has RAID capability, so you could use something like this:
Some of us have the DIY spirit...
Seriously, why buy something when you could 1) build it (probably cheaper) yourself and 2) learn more from building it? Most DIY projects have a habit of benefiting you at some point in the future in ways that you can't predict when you start them.
Either you - not you personally, the rhetorical you - 1) don't have the time, which is acceptable, or 2) you don't have the knowledge, which you should be trying to gain, or 3) you are lazy, which is really quite sad.
There's more to life than just spending money on a problem. There's actually figuring out the solution to the problem.
$.02
I would also like to build such a thing, but a box full of disks spinning 24/7 is likely to use a lot of power and give off a lot of heat. Are there any power saving solutions to this? It would be nice if there was some intelligent software that, when you try to play a movie off the disk, spin up only the disk that has the file, read a large chunk of it into memory, and spin the disk back down.
Is this doable?
When you're dealing with that much storage, you really need to catagorize your files into what needs to be backed up and what doesn't. In this type of application (if it was me), most of the storage is likely to be filled with dvd rips & mythtv recordings, or backups from your main system(s). So you would want to backup a list of what you have, but you can always recover from original media (in the case of dvd rips, or off of re-runs for tv shows). Also, on a storage server you're more likely to have data loss from physical driver failure (hence the raid 1 or 5), Since you won't be playing with the system that much once it's set up, you remove a lot of risk factors that you'd have on a desktop system (accidental file deletion, filesystem corruption, ...)
Imagine this with a high-performance SATA raid controller [1] [2], in an enclosure barely bigger than the 4 hard drives alone.
Someone knows here to buy this motherboard? What about practical experience with this sort of configuration?
When we developed the PetaBox at The Archive, the idea was to use off-the-shelf PC hardware and maximize GB/buck, while keeping cooling and power costs low. It's worked out pretty well. See also my unofficial PetaBox web page.
It turns out that you really don't need much of a PC to serve files. We underclocked the cheap little Via C3 processors to 800MHz to reduce power and heat, and they still troop along nicely. SATA is not necessary, since you're going to be bottlenecked on the network connection anyway. We used 512MB of RAM per node, but only because our system runs a gaggle of perl scripts to provide a variety of services (file searches, XML-based metadata updates, etc). If you're just going to be running NFS or Samba, 256MB is probably plenty (unless you choose to run Gigabit over a mere 32-bit PCI bus, in which case 512MB or 1GB would be better, so that you're reading more from filesystem cache and pounding the hard drives over your overloaded bus less). Gigabit ethernet is a must (we used 100bT for the PetaBox, which is annoying at times, but the cheaper 100bT 48-port switches were instrumental in keeping the overall price of the system low). We stuck four hard drives in each case, mostly from previous bad experiences trying to work with eight-disk machines. I can't say too much about the disk failure rate statistics which incited us to switch to Hitachi Deskstars, but I will say that I'm glad our PetaBox is using Deskstars and I will only use Deskstars in my workstation at home.
If you really, really want to keep the gigabit pipe full while pounding on your disks, then a newer bus like PCI-Express is necessary. Otherwise, I'd be tempted to go with an older, cheaper (and imo, more reliable) Pentium-II or -III based PC. You can get solid, reliable, well-cooled and well-dustfiltered early model VA Linux servers with 500MHz Pentium-III's for $200 or less. I must stress the importance of buying a really solid, rigid case. Over time, normal computer cases get all bendy-wendy, turning every part into a moving part, including parts you don't want to have moving at all. Fans will start sticking, motherboard traces will start breaking, etc. Most of the rack-mountable cases are made of good thick solid steel panels, which makes them heavy as f**kall, but IMO that's a small price to pay for a system that will run forever.
For operating system, the most important thing is to get something you know how to run and maintain, or can get help running and maintaining. If you have geek friends who are willing to provide technical assistance, find out what they know best and use that. A well-known operating system will probably be of more use to you than a technically better, but less well understood, operating system.
Having said that, my personal preference is Slackware Linux, because I appreciate its philosophy of keeping things simple, and preferences for packages which are the most stable, as opposed to newest versions or lots of features. My second choice would be FreeBSD. Third would be the OS we decided to use at The Archive for the PetaBox nodes, Debian Linux. But if all you know is Windows, then go ahead and use Windows.
Regarding RAID, it's been my experience working at The Archive that RAID is often more trouble than it's worth, especially when it comes to data recovery. In theory, recovery is easy, you just replace a bad disk and it will rebuild the missing data, and you're good to go. In practice, though, you will often not notice that one of your disks are borked until two disks or borked (or however many it takes for your RAID system to stop working), and then you have a major pain in the ass on your hands. At least with one filesystem per disk, you can attempt to save the filesystem by dd'ing the entire raw partition contents onto a different physical drive of same make + model, skipping bad sectors, and then running fsck on the good drive. But if you have