Ideas for a Home Grown Network Attached Storage?
Ken asks: "It seems that consumer level
1TB+ NAS boxes
are all the rage
right now. Being a digital packrat, with several computers/entertainment devices on my home network, I am becoming more interested in getting one of these for my home. Unwilling to dish out 1K or more up front, and possessing a little of the DIY spirit, I would like to build my own NAS and am interested in hardware/software ideas. While the small form factor PC cases are attractive, my NAS will dwell in the basement so I am thinking of a cheap/roomy ATX case with lots of power. I think that integrated gigabit Ethernet capabilities and PCI-Express on the motherboard are a must, as well as Serial ATA HDDs, but what processor/RAM? How strong does a computer really need to be to
serve files? What about the OS? Win2K3 server edition? WinXP Pro? Linux?"
"I have been using Red Hat and then Fedora Core since it came out but only in a workstation role, and I have little experience with other flavors. What file system should I use for maximum compatibility? I will need it to work with Windows, Linux and several UPnP devices. I am planning on starting out with two or three HDDs in a RAID 5 config. and I would like to be able to add more HDDs as space is needed without any major changes. Thanks for any ideas."
Common linux file systems (ext, reiser, etc) contains critical data-losing type bugs on file systems bigger than 2TB, except XFS. This was found to be the case in even the most recent 2.6 kernels.
Tony Battersby posted a patch to the LBD mailing list recently to address the ones he could find, but lacking a full audit, you probably shouldn't use any filesystem other than XFS.
Considering the gravity of these bugs, you might consider using XFS for everything, if the developers left these critical bugs in for so long, it makes you wonder about the general quality of the filesystems.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
I was thinking of using a mini and a single firewire disk for a somewhat similar project.
But, OS X has RAID capability, so you could use something like this:
Some of us have the DIY spirit...
Seriously, why buy something when you could 1) build it (probably cheaper) yourself and 2) learn more from building it? Most DIY projects have a habit of benefiting you at some point in the future in ways that you can't predict when you start them.
Either you - not you personally, the rhetorical you - 1) don't have the time, which is acceptable, or 2) you don't have the knowledge, which you should be trying to gain, or 3) you are lazy, which is really quite sad.
There's more to life than just spending money on a problem. There's actually figuring out the solution to the problem.
$.02
You don't. Take a bunch of disks, turn them into RAID5 array. Make a logical volume (LVM on Linux) and add the RAID-array to it. Create a growable device on the LVM and format with a standard gowable FS.
When you get new disks simply create a new RAID5 array and add that to the logical volume and add to your current and grow the FS on it.
You don't want everything on one big RAID0, I lost 200G of data that way. I can say I'll never do that mistake again.
A few comments about this
-Get the best-value processor that you can find. You won't need the fastest thing out there, but it's better to have a little more "oomph" than you need. If you end up using an encrypted filesystem at some point, you'll want enough power to decript and keep the network "fed"
-Have a plan for adding a second network interface. Maybe you don't need it now, but once the DIY bug bites, you may find yourself wanting to use the machine as your NAT box or as a wireless access point or something like that.
-Think about noise and power use. Yeah, those WD Raptors are fast, but they're really loud, too, particularly if you buy a pile of them. You might want to think about acoustic material for the inside of the case -- your local car customizing shop can hook you up. You'll also want an "overkill" power supply for the case so that you don't have problems when you add more drives later.
-Think about heat and airflow. At this time of the year, it's easy to ignore (Dear Australia: yes, I know it's summer there now), but during the summer, stuffing the fileserver into the closet might not be such a good idea.
-Consider underclocking. If you do buy a better processor than you need, bump the speed down for now. Less power, less heat, less noise.
-Get a BIOS or hardware-level RAID mirror for your "root" disk. You can use software RAID for the data disks, but you want to be absolutely certain that you can recover the disk with information about the software RAID. The RAID does no good if you don't know how to access it.
-If you use Linux, LVM will become your new best friend.
-Consider buying hard drives that are carried by your nearest Best Buy/CompUSA/other computer store. You don't actually have to buy the initial batch from there, but if a drive in the RAID set goes bad, you'll want to replace it ASAP. It's nice if you can do that tonight rather than "in a few days".
--
When we developed the PetaBox at The Archive, the idea was to use off-the-shelf PC hardware and maximize GB/buck, while keeping cooling and power costs low. It's worked out pretty well. See also my unofficial PetaBox web page.
It turns out that you really don't need much of a PC to serve files. We underclocked the cheap little Via C3 processors to 800MHz to reduce power and heat, and they still troop along nicely. SATA is not necessary, since you're going to be bottlenecked on the network connection anyway. We used 512MB of RAM per node, but only because our system runs a gaggle of perl scripts to provide a variety of services (file searches, XML-based metadata updates, etc). If you're just going to be running NFS or Samba, 256MB is probably plenty (unless you choose to run Gigabit over a mere 32-bit PCI bus, in which case 512MB or 1GB would be better, so that you're reading more from filesystem cache and pounding the hard drives over your overloaded bus less). Gigabit ethernet is a must (we used 100bT for the PetaBox, which is annoying at times, but the cheaper 100bT 48-port switches were instrumental in keeping the overall price of the system low). We stuck four hard drives in each case, mostly from previous bad experiences trying to work with eight-disk machines. I can't say too much about the disk failure rate statistics which incited us to switch to Hitachi Deskstars, but I will say that I'm glad our PetaBox is using Deskstars and I will only use Deskstars in my workstation at home.
If you really, really want to keep the gigabit pipe full while pounding on your disks, then a newer bus like PCI-Express is necessary. Otherwise, I'd be tempted to go with an older, cheaper (and imo, more reliable) Pentium-II or -III based PC. You can get solid, reliable, well-cooled and well-dustfiltered early model VA Linux servers with 500MHz Pentium-III's for $200 or less. I must stress the importance of buying a really solid, rigid case. Over time, normal computer cases get all bendy-wendy, turning every part into a moving part, including parts you don't want to have moving at all. Fans will start sticking, motherboard traces will start breaking, etc. Most of the rack-mountable cases are made of good thick solid steel panels, which makes them heavy as f**kall, but IMO that's a small price to pay for a system that will run forever.
For operating system, the most important thing is to get something you know how to run and maintain, or can get help running and maintaining. If you have geek friends who are willing to provide technical assistance, find out what they know best and use that. A well-known operating system will probably be of more use to you than a technically better, but less well understood, operating system.
Having said that, my personal preference is Slackware Linux, because I appreciate its philosophy of keeping things simple, and preferences for packages which are the most stable, as opposed to newest versions or lots of features. My second choice would be FreeBSD. Third would be the OS we decided to use at The Archive for the PetaBox nodes, Debian Linux. But if all you know is Windows, then go ahead and use Windows.
Regarding RAID, it's been my experience working at The Archive that RAID is often more trouble than it's worth, especially when it comes to data recovery. In theory, recovery is easy, you just replace a bad disk and it will rebuild the missing data, and you're good to go. In practice, though, you will often not notice that one of your disks are borked until two disks or borked (or however many it takes for your RAID system to stop working), and then you have a major pain in the ass on your hands. At least with one filesystem per disk, you can attempt to save the filesystem by dd'ing the entire raw partition contents onto a different physical drive of same make + model, skipping bad sectors, and then running fsck on the good drive. But if you have