Slashdot Mirror


SoHo NAS With Good Network Throughput?

An anonymous reader writes "I work at a small business where we need to move around large datasets regularly (move onto test machine, test, move onto NAS for storage, move back to test machine, lather-rinse-repeat). The network is mostly OS X and Linux with one Windows machine (for compatibility testing). The size of our datasets is typically in the multiple GB, so network speed is as important as storage size. I'm looking for a preferably off-the shelf solution that can handle a significant portion of a GigE; maxing out at 6MB is useless. I've been looking at SoHo NAS's that support RAID such as Drobo, NetGear (formerly Infrant), and BuffaloTech (who unfortunately doesn't even list whether they support OS X). They all claim they come with a GigE interface, but what sort of network throughput can they really sustain? Most of the numbers I can find on the websites only talk about drive throughput, not network, so I'm hoping some of you with real-world experience can shed some light here."

6 of 517 comments (clear)

  1. Cmon people... by Creepy+Crawler · · Score: 5, Informative

    You might as well build it yourself.

    Go get a lowbie Core2, mobo, good amount of ram, and 4 1TB disks. Install Ubuntu on them with LVM and encryption. Run the hardening packages, install Samba, install NFS, and install Webmin.

    You now have a 100% controlled NAS that you built. You can also duplicate it and use DRBD, which I can guarantee that NO SOHO hardware comes near. You also can put WINE on there and Ming on your windows machines for remote-Windows programs... The ideas are endless.

    --
    1. Re:Cmon people... by swillden · · Score: 5, Informative

      Your network connection is the limiting factor here. On large sequential reads, modern SATA drives with a mobo's onboard controller can easily maintain the 100MB/s or so it takes to max out your gigE connection.

      I second this.

      A good way to test your network connection is with netcat and pv. Both are packaged by all major Linux distos.

      On one machine run "nc -ulp 5000 > /dev/null". This sets up a UDP listener on port 5000 and directs anything that is sent to it to /dev/null. Use UDP for this to avoid the overhead of TCP.

      On the other machine, run "pv < /dev/zero | nc -ulistenerhost 5000", where "listenerhost" is the hostname or IP address of the listening machine. That will fire an unending stream of zero-filled packets across the network to the listener, and pv will print out an ongoing report on the speed at which the zeros are flowing.

      Let it run for a while and watch the performance. If the numbers you're getting aren't over 100 MB/s -- and they often won't be, on a typical Gig-E network -- then don't worry about disk performance until you get that issue fixed. The theoretical limit on a Gig-E network is around 119 MBps.

      Do the same thing without the "-u" options to test TCP performance. It'll be lower, but should still be knocking on 100 MBps. To get it closer to the UDP performance, you may want to look into turning on jumbo frames.

      pv is also highly useful for testing disk performance, if you're building your own NAS (highly recommmended -- a Linux box with 3-4 10K RPM SATA drives configured as software RAID0 array will generally kick the ass of anything other than very high end stuff. It's nearly always better than hardware RAID0, too).

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  2. Re:You could roll your own. by nhtshot · · Score: 5, Informative

    My situation is similar to yours. I bought and tested several off the shelf solutions and was continuously disappointed.

    My solution was an off the shelf AMD PC filled with HDD's and linux software raid.

    It's MUCH Faster (90MB/Sec) then any of the NAS solutions I tested.

    With Christmas specials abounding right now, HDD's are cheap. Use independent controllers for each port and a reasonable CPU. Also make sure that the GIGe interface is PCI-E.

  3. NAS disk architecture by anegg · · Score: 5, Interesting

    If you use a single disk NAS solution and you are doing sequential reads through your files and file system, your throughput can't be greater than the read/write speed of a single disk, which is no where near GigE (1000 Gbps is about 125 MB/second ignoring network protocol overhead). So you will need RAID (multiple disks) in your NAS, and you will want to use striped RAID (RAID 0) for performance. This means that you will not have any redundancy, unless you go with the very expensive striped mirror or mirrored stripes (1+0/0+1). RAID 5 gives you redundancy, and isn't bad for read, but will not be that great for writes.

    As you compare/contrast NAS device performance, be sure that you understand the disk architecture in each case and see oranges to oranges comparisons (i.e, how does each one compare with the RAID architecture that you are interested in using - NAS devices that support RAID typically offer several RAID architectures). Also be sure that the numbers that you see are based on the kind of disk activity you will be using. It doesn't do much good to get a solution that is great at random small file reads (due to heavy use of cache and read-ahead) but ends up running out of steam when faced with steady sequential reads through the entire file system where cache is drained and read-ahead can't stay ahead.

    Once you get past the NAS device's disk architecture, you should consider the file sharing protocol. Supposedly (I have no authoritative testing results) CIFS/SMB (Windows file sharing) has a 10% to 15% performance penalty compared to NFS (Unix file sharing). I have no idea how Apple's native file sharing protocol (AFP) compares, but (I think) OS X can do all three, so you have some freedom to select the best one for the devices that you are using. Of course, since there are multiple implementations of each file sharing protocol and the underlying TCP stacks, there are no hard and fast conclusions that you can draw about which specific implementation is better without testing. One vendor's NFS may suck, and hence another vendors good CIFS/SMB may beat its pants off, even if the NFS protocol is theoretically faster than the CIFS/SMB protocol.

    Whichever file sharing protocol you choose, its very possible it will default to operation over TCP rather than UDP. If so, you should pay attention to how you tune your file sharing protocol READ/WRITE transaction sizes (if you can), and how you tune your TCP stack (windows sizes) to get the best performance possible. If you use an implementation over UDP, you still have to pay attention to how you set your READ/WRITE buffer sizes and how your system deals with IP fragmentation if the UDP PDU size exceeds what fits in a single IP packet due to the READ/WRITE sizes you set.

    Finally, make sure that your network infrastructure is capable of supporting the data transfer rates you envision. Not all gigabit switches have full wire-speed non-blocking performance on all ports simultaneously, and the ones that do are very expensive. You don't necessarily need full non-blocking backplanes based on your scenario, but make sure that whatever switch you do use has enough backplane capacity to handle your file transfers and any other simultaneous activity you will have going through the same switch.

  4. Network won't be your bottleneck. by m0e · · Score: 5, Informative

    Disk will always be. Since disk is your slowest spot you will always be disk I/O bound. So in effect there's no real reason to worry about network throughput from the NIC. NICs are efficient enough these days to just about never get bogged down. What you would want to look at for the network side would be your physical topology -- make sure you have a nice switch with nice backplane throughput.

    About disks:

    Your average fibre channel drive will top out at 300 IO/s because few people sell drives that can write any faster to the spindle (cost prohibitive for several reasons). Cache helps this out greatly. SATA is slightly slower at between 240-270 IO/s depending on manufacturer and type.

    Your throughput will depend totally upon what type of IO is hitting your NAS and how you have it all configured (RAID type, cache size, etc). If you have a lot of random IO, your total throughput will be low once you've saturated your cache. Reads will always be worse than writes even though prefetching helps.

    If you're working with multi-gigabyte datasets, you'll want to increase the number of spindles (ie number of disks) to as high as you can go within your budget and make sure you have gobs of cache. If you decide to RAID it, which type you use will depend on how much integrity you need (we use a lot of RAID 10 with lots of spindles for many of our databases). That will speed you up significantly more than worrying about the NICs throughput. don't worry about that until you start topping a significant portion of your bandwidth -- for example, say 60MB/sec sustained over the wire.

    This doesn't get fun until you start having to architect petabytes worth of disk. ;)

  5. Already been extensively discussed... by kwabbles · · Score: 5, Informative

    For example:

    Best home network NAS?
    http://ask.slashdot.org/article.pl?sid=07/11/21/141244&from=rss

    What NAS to buy?
    http://ask.slashdot.org/article.pl?sid=08/06/30/1411229

    Building a Fully Encrypted NAS On OpenBSD
    http://hardware.slashdot.org/article.pl?sid=07/07/16/002203

    Does ZFS Obsolete Expensive NAS/SANs?
    http://ask.slashdot.org/article.pl?sid=07/05/30/0135218

    What the hell? Is this the new quarterly NAS discussion?

    --
    Just disrupt the deflector shield with a tachyon burst.