Finding the Bottleneck in a Gigabit Ethernet LAN?
guroove asks: "I have a small gigabit ethernet network at home, and I spent a lot of money getting gigabit NICs for all my computers and even bought cat 6 cabling. I only have 3 computers on the gigabit network (a Mac, a Windoze machine, and a Linux box) so instead of getting a switch, I triple NIC'd the Linux box, which I use as a gateway and a file server. After the network was complete, I wasn't satisfied with transfer rates, so I started a transfer of a very large file and found that the transfer rate was topping off at just over 145 Mbps (which is a far cry from 1000 Mbps). I'm wondering now where my bottleneck is. Is it the NICs? Are all gigabit NICs really giving us 1000 megabits per second? Is it the driver? Is it Samba? Could it be that the hard drives aren't fast enough? Does anyone have experience with gigabit home networking enough to know where the bottlenecks are? Does the current PCI technology even allow for bandwidth that high"
I found that unless both machines were of a recent vintage, samba seems to hit a limit. Exmaple being my current computer AMD 2400XP running Linux 2.4.24, to a AMD 500 K6-2 running Linux 2.4.20 tops off about 1 MB sec on a 100 Mb/sec network. Contrast my current computer (2400 one) to a friends 2600XP running Win2K, I was seeing about 6-7 MB/sec. (and a 25% CPU usage...)
I have found that FTP seems to use the bandwidth up better if you want to test it. Computer xbox I can get 7-9 MB/sec on a 100 Mb/sec connection.
You might also look into some network bandwidth tools that just go to and from memory and are designed for testing network speeds.
I might be good to start by measuring your network's performance, without hard drives or application software in the loop. I'd suggest using IPerf to accomplish this. If you measure less than expected performance with IPerf, your problem is with your NICs, switch, or drivers. If IPerf reports OK numbers, start looking at Samba and your hard drives. The bus shouldn't be a problem, because even a lowly 32 bit 33 MHz PCI bus has a theoretical 1.056 Gb/s data rate.
I agree. "copying a large file only moves at ~18MB/s.. why aren't I getting 80MB/s?!!!1" is kind of a stupid question. If you want to run a speed test, repeatedly copy a smaller file that your linux box can cache in memory, over and over again so you _KNOW_ it's cached. Make sure the file can fit inside whatever linux says your average 'cached' memory load is. Then get and re-get it and see how fast it gets. I'll bet that done right, you can get probably 45MB/s sustained.
...don't confuse "theoretical maximum" with "real-world use"
The other thing is that you have to look at your PCI bus. If you're using 32-bit PCI66, I think 200MB/s (or thereabout) is your max speed. And that's sustained write as bus master. now with your IDE controller and 3(!) Gigabit NICs, that's going to be cut by, say, 1/2 to (say) 1/4 depending.
You're kind of complaining "My speedometer goes to 140, but my car only gets up to 98 before the tires fly off!"
Looks like you hit it on the head. I am using an ATA/133 hard drive. I'm actually in the process of setting up a hardware RAID for the hard drives to see if that speeds things up at all.
Someone stole my old sig.
A couple likely bottlenecks:
1. samba. The microsoft SMB or CIFS protocol is a big inefficient hog. Try transferring with FTP. The data is piped down a TCP stream, end of story.
2. hard drives. most hard drives can't push a gigabit/second from the platters (let alone write). Check out their sustained transfer speed (not burst cache). Also check out your bus medium. ATA-66 won't push a gigabit.
3. pci bus. Transferring data down the PCI bus from the disk controller and then back out the PCI bus to the network card means you need a 2x effective bandwidth. PCI can't hit 2 gigabit here. You might get better results with PCI Express.
Good luck.
-molo
Using your sig line to advertise for friends is lame.
Why not use iperf, which is meant for this usage?
Agreed.
... and quite possibly contain proprietary network / NIC kernel modules to further gain improvements.
While there are a number of Linux based routers out there, none that I know of are used in the Gigabit realm. Even if they are, they at the -very- least have recompiled the kernel to switch on a number of router/gateway optimizations
Unless you have a VERY modern bus architecture (alot of people using Linux routers do so on old gear), preferably an AMD with hyperthreading (since I doubt you have a non-x86 system or you'd have mentioned it), you will never get close to maximizing not one but -3- Gb NICs.
Take a look at some of the servers that are out there in the x86 realm. They usually require you to use a 100MHz or 133MHz PCI card to get best results from a Gb ethernet NIC. And if you look at the first generation of x86 servers (say, from 2 years ago) that came with Gb ports by default, looking deep into the benchmarks you often find that they never reached their Gb potential with the built-in ports either. The advantage was that it was still better than 100 Megabit.
With a hyperthreaded high-speed bus and some kernel tweaks, I would be quite happy if I could get all 3 NICs to stress-test simultaneously at 300-500Mb/each. Heck, I'd probably be happy around the 250Mb range.
BTW, even a Gb switch, on the home CPE level, is probably never going to send multi-Gb of data (ie, by trying to switch data amongst multiple Gb ports). Often times you are limited to a max of 1Gb total throughput because of the switched backplane. Heck, even then you may max around 900Mb due to network overhead.
Moral is simply to realize that with all networking products, the real speed is usually significantly less than the rated speed.
It is more productive to voice thoughtful opinions (reply) than to judge (moderate) others.
Intel do Hyperthreading, not AMD. The buzzword there is Hypertransport, which significantly ups the speed of memory and device access; a lot of motherboards with Gigabit onboard now attach them directly to the 800/1000MHz Hypertransport bus, which can easily keep up.
With the increased bandwidth of Gigabit Ethernet, software routing on generic hardware is severely non-optimal these days.
- 32-bit/33Mhz PCI (which is the nice short PCI slots in all NON-server motherboards) is limited to 132MBytes/s transfer. Since a full 1Gbit = 128MBytes, you only can only get a theoretical 66MBytes per GBit port, since ALL traffic has to go back and forth from main memory, and thus has to cross the PCI bus twice.
- For the high-end 64-bit/66Mhz PCI slots available on server motherboards, you get a theoretical 528MBytes/s performance, which should be enough to run 2 simultaneous connections (even with some of the PCI bus collisions).
- The above holds even for dual-port NICs, since the traffic has to go back and forth to RAM, and can't just stay on the NIC.
- The size of the NIC's on-board buffer has a serious impact on performance, as this acts a temporary storage while the CPU deals with the network packet interrupt. If you have a small buffer, then you're going to force a lot of retransmits, as stuff comes in and overwrites the existing data while waiting for the CPU.
- Remember that for every packet incoming, there is an interrupt request sent to the CPU to deal with the incoming data. A rule of thumb from the Sun Solaris side of the house: You dedicated 1 full 400Mhz UltraSPARC II CPU to just servicing the interrupt requests from a single GBit ethernet card. Translated to the x86 world, that generally means that you'll run at least 25% CPU load on a 2GHz CPU while trying to service 1 GBit ethernet's full of network interrupts.
- If you have NICs which can use Jumbo Frames, these improve performance considerably, as they reduce the total number of packets (and thus, overhead) by a factor of 10.
- Linux's network stack is not fully optimized for GBit performance. The BSDs are better, but neither have had the obscene tuning that dedicated router/switch stacks have (such as Cisco's IOS).
- As mentioned above, the non-Server versions of Windows have similar limitations in their network stacks, which seem to limit network throughput to about 200Mbits/s, regardless of hardware. The various Server versions don't have this problem.
- Remember that you are doing ROUTING, when all you really want to do is SWITCHING. Routing is significantly more work for the CPU, since it involves packet inspection, and not just a MAC address table lookup and reforward.
You really need to use dedicated switches (as there are hardware ASICs that do this all at near-wire speeds, and eliminate all the potential problems above).Most PCs nowdays have the standard 32-bit 66mhz PCI slots and 64-bit PCI. Of course, the 66Mhz PCI slot will top out at transfer speeds of ~266Mbytes/s. 64 bit slots will of course be a bit faster.
Worse yet, if you are running these gigabit nics in a 33Mhz PCI slot, you will get less than 133Mbyte/s transfer rates across the bus.
So my advice to you is that you investigate what kind of speeds and slots your cards use. Are they on their optimal slot type? Are they actually using bus mastering?
You should easily be able to get near top speed if place the cards in their optimal slots and actually configure your drivers right. Also, increasing the MTU to a higher value will probably significantly improve your large data transfer speeds.
Yep. Never underestimate the ability of limited harddrive speeds to throw a wrench in file transfer speeds. I first ran across this while developing the Linux network driver for LSI's 1Gb & 2Gb Fibre Channel adapters. Spent a little while pulling hair before the whole "IDE drives on either end of a 2 Gigabit link might be an issue" point hit me. I found this to be an issue even while reading from and writing to 10K RPM FC drives. Had to use an FC RAID on both ends before I could saturate the network capacity with file transfers between only two systems.
If you really want to see what your network is capable of handling in raw bandwidth, try running large packet flood pings between each host. As a side note, this will also hammer the heck out of the corresponding network stacks.
1. What you are seeing is average rate.
TCP goes into congestion avoidance and fast retransmit and recovery (for example TCP-Reno). The connection might be touching maximum rate but you are not seeing it!
2. If your file transfer is over a large round trip time then TCP rate gets dilated: (File-Size / N*RTT)
where RTT is round trip time and N is the number of round trips required to transport the page.
3. If you are downloading the file, from "somewhere out there" then the bottleneck might be "somewhere out there" and not in your setup. Please recall, the bottleneck will cause TCP to de-accelerate whenever it sees a packet loss.
2/100 dollars.
Voltaire: God is dead.
God: Voltaire is dead!
just get a gigabit switch.
i'm not trying to be a dick here, but your a fucking moron if you think you can use elmers glue and duct tape to build a high speed network! gigabit needs gigabit cards and gigabit switches period, not haveing these is effectively taking the giga out of the bit.
secondly, if your saying that it was cheaper for you to get 5 gigabit cards that it was to get 3 gigabit cards and a 4 port gigabit switch, then a lot of your problem is problably weak gigabit cards. you didn't but the 12$ ones on the internet did you? those should be labeled 1/3gigabit, their processors arent capable of enough transactions, and some actually offload onto the CPU like some sick "winethermodem"
i run a gigabit network at my home, i have 4 desktop machines on it, 2 of them with INTEL gigabit built into the motherboard, and the other 2 with intel PCI cards. i can easily transfer using nfs at 700mpbs, which sounds fair to me after TCP/IP overhead. my samba results are a bit less and around 600-650mpbs.
also. every one of these machines is an XP1600+ or faster, except for my notebook, which is a celeron2.4 and is using a PCMCIA gigabit card from 3Com. The laptop is slower on the network with about 400mbps with samba, which is most likely a limitation of the 3com card combined with the PCMCIA bus.
--
i appologize for cursing, but please read that paragraph again. you need to build things within spec(or above) to get the stated performance, gigabit is not made to be strung nic->nic->nic and routed with standard routing software. Your PCI bus, your nics, your memory and CPU, and your un-tuned routing are problably ALL adding up to your week transfer rates.
First, range has nothing to do with it. You can get a decently low amount of latency (and high amount of bandwidth) all the way across the world with fiber. It just depends on the media. Satellite shots add a lot of latency because of the time for the signal to travel through the atmosphere and back down, plus the resending of packets due to errors.
Secondly, low latency is what you want. TCP doesn't handle HIGH latency very well. Remember, TCP needs to get ACKs back for every packet it sends. High latency means TCP has to wait a while for a response.
Just remember they are pushing Gigabits per second down fiber that laying on the bottom of the ocean.