TCP Equipped Ethernet Card
Josh Baugher writes
"
A 100 megabit ethernet card
with a TCP/IP stack built in. They claim
to be able to do 9 megabytes/second with only 2% CPU load (compared to
4.5 megabytes/second at 98% receiving CPU load using Windows NT TCP/IP
( read about this on "geeks" mailing list.) "
I'm using Samba between Windows 98 and Linux 2.2.6, and my speeds are significantly lower than
that (~ 100 kb/s). I have a decent 3Com card in
the Linux box and a cheapo Linksys card in the
Windows box, along with a Linksys hub. Is there
anything I can tweak to make it go faster?
I never though CPU load was an issue, if 100 Mbit netwwork cards have that sort of cpu load then whats the go with gigabit ethernet ?
Maybe they have some wierdo network cards that dont use DMA.
I think ill stick my 10mbit NIC for now
There are at least a half dozen companies developing/offering routers on silicon. Most have highly parallel and exotic (e.g. hypercube) architectures with hardwired advanced pattern matching algorithms. The future of TCP/IP is in dedicated hardware circuits.
Banging bits out an ISA bus isn't exactly a fast process.
Ok, Ive got two computers with SMC etherpower II 10/100 cards running under Linux 2.2.4, connected by a crossover cable in full-duplex mode, with little other traffic. ie: ideal conditions.
/dev/null
host 1: nc -l -p 5050 >
host 2: dd if=/dev/zero of=/proc/self/fd/1 bs=4096 count=102400 | time nc utrk 5050 -w1
output from host 2:
102400+0 records in
102400+0 records out
0.70user 7.70system 0:46.10elapsed 18%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (140major+22minor)pagefaults 0swaps
since netcat had a 1 sec delay, we get a total of 45 seconds for the transfer of 419430400 bytes = 9320675 bytes/sec = about what they are quoting. CPU load was higher that 2%, obviously (actually, about 25%), but the computer was still usable (it is the disk IO that would kill it, anyway)
Yeah, this is in fact one of the "other uses" that toshiba is planning for the chip that is going to be used in the "Playstation II"
Um, just wanted to point out, that the samba protocols specifies that for most machines out there, you can only send 65,535 bytes in one go. And then you have to make sure that the other side heard you. So, ftp will always be faster than samba.
Remember to keep in mind the differences between bits and bytes, while its probably true that you don't _notice_ a performance hit at 900K/s, what you really mean is 900 kilobits/s. That's nearly 80 times less than 9 megabytes/s. So if the kernel spends 1% of CPU time on 900kb then its easy to see why it might spend a whole lot more time on 80 times more data.
My two cents.
oh yeah, interested people should take a look at
D. Clark, V. Jacobson, J. Romkey, and H. Salwen,
An analysis of TCP processing overhead, IEEE Communications, Vol. 27, No. 6, pp. 23--29,
June 1989.
Classic paper! (well, kinda)
You must be *very careful* assessing the load on a Linux system, as the measurement of the load is far from optimal or complete!
/proc/stat are also useless for the kind of tests required in this case, because the load imposed by interrupt handling is added to the process that happens to be running when the interrupt comes in. When the interrupt comes when the idle process is active, the interrupt handling time is counted as idle time.
The "load average" you see displayed in the output of the "w" or "uptime" commands is quite useless in this case, it shows the average number of user processes that are "ready to run", i.e. they would take the processor if it were available. This number in no way reflects the load by interrupt handlers. It is also badly computed, as it includes processes that are waiting for the disk.
(this error was introduced into the kernel long, long ago and I have mailed Linus about it, but he did not want to change it as this figure better reflected the general feel of load on the system, in his opinion. In my opinion, it makes the whole "load average" useless, so I always patch this away when I compile a kernel)
The numbers in
This means that when a test is run that does not use much user-space processing and heavily relies on interrupt handling (like a networking test where a test process sends useless data over a TCP socket! same when it tries to use a serial port), the load results obtained from Linux will be far off the realistic load on the processor.
Rob
Although writting a driver would be fairly easy, this would break all sorts of features that have come into existance due to open source protocol stacks (not being able to do IP chains stuff to an external IP stack is a definite step backwards).
They would either need to open source the stack and make it downloadable from an OSS driver (like some of the SCSI cards out there) or the card will never get within 10' of my boxen.
Somone already pointed out the security implications. Personally, I don't want to be yanking a card in and out of production just becuase someone built the next teardrop and the vendor is slow to fix it.
As for NT, well, this is obviously the tact that MS is pursuing to gain equal performance with other OSS operating systems, but it has certain implications that will keep NT in the second fiddle chair. The card will probably weaken NT security initially by breaking fixes that are covered by current hotfixes and service patches. Additionally, I can think of no better way to fill up somone's disks than by having improved transfer retes across the net, operating system and disks. Worm designers should re-joice as well. Now, you can design worms that can consume more resources without being noticed. If you're really tricky-trick you could design a worm that existed only within the context of the the TCP/IP stack and if the board has NVRAM... well, a box could stay compromised for years. How long before a Microsoft Weapons System sees daylight?
-- "Most decently written TCP/IP stack applications have NO buffer overrun problems" - an anonymous programmer at Fort Mead
-- "ALL TCP/IP stack applications are a long way from being mathmatically correct." -- A mathmatician's retort.
-- "Our job is to find the differences between one and the other and keep this information from the public as long as possible." -- A manager who successfully defused the situation..
Heh, the sites been slashdotted, they should use their own cards or something... :)
Moving the stack into hardware is an interesting idea, though. Unfortunately it has some negative (and admittedly positive) implications for those concerned about security.
First, it will be impossible to tell what operating system a computer is running by using TCP fingerprinting. This is both good and bad in that it will thwart script kiddies to some extent by not revealing the platform, thus making it more difficult to take advantage of well known exploits. On the other hand things like Netcraft and the Internet OS counter will also not be able to take surveys properly.
Second, and entirely negative, is the possibility that their hardware implmentation of TCP/IP may be sub-standard. It may have scads of DOS loopholes and other weaknesses. Unless they make the thing software upgradeable as holes are found, and make the software Open Source, I don't see it gaining much marketshare against the cheap and plentiful cards we have now.
I submitted the news item, and thought it to be a bit overwhelming to cite more than one source. What is the appropriate limit as to how many sources to cite?
I think going 1 level down is fair -- I got the item from 1 source, the geeks list.
What if an item popped up on the geeks list that came from site1 that came from site2 that came from site3...?
Shoud geeks, site1, site2, site3 ALL be cited?
Opinions?
- A.P.
--
"One World, One Web, One Program" - Microsoft Promotional Ad
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
- A.P.
--
"One World, One Web, One Program" - Microsoft Promotional Ad
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
- A.P.
--
"One World, One Web, One Program" - Microsoft Promotional Ad
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
I've seen ftp reporting 1.1e+3 kbytes/s on my 10Mbit Ethernet. Of course, there wasn't really any other traffic on the network at the time.
\\'
Posted by max kreed:
. html
http://mainz-online.de/internet/news/news130598
Notice the date: 13th of May 1998. If it's a hoax then they've been doing it for quite a while I guess.
First of all, many threads here have been talking about writing drivers for it for Linux, etc, and second, many threads have talked about makeing it software upgradable to fix security holes.
Consider that BECAUSE it was posted to Slashdot the makers of this card could be slashdotted with email and offers of help for making drivers for alternative OSs. Plus, if slashdotters show enough intrest they'll make OS drivers because they can see that it will increase their sales.
I think this kind of thing is EXACTLY what should be posted to Slashdot because we'd be able to make a difference.
I worked with an expensive Intel NIC 9 years ago that had an i960(I think) and an OSI protocol stack on board. Never did any benchmarks, but I'm guessing the complex OSI protocol stack plus wimpy ISA '386 boxes made putting intelligence on the NIC a good idea at the time.
I figure there must be a good reason these things haven't gone mainstream in almost a decade. The proliferation of simple TCP/IP plus faster CPUs might be one reason.
It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow
Not all of your bandwidth is going to go to data. Remember that you have protocol overhead to deal with -- ethernet packets have headers, TCP packets have headers, and so forth. All of those headers take up some of your precious bandwidth, so you don't really have 1.25MB/sec to play with.
The PIO is killing your performance. Switch to a PCI bus mastering card, and your load will go WAY down.
(a full TCP stack is a pretty large thing to be putting in hardware)
Unless they count firmware as hardware. I have considered similar stunts with the Acenic gigabit card. At gigabit speeds, the idea looks a lot more useful.
Please, correct me if i'm wrong, but couldn't you
:(
use an FPGA or something similar thereby providing
reconfigurable hardware TCP.
Now that would almost certainly put the cost up and there may be security issues, i'm not too hot on FPGA's.
Anybody know if this has been tried using FPGA's somewhere else ???
Iggy
I can't wait until memory management is moved to hardware only. Maybe Linux MM would stop crashing. Looks like web servers are going to be implemented in hardware real soon. Would you believe 3D graphics were once done in software?
Winmodem people need to learn something from this. Things perform better in hardware than software. Of course, this depends on the openness of the drivers. We may be stuck with a good card and no documentation. Only problem I see is that tcp and other pieces of this layer are intended for software. So maybe this isn't a good idea (I'm tending to agree with the "what if there is a hardware bug" comment). Winmodem people seem to have taken the opposite approach and I'm not sure who is worse, but the winmodem people can definately learn from these guys.
If whoever did this is getting 900 kilobits/s, he's got SERIOUS problems :). Even 900 kilobytes/s is pretty slow for a 10Mbit (although maybe SMB is just a slow protocol?). Still, my ftpd will eat up to 5% of my CPU running at ~1100 kbytes/s, so I can see where this TCP on-a-card sort of thing might come in handy for higher bandwidth stuff.
Thanks, I never saw that before. Still there is more to consider.
When you start offloading stuff from the CPU to a processor on an add-on card, it's going to be a pretty single-purpose processor. This really limits the lifespan of it, though. What if there's some new networking protocol that obsoletes TCP? If the processor on the netcard isn't general purpose enough, then it'll go to waste. What if people start using voxels instead of polygons (hypothetical), are we going to find a bunch of Voodoo cards in the trash?
Of course general purpose processors on cards would be even worse. You'd be much better off with another CPU. But it would be nice to have some means of keeping abreast with changing protocols and APIs and the such.
Remote loopback is a term used by some Ethernet card diagnostics for a network test in which one system sends out an ethernet packet and the receiving system immediately takes it and sends it back.. sort of a link-level ping test.
With a remote loopback, you get hardware testing, but it's probably not a great test for anything having to do with tcp/ip since it's not involved.
- jon
Ganymede, a GPL'ed metadirectory for UNIX
It's a hardware story... how hard would it be to implement this on a linux box?
Just about any interesting new computer toy should be reported here. If linux (and other) users don't keep up on what's new, they're going to end up in the silicon ghetto, just like Microsoft would want.
Putting that functionality in hardware ties you to a particular implementation. Systems have been designed that would handle page faults purely in hardware. They were very complicated and rather limiting. Experience has shown this to be an area best handled by software, which is flexible and can be easily changed or fixed.
Yes, I believe that 3D graphics were once done in software. They still are, in many applications. 3D game rendering is a rather specific task, and very computationally expensive. Thus, it was worthwhile to develop specialized hardware for this purpose. Compared to modern workstation hardware, TCP stacks are fairly inexpensive, if written correctly. I question how how useful silicon TCP would be. I expect that the added cost and reduced flexibility would more than outweigh the saved CPU-load for most cases.
Then again, perhaps NT's implementation of TCP is poor enough to where this is a concern.
Oh, and I never have problems with Linux MM.
--Lenny
//"You can't prove anything about a program written in C or FORTRAN.
It's really just Peek and Poke with some syntactic sugar."
Ah, I didn't think of that. Well said. I fell into the "this shouldn't be on Slashdot!" trap there when what I really wanted to do was inject a note of caution. Proprietary hardware is one of the few things that can kill a free operating system, and a nerd who values her freedom has cause to be wary.
fish and pipes
i get about a 50% cpu load on a p5-120/3Com 509/10BT when moving at full speed, messes with my mp3 playing...
Of course, I tried to post this to /. a few times and it didn't make it. Oh well. Maybe people will find it appropriate to this thread.
I've oft wonderd about this maybe time for an other ask how can the load avg be out of sync with the cpu utlisation by such a large margin?
When 3COM first started making Ethernet boards, they tried putting the protocol processing software on a smart Ethernet board. It never worked too well. Unless you had a slow system, with a brain dead operating system or small address space. The processors on the "smart" boards were cheap and slow. The on-board software tended to be buggy, limited and out-of-date. The cards were expensive. You still had to have some sort of light weight protocol for communication between the operating system and the card, adding another layer of software. With well written software, a 25 MHz 68020 host, could run TCP/IP at wire speed on a "dumb" 10 MBPS Ethernet card. The "smart" cards quickly disappeared.
Putting the protocol processing in Silicon will burn you when you need new features and algorithms in your networking stack. What happens if you need large windows, SACK, IPV6, IPSEC, QOS?
The right solution is to use an operating system that doesn't suffer from MBD and use decently designed network cards on a fast bus. 100 MBPS shouldn't be a problem for a decent system. 1 GBPS is where current hardware and operating systems fall down and need improvement.
Mea navis aericumbens anguillis abundat
If I remember well, an Ethernet above a 10% usage level is considered very close to being quite dead.
-- Fast, Cheap, Well. Pick two.
Let's get those attributions right!
Peter
I've been wondering about how well a Linux PC would actually be doing on a 100Mbit Ethernet.. I'm a bit worried that it wouldn't be too good, as there seem to be real work for the CPU to do. I became aware of this when I found that an R5000 SGI O2 couldn't do more than max. 5 MB/sec, memory-to-memory TCP no disk involved! And this used more than 60% of the CPU, the system was completely kneeling and with all the other work going on the TCP became a real bottleneck :-(
IRIX 6.3 isn't the worst operating system in the world so this got me thinking.
That on-board TCP stack seems interesting, but only if it supports something else than NT of course.
But then there's the problem of embedded TCP stacks, I've yet to see one withouth strange bugs here and there. TCP stacks are notoriously difficult to get right, in practice it's only a real, open-source preferably Unix box that can be trusted to (eventually) get it Right.
TA
Who's talking out of his ass here? If you haven't got better comments than that please shut up. .sig Dave Miller used last year: //// //// ////
Here's the
Yow! 11.26 MB/s remote host TCP bandwidth &
199 usec remote TCP latency over 100Mb/s
ethernet. Beat that!
>I always patch this away when I compile a kernel) :-)
Would you mind posting a patch for us lazy people?
TA
Thanks,
TA
If you think that 900KB/sec is slow for 10bT, what do you think the average is? Ethernet is not known for it's ability to work well at high levels of utilization, and 900KB/sec is 7.2Mb/sec. At this level of utilization and up, the contention between different hosts trying to talk at once drives up the collision rate which keeps the tranfer rates down.
Such a thing has already become commonplace on the Amiga. Most of the soundcards have DSPs, and most of those are capable of doing MP3 decoding to take load off CPU. It's cool, but I would expect it to only be of interest to people who aren't able to get faster CPUs (e.g. 680x0 users). With today's PPCs and x86 chips, and clock rates approaching a gigahertz in the next year or so, it seems like the processing requirements of decoding MP3s are kinda trivial.
Don't get me wrong -- I'm always interested in custom hardware to take stuff over from the CPU. Heck, that was the whole point of the original Amiga hardware. But general-purpose CPUs are just getting so damned fast... who cares about CPU load anymore? Just get a faster processor. They're already "infinitely" fast for most people's purposes.
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
hmm...does this mean that some of those poor, defenseless servers are finally going to have a /. defense mechanism?
Juiced? Or Not?
First, a couple of years is a long time in the networking industry. More impressive first products have been engineered in that time frame.
Second, don't be missled by claims that something is done in hardware. As often as not, this does not mean that the entire implementation is burned into silicon. Often it only means that processing that may have been done on the main CPU has been shifted to a dedicated processor. Often times this dedicated processor may be particularly well suited to the task at hand because it implements special instructions. It may also be on the same die as other discrete functional units dedicated to the task (like an ethernet controller).
What you end up with can be quite fast, but it still retains a bit of flexibility, so it can be reprogrammed to fix bugs, or meet new standards. (or something totally unrelated. Appearantly the engineers at Alteon programmed the MIPS CPUs they use in their gigabit ethernet switches (two per port) to crack RC5 keys for a laugh.)
Well, have you?
- There are many hardware patents out there that don't affect us that much, the question is -- is this one of them?
- is the "Silicon TCP" is indeed a step forward for NICs, and if so
- is the patent based on "prior art", and therefore unenforceable?
I would hate to see something that would work well for us regular (e.g., don't have unlimited funds to buy the next new great hardware) folks trapped within a "can only get it from one company" patent....Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
What do you think DVD and VideoCD hardware decoders do? Video CD is MPEG-I, and DVD movies are stored using MPEG-II.
Of course, it's not as though MP3 decoding should be rough on a reasonably fast machine. My 333 Celeron typically has loads less than 1% while playing using mpg123 or x11amp. Comparably, running WinAmp over on my windows partiton typically uses ~10% CPU.
The problem is that adding a specialized MP3 decoder eats up yet another slot in the case (unless you make it a USB/FireWire thingy). I'm down to one PCI/ISA slot on my ATX (w/ AGP) mainboard, and once I buy a modem next month, it's full.
A potentially bigger problem is that you've got to redo all the MP3 players out there to support the decoders, and you've got to settle on a standard hardware decoding method, so we don't have to have 5 different versions of x11amp.
> Maybe we should try to port netperf
> (www.netperf.org) to Windows and add raw
> Ethernet to it.
netperf allready runs on Win32, I've tested it on both 95 and NT. The netperf ftp site has binaries for Intel and Alpha.
Neither 95 nor NT has any problem saturating a 10Mbps ethernet, but at higher speeds NT kicks the snot out of 95 on the same h/w. I've seen NT get 90+Mbps on a 100Mbps ethernet, but don't have anything faster to test with.
The loopback tests show that Linux and FreeBSD have comparable maximum speeds, much higher then NT (on the same h/w). One assumes that this has to do with how often stuff is moved around in memory, etc. etc.
BTW, if you ever want to convince yourself that ISA sucks, just run some tests on a PCI NIC and then an ISA NIC, and watch the 60-80% increase in CPU use for the ISA card.
The systems were PII-333s with 64MB RAM running NT4SP4, and DLINK 8029 (or maybe 8019) PCI NICs.
I believe that the NT TCP/IP stack has been improved greatly since the 3.5x days, which was what I suppose you would have been running on a P90.
(Sorry for delay getting back to you, has been busy at work)
Surely there is more at stake here than the CPU usage?
If the soundcard could do the decoding then you'd only be sending 3 or 4 megs over the PCI bus rather than 30 or 40 megs. Would this not be a good thing?
Sigh. That old myth. Look at the first paper on this page. One of the designers of Ethernet tested it. Drove it at almost 100% of 10Mbps. A bunch of workstations on the net.
They make it sound like they're using fancy servers.
:) Secondly, 48 megs of RAM seems a bit low for NT. Thirdly, the hard drive systems in there are probably also low quality to say the least.
:), although not a Linux box... and it was a server. I don't call a home system a server that will be pumping 90 megabits/sec.
They're not, they're using two proprietary IBM Aptivas, low end machines. (I should know, I have one of them, the same model they used). Firstly, the E56 has a 266 mhz K6, unless IBM lied to me too
Couldn't they have borrowed Mindcraft's server or something? At least THEY could tune an NT box
Isn't this what IO2 is supposed to accompish? This is very cool, but it would be nice for the TCP/IP stack to be configurable and upgradable. i.e. to IPv6 etc. This should be great for homebrew routers and such also. I hope linux drivers appear soon.
I just checked the troughput between a pair of my computers. I did this with a test program that used read and write calls to transfer the data and did nothing else with it. /dev/null is more expensive than the ethernet transfer. ;-)
The data transfer ran at 10Mbytes/second with a 12% cpu load on a PII-266. configuration info at end
Extrapolating to a modern 500MHz system, that would be something like a 6% cpu load. I suppose there are niches where the differense might matter, but with three 10Mbyte/sec transfers the PCI bus will choke anyway and we are looking at the difference between 18%cpu for network and 6%cpu for network.
I guess if you are stuck with a server OS where the TCP stack is a pig and you can't fix it then maybe this is a reasonable optimization, but I sure wouldn't bother with it for linux.
Footnote:It turns out the sending data to
Configuration of Receiver: DellXPS PII-266, Kingston 10/100mbit card(dec21140), Linux 2.2.6, tulip driver.
Configuration of Sender: Gateway PII-300, Kingston 10/100mbit card(dec21140), Linux 2.2.1, tulip driver.
I'm not sure if Kingston still makes these, we buy equivalent cards for $13 now.
Configuration of Network: Half duplex 100mbit ethernet with about 50 machines hanging on it. Mostly idle during test.
Procedural Note: I did run the test many times and exclude the slow outliers, both of these machines have operational services that I did not wish to disable, so many test runs were ruined by other loads on the machines.
Open Benchmark In order to preserve my credibility I will offer to allow anyone to run my tests exactly as I have in order to verify my results. No changes which may improve the results will be permitted.
Thad
The Bolachek Journals
The web page describing this NIC is really unimpressive. What I'd really like to know is:
How many simultaneous connections can this thing support, and is it slower when multiple connections are used? How's the performance when one of these cards is hooked up to a machine with a standard software TCP/IP?
Putting TCP/IP in hardware is nice, I guess, but then nasty real-world issues crop up. What happens if there's a bug in the implementation? There's also the nasty challenge of writing a driver for a card like this, but I won't claim that's a defect of this NIC design.
the big push for gig ether is campus trunks between buildings. goes up to 10km (single mode) and Cisco's Gigabit EtherChannel can aggregate up to 16 links. Really great if your org grew strangely, and you have departments in two buildings. It also helps in linking switches in a internet server fanout. Either way, it's more of a backbone technology than anything else. In fact, PCI32's theorietical peak (132MB/s) doesn't match the throughput of full duplex 1000Base-SX. And NT's networking core prevents running the network over 400Mbit peak.
Interestingly enough, on most tested unices, i think they're getting around 800-900Mbit. It'd be interesting to see how fast ftp.cdrom.com would be with gig ether to the backbone... Maybe then i'd see more than 10KBps...
>My 333 Celeron typically has loads less than 1% >while playing using mpg123 or x11amp.
no, that is not right, linux has got a problem with this, also the CPU usage of NAD was under Windows only 0%...
winamp consums 5,4% of the cpu power, on my PII/300.
tested with the rc5-client, with and without winamp, this 5,4% are the difference between the key-rates.
try it under linux with this method
(i had not the time, yet)
First off, to all those engaged in the "I get more bandwidth than you" pissing contest: try beating 560Mb/s sustained application-to-application bandwidth between two P2/450GX systems (running NT, BTW, but not NT's TCP/IP). So you think you can beat that, eh? Try beating 2us application-to-application latency for zero-length messages. Can't do it, even with buzzwords like VIA and I2O, can you? OK, next topic.
Regarding TCP/IP on the card: as another poster pointed out, this is not a new thing but a very old thing. I once worked on putting DECnet on old 3Com "smart" cards...ick. There are all sorts of problems with doing this sort of thing on the card. First is upgradability of the network stack. You immediately become dependent on the manufacturer for upgrades - don't expect open-source firmware any time soon, even if you had the tools to compile and load it. FPGAs aren't really a good choice here because they increase the component and design costs too much. You'd be much better off using a commodity embedded microcontroller with "firmware" stored in flash memory, although this may still increase the cost unacceptably and for _real_ speed you just plain have to chuck all this stuff out the window and go ASIC. As it turns out, most systems in most uses have more CPU power to burn than any other type of resource. Some guys at HP several years ago took this observation and ran with it; they designed a card that was even more stripped down than the typical Ethernet card, doing even more of the work in software, and they actually got excellent results.
Lastly, the conversation about NT's networking code reminds me of an exchange I had with an engineer at MS a couple of years ago. He was saying that they had to sacrifice a little on TCP/IP features and error checking (e.g. not crashing if sent a source-routed frame, or something like that) to get speed. My response was that (a) not checking unusual conditions in incoming network packets is just unacceptable, and (b) NT's TCP/IP performance is piss poor, indicating that they have bigger issues to worry about than shaving a few instructions by not checking packet headers. In the time since then I have found no reason to change either observation.
Slashdot - News for Herds. Stuff that Splatters.
O common, what do you expect?
;o).
Can we please be serious about this. 1MB/s is nothing. If win95 wouldn't be able to even sustain that...
Let us know when you get >10MB/s with a 100Mb/s board.
btw. please get your capitalization correct: MB = Mega Byte, Mb = Mega bit, mb = I dunno
Breace.
Using an other system doing the remote loopback, or simply wiring TX to RX on the system that's being tested itself.
Sorry about the confusion.
Breace.
No you don't understand.
With Linux you don't need multiple processors to sustain decent Ethernet throughput.
Breace.
I've seen NT get 90+Mbps on a 100Mbps ethernet
:(
I'm sure you are right, but can you please discribe the system (CPU, RAM etc?), just for us to get an idea. Because I've never been able to see such a thing. Then again, I think I gave up on NT when we still only had P90's...
BTW, if you ever want to convince yourself that ISA sucks, just run some tests on a PCI NIC and then an ISA NIC, and watch the 60-80% increase in CPU use for the ISA card
Right. Although we had a bad experience with the OPTi 802/832 PCI chipset. It doesn't implement DMA burst modes from RAM to PCI devices, and thus limits the max datarate in that direction to about 5MB/s. That's worse then ISA! Anyways, that's an other story altogether...
Thankx for letting us know about netperf. I wasn't able to connect to their ftp server yesterday.
Breace.
Video CD is MPEG-I, and DVD movies are stored using MPEG-II.
Video CD uses MPEG-1, layer I or II, not layer III which is MP3. Most MPEG-1 hardware decoders that I know of don't implement layer III decoding.
Most DVD movies use Dolby AC-3 for audio, not MPEG although I believe in Europe they do (but again layer I or II as I understand).
I also know of no DVD decoder that integrates MP3 decoding.
Too bad really...
Breace.
Hmm, you must be confused about the D-Link. I don't think they have a 8029. They do have a Realtek 8029 clone though, but that's a 10Mb/s only chip. It's also a NE2*00 clone, which means that it's very unlikely to actually get anywhere close to 100Mb/s because of the silly I/O scheme. I imagine it was probably a DEC based board.
Anyways, that's a pretty fast system,- we don't have one in the office here. I'll certainly redo our tests though as soon as I can, because we do have NT4 now and I think you are right about us using 3.51. (Although most of our test are Raw Ethernet related, not TCP/IP)
Thankx for letting us know.
Breace.
Exactly, this card is of interest because the Microsoft network stack is terribly slow, and costs an awful lot of CPU time.
The main reason for this is: Poor design. (What did you expect?)
The M$ network stack is (as with most M$ device driver architectures) way more complex to deal with then for example Linux and as a result many of the network drivers are not well written. E.g. they are not optimized for performance. I'm sure the writers are happy if the damn thing works at all.
The network stack itself also has much more involvement with each packet going out or coming in then with Linux. In some cases packets are actually copied in RAM. This is what causes the higher CPU overhead.
We have run several tests and the results where depressing. I think we probably lost the source code, but if I can find it I will post it somewhere.
Here's some of the figures I remember: (all tests where raw network tests, RAM to RAM, no hard drives involved)
Two P90 systems using 100Mb/s Full Duplex (DEC 21140) cards. We where unable to sustain more then 35Mb/s using UDP in one direction only.
Linux on this configuration: close to 100Mb/s.
Using a raw packet driver on a 200MMX notebook with a SMC 100Mb/s Full Duplex card and using remote loopback we are unable to get much more then 15Mb/s sustained. (That's 15Mb/s going out and 15Mb/s comming in)
This was using raw Ethernet packets, no TCP/IP or other protocol.
This same configuration in a normal network is unable to receive more than around 4Mb/s UDP multicast packets, or packets will be dropped (thus not received)
I'm surprised that people have kept up with the poor raw network performance of our Redmondians. It's a disgrace, and I will NEVER use a M$ product if serious networking is to be done. Even if the Ethernet adapter handles the protocols.
Breace.
One thing is becoming obvious.
We need some serious network benchmark tools that are cross-platform usable between Linux and Windows and maybe more.
Strangely enough not too many seem to exist. Neither does someone seem to have some hard data on network performance. It entirely useless to say 'I get 1MB/s using FTP'.
A good network benchmark tool will be able to test raw Ethernet performance as well as performance through protocol layers.
Of course it would only test the network and not use the file system or hard drive. It should be very clear what sort of configuration is supposed to be used: two systems running full-duplex, one system using remote loopback or whatever. It could also be interesting to have a > 2 system test to show what collisions do.
And most important as far as I'm concerned: Open Source. I don't believe in closed source benchmarks.
The difference between raw Ethernet and TCP/IP protocol would show us how badly we need hardware assistance on what platform.
Maybe we should try to port netperf ( www.netperf.org) to Windows and add raw Ethernet to it.
Well, maybe I'll be a bit more serious about this if there's an interest.
Breace.
Hmmm...
When I read the slogan under the Slashdot logo at the top of this page it says "News for Nerds. Stuff that Matters". I don't see "LINUX ONLY" stamped anywhere up there. I don't see anything wrong with the posters giving us articles concerning platforms or OSes other than Linux and the hardware it runs on. In fact even though I'm quite a Linux supporter, I'm sick of the people who think this must be a Linux only site. My understanding is that Linux is on the minds of the nerds at this point of history so we get alot of stories about Linux (if I'm wrong here, well, sorry... somebody should make that clear). That's cool, but why are the posters slammed when they post an article about something other than Linux? That's not cool, or uncool, or whatever.
bah
Unless your file is very large, writing is easier than reading, because buffering can cover the seeking latency. The dirty blocks sit in main memory for a while, then when the bus has a chance they sit in the controller cache, then in the disk's cache, and then when the head is in the right spot they finally get committed to media.
/dev/null
Also, with a heavily used disk running a good filesystem, large writes are more likely to be contiguous on disk than reads.
When I'm testing TCP networking performance, I give the web server a file that is just a hole (length huge, size just a few indirect blocks) and use a command line http client. Works nicely:
# httpget -v http://.../~ambrose/hole -o
67108864/67108864 bytes transferred in 8.26 seconds (7930.47 kB/sec).
Java: the COBOL of the new millenium.