Intel Develops Hardware To Enhance TCP/IP Stacks

← Back to Stories (view on slashdot.org)

Intel Develops Hardware To Enhance TCP/IP Stacks

Posted by timothy on Sunday February 20, 2005 @08:01PM from the sure-that's-not-a-little-overspecific? dept.

RyuuzakiTetsuya writes "The Register is reporting that Intel is developing I/OAT, or I/O Acceleration Technology, which allows the CPU, the mobo chipset and the ethernet controller to help deal with TCP/IP overhead."

17 of 271 comments (clear)

Min score:

Reason:

Sort:

Re:White elephant? by Uhlek · 2005-02-20 20:13 · Score: 2, Informative

That all depends on how it's done. Simply offloading the processing won't work, but replacing the TCP/IP drivers with simple hooks into a hardware-based I/O system can.
Qlogic TOE cards by jsimon12 · 2005-02-20 20:16 · Score: 5, Informative

Uh, this isn't new, Qlogic has been doing it for some time now, in there TOE cards (TCP Offload Engine). The cards are smoking, especially on Solaris, cause Sun's TCP stack is crappy.
1. Re:Qlogic TOE cards by incubuz1980 · 2005-02-20 21:16 · Score: 2, Informative
  
  The Solaris TCP/IP stack has been greatly improved in Solaris 10. There really is a BIG difference compared to older versions of Solaris.
Re:White elephant? by Toby+The+Economist · 2005-02-20 20:23 · Score: 5, Informative

You must imply that the hardware implimentation will be faster than the main CPU, which it almost certainly won't be, because if you've just spent 300 USD on your P4 CPU, what are you doing spending the same amount again - or more - just on your network subsystem?

Also remember that a well implimented TCP/IP stack runs at about 90% of the speed of a memcpy() (Tannenbaum's book again).

For hardware TCP/IP processing to be useful, you need to be say 2x the speed of the CPUs memcpy() function!

Given that the main performance bottleneck is memory access, since you're basically copying buffers around and so caching isn't going to help you, I don't see how any sort of super-duper hardware is going to give you anything like a 2x speed up, let alone at an economic price.

--
Toby
Re:White elephant? by Toby+The+Economist · 2005-02-20 20:27 · Score: 2, Informative

Any given thread which needs network I/O cannot continue until that I/O is complete. The fact the CPU can switch elsewhere makes no difference to the thread which requires the network packet to be processed before it has the information it requires to continue, and if that processing is offloaded to a slower network processor, the performance of that thread is degraded.

--
Toby
Re:Good stuff! by RatRagout · 2005-02-20 20:33 · Score: 5, Informative

Yes. Checksum was one of the problems. The other problem is the memory-to-memory-copying of data due to the semantics of the tcp/udp-send() call. This semantics require that the data existing in the memory location at the time send() is called is the data to be sent. If the application changes the data directly after the send()-call this should not affect what is sent. This means that the OS has to copy the data into kernel memory, and then at some later time copy it onto the nic. This memory-to-memory-copying becomes a severe problem when the traffic and bandwidth increases
Lots of people agree, including AC and DM by Anonymous Coward · 2005-02-20 20:37 · Score: 4, Informative

AC being Alan Cox, DM being Dave Miller.

Read Alan's opinion here.

Read Dave's opinion here.

There has been discussion of this specific Intel announcement here.
Re:Good stuff! by kernelistic · 2005-02-20 20:39 · Score: 5, Informative

There have been multiple fixes to address the inefficiencies of the original design of the BSD TCP/IP stack.

FreeBSD for example, has a kernel option called ZERO_COPY_SOCKETS, which dramatically increases network throughput of syscalls such as sendfile(2). With this option enabled, as the name entails, data is no longer copied from userland to kernel space and then passed onto the network card's ringbuffers. It is copied in one swoop!
Re:Ethernet controllers by Anonymous Coward · 2005-02-20 20:41 · Score: 2, Informative

You got the PCI bandwidth correct, but you're gigabit bandwidth is a hair off. Depending on how you define "giga" (base 10 or base 2), you get the following numbers:

a) Gigabit/sec = 1000 Mbit/sec = 125MByte/sec
b) Gigabit/sec = 1024 Mbit/sec = 128MByte/sec

True, even these speeds don't completely saturate the PCI bus, though because of how the PCI bus is shared (each device gets a few clock cycles to do it's thing before passing control off to the next device) no single device could anyway unless it's the ONLY thing on the PCI bus. It certianly will saturate (or come dang close to it) when it has it's moment of control though.
Old news by obeythefist · 2005-02-20 20:56 · Score: 4, Informative

Intel has been wanting to do this for years! I remember reading old articles on The Register about it, and how they were pulling back because Microsoft didn't like the idea of Intel taking away things that Microsoft were running with their software, including things like managing networking instead of having the OS do it.

Of course it couldn't last, what with nVidia doing firewalls and NICs and all sorts of other things, Intel is a big company and they know when they need to compete. MS has also lost a bit of their clout when it comes to things like pressuring the bigger companies (intel, HP, Dell)

--
I am government man, come from the government. The government has sent me. -- G.I.R.
Re:White elephant? by Toby+The+Economist · 2005-02-20 21:04 · Score: 4, Informative

You can accelerate graphics to a very large degree because the problem is very subject to parallelism.

You cannot accelerate networking very much because the problem is highly serial.

It is improper to compare the two because they are fundamentally different problems.

You can throw tons of hardware at 3D graphics and get good results, because just by having more and more pipelines, you go faster and faster.

Processing a network packet is quite different; the data goes through a series of serial steps and eventually reaches the application layer. The only way you can really make it go faster is to up the clock rate, and you find it's uneconomic to try to beat the main CPU, which remember has *already* been paid for. You have all that CPU for free; to then spend the kind of money you'd need to outpace the CPU makes no sense, let alone the money you'd need to spend to outpace the CPU by a decent margin.

--
Toby
Re:White elephant? by Uhlek · 2005-02-20 21:05 · Score: 5, Informative

Hardware implementation will most definitely be leaps and bounds faster than the general CPU. Can a Linux router route 720Gbps of traffic through hundreds of interfaces at once? No. But a Cisco 6500 can, because of hardware designed especially for the task.

Simply put, software on general purpose processors sucks for doing heavy computational work. Hardware tuned especially for a task has, and always will, be where it's at. However, the costs involved in creating ICs specific to a task usually mean that ASICs are only created where there is a need. Modern graphics cards are a great example. The on-board graphics processors are designed especially to create graphics, something that, if offloaded onto the GP CPU, would crush even the highest of the high end.

Also, offloading the TCP/IP stack on a normal workstation probably isn't going to be a huge performance boost. Where this will be useful is in situations where there is a need for high-throughput, low-latency network I/O processing.
Re:Security updates by TheRagingTowel · 2005-02-20 21:13 · Score: 2, Informative

Flash memory. It's been done all the time.

--
4Z5TX
Re:Ethernet controllers by jpc · 2005-02-20 23:25 · Score: 2, Informative

gigabit is full duplex - double your figures.

But new motherboards are already starting to come with gigabit attached to PCI Express. For the last few years any decent board has had them on fast PCI-X, at least 64 bit 66 MHz.
Re:nvidia by Glock27 · 2005-02-21 02:08 · Score: 3, Informative

Isnt Nvidia doing the same with his new nforce serie motherboards? lowering cpu usage by adding network management code and a SPI firewall inside the chipset?
Yes. The nForce4 chipsets offload most TCP/IP processing and firewall from the main CPU.
If you go with a Athlon64 Socket 939 nForce4 board, you get PCI Express, lower power consumption, a ton of great features, good Linux support, and plug-compatible dual core upgrades down the road. Intel's offerings just seem anemic by comparison.
(Personally, I'd also do an NVIDIA graphics board for the excellent Linux driver support. And no, I don't work for NVIDIA, I'm just a satisfied customer.)

--
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait
Re:White elephant? by sconeu · 2005-02-21 04:03 · Score: 3, Informative

Bullshit.

I used to work at a company that did Fibre Channel.
One of the things we had was an ASIC that did network processing in hardware, allowing us to do all sorts of interesting stuff at wire speed (2Gbps). If we had to load into memory we would have been at least an order of magnitude slower.

--
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
Re:side effects? by Fweeky · 2005-02-21 08:41 · Score: 3, Informative

From FreeBSD's zero_copy(9) manpage:
For sending data, there are no special requirements or capabilities that the sending NIC must have. The data written to the socket, though, must be at least a page in size and page aligned in order to be mapped into the kernel. If it does not meet the page size and alignment constraints, it will be copied into the kernel, as is normally the case with socket I/O.
The user should be careful not to overwrite buffers that have been writ ten to the socket before the data has been freed by the kernel, and the copy-on-write mapping cleared. If a buffer is overwritten before it has been given up by the kernel, the data will be copied, and no savings in CPU utilization and memory bandwidth utilization will be realized.
It also mentions some issues with regard to zero-copy receive, which requires help from the NIC to ensure received packet payloads are also page-aligned and >= page size. Such support is predictably very rare.