Slashdot Mirror


Intel Develops Hardware To Enhance TCP/IP Stacks

RyuuzakiTetsuya writes "The Register is reporting that Intel is developing I/OAT, or I/O Acceleration Technology, which allows the CPU, the mobo chipset and the ethernet controller to help deal with TCP/IP overhead."

16 of 271 comments (clear)

  1. Good stuff! by kernelistic · · Score: 5, Interesting

    First checksum offloading, now this... It is nice to see that hardware vendors are realizing that 10Gbit/s+ speeds aren't currently realistic without extra forms of computation support from the underlying network interface hardware.

    This is Good News.

    1. Re:Good stuff! by RatRagout · · Score: 5, Informative

      Yes. Checksum was one of the problems. The other problem is the memory-to-memory-copying of data due to the semantics of the tcp/udp-send() call. This semantics require that the data existing in the memory location at the time send() is called is the data to be sent. If the application changes the data directly after the send()-call this should not affect what is sent. This means that the OS has to copy the data into kernel memory, and then at some later time copy it onto the nic. This memory-to-memory-copying becomes a severe problem when the traffic and bandwidth increases

    2. Re:Good stuff! by kernelistic · · Score: 5, Informative

      There have been multiple fixes to address the inefficiencies of the original design of the BSD TCP/IP stack.

      FreeBSD for example, has a kernel option called ZERO_COPY_SOCKETS, which dramatically increases network throughput of syscalls such as sendfile(2). With this option enabled, as the name entails, data is no longer copied from userland to kernel space and then passed onto the network card's ringbuffers. It is copied in one swoop!

  2. finally... by N5 · · Score: 5, Funny

    intel is working on something worthwile: a cure for the common slashdot-ing

    and they say the drug companies are miracle workers ;)

    --
    John 3:16 - The easiest way to a BETTER YOU.
  3. White elephant? by Toby+The+Economist · · Score: 5, Interesting

    I think in Tannenbaum's book there's a reference which states that offloading network processing normally isn't useful, because the CPU that work is offloaded to is always less powerful than the main CPU and the main CPU is normally blocked in it's task until the network processing has completed.

    --
    Toby

    1. Re:White elephant? by Toby+The+Economist · · Score: 5, Informative

      You must imply that the hardware implimentation will be faster than the main CPU, which it almost certainly won't be, because if you've just spent 300 USD on your P4 CPU, what are you doing spending the same amount again - or more - just on your network subsystem?

      Also remember that a well implimented TCP/IP stack runs at about 90% of the speed of a memcpy() (Tannenbaum's book again).

      For hardware TCP/IP processing to be useful, you need to be say 2x the speed of the CPUs memcpy() function!

      Given that the main performance bottleneck is memory access, since you're basically copying buffers around and so caching isn't going to help you, I don't see how any sort of super-duper hardware is going to give you anything like a 2x speed up, let alone at an economic price.

      --
      Toby

    2. Re:White elephant? by Jeff+DeMaagd · · Score: 5, Interesting

      Graphics and networking are two very different things. Networking isn't compute intensive, it is I/O intensive. I don't think the Intel hardware network offload is for much more than basic computation.

      Besides, GPUs are more powerful than CPUs at the task of rendering polygons.

      Very often ASICs are better at a task than general purpose CPUs, just that considerations must be made as to whether the performance gain is worth the cost difference.

    3. Re:White elephant? by Uhlek · · Score: 5, Informative

      Hardware implementation will most definitely be leaps and bounds faster than the general CPU. Can a Linux router route 720Gbps of traffic through hundreds of interfaces at once? No. But a Cisco 6500 can, because of hardware designed especially for the task.

      Simply put, software on general purpose processors sucks for doing heavy computational work. Hardware tuned especially for a task has, and always will, be where it's at. However, the costs involved in creating ICs specific to a task usually mean that ASICs are only created where there is a need. Modern graphics cards are a great example. The on-board graphics processors are designed especially to create graphics, something that, if offloaded onto the GP CPU, would crush even the highest of the high end.

      Also, offloading the TCP/IP stack on a normal workstation probably isn't going to be a huge performance boost. Where this will be useful is in situations where there is a need for high-throughput, low-latency network I/O processing.

  4. nvidia by Ecio · · Score: 5, Interesting

    Isnt Nvidia doing the same with his new nforce serie motherboards? lowering cpu usage by adding network management code and a SPI firewall inside the chipset?

  5. Qlogic TOE cards by jsimon12 · · Score: 5, Informative

    Uh, this isn't new, Qlogic has been doing it for some time now, in there TOE cards (TCP Offload Engine). The cards are smoking, especially on Solaris, cause Sun's TCP stack is crappy.

  6. Re:Ethernet controllers by afidel · · Score: 5, Insightful

    No, a gigabit adapter can't saturate a PCI bus by itself, 32bit 33MHz PCI is 133MB/s, gigabit is 100MB/s. Then there is 32bit 66MHz PCI, and if you want you could run a 32bit card at 133MHz as the standard supports it (though I've never heard of such a card, if you need 133MHz you generally also need 64bit but I assume a ADC could use the faster speed but not need the wider word size. The fastest current implementation of the slot local bus is 16 channel PCI-express which could handle 4 10gigabit adapters. The problem would be coming up with enough data to keep those pipes full, no disk subsystem is fast enough, and any meaningfull SQL transactions are going to be CPU limited on even the bigest of servers, so why would you need a bus with more bandwidth than that? Add to this the fact that servers which actually need more throughput have long had the faster PCI slots and you realize that it's not a problem in the real world.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  7. yeah great by Anonymous Coward · · Score: 5, Funny

    soon it will be dedicated processor and RAM to deal with tcp, then a dedicated processor for the keyboard input, then a dedicated processor for the fans and a special dedicated processor on 12" PCI-X card for the extremely computationally intensive MOUSE, actually this will have it's own special dedicated path call 'AMP' or Accelerated Mouse Port. Mice of the future will need much more bandwidth than today. About 16 GB i/o so they need their own data paths.

    And then there will be other enhancements like the tcp/ip one.

    For instance a special accelerator card for Word and Internet Explorer will be developed.

    Furious Linux users will demand their own technology, so one manufacurer will come up with a special card for running GNOME apps. This card will have 4 duel core 6 Ghz processors and allow Gnome to run at normal speeds.

  8. Re:A good thing by Quobobo · · Score: 5, Funny

    Newly discovered, a simple and easy karma-gaining method! Amaze your friends, and become more eligible to moderate!

    1. Refresh your browser constantly until there's a new story on Slashdot, to post before everyone else.

    2. Post something similar to "This is good/bad, for INSERT_OBVIOUS_REASON_HERE. And fuck the INSERT_RIAA-LIKE_ORGANIZATION_HERE." (second sentence is optional)

  9. So finally! by Trogre · · Score: 5, Funny

    buying Intel really will make the internet go faster!

    --
    "Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
  10. And the integrated DRM? by tjlsmith · · Score: 5, Interesting
    and how much DRM are they going to build onto the motherboard, just in passing?

    Don't think for a minute the big boys aren't trying to take the Internet away from us. The missed the opportunity once, never twice.

    --
    Mumia Abu-Jamal is *laughably guilty*. Check the evidence.
  11. Re:Ethernet controllers by Matt_Bennett · · Score: 5, Insightful
    The critical aspect you leave out is that Gigabit ethernet is (inherently) Full Duplex. That means that that a 32/33 PCI bus would be saturated at a gigabit out, but have no bandwidth for anything incoming.

    In truth, a gigabit ethernet card can saturate a 1X PCI-E link (2Gb/s after the 8B/10B encoding is removed), when sending small packets- basically due to packet overhead.