Slashdot Mirror


Intel Develops Hardware To Enhance TCP/IP Stacks

RyuuzakiTetsuya writes "The Register is reporting that Intel is developing I/OAT, or I/O Acceleration Technology, which allows the CPU, the mobo chipset and the ethernet controller to help deal with TCP/IP overhead."

7 of 271 comments (clear)

  1. Interesting by miyako · · Score: 4, Insightful

    This seems interesting, though given intels track record I wonder if it will really be as useful as they are speculating, as the article has no real technical information.
    Granted, I've never administered a server that was under anywhere remotely near the types of loads we are talking about for this to be useful, but I have a hard time imagining that dealing with the TCP/IP stack would be more intensive than running applications (as the article claims).
    So, far all you people out there much more qualified to discuss this than I am, will having some part of the processor dedicated to handling TCP/IP really speed things up, or is this primarily a marketing technology?

    --
    Famous Last Words: "hmm...wikipedia says it's edible"
  2. Re:Ethernet controllers by afidel · · Score: 5, Insightful

    No, a gigabit adapter can't saturate a PCI bus by itself, 32bit 33MHz PCI is 133MB/s, gigabit is 100MB/s. Then there is 32bit 66MHz PCI, and if you want you could run a 32bit card at 133MHz as the standard supports it (though I've never heard of such a card, if you need 133MHz you generally also need 64bit but I assume a ADC could use the faster speed but not need the wider word size. The fastest current implementation of the slot local bus is 16 channel PCI-express which could handle 4 10gigabit adapters. The problem would be coming up with enough data to keep those pipes full, no disk subsystem is fast enough, and any meaningfull SQL transactions are going to be CPU limited on even the bigest of servers, so why would you need a bus with more bandwidth than that? Add to this the fact that servers which actually need more throughput have long had the faster PCI slots and you realize that it's not a problem in the real world.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  3. Re:the good, the bad, the ugly? by pc486 · · Score: 3, Insightful

    I can't believe the parent got modded up. This kind of thing has been done before (RTFA. Yeah yeah, I know. I must be new here...). It's called TOE (TCP Offload Engine) and many networking companies have done TOE. However, most cards are expensive and don't have much support across platforms.

    What's new here is that Intel wants to put this in their chipsets everywhere and not just in $700+ NICs. Already this has been happening with checksum offloading, TCP fragmentation, smart interrupts, and so on in most GigE chips.

    So yes, people have done this before and have been since at least 2000.

    As far a DRM is concerned, look at the NIC market and look at the TCP/IP spec. TCP/IP? Standard and anything non-standard won't work with stuff that's out there. Wierd NICs? I've been getting Linux source-code drivers for even the cheapest of cheap NICs for years now. There's too much competition to sneak in something restrictive.

  4. And the CPU doesn't have other things to do? by Moderation+abuser · · Score: 3, Insightful

    My boxes all run tens to hundreds of processes for tens to hundreds of people. Offloading the processing to a networking subsystem isn't going to hurt, especially with gig and 10gig.

    Not that this is a new idea. It's been done for donkey's years.

    --
    Government of the people, by corporate executives, for corporate profits.
  5. Re:White elephant? by Uhlek · · Score: 4, Insightful

    Comparing the two is completely valid when you're discussing the benefits of task-customized hardware and general purpose computing. Are there limitations where a hardware-based TCP/IP stack will be useful in the desktop/server market, yes, of course there is. But for high-bandwidth applications, I can assure you that offloading the TCP/IP overhead onto an ASIC will not only give you better performance, but also free up primary processor time for other applications.

    Also, Catalyst switches are not highly parallel. They can be parallel, depending on the exact model and configuration, as well as the exact path inside the switch that the traffic takes, but it's not even remotely the same in execution as having "hundreds of linux routers side by side."

    Instead, it is the exacting way in which the various components of the switch pass data, the very specific purpose of each chip and circuit in the device that gives modern routers the speed they do. Special components such as content-addressable memory, tertiary content addressable memory (memory that allows you to store 0s, 1s, and wildcard values instead of just 0s and 1s, allowing for wire-speed match comparisons against ACLs and routing tables), etc. etc. It isn't merely a stack of GP CPUs all running in parallel to achieve a particular task.

    Systems guys often mistake routers and switches for computers with a bunch of Ethernet jacks. They're far from it. They are highly specialized pieces of hardware designed from the bottom up to do one thing and do it well -- transport data. Computers are the opposite. They're designed from the bottom up to be able to do whatever you wish them to as fast as possible, but that flexibility comes with a price.

    If you ever get the urge, you should read up on Catalyst switching architecture. You'll find it quite interesting.

  6. Re:Ethernet controllers by Matt_Bennett · · Score: 5, Insightful
    The critical aspect you leave out is that Gigabit ethernet is (inherently) Full Duplex. That means that that a 32/33 PCI bus would be saturated at a gigabit out, but have no bandwidth for anything incoming.

    In truth, a gigabit ethernet card can saturate a 1X PCI-E link (2Gb/s after the 8B/10B encoding is removed), when sending small packets- basically due to packet overhead.

  7. This old bit of snake-oil... by Ancient_Hacker · · Score: 4, Insightful
    The nightmare continues. It goes something like this: Some drooling "computer scientist" is too dumb to do anything useful, so they speculate" "Wouldnt it be nice to free up this $XXXX CPU from this humdrum task (choose: moving bits/bytes/pixels/ or packets)". He finds a brain-addled silicon-stuffer to design a chip to do just that. All rejoice at the increased efficiency.

    Except:

    • The silicon-stuffer only has access to the slow processes of maybe two silicon generations back, unlike the CPU which paid for the latest whizzy xx picofurlong process. So the supposedly whizzy chip is still not particularly faster than the CPU.
    • The whizzy chip shows up late, just about when the associated CPU is going to take a 2x speed hike.
    • The chip is on the I/O bus, requiring many slow I/O cycles, with interrupts masked, to get its commands.
    • Said whizzy bit-banger doesnt have any software support from the main operating systems.
    • The silicon-etcher guy can't write english worth a damm, so nobody can understand the spec sheet.
    • And oh, he didnt know the bus was active-low, so all the data packets have to be inverted.
    • And sometimes byte-reversed too.
    • The chip designer doesnt know or care about the whole system, so the chip does several things that spoil the overall performance, like hogging the bus, saturating the bus snoop logic, poisoning the cache, interrupting too often, etc.
    • The droolers forgot to think about the multi-processor option, so the chip doesnt share well with multiple CPU's.
    • The chip is all hard-wired gates, so there's no way to fix the problems.
    Finally some software wizard finds a way of speeding up the code that runs in the CPU so it's now faster than the separate chip, so the chip is now useless and just an extra power waster.

    We've seen successive waves of this concept, none of them have had much success. Graphics processors are one partial exception, and it took almost a decade of mis-designs of those before they became stable enough to be usable.