Slashdot Mirror


A 50 Gbps Connection With Multipath TCP

First time accepted submitter Olivier Bonaventure writes "The TCP protocol is closely coupled with the underlying IP protocol. Once a TCP connection has been established through one IP address, the other packets of the connection must be sent from this address. This makes mobility and load balancing difficult. Multipath TCP is a new extension that solves these old problems by decoupling TCP from the underlying IP. A Multipath TCP connection can send packets over several interfaces/addresses simultaneously while remaining backward compatible with existing TCP applications. Multipath TCP has several use cases, including smartphones that can use both WiFi and 3G, or servers that can pool multiple high-speed interfaces. Christoph Paasch, Gregory Detal and their colleagues who develop the implementation of Multipath TCP in the Linux kernel have achieved 50 Gbps for a single TCP connection [note: link has source code and technical details] by pooling together six 10 Gbps interfaces."

33 of 150 comments (clear)

  1. Request For Comments by Nethead · · Score: 4, Informative

    RFC 6182 if anyone is interested.

    --
    -- I have a private email server in my basement.
    1. Re:Request For Comments by dreamchaser · · Score: 3, Insightful

      The first part I read when I heard of this was the security concerns. While there's been a good attempt to address them I am not 100% sold. I guess the proof will be in the pudding as the old saying goes. Anytime you make a new protocol, especially one that is more complex, you run the risk of increased vulnerability.

    2. Re:Request For Comments by swillden · · Score: 5, Informative

      RFC 6182 if anyone is interested.

      I think RFC 6824, linked in the summary, is the more relevant RFC.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    3. Re:Request For Comments by swillden · · Score: 4, Interesting

      What sort of security concerns are you thinking of?

      An attacker who controls one of the paths can obviously modify, replace, delay or delete portions of the stream which are multiplexed onto that path. Such an attacker could probably perform a DoS that would shut down the entire stream (disclaimer: I haven't read the details). But of course ordinary TCP is subject to all the same attacks, if the attacker has control of the path that carries it. In many cases an attacker without control of the path can also execute DoS attacks against TCP (e.g. sending RSTs).

      I'm not saying there aren't any new vulnerabilities exposed, but I'm not seeing where they would lie. TCP is not secure in any useful sense, so it's hard to see how MTCP could be worse.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    4. Re:Request For Comments by fleisher · · Score: 3, Informative

      The old saying is, "The proof of the pudding is in the eating," not "The proof is in the pudding."

      --
      Max
  2. Re:what's happening with SCTP? by swillden · · Score: 5, Informative

    Doesn't SCTP provide for these scenarios (and many more)?

    No.

    SCTP supports multiple paths between endpoints, but doesn't use them simultaneously. Rather, it picks a primary path to use for data transfers and has the ability to fail over to an alternate path in the event the primary fails.

    A quick glance at the MTCP RFC shows that it is essentially multiplexing packets over n separate TCP streams (called subflows). It's the responsibility of the TCP/IP stack (in the OS, generally) to make this multiplexing transparent to the application, so the application only sees one stream.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  3. Don't even! by Impy+the+Impiuos+Imp · · Score: 2

    I remember getting dual-channel ISDN, which was 128k, but it was split into two 56k data channels and a 16k control channel. You could never download from any one site faster than 56k because a connection couldn't straddle more than one data channel.

    Still, I could play EQ and surf at the same time on a different computer, a novel thing you young punks take for granted get off my lawn!

    --
    (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    1. Re:Don't even! by BitZtream · · Score: 3, Informative

      Wow, sucks to be you. ISDN channel bonding was well known and I personally used it to achieve higher speeds than you could on a single channel even over a single TCP connection. The bonding had nothing to do with the modem/circuit actually and in reality was just a standard feature of the PPP protocol called multilink-PPP. You can still do the exact same thing today with multiple connections and pretty much any PPP client on ANY OS on the planet.

      Of course, ISDN is actually 2 64kbps data channels and a 16kbps control channels, as it was meant to carry 2 voice channels, which by standard, are 64kbps data channels, so I'm guessing you really don't know that much about it in general.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  4. API support by AveryRegier · · Score: 2

    One of the barriers to this technology will be API support. Many APIs provide the IP address (on both sides) with the connection object. Implementors will have to make a choice about which ip to expose and remain backward compatible.

    1. Re:API support by c0lo · · Score: 3, Informative
      In my understanding, this will still rely on multiple IP addresses (not using a single IP address for all the network connections). The difference: it will ride on top of multiple TCP/IP connections - assuming they are available - to multiplex their different paths into a single socket connection (that is: no API changes).

      Sort of: if both WiFi and cell channels are available (think: wandering in a shopping mall with public hotspots), one's Android mobile will use both of them in the same time to manage one's plot in Farmville (or to download the MP3's using that magnet from the PirateBay, or placing whatever buy/sell orders on stock exchange); if one walks out of WiFi spot coverage, the mobile will use only what's available - the cell connection.

      Why I used android in my example? Well, it's a Linux kernel, the first implementation is already available. Besides, that should be great news for Google: their "goggles" will be able to transmit what you see much faster and reliable. What I understand from the MCTCP guys' presentation makes me believe MPTCP is able to cope with the use and drop from use of multiple dynamically IP addresses (are assigned to the many network devices one's mobile has): thus stepping from one hot-spot to another will not impede Google's capability to receive the data from your (their?) glasses.

      --
      Questions raise, answers kill. Raise questions to stay alive.
    2. Re:API support by CAIMLAS · · Score: 4, Interesting

      Yep. And this is a godsend, in some ways: "multipath NFS" should soon be inexplicably easier to accomplish on a high scale. I will be able to put in a single redundant/HA host with 8 1GBps NICs and not have to worry about setting up multipath on each of the individual VM heads I run. This has the significant advantage of not being stuck with immobile "SAN storage" LUNs or, for that matter, "enterprise" hardware vendors which can't bring the reliability their hardware close to anything near what generic Intel or even bcm network cards can provide.

      All the better if I've got unified storage at the backend with abstracted paths (eg. lustre, unionfs).

      And from the looks of it, it's designed 'forward' - it's going to be MUCH easier to do HA TCP connectivity with this than it is with misc. service level TCP (eg. heartbeat), particularly when you're dealing with (mostly) centrally assigned IPv6 addresses. Awesome.

      Granted, from the looks of it, we may have to wait for switch support first, too... I didn't read that carefully.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    3. Re:API support by olivier.bonaventure · · Score: 4, Informative

      The current implementation in the Linux kernel only exposes the first address used in the connection to the application. If the addresses change, the application is not informed but the TCP connection remains alive. Exposing addresses to the application is an old mistake of the socket interface. The socket interface does not expose packet losses because TCP deals with them and provides a bytestream abstraction to the application. Multipath TCP does the same, it handles all changes in address transparently to the application.

    4. Re:API support by olivier.bonaventure · · Score: 5, Informative

      Multipath TCP supports transparently IPv4 and IPv6. A Multipath TCP connection can start over IPv4 and then use IPv6 without the application being aware of the utilisation of IPv6. This could help the utilisation of IPv6 paths by IPv6 unaware TCP applications.

    5. Re:API support by funkboy · · Score: 2

      Great idea.

      The fact that the protocol supports this without requiring changes to the applications is pretty impressive.

  5. Re:Bad math? by Zapotek · · Score: 2

    I assume 10Gbps were eaten by protocol overhead and arbitrary resource restrictions. Perfect distribution/load-balancing is seldom the case in the real world and this does seem like quite an achievement, all things considered. Easy link aggregation at the protocol level, a big thank you to the devs. :)

  6. Re:what's happening with SCTP? by c0lo · · Score: 4, Informative
    In my understanding (I might be wrong):

    1. SCTP - identified by a protocol number (132) - acts at the network layer. If a router along the route refuses SCTP, you are screwed; Advantage: is capable of UDP as well).

    2. MPTCP - relies on pure TCP for all the connection (acts at the transport layer and fixes the protocol to TCP) and set in place conventions between client-server to discuss over multiple paths. Advantage: no sane public network will try to block it (pretty much like using http on port 80). Disadvantage: TCP only.

    --
    Questions raise, answers kill. Raise questions to stay alive.
  7. Re:cell networks already have issues by ebno-10db · · Score: 3

    Sheesh, you wanna put even more people out of work? More cell bandwidth needed? Ok, more base stations, new and improved protocols, new frequency allocations, etc. etc., etc. As someone who once made a living working on cellular (phy layer) stuff, I say 12 year old Tiffany has both a Constitutional and a God given right to stream Justin Bieber videos while texting her buddy sitting right next to her. I'll even write the manifesto!

    More seriously, a lot of what we take for granted started out as frivolous luxuries. I tell my daughter about days before cell phones, or PC's, and having seven channels of broadcast TV (and having to get up to change the channel!) and she's convinced I come from the age of dinosaurs. She's probably right. That was good, because I made a living changing it.

  8. Re:cell networks already have issues by c0lo · · Score: 4, Informative

    without every user making 3 connects to view their friends cat picture.

    Rest assured: there'll be a single connection using a cell tower. A second flow will be made using the connection with nearby WiFI hot-spot, and Tiffany's chatting to her buddy sitting next to her will be really faster (without quotes); even better, the above will happen without Tiffany knowing or the extra requirement for Tiffany to have a geek father that's not lazy and does have spare time (even if one may wonder what to what good being a geek will be in the future).

    --
    Questions raise, answers kill. Raise questions to stay alive.
  9. Re:Uh, I get this with lacp by LordLimecat · · Score: 4, Informative

    No, you dont. If I remember correctly, LACP will give you the maximum bandwidth provided by a single link, per connection. You cant just hook up LACP / LAGG / whatever your vendor calls it, fire up iSCSI, and magically have a 2gbps link to your SAN-- because iSCSI does a single connection per LUN, you will get a 1gbps connection even with LACP.

    LACP gets you higher total capacity, so if you were running two iSCSI connections you could get 1gbps on each with no contention. If the summary be believed, this would give you a truly multi-gbps link off of aggregated gbit connections.

  10. Re: Uh, I get this with lacp by c0lo · · Score: 2

    If cell manufacturers designed their equipment and built the right drivers

    And if Apple refuses to implement it, you will still be able to grab an Android, compile/install the MPTCP stack and do it (without waiting for Apple to resist the mobile providers pressure in not supporting a feature that would hurt their bottom line. Or, for the matter, wait for the mobile providers to upgrade their towers and hurt their bottom line by themselves).

    --
    Questions raise, answers kill. Raise questions to stay alive.
  11. Re:You're supposed to get an AS number. by Guido+von+Guido+II · · Score: 2

    If you want to use multiple links all at the same time, with the packets spread over them, you're supposed to get an Autonomous System number.

    This is more akin to link aggregation than it is multihomed Internet connections. Any two hosts could use this. They could be in the same autonomous system. They could be on the same subnet. There's no need to get a separate AS number for each host.

    Note that one of the other use cases suggested is for smartphones.

  12. Support available already for most unices by c0lo · · Score: 4, Informative

    For those wanting to try, their install howto. Seems supported on:
    1. Linux - either debian binaries or compiling from source. Both kernel module and UserSpace ways.
    2. Virtualized Linuxes - their example is provided for Amazon EC2
    3. Mac OSX - but, obviously, not on iPhone (I estimate slim chances for this to happen in the near future - it's a technology disruptive for the mobile providers income, as it makes the multi-pathing over cell/WiFi hot-spots transparent to end user)
    4. Android (Opinion: see? This is one of the reasons relying on "walled gardens" is bad: you have to wait for the mercy of the garden lord to benefit from something).

    --
    Questions raise, answers kill. Raise questions to stay alive.
  13. Re:Standard DSL + custom host file = 50gbps connec by Anonymous Coward · · Score: 2, Interesting

    or maybe we could just filter comments based on length or number of links. >1000 words or >20 links

  14. Re:Use Cases? by aXis100 · · Score: 3, Informative

    You're missing the point. One of the big reasons to have multiple interfaces is for redundancy - with a company's internet interface, redundancy would be vastly improved by choosing two different providers, and even better with different mediums. The subnets will definitely be different.

    Having both of these links acting simultaneously would be great and I could see a lot of people being excited about it.

  15. Re:Uh, I get this with lacp by silas_moeckel · · Score: 3, Interesting

    Not unless they changed something recently. Read http://www.ieee802.org/3/hssg/public/apr07/frazier_01_0407.pdf LACP requires that any conversation goes over only a single link at a time. Out of order packets can do some rather nasty things to tcp connections and adding buffers to correct that does nasty things to voip / other latency sensitive bits. Sure linux boxes have some non standard modes that might work if you sitting one switch away but that's not conforming to the LACP spec. They also do not scale as they require keeping state of every session running through them. What networking gear are you using?

    --
    No sir I dont like it.
  16. Re:Uh, I get this with lacp by LordLimecat · · Score: 4, Informative

    According to both the article which silas linked below (which is the original source for what I said), as well as a whole boatload of other documentation, thats not correct; its an 802.1ad issue.

    I did find this on serverfault which indicates that ONLY balance-roundrobin can get you 2gbps on a single tcp connection; and it also notes that some protocols dont like it, which means that its not really a transparant bonding technology. All of the other methods of distributing packets rely on a hash of various values, for instance source mac and destination mac IDs, and regardless of method the hash will ALWAYS be the same on a single TCP connection, which means that the same single link will be used.

    Regardless, the Linux Bonding driver is NOT the same thing as LACP, and its not something you implement on the switch.

  17. Re:What am I missing? by Anonymous Coward · · Score: 3, Informative

    You want to send a shitload of data to a destination but it takes too long? Not a problem, throw a couple quad nics in those bitches and bond them up, problem solved providing your network can support the throughput.

    What am I missing?

    This is layer 4 not 2. So long as both endpoints support it, it don't matter where the traffic goes. they can go over entierly different paths. This is doing what you describe, but over the internet. Transparent to the network, and the higher levels of the protocol stack.

  18. Re:what's happening with SCTP? by butlerm · · Score: 4, Informative

    On the contrary, SCTP is a transport protocol just like TCP, except with a large number of added features. The main problem with SCTP has nothing to do with SCTP at all. It is that NAT devices do not support any transport protocol that they haven't been programmed for in advance. This makes SCTP next to impossible to deploy on a broad scale - NAT, that wart upon router-kind, is ubiquitous.

    TCP would have exactly the same problem if it were a new protocol. A NAT device requires relatively deep knowledge of TCP to support it at all. It play games with both ports and addresses, keeps track of connection state, and so on. Ordinary routers do no such thing. A NAT device is a transport layer proxy by another name.

  19. Re:what's happening with SCTP? by butlerm · · Score: 5, Informative

    Work is underway for concurrent multipath transfer for SCTP as well. Also known as CMT-SCTP. There are significant challenges in doing this sort of thing though. SCTP wasn't designed for CMT, and probably needs much more radical changes than the current architects are proposing to do it well.

    Changes like subflows with independent sequence numbers and congestion windows, to start with. SCTP is much further ahead in the connection handling and security department, but MPTCP has the odd advantage of resorting to independent subflows to begin with, and if it can handle path failure properly, it might well be ahead in the CMT game, if byte stream semantics are all you need.

  20. Re:what's happening with SCTP? by olivier.bonaventure · · Score: 4, Informative

    SCTP is cleaner than Multipath TCP, but it suffers from two drawbacks that hinder its deployment in today's Internet : - many middleboxes only support IP, ICMP and TCP and discard SCTP packets (or do not perform NAT correctly) - applications need to be modified to support SCTP Multipath TCP is an evolution to TCP that works with unmodified applications and unmodified middleboxes.

  21. Re:Bad math? by olivier.bonaventure · · Score: 4, Informative

    The limit here is the CPU and on the sender and the receiver. Both servers used in the test reached 98% CPU load to achieve 52 Gbps. Note that 52 Gbps is the googput at the application and not the bandwidth used on the links (which is higher due to the various overheads)

  22. Re:fault tolerance by patch11 · · Score: 4, Informative

    MPTCP has separate sequence-number spaces. One for the subflow, inside the regular TCP header. And the data sequence-numbers, included inside the TCP option-space.

    This data sequence numbers include data-acks. So, this is your mentioned "cross-subflow ack machinery".

  23. Re:what's happening with SCTP? by fa2k · · Score: 3, Insightful

    Your comment is correct, but NAT is not the core problem. In a world without NAT people would still use stateful firewalls. Those firewalls should be configured to drop anything unknown, because as a principle whitelisting is better than blacklisting.