Linux 3.3: Making a Dent In Bufferbloat?

← Back to Stories (view on slashdot.org)

Linux 3.3: Making a Dent In Bufferbloat?

Posted by Unknown on Wednesday March 28, 2012 @03:35AM from the drop-15-packets-in-ten-minutes-a-day dept.

mtaht writes "Has anyone, besides those that worked on byte queue limits, and sfqred, had a chance to benchmark networking using these tools on the Linux 3.3 kernel in the real world? A dent, at least theoretically, seems to be have made in bufferbloat, and now that the new kernel and new iproute2 are out, should be easy to apply in general (e.g. server/desktop) situations." Dear readers: Have any of you had problems with bufferbloat that were alleviated by the new kernel version?

8 of 105 comments (clear)

Min score:

Reason:

Sort:

What is bufferbloat? by stillnotelf · 2012-03-28 03:44 · Score: 5, Informative

TFS doesn't mention, and it's hardly an obvious term. From TFA:

Bufferbloat...is the result of our misguided attempt to protect streaming applications (now 80 percent of Internet packets) by putting large memory buffers in modems, routers, network cards, and applications. These cascading buffers interfere with each other and with the flow control built into TCP from the very beginning, ultimately breaking that flow control, making things far worse than they’d be if all those buffers simply didn’t exist.
16550A by Anonymous Coward · 2012-03-28 03:52 · Score: 5, Insightful

In my day, if your modem had a 16550A UART protecting you with its mighty 16 byte FIFO buffer protecting you, you were a blessed man. That little thing let you potentially multitask. In OS/2, you could even format a floppy disk while downloading something thanks to that sucker.
Re:Hm by nosh · 2012-03-28 04:17 · Score: 5, Insightful

People might be sceptical how big the problem is, but the analysis itself and the diagnosis is sound. Most people are only suprised they did not think about it before.
The math is simple: If you have a buffer that is never empty, every package will have to wait in the buffer. If you have a buffer full all the time, it serves no purpose but only defers every packet. And given that RAM got so cheap that buffers in devices grew so much more than bandwidth, you now often have buffers big enough to hold packets needing full seconds to send them all. Such a buffer running in always-full mode means high latency for no gain.
All additional factors of TCP going harvoc when latency is too high and no longer being able to compute how fast to optimally send if no packages get dropped only make the situation worse, but the basic is simple: A buffer always full is a buffer only having downsizes. The more the bigger it is....
Re:Hm by Anonymous Coward · 2012-03-28 04:37 · Score: 5, Interesting

If your TCP flow-control packets are subject to QoS prioritisation ( as they should be ) then bufferbloat is pretty much moot.
TCP flow-control makes the assumption that dropped packets and retransmission requests are a sufficient method of feedback. When it doesn't receive them, it assumes everything upstream is going perfectly and keeps sending data over (in whatever amounts your QoS setup dictates, but that's not the point).
Except everything upstream is maybe not going well. Other devices on the way may be masking problems -- instead of saying "I didn't get that, resend", they use their local buffers and tell downstream nothing. So TCP flow-control is being kept in the dark. Eventually a buffer runs out and that's when that device starts asking for a huge pile of fresh data at once... which is not how it was supposed to work. So the speed of the connection keeps fluctuating up and down even though pipes seem clear and underused.
The immediate workaround is to trim buffers down everywhere. Small buffers are good... but they shouldn't grow so much as to take a life of their own, so to speak.
One thing done in Linux 3.3 to start adressing this properly is the ability to express buffer size in bytes (Byte Queue Limits (BQL)). Historically they worked with packet amounts only, because they were designed in an era where packets were small... but nowadays they get big too. It gives us better control and somewhat alleviates the problem.
The even better solution is to make buffers work together with TCP flow-control. That would be Active Queue Management (AQM), which is still being developed. It will basically be an algorithm that decides how to adapt use of buffers to traffic bursts. But a good enough algorithm has not been found yet (there's one but it's not very good). When it will be found it will need testing, then wide-scale deployment and so on. That might still take years.
You missed the feedback loop (TCP flow contol) by Richard_J_N · 2012-03-28 06:03 · Score: 5, Informative

Unfortunately, I think you haven't quite got this right.
The problem isn't buffering at the *ends* of the link (the two applications talking to one another), rather, it's buffering in the middle of the link.
TCP flow control works by getting (timely notification of) dropped packets when the network begins to saturate. Once the network reaches about 95% of full capacity, it's important to drop some packets so that *all* users of the link back off and slow down a bit.
The easiest way to imagine this is by considering a group of people all setting off in cars along a particular journey. Not all roads have the same capacity, and perhaps there is a narrow bridge part way along.
So the road designer thinks: that bridge is a choke point, but the flow isn't perfectly smooth. So I'll build a car-park just before the bridge: then we can receive inbound traffic as fast as it can arrive, and always run the bridge at maximum flow. (The same thing happens elsewhere: we get lots of carparks acting as stop-start FIFO buffers).
What now happens is that everybody ends up sitting in a car-park every single time they hit a buffer. It makes the end-to-end latency much much larger.
What should happen (and TCP flow-control will autodetect if it gets dropped packet notifications promptly) is that people know that the bridge is saturated, and fewer people set off on their journey every hour. The link never saturates, buffers don't fill, and nobody has to wait.
Bufferbloat is exactly like this: we try to be greedy and squeeze every last baud out of a connection: what happens is that latency goes way too high, and ultimately we waste packets on retransmits (because some packets arrive so late that they are given up for lost). So we end up much much worse off.
A side consequence of this is that the traffic jams can sometimes oscillate wildly in unpredictable manners.
If you've ever seen your mobile phone take 15 seconds to make a simple request for a search result, despite having a good signal, you've observed buffer bloat.
Re:Hm by RulerOf · 2012-03-28 06:12 · Score: 5, Informative

Why would my buffer be never empty? Just because you have more buffers doesn't mean you process anything slower than you have to. It just means if you can't get around to something immediately, you can catch up later.
That's exactly the problem. TCP relies on packets being dropped in order to manage connections. When buffers are instead allowed to fill up, delaying packets instead of outright dropping them, the application relying on those packets experiences extremely high latency instead of being rate-limited to fit inside of the available bandwidth.

The problem has come to pass because of how counterintuitive this really is. It's a GOOD THING to discard data you can't transfer RIGHT NOW, rather than wait around and send it later.

I suppose one of the only analogs I can think of might be the Blackbird stealth plane. Leaks like a sieve on the ground, spitting fuel all over the place, because at altitude the seals expand so much that they'd pop if it hadn't been designed to leak on the ground. Using gigantic packet buffers would be like "fixing" a Blackbird so that it didn't leak on the runway.

--
Boot Windows, Linux, and ESX over the network for free.
Re:oversimplified PR noise ignores decade of resea by jg · 2012-03-28 06:15 · Score: 5, Interesting

You are correct that replacing one bad constant with another is a problem, though I certainly argue many of our existing constants are egregiously bad and substituting a less bad one makes the problem less severe: that is what the cable industry is doing this year in a DOCSIS change that I hope starts to see the light of day later this year. That can take bloat in cable systems down by about an order of magnitude, from typically > 1 second to of order 100-200ms; but that's not really good enough for VOIP to work as well as it should. The enemy of the good is the perfect: I'm certainly going to encourage obvious mitigation such as the DOCSIS changes while trying encourage real long term solutions, which involve both re-engineering of systems and algorithmic fixes. There are other places where similar "no brainer" changes can help the situation.
I'm very aware of the research over a decade old, and the fact that what exists is either *not available* where it is now needed (e.g. any of our broadband gear, our OS's, etc.), and *doesn't work* in today's network environment. I was very surprised to be told that even where AQM was available, it was often/usually not enabled, for reasons that are now pretty clear: classic RED and derivatives (the most common available) require manual tuning, and if untuned, can hurt you. As you, I had *thought* this problem was a *solved* problem in the 1990's; it isn't....
RED and related algorithms are a dead end: see my blog entry on the topic: http://gettys.wordpress.com/2010/12/17/red-in-a-different-light/ and in particular the "RED in a different light" paper referenced there (which was never formally published, due to reasons I cover in the blog posting). So thinking we just apply what we have today is *not correct*; when Van Jacobson tells me RED won't hack it (which was originally designed by Sally Floyd and Van Jacobson) I tend to believe him.... We have an unsolved research problem at the core of this headache.
If you were tracking kernel changes, you'd see "interesting" recent patches to RED and other queuing mechanisms in Linux; this shows you just how much such mechanisms have been used, that bugs are being found in this day and age in such algorithms in Linux: in short, what we have had in Linux has often been broken, showing little active use.
We have several problems here:
1) basic mistakes in buffering, where semi-infinite statically sized buffers have been inserted in lots of hardware/software. BQL goes a long way toward addressing some of this in Linux (the device driver/ring buffer bufferbloat that is present in Linux and other operating systems).
2) variable bandwidth is now commonplace, in both wireless and wired technologies. Ethernet scales from 10Mbps to 10 or 40Gps.... Yet we've typically had static buffering, sized for the "worst case". So even stupid things like cutting the buffers proportionately to the bandwidth you are operating at can help a lot (similar to the DOCSIS change), though with BQL we're now in a better place than before.
3) the need for an AQM that actually *works* and never hurts you. RED's requirement for tuning is a fatal flaw; and we need an AQM that adapts dynamically over orders of magnitude of bandwidth *variation* on timescales of tens of milliseconds, a problem not present when RED was designed or most of the AQM research of the 1990's done. Wireless was a gleam in people's eyes in that era.
I'm now aware of at two different attempts at a fully adaptable AQM algorithms; I've seen simulation results of one of those which look very promising. But simulations are ultimately a guide (and sometimes a real improving insight): running code is the next steps, and comparison with existing AQM's in real systems. Neither of these AQM's have been published, though I'm hoping to see either/both published soon and their implementation happening immediately thereafter.
So no, existing AQM algorithms won't hack it; the size of t
Re:Hm by Kidbro · 2012-03-28 09:50 · Score: 5, Funny

A stealth plane analogy. I didn't see that coming!
(very informative post, btw - thank you:)

--
May we live long and die out