Is Your Internet Connection Free From Bufferbloat? (blogspot.com)
Bufferbloat is that "undesirable latency that comes from a router or other network equipment buffering too much data," according to the site for
an ongoing project trying to address it. Now long-time Slashdot reader mtaht writes:Inside the lede-project, two core new bufferbloat-fighting techniques are poised to enter the linux mainline kernel and thousands of routers -- the first being a fq-codel'd and airtime fair scheduler for wifi, and the second, the new "cake" qdisc, which outperforms fq_codel across the board for shaping inbound and outbound connections.
His submission ends with a question for Slashdot readers. "It's been nearly six years since the start of the bufferbloat project. Have you or has your ISP fixed your bufferbloat yet?"
His submission ends with a question for Slashdot readers. "It's been nearly six years since the start of the bufferbloat project. Have you or has your ISP fixed your bufferbloat yet?"
Not like I'm writing router or server code, I'm just a clueless dude surfing the web. Bad stuff happens, "the network" is hosed.
Why the fuck would Tin Man have a woody?
While bufferbloat was patched out, my router is still under the control of a cargo ship, which claims to actually be an aircraft carrier. What is to be done about blufferboat?
Judging from the first 25 replies, the slashdot readership is suffering from an overdose of eggnog. Here's a link (which has links to results from every ISP), which shows latency under load often measured in seconds. http://www.dslreports.com/spee... The problem with this survey is that there are now plenty of folk that get sub-30ms latencies on their internet - which is what those using bufferbloat fixes get, and the question was if you or your isp was driving improved hardware to get those results. Problem seems to be 99% of the results are worse than that, still, 4+ years after the code to fix first arrived in Linux.
The tangible problem is if you need low latency, or want to maintain the latency you have, when your upstream connection is saturated. At least I think that is what it means.
...is what slows my connection speed down. Fuck, I could have a gigabit connection and would spend 80% of my time waiting for the next version of ad.doubleclick.net, etc. Really? Bufferbloat? I wish!
And how would that improve things?
It wouldn't. Nagle's algorithm doesn't cause congestion, it reduces it.
"Solving" a problem by going back to a probably worse one isn't really "solving it"
The first step in "solving" a problem is verifying that it is actually a problem. I am not convinced that "bufferbloat" (whatever that means) is a problem. Buffering can reduce latency, especially under heavy load, by better bandwidth utilization, and allowing faster retransmission of dropped packets. If it is slowing things down, then you should fix the buffering rather than eliminating it.
... and yes, I read TFA. It is a bunch of poorly labeled graphics that didn't make any sense to me, and seem to be designed to obfuscate rather than enlighten, although that may just be a result of Hanlon's Razor.
DSL is unfortuantely the best internet connection in the small town I live in. The upload rate of these connections is really slow, and for large uploads, can saturate the connection. What this translates to in the real world is constant complaints from people about how their internet connection has just died for no good reason. What's happening in 99% of these cases is that some iPad in their house is backing up to iCloud, and bufferbloat from this upload is temporarily wiping out download speeds.
What I did was install the OpenWRT firmware on my TP-Link router, and install the SQM (Smart Queue Management) QoS application on it. This shapes uploads so that bufferbloat is greatly reduced. I tested all of this on DSLReport's Bufferbloat page, and it works great.
which sqm mode are you using?
The latency measurements in the article are meaningless. Reducing seconds of latency to milliseconds! Where is point a and b? The driver layer adds ten seconds of latency? None of this makes sense.
It is entirely probable we've been inside our own filter bubble so long (6 years) we cannot properly communicate with first time readers! some folk explaining the problem... the ietf video shows the benefit from fixing it. https://www.bufferbloat.net/pr... showing the extent: http://www.dslreports.com/spee... you have this entirely backwards: "Buffering can reduce latency, especially under heavy load, by better bandwidth utilization, and allowing faster retransmission of dropped packets. If it is slowing things down, then you should fix the buffering rather than eliminating it." You want enough buffering to absorb bursts, but any more just adds latency. Van Jacobson and kathie nichols calls this distinction good queue and bad queue: https://tools.ietf.org/html/dr... Less buffering (and fair queuing) allows for faster retransmission in particular.
Comment removed based on user account deletion
Buffers are not a problem for latency, the growing internet is. Back in the early 2000s from a particular place in Europe to the west coast in US we averaged over 220ms RTT because it was going up to a satellite, landing in Newark and then traveling over 4 hops to the west coast. Around 2003 when we switched to fiber we got down to about 110ms, with the fiber going via two landing stations on the north shore of Africa, then via France via the Atlantic Ocean to Maine (or dalaware) then over land to the west coast. That was something like 12 to 14 hops.
As the years passed newer and faster fibers were put in place, but also more routers were added to branch the backbone more. Now the same geographic location in Europe to the west coast of US is again at 220ms RTT, because the hop count is around 36. Almost 3 times more routers today than 14 years ago. This is where the latency problem comes from - packet switching in the multitude of hops and MPLS tunnels that you don't even see, not from some imaginative buffers.
Now it is after I got my fiber connection it is all gone. My old *DSL connected at 50/10 mbit(errorfree) but I couldn't get anywhere near that(30mbit at most) and latency were way too high. Only place it caused me some problems was when I worked from home and the Citrix connection as I don't play online games.
Badly managed buffers are a massive problem for latency. Just look at this graph from the article. You see the four ping time measurements on the right? You see how one of them is 100-250ms and the rest are more like 20ms? That's exactly the same link in all cases, but the first measurement has a giant pile of latency introduced purely by poor buffer management.
I'm not going to dismiss the problem you described, because I agree it's a problem. But it makes no sense to worry about 100ms on cross-Atlantic links and yet completely dismiss 200ms right on the first hop.
Maybe in their core, but what happens when they try to fit that traffic down your 10 Mbit/s DSL link? There is going to be a buffer there.
cut through routing works when there is no congestion. http://www.dslreports.com/spee...
You're right, of course. The trouble is, the latency increases aren't reasonable for common consumer networks under load.
Two speedtests I just did on my lightly-loaded hardwired home network (30Mbsp cable from Time Warner):
With QoS
Without QoS
Throughput is less (rather surprisingly less -- I may want to check some things) with my QoS rules that group connections into individually-throttled categories, but bufferbloat is sane-ish (a brief peak at 250ms was observed, but otherwise under 100ms).
Without QoS, bufferbloat starts at around 1000ms (x10 increase!) and goes up from there.
I'm currently using Shibby's version of Tomato-USB on an overkill dual-core Asus router to accomplish this, though I have used other consumer-ish hardware with reasonable success (including the venerable WRT54G/L/GS) using similar software.
The trick, as I see it, is primarily to ensure that the cable modem (and whatever is directly upstream of it at the head-end) never see enough throughput for their buffers to begin filling by keeping all nearby bottlenecks under my own control.
The other benefit of QoS is that on heavily bandwidth-constrained networks, some tasks can be given higher priorities than other tasks, which is easy when we control the neck of the bottle.
I dated a girl for a bit who had the cheapest Internet she could get: 2Mbps down. Her kids hated it, and web browsing with tablets and phones and laptops was terrible for all of them if anyone was streaming a video (badly) or downloading (slowly). Loud banter over who was "hogging the Internet" and ruining gaming was common, and not unreasonable. It got worse when people would visit. It was really bad.
Best case: They were taking turns using the Internet. In 2014.
After observing this and suggesting she get faster Internet ("no, it's not important to me," she said) I gave her a router with Tomato, did some obvious QoS priorities that were tweaked for that particular situation, and voila: The games worked fine. Web browsing was always quite responsive. Youtube worked (worked meh, but worked), and downloads and BT didn't trash any of the above. Anyone could do whatever they wanted, and the inevitable slowdowns were graceful while responsiveness remained good. The gamer of the house didn't get upset anymore seemingly-randomly.
But that's just one success story. I've been doing tricks like this for over a decade on a myriad of non-enterprise networks, using cheap hardware and thoughtful software.
(Now it's time for someone to pop up and tell me that I've done it all wrong, and that my results are impossible. This always happens on /. when I write about using Tomato and QoS to solve real, practical problems. I'm ready.)
Kid-proof tablet..
You've done it all wrong! Those results are impossible!
Silence is a state of mime.
I have 32GB of ram in my system, 64mb barely registers on the radar. Nice try, thank you for playing.
So rise up, all ye lost ones, as one, we'll claw the clouds.
At least in Tranna
davecb@spamcop.net
I'll need to see your CCNA before I can accept your retort.
Kid-proof tablet..
Steely Dan?
A bullet may have your name on it, but artillery is addressed to " Whom It May concern"
I built my own router because I don't want any of these mass-produced, consumer piece of shit routers with more holes in them than swiss cheese.
Nice success story and the exact circumstances we were trying to make easier to solve with cake. (and the dream is more ISPs would just be doing it for you on their default supplied boxes)
I would like to benchmark more stuff like tomato's qos against cake, the equivalent (single!) command line for outbound would be:
tc qdisc add dev your_device root cake bandwidth 2mbit nat
which automatically applies per host fairness, qos, and queue length management.
inbound requires a slightly more complex setup but not much.
It doesn't matter how big or small the buffer is, what matters is why it's filling up to start with.
If you're buffering because of a transient traffic spike or network load, then the buffer helps. If it's constantly filling up and evicting then there's a deeper problem that won't be solved either by using, eliminating, or changing the buffering strategy.
Buffering is more useful for UDP, and is primarily intended to smooth out transient congestion on a network link or interface.
On a Carrier network, it's used to help deal with bursty traffic... TCP is a rather "slow" response mechanism which works fine in general, but can't handle congestion which comes and goes on millisecond or sub millisecond time scales.
For example, if an ISP has ten 100gig links running in a bundle, and a 1ms duration traffic burst fills one link up, the buffer will soak it up and allow for the load balancing to recalculate traffic flows across the other links. Without the buffer, those packets would drop, and all the TCP clients would react and throttle back, even though the overall bundle is still under 50% capacity, and even though the congestion has long since cleared before the packet times out.
The other place buffers are useful is when enforcing QoS on a network. They allow the router to evict higher priority traffic ahead of other traffic, so that when congestion hits you can still guarantee some traffic types.
Buffer bloat can't happen without congestion. Congestion is the real problem and talk of buffer bloat is a bit off-point. Sure, if you combat congestion with very large buffers (and hence significant queuing), you get increased latency due to the queuing. Reasonable increase in latency (say 20%) is not a huge hit on performance. Remember that you're trading that extra latency for lower probability of dropped packets.
You're correct that bufferbloat "only happens" when there's traffic. But I don't think you appreciate the current nature of internet traffic.
With web pages averaging 2 megabytes these days, you're "doing large file transfers" all the time. And if your iPhone kicks off an upload of its pictures, or your child starts watching videos, or your spouse starts their own web browsing/mail session, you're at the mercy of your router's queue management algorithm.
I don't think a "20% increase" in latency is reasonable, given that the Smart Queue Management (fq_codel, and soon cake) that's available in the Linux kernel (not to mention LEDE/OpenWrt/DD-WRT for your home routers) provide a "no-settings" way to limit lag/latency to an increase of only a few msec (or a couple dozen msec on a crummy DSL link).
Well... I disagree that the "modern internet does not suffer from this problem." I have seen it at my house, and at measurements at many other places. (If you're only considering FTTH as "modern", I have still seen bufferbloat there...)
The ISP does have an opportunity to control buffering in two places: at both ends of the bottleneck (which is likely to be your cable/DSL/FTTH/etc.) link between your house and their facility.
a) Their "head end" gear might control queues for traffic going *to* you
b) Their Customer Premise Equipment (CPE) also would have the ability to control outgoing queues
If the ISP did both, then no one would have need to coin the term "bufferbloat". But the fact of the matter is that the vast majority of ISPs do *neither*.
Consequently, in late 2016, I believe it's prudent to provide my own solution and use one of the Smart Queue Management solutions (fq_codel, cake) that's available in LEDE/OpenWrt/DD-WRT so that I can get on with useful work.
Buffering can reduce latency, especially under heavy load, by better bandwidth utilization
You have no idea what you're talking about. Buffering is one of the main causes of latency. Ever see a 1,000ms ping? That's not because the speed of light is too slow, that's because there is a backlog of packets in the buffer. With the speed of light through fiber, no one should ever see a ping above 300ms to anywhere in the world. The highest ping I see from Midwest USA to Australia, India, or China is about 220ms.
Buffers are not inherently bad, but "bufferbloat" is because buffers are too large. Too large of buffers actually reduce throughput because TCP takes longer to respond to changes in congestion. Even worse is when bufferbloat starts to get up into the 3second range, yes seconds not milliseconds, TCP treats it as a lost packet and resends the data. I regularly see bloated Linux ISO seeders with 2k-4kms pings resending nearly 50% of their packets, most of which were not actually lost but only highly delayed.
Good anti-bufferbloat AQMs like fq_Codel and Cake increase effective bandwidth, while isolating light traffic from heavy traffic and keeping latency almost idle-link low. Want 10ms pings while paying games and downloading/uploading torrents, I have that already.
My response to this, it's not even wrong.
The article talks about shaping download, which isn't possible at the endpoint. The traffic is already there and you have to deal with it. Dropping it will create retransmits for TCP and make the problem worse.
Wrong. Dropped packets signal congestion. If you don't signal congestion, the congestion will only get worse. You eventually have to drop a packet. The sooner you drop the packet after congestion has started, the less the congestion will be. The flip side is if you signal too early, you lose effective bandwidth. I shape my download and it has caused my average to go up because it stabilizes the flows.
With normal fifo buffers, once the buffer is full, you get a burst of lost packets. This is much worst than dropping a single packet earlier.
maintain a hosts file is a PITA
The only real way to solve this is to timestamp packets as they enter a buffer and drop the ones that are too old.
You don't have to timestamp them to get the same effect. Codel and RED both effectively use time without timestamping. But yes, the "tracking time" is pretty much the only way.
With the speed of light through fiber, no one should ever see a ping above 300ms to anywhere in the world.
Even to places where there's no fiber connection? In a lot of places, the only route to the Internet with a throughput greater than the 0.15 Mbps of IDSL is through a satellite in geostationary Earth orbit, 36,000 km up. An ICMP ECHO request from a subscriber to a satellite ISP, such as Exede, needs to go up to the satellite and down to the destination network, and its response needs to come out of the network and then go up to the satellite and back down to the subscriber. That's 0.12 light seconds for each of four legs, already nearly half a second, plus whatever latency is in the destination network.
I've been using CeroWrt (https://www.bufferbloat.net/projects/cerowrt/wiki/ - the initial testbed for all of the bufferbloat work) for at least four years. For the majority of that time I had 1.5Mbps DSL service, but now I'm connected via a 12Mbps ADSL2+ link.
Prior to the installation of CeroWrt, it was painful for me to attempt to work remotely using an SSH tunnel if someone was watching a show via Netflix, but after setting up CeroWrt everyone was happy (me for not having to yell at my daughter and my daughter for being able to watch Netflix without me yelling).
With the 12Mbps link, it doesn't seem to be the ingress traffic that causes issues, but the egress traffic (at times, I upload large data sets). Without shaping the outbound traffic, I can see round-trip times in excess of 2 seconds which is just a bit excessive. ;-)
I recently installed LEDE (https://lede-project.org/) (an OpenWrt (https://openwrt.org/) fork) on a spare router (the same model as the CeroWrt router - WNDR3800) and it is obvious that the software continues to improve.
It appears that LEDE may be approaching its first stable release (https://forum.lede-project.org/t/criteria-for-first-lede-stable-release/552). If you have a spare router that is supported by LEDE, please consider installing a current build and report any issues found.
If you would like to learn more, here are a few random links to get you started:
I feel that the work that Dave (and everyone else that is involved) is so important that I send a few coins his way every month via Patreon. Here's his most recent update: "Where your donations go" (https://www.patreon.com/posts/where-your-go-7564906).
Dave, a belated Merry Christmas to you and I'm looking forward to a New Year where all of the efforts to tame bufferbloat and make WiFi fast benefit everyone.
If it's constantly filling up and evicting
This is actually normal for any congestion control algorithm that uses only packetloss to signal congestion. TCP? It keeps sending data until the buffer fill and drops "a packet", but we really know FIFO taildrop buffers drop bunches of packets. Then TCP backs off. But wait, there's more! You have many TCP flows going over the connection, so they are all fluctuation, keeping the buffer either in a state of steady full, which causes high latency and lots of dropped packets, or wildly swinging between empty an full because of global synchronization.
Remember that you're trading that extra latency for lower probability of dropped packets.
Not once you've gotten into the "bloated" range of buffer sizes. Increased latency from large buffers also increases the latency to signal to the sender that the route is congested. The sender will spend more time sending packets that will ultimately just get dropped. If the latency was lower, the sender would have known sooner to reduce its rate. Latency and loss go hand-in-hand once you get into unnaturally large buffers. I'm not sure the exact recommend buffer size, but I think it's around 10ms of the bandwidth. Many people are seeing 1,000ms+, which is 2 orders magnitudes above optimal.
That is not the definition of "shaping", that is Cisco's definition for their own internal terminology. Regardless of what you want to call it, I can control the amount of bandwidth a flow or group of flows can use regardless of direction (ingress/egress), assuming they respond to normal loss, marked, or delayed packet. Most people calling this "shaping bandwidth", but you can call it whatever you want.
I have a spare Asus RT-N16.
Where do I get started with Cake?
Kid-proof tablet..
It's pretty much not that at all. It's closer to:
* The provider is selling 100/100 Mbit/s to 20 people with a 1 Gbit/s uplink.
* You hook a WiFi router up to the 100/100 connection.
* While trying to VoIP/Skype on one WiFi device, somebody else starts watching Netflix on another.
* The latency on your WiFi (and thus your VoIP call) jumps up to 50-100ms due to bad buffer management on the WiFi.
* A third device starts trying to sync photos to a backup service, introducing another 100-250ms* of latency by tying up your upstream and generating another badly managed queue on your router.
That's a ton of unnecessary latency being generated right in your own house, by your own gear, and none of it will be helped by the ISP putting in more upstream bandwidth.
...on the topic of which, it would be insanely unnecessary to have 2 Gbit/s of bandwidth for twenty 100 Mbit/s users. You don't need enough bandwidth for every user to max out their connection simultaneously, because that never happens; you only need enough to cover whatever your actual peak traffic is without dropping any packets. When averaging over thousands of customers, this actually works out to needing something around 100 Kbit/s(!) per customer today.
Of course 20 is much less than "thousands" and the traffic profile of 20 customers will be much more peaky than the one of 1000 customers, but I suspect even then that 1000 Mbit/s would be enough to cover twenty 100 Mbit/s connections without dropping any packets. It certainly wouldn't be anywhere near having a "real uplink of 50 Mbit/s".
(*: Probably it wouldn't be this bad with a symmetric 100/100 connection; the graph I linked is for a 140/12 connection, but those are probably more common than symmetric connections anyway.)
bufferbloat is definitely still a thing.
I've been using this script for years to drop packets early to improve latency. it uses HFSC (built into linux since forever) and works great:
https://gist.github.com/eqhmco...
from that:
Congestion avoidance algorithms (such as those found in TCP) do a great job of allowing network endpoints to negotiate transfer rates that maximize a link's bandwidth usage without unduly penalizing any particular stream. This allows bulk transfer streams to use the maximum available bandwidth without affecting the latency of non-bulk (e.g. interactive) streams.
In other words, TCP lets you have your cake and eat it too -- both fast downloads and low latency all at the same time.
However, this only works if TCP's afore-mentioned congestion avoidance algorithms actually kick in. The most reliable method of signaling congestion is to drop packets. (There are other ways, such as ECN, but unfortunately they're still not in wide use.)
Dropping packets to make the network work better is kinda counter-intuitive. But, that's how TCP works. And if you take advantage of that, you can make TCP work great.
There are definite peak hours for customer traffic, eg work hours for businesses, and evenings and weekends for home users, so even the very generous 2:1 contention ratio that you seem to be suggesting probably would still result in a saturated backhaul from time to time.
Thus shifting as much discretionary stuff away from that peak as possible will help, just as for power grids, but that's a separate topic.
Rgds
Damon
http://m.earth.org.uk/
You know, I'd love to use your program and do a comparison between it and a few other ones I use, but I do have a few questions.
Rather than clutter up this forum, drop me a line, let's talk, geek to geek
So rise up, all ye lost ones, as one, we'll claw the clouds.
I have an LTE Verizon Jetpack as my primary Internet connection, and the firmware is proprietary and not user-modifiable, and of course they refuse to implement bufferbloat mitigations on their own. So, no, it's not free from bufferbloat.
Link to my homepage has my e.mail address there.
So rise up, all ye lost ones, as one, we'll claw the clouds.
It depends on the number of users involved. For this case of 20 users... yeah, you're probably right that you couldn't guarantee no packet loss all of the time, but you would probably get quite close. "It's a weekend" wouldn't be enough to saturate it; even when actively using the internet, most people's bandwidth use is short large spikes surrounded by lots of idle time. Torrents would be a better bet, but maxing out the link would require 10 users torrenting at their max line speed. There are people who will do that, sure, but your odds of having 10 of them at once in a building of 20 people are low.
At the main ISP level, where you're aggregating thousands of customers together, you can overprovision far more than 2:1 safely because customers average each other's traffic out and unusually high peaks become even rarer.
(I should point out that I have no operational experience running a network like this so a lot of this is educated guesswork, but the ~100 Kbit/s figure I gave in the last post comes from people who do have that experience.)