Bufferbloat — the Submarine That's Sinking the Net

Correction: JIM GETTYS by Anonymous Coward · 2011-01-07 01:07 · Score: 4, Informative

http://en.wikipedia.org/wiki/X_Window_System

Really? by Anonymous Coward · 2011-01-07 01:08 · Score: 2, Insightful

Latency is bad? Bigger buffers = more latency?

Re:Really? by Anonymous Coward · 2011-01-07 02:40 · Score: 3, Informative

I've no idea if this post explains it correctly or not (one of the replies implies that it doesn't), but if it does, it should be nearer the top of the page, hence my posting it here. :-)

Definition, please by Megane · 2011-01-07 01:10 · Score: 5, Insightful

I'm so glad the term has been defined so that I know what the hell we're talking about here. Oh wait, no it hasn't.

Okay, then I'll RTFA. Oh wait, two screens worth of text later and it still hasn't.

I'd like to change the topic now to the submarine that's sinking the English language: jargonbloat.

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }

Re:Definition, please by Megane · 2011-01-07 01:15 · Score: 5, Informative

For what it's worth, TFS seems to be linking into the middle of the story, so maybe that's part of my problem. Still, it's really annoying to be told about this new problem with new jargon word, that's going to make the sky fall any day now, without knowing just what the hell it is.
The previous article seems to explain things a little better: http://gettys.wordpress.com/2010/12/03/introducing-the-criminal-mastermind-bufferbloat/

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Re:Definition, please by drooling-dog · 2011-01-07 01:29 · Score: 4, Insightful

They know something you don't, they want you to know it, and they want to keep it that way for as long as possible...
Re:Definition, please by Nursie · 2011-01-07 01:37 · Score: 2

He's written a whole series on this over the course of months, if he doesn't explain it a long way into the series then blame the slashdot summary, not the guy doing the research/testing and telling the world about it.
Re:Definition, please by mcgrew · 2011-01-07 01:45 · Score: 3, Informative

There are two reasons I can think of why people write like that. One is they're poor communicators, the second is they want to appear intelligent.
It seems there are two kinds of stories posted here lately -- science and tech stories written for the non-nerd by non-nerds like one last week that explained what a CPU was (!), and stories like this that coin new jargon and don't explain it, or use an acronym that most folks here will misunderstand, like using BT when referring to Britich Telecom when most of us think of BitTorrent when we see BT.
Maybe I'm just getting old.

--
Free Martian Whores!
Re:Definition, please by Megane · 2011-01-07 01:55 · Score: 5, Insightful

Actually, I blame the submitter. It is well known that Slashdot "editors" don't edit. They merely choose the least worthless articles out of the slush pile and push the button, sometimes using copy and paste to combine two similar submissions. Even my above link was still to the middle of the story, but it explains the core concept best.
I also place a teensy bit of blame on the blogger, for not linking the first use of the word to the previous article. But he couldn't expect to get linked into the middle of the series.

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Re:Definition, please by jg · 2011-01-07 02:10 · Score: 5, Insightful

You asked, I just provided:
http://gettys.wordpress.com/what-is-bufferbloat-anyway/
Good question.
Bufferbloat is the cause of much of the poor performance and human pain using today’s internet. It can be the cause of a form of congestion collapse of networks, though with slightly different symptoms than that of the 1986 NSFnet collapse. There have been arguments over the best terminology for the phenomena. Since that discussion reached no consensus on terminology, I invented a term that might best convey the sense of the problem. For the English language purists out there, formally, you are correct that “buffer bloat” or “buffer-bloat” would be more appropriate.
I’ll take a stab at a formal definition:
Bufferbloat is existence of excessively large (bloated) buffers into systems, particularly network communication systems.
Systems suffering from bufferbloat will have bad latency under load under some or all circumstances, depending on if and where the bottleneck in the communication’s path exists. Bufferbloat encourages congestion of networks; bufferbloat destroys congestion avoidance in transport protocols such as HTTP, TCP, Bittorrent, etc. Without active queue management, these bloated buffers will fill, and stay full.
More subtlety, poor latency, besides being painful to users, can cause complete failure of applications and/or networks, and extremely aggravated people suffering with them.
Bufferbloat is seldom detected during the design and implementations of systems as engineers are methodical people, seldom if ever test latency under load systematically, and today’s memory is so cheap buffers are often added without thought of the consequences, where it can be hidden in many different parts of network systems.
You see manifestations of bufferbloat today in your operating systems, your home network, your broadband connections, possibly your ISP’s and corporate networks, at busy conference wireless networks, and on 3G networks.
Bufferbloat is a mistake we’ve all made together.
We’re all Bozos on This Bus.
Re:Definition, please by Nefarious+Wheel · 2011-01-07 02:19 · Score: 3, Funny

The Evil Buffer Stuffer Strikes Again!

--
Do not mock my vision of impractical footwear
Re:Definition, please by GooberToo · 2011-01-07 02:42 · Score: 4, Insightful

Yeah, I see this a lot with nerds. It's pretty fucking annoying when someone launches in a long winded dissertation on some obscure subject, without even bothering to put an introductory paragraph at the top giving even the briefest overview of what the fuck they're even talking about.
Its a series of blog articles. He presumes you've been following his series of articles whereby he introduces the topic and experimentally validates his assertions. If you didn't get the introduction, blame your own laziness or the failure of the poster to also provide a link to the first blog post in the series.
Basically you're complaining because you jumped to the middle of a book and then bitched that the chapter you started reading doesn't have an introduction. Most people will wonder what the hell is wrong with you. To then attack the author for other's failings is bizarre to say the least. And all this ignores that blogs are frequently written to be familiar and causal reading; which also entirely invalidates your general tone.
Re:Definition, please by hairyfeet · 2011-01-07 03:17 · Score: 3, Interesting

Now someone can point out if this humble repair guy is wrong, but from what I read of TFA it sounds to me like TCP is the problem and not the buffers. What TCP is doing is slamming the network until it drops packets, and then using that to determine speed, correct? Doesn't sound like an efficient way to allocate resources when even grandma has a fat cable or DSL pipe. It would be like everyone trying to shove their way to the front of a line and only backing off when getting punched in the face.
So again feel free to correct if I'm missing something, but wouldn't the better solution be to come up with a new way to allocate space? Perhaps a packet every so often that says "Hi, how much bandwidth may I have please?" which the server or node would reply "You can have X" and then everyone wouldn't be trying to slam and your route would be based on which gives you the largest X.
So maybe I'm missing something, but it seems to be a choice of that or rip every buffer out of everything from the home modem on up. Considering even the el crappo motherboards are gigabit now, and even the CCC (Cheapo Chinese Crap) home routers have decent sized buffers on them this would seem like a more doable solution. Because while the "slam and back off" method probably worked really well in the past I just don't see having millions of people trying to slam their way to the head of the line as the most efficient way to utilize a limited resource, and no matter how big a pipe you have it is still just that, limited.

--
ACs don't waste your time replying, your posts are never seen by me.
Re:Definition, please by betterunixthanunix · 2011-01-07 03:25 · Score: 2

I think the idea is that there is a certain size beyond which these buffers will create more problems than they solve, or perhaps more accurately that there is a threshold ratio of buffer size to the bandwidth of a link which, once exceeded, will create problems. Bufferbloat is just an easy term to refer to this condition.

But I am not the author, so perhaps he can chime in.

--
Palm trees and 8
Re:Definition, please by davidbrit2 · 2011-01-07 03:39 · Score: 5, Informative

I'll attempt to translate.
TCP has to be able to estimate how fast* it can send data, because there's no way it can know definitively the link speed, capacity, and reliability between your system and a remote system. It does this by progressively getting faster until it starts detecting transmission problems between the two systems, at which point it backs off and slows down. Ideally, you hit a nice equilibrium at some point.
On a proper network, if some router along the path is at capacity, either internally, or along one of its outgoing paths, it should drop the packets it can't handle in a timely fashion. This seems counterintuitive at first, but remember that TCP handles the guaranteed transmission already - it will retransmit packets that didn't arrive. If the router is holding these packets in a buffer, and sending them along once the links clear up, i.e. "when it gets around to it", the packets will reach their destination with hugely inflated latency. This in turn confuses TCP, as it can't get a reliable estimate of link capacity, and the whole speed negotiation falls apart. The latency becomes wild and unpredictable as packets are sometimes buffered, sometimes not, but they always reach their destination, so TCP thinks it's sending at an acceptable rate. So now you've got all the endpoints conversing through this router that's claiming, "No problem, I can handle it!" when it really can't, and the problem just compounds itself as the router gets slammed harder and harder.
By getting timely notification of dropped packets, TCP can say, "Oh, I'm transmitting too fast for this link, time to shrink the sliding window and slow down." This both smooths out latency, and minimizes further dropped packets, not just for the two hosts involved, but for everyone else transmitting through the affected routes as well. This is how it's supposed to work, but excessive buffering of packets within routers prevents it from happening.
Moral: Dropped packets are perfectly normal and in fact required for TCP to manage its own speed and latency. Stop trying to buffer and guarantee packet delivery - TCP is handling that already.
(Disclaimer: I'm a DBA, not a network engineer. Feel free to clarify or correct anything I've mucked up.)
* "Fast" in this case means "How many packets should I send at once before stopping to wait for acknowledgment of those packets getting where they're going". "Faseter" equates to "more of them".
Re:Definition, please by Idbar · 2011-01-07 05:12 · Score: 2

Perhaps, the wikipedia article about Random Early Detection, may help to understand the issue. RED (proposed by Floyd and Jacobson, the latter cited in the article posted) was proposed in 1983 to overcome this issue, along with Explicit Congestion Notification (a mechanism to mark packets instead of dropping them). ECN was implemented only until Windows Vista (and wasn't enabled by default), which made complicated to actually take advantage of such schemes.

Many mechanisms have been proposed (Even I'm proposing one), yet, the Internet Service providers have been throwing more hardware and the manufacturers have been working on increased speeds and memory rather than focus on the problem of TCP's congestion control mechanisms (which was set as a precedent problem in 1992 by Jain's paper: Myths about congestion in high speed networks), unluckily, well, nobody enables their WRED (Cisco's proprietary mechanism) or implement any other algorithm. And so, almost 30 years later, people is realizing that this may be an issue.
Re:Definition, please by myrdos2 · 2011-01-07 06:07 · Score: 3, Insightful

Solutions which require the internet's infrastructure to be replaced (all the routers and switches and so forth) have been proposed for many years, and never go anywhere. The only one I'm aware of is IPv6, and look how slowly that beast has taken off. That said, TCP sawtooth isn't as bad as you make it out to be - in most cases. Whenever a packet is dropped, the TCP connection drops its speed to around half, then gradually ramps up to where it was previously. You don't get 100% of your bandwidth utilization, but you do get to automatically adjust to changing network conditions. And as the number of TCP connections over one pipe increases, you get closer and closer to max utilization rates.
TCP fails when:
-competing against UDP, which has no congestion control and will clog a line even if every UDP packet is dropped
-there is interference in the line causing packet corruption, which TCP interprets as congestion
-competing against Microsoft products, which have TCP stacks that are tweaked to grab more than their fair share of the bandwidth
My understanding is that TCP congestion control generally isn't applied to backbones - I believe that ISPs throttle your traffic before sending it over an optic link so as not to overbook its capacity. You're probably just competing with your household, and possibly people on your block - can someone verify this?

Name wrong by ebcdic · 2011-01-07 01:11 · Score: 2, Informative

He's Jim Gettys, not Getty.

Awsum, TTY in your name by cerberusss · 2011-01-07 01:13 · Score: 5, Funny

Jim Getty, one of the original X Window System developers and editor of the HTTP/1.1 spec

I'd murder four people just to have TTY in my name. Five if I could capitalize them, and postfix with a number. I'd name my son Dev.

You'd get a business card with something like Dev GeTTY1, Armadillo Avenue 64, Seattle, Washington

--
8 of 13 people found this answer helpful. Did you?

Re:Awsum, TTY in your name by jmyers · 2011-01-07 01:31 · Score: 3, Funny

So you are the reason I keep getting this in my logs "getty keeps dying. There may be a problem".
Re:Awsum, TTY in your name by DikSeaCup · 2011-01-07 01:45 · Score: 3, Funny

Oh my God, you've killed getty! You bastard!

--
I talk about stuff.

First link in the first article by mangu · 2011-01-07 01:14 · Score: 5, Insightful

Just start RTFAing: "In my last post I outlined the general bufferbloat problem."

Follow the link:

"Each of these initial experiments were been designed to clearly demonstrate a now very common problem: excessive buffering in a network path. I call this bufferbloat

Re:First link in the first article by GooberToo · 2011-01-07 02:34 · Score: 3, Interesting

Demands for definition are a bit pompous...
A bit?
Even more pompous is making a post about it when everyone can clearly see, "bufferbloat" is shorter than constantly saying something tedious like, "excessive packet buffering in the entirety of a network path."
Perhaps this will help the uninitiated. The article describes a wide problem of excessive packet buffering in the entirety of a network path, which has been dubbed, "bufferbloat."
Re:First link in the first article by ls671 · 2011-01-07 02:56 · Score: 2

A better definition could be:
"A user saturating its broadband connection by transferring 20GB files and not taking care of using the --bwlimit (limit bandwidth) option with rsync"
I have been using it for ages to prevent this very problem.
Other types of traffic shaping can be done, with Linux tc as an example, but it is always best to do it at the application level when possible.

--
Everything I write is lies, read between the lines.
Re:First link in the first article by PybusJ · 2011-01-07 12:06 · Score: 2

The reason you traffic shape to 90% line speed is to stop your upstream from buffering at all. It's all just a workaround for the fact that your ISP has large buffers.
The fact we nerds can configure Linux routers to avoid the issue, doesn't mean its a non-issue to everyone else.

pegged connection == latency, who'd of thunk it? by Shakrai · 2011-01-07 01:18 · Score: 5, Insightful

I read TFA and I'm not seeing the problem. He can't duplicate this issue unless he maxes out his connection and then his latency goes to hell. No shit Sherlock, that's what happens when your pipe is full and the packets have to wait in the queue to be transmitted. Am I stupid or could he avoid this issue entirely by using QoS and/or rate-limiting his connection to some amount <100% of it's maximum throughout? I have QoS at the office that keeps our connection from pegging (it's limited to around 75% on the download and 90% on upload) and have never once encountered an issue with latency or jitter. At home I only throttle the upload (to 90% of maximum) and have successfully ran VPNs, bittorrent uploads and VoIP calls all at the same time without any headaches.

Really, what's the problem here?

--
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.

cringley explains by szo · 2011-01-07 01:26 · Score: 4, Interesting

http://www.cringely.com/2011/01/2011-predictions-one-word-bufferbloat-or-is-that-two-words/

--
Red Leader Standing By!

So, let me get this straight... by CFD339 · 2011-01-07 01:31 · Score: 5, Insightful

RAM is cheap.
High speed uplink is not cheap.
Peering agreements are manipulative, expensive, and sometimes extortionate.

So...

The poorly designed, poorly peered, under allocated back haul links can't handle the traffic that routers want to push through them -- but since RAM is cheap, operators just add RAM to the buffers so that when those back-haul lines slow down for a second the packets can get pushed through.

And we're blaming the buffer for the problem?

--
The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln

Re:So, let me get this straight... by phantomcircuit · 2011-01-07 01:55 · Score: 5, Insightful

TCP assumes that packets will do dropped when there is congestion, if they aren't the congestion control algorithms fail (hard).
Re:So, let me get this straight... by gl4ss · 2011-01-07 02:33 · Score: 2

the real interesting bit would be what would the internet be without those buffers. packetloss at 80%?

--
world was created 5 seconds before this post as it is.
Re:So, let me get this straight... by complete+loony · 2011-01-07 03:26 · Score: 4, Informative

I was sharing a connection with a friend once who was throttling my upload bandwidth in an attempt at fairness. Trying to run something like bittorrent would fill all the buffers in my PC, his router and his modem, adding 1.5 seconds of latency to the link (I used to ping the host on the other end of the modem to confirm it).

--
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
Re:So, let me get this straight... by compro01 · 2011-01-07 03:39 · Score: 2

A lack of a buffer is bad, as even if a packet had to be buffered for a microsecond, it would still be dropped as it couldn't be transmitted immediately.
But a too large buffer is also bad, as it delays the packet being dropped until it times out, preventing TCP's congestion control functionality from rapidly responding to the congestion (by shrinking the window size). Eventually, it will detect the packets getting dropped as they time out, but in the meantime (possibly several seconds), it continues merrily transmitting packets, which will also get dropped, resulting in a massive spike in latency whenever it happens, as it then needs to retransmit all of those dropped packets.
With an appropriately sized buffer, small delays would result in data being buffered for short times, but if it would face a too long delay (determined by the buffer being full), it would simply get dropped, alerting everyone to slow down, resulting in only a small number of packets needing to be retransmitted and preventing a large latency spike.

--
upon the advice of my lawyer, i have no sig at this time
Re:So, let me get this straight... by compro01 · 2011-01-07 03:44 · Score: 3, Insightful

Yes, lack of buffers would be bad, as even a trivial delay would result in a packet getting dropped. Oversized buffers are also bad, as they simply delay the packet getting dropped, preventing congestion control from reacting in a timely manner. The buffers need to be sized appropriately relative to the link speed and typical latency.

--
upon the advice of my lawyer, i have no sig at this time
Re:So, let me get this straight... by rickb928 · 2011-01-07 05:01 · Score: 3, Insightful

No.
Packet losses would be handled by adjusting to the conditions.
Look at the trace Gettys posted in the referenced article. Lots of dup packets. Get rid of those, and there's some bandwidth that can be *used*. And allowing TCP to adjust to prevailing conditions should result in less packet loss. It might seem to be less bandwidth also, but we may be in a vicious circle of increasing bandwidth to solve a problem that is NOT bandwidth. Packet loss by itself is a symptom, not a problem.

--
deleting the extra space after periods so i can stay relevant, yeah.

Naming Your Son Dev by djdevon3 · 2011-01-07 01:39 · Score: 2

Easy, name him Devone = Dev1

Re:pegged connection == latency, who'd of thunk it by vadim_t · 2011-01-07 01:42 · Score: 5, Informative

Several issues:

1. People who aren't networking engineers don't know about QoS, or don't know/want to know how to configure it.
2. QoS used that way is a hack to work around an issue that doesn't have to be there in the first place
3. How do you determine the maximum throughput? It's not necessarily the official line's speed. The nice thing about TCP is that it's supposed to figure out on its own how much bandwidth there is. You're proposing a regression to having to tell the system by hand.
4. QoS is most effective on stuff you're sending, but in the current consumer-oriented internet most people download a lot more than they upload.

Re:pegged connection == latency, who'd of thunk it by TheThiefMaster · 2011-01-07 01:46 · Score: 2

The problem is that maxing your connection from one site is causing everything else you do on your connection to be delayed / dropped as well, because it ends up queued behind anything that got buffered mid-transit from the first site. With a smaller buffer the large transfer would start to drop packets and back off sooner, allowing packets from other sources to "hop the queue".

You have have not RTFA or not UTFA.. by bmajik · 2011-01-07 01:46 · Score: 5, Informative

What Jim is saying is that TCP flows try to train themselves to the dynamically available bandwidth, such that there is a minimum of dropped packets, retransmits, etc.

But in order for TCP to do this, packets must be dropped _fast_.

When TCP was designed, the assumptions about the price of ram (and thus, the amount of onboard memory in all the devices in the virtual circuit) were different -- namely, buffers were going to be smaller, fill up faster, and send "i'm full" messages backwards much sooner.

What the experimentation has determined is that many network devices will buffer 1 megabyte or MORE of traffic before finally dropping something and telling the tcp originator to slow down. And yet with a 1 meg buffer and a rate of 1 megabyte per second.. it will take 1 second simply to drain the buffer.

The pervasive presence of large buffers all along the tcp vc, and the non-speified or tail-drop drop behavior of these large queues means that tcp's ability to rate limit is effectively nullified, and in situations where the link is highly utilized, many degenerate behaviors occur, such that the overall link has extremely high latency, and that bulk traffic will cause interesting traffic to be randomly dropped.

Personally, I used pf/altQ on openBSD to try and manage this somewhat.. but its a dicey business.

--
My opinions are my own, and do not necessarily represent those of my employer.

Re:pegged connection == latency, who'd of thunk it by suv4x4 · 2011-01-07 01:49 · Score: 4, Funny

Really, what's the problem here?

You really don't see the problem? How can you be so naive. Maybe you're new to this. All signs show to the fact there is a problem.

Of course the problem is not obvious. The article itself says it'll completely surprise us. They know we won't believe it at first. But that's why we must believe it, or else it's Armageddon.

Would you risk an Armageddon, because of your inability to understand and see?

And that's, in short, why we must attack Iraq.

Wait, what were we talking about :P?

Re:pegged connection == latency, who'd of thunk it by TheThiefMaster · 2011-01-07 01:54 · Score: 5, Interesting

As an extreme example, say you request a 1GB file from a download site. That site has a monster internet connection, and manages to transmit the entire file in 1 second. The file makes it to the ISP at that speed, who then buffers the packets for slow transmission over your ADSL link, which will take 1 hour. During that time you try to browse the web, and your PC tries to do a dns lookup. The request goes out ok, but the response gets added to the buffer on the ISP side of your internet connection, so you won't get it until your original transfer completes. How's 1 hour for latency?

The situation is only not that bad because:
A: Most download sites serve so many people at once and/or rate limit so they won't saturate most peoples' connections
B: Most buffers in network hardware are still quite small

Re:I think buffers are a good thing by Coriolis · 2011-01-07 01:59 · Score: 5, Interesting

He's not arguing against application-level caching. He's saying that too much caching at the IP layer is confusing TCP's algorithm for deciding how fast the link between two points is. This in turn causes massive variability in how fast the data can be downloaded; or in your terms, how fast the video can be buffered (and, in fact, how much buffer the video player needs).

--
Rgasuya aata! : I have been coding Perl and cannot tell where my fingers are now!

Re:QoS by Megane · 2011-01-07 02:00 · Score: 4, Informative

After reading TFSeries, the problem is excessive buffering (as in 1-10 or more seconds worth of data) screwing up TCP/IP's automatic bandwidth detection. QoS helps a little bit by getting the important packets (especially ACKs) through, but high-bandwidth TCP connections are still going nuts when they hit a slower link with excessive buffering.

And one of the major offenders is Linux commonly defaulting to a txqueuelen of 1000.

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }

Re:Buffering of what? by nedlohs · 2011-01-07 02:02 · Score: 2

Buffering of packets, "network path" can't be refering to anything else.

Re:pegged connection == latency, who'd of thunk it by Dunbal · 2011-01-07 02:04 · Score: 3, Informative

but in the current consumer-oriented internet most people download a lot more than they upload.

Because the current consumer infrastructure forces it onto you. I would happily seed my torrents all year long, except I only have 1/12th the uploading bandwidth as I have for downloading. Since I need some of it for other things, uploading becomes impractical.

It's easy to blame the consumer, but there's a certain model imposed on him from the start.

--
Seven puppies were harmed during the making of this post.

Concerning Boiled Frogs by wiredog · 2011-01-07 02:05 · Score: 4, Informative

If you put a frog in a pot of water and slowly raise the temperature it will try to jump out before the water reaches a temperature that is fatal to the frog.

--

Best Slashdot Co

Re:Concerning Boiled Frogs by TheRaven64 · 2011-01-07 02:14 · Score: 5, Funny

Only if you use a real frog. You can kill a hypothetical frog in this way.

--
I am TheRaven on Soylent News
Re:Concerning Boiled Frogs by ladadadada · 2011-01-07 02:30 · Score: 4, Funny

If you put a frog in a pot of water and don't even bother boiling it, the frog will jump out anyway.
If you were to find a frog in its natural habitat where it's happy to sit all day waiting for food to drift past and boil that environment slowly, you might actually have an experiment on your hands... and an ethics committee on your tail.
Boiling a lagoon is left as an exercise for the reader.

--
Sig matters not. Judge me by my sig, do you?
Re:Concerning Boiled Frogs by alien-alien · 2011-01-07 03:01 · Score: 2

You can keep the frog in there longer though if you buy a ram and have him watch the pot. Having the ram watch the pot may help a bit as the water will heat up slower - and most certainly will never boil.
Efficiency may decrease if you put all this on a buffé (or RAM Table), which TFA clearly states will make the frog more jittery.
The one exception to this appears to be when you employ a Serf (or Page) to watch over the buffé. Supervised Page Tables actually benifit from more RAM and you will need fewer Serfs i.e. it will make serfing more efficient.
YMMV
Re:Concerning Boiled Frogs by zm · 2011-01-07 03:15 · Score: 5, Funny

Use a lid.

--
Sig ?

Re:Buffering of what? by SuricouRaven · 2011-01-07 02:07 · Score: 2

Packets.

Re:pegged connection == latency, who'd of thunk it by TheRaven64 · 2011-01-07 02:10 · Score: 3

Mod parent up. Half way down the comments, and this is the first post that actually explains why 'bufferbloat' is something I should care about.

--
I am TheRaven on Soylent News

Re:Looks like a hype by ledow · 2011-01-07 02:12 · Score: 5, Insightful

You haven't read the article (or the many others around on LWN.net on the same topic). Basically, large buffers in networking gear, from DSL routers on your home network through to ISP's, mean that interactivity is *shite*. You might download Gb's but in terms of interactive applications it's useless and we're facing ever-increasing latency and problems through wanting to cope too much with errors and delays (e.g. huge buffers to keep resending instead of just letting packets drop and having TCP sort it out by retransmission). TCP windows never shrink because errors and buffered and retried so much from intermediate devices that any sort of window scaling is worthless because it doesn't *see* any packet-loss.

Same devices, smaller buffers, everything works fine and "faster" / "more responsive" all around. It actually would *save* money on new devices because you don't need some huge artificial buffer, you can just drop the occasional packet. But the problem is so deeply embedded into run-of-the-mill hardware that it's almost impossible to escape at the moment and thus EVERYONE from large businesses to home users are running on a completely sub-optimal setup because of it. Almost every networking device made in the last few years has buffers so large that they cause problems with interactivity, bandwidth control, QoS, etc. It's NOT just that a "faster connection" solves the problem - we are getting a percentage of optimal service that's steadily decreasing as buffers increase even though we're improving all the time. That's the point. And it *is* caused by memory prices because memory is so cheap that a huge thoughtless buffer costs no more than a tiny, thought-out buffer.

Re:pegged connection == latency, who'd of thunk it by TheThiefMaster · 2011-01-07 02:20 · Score: 3, Funny

So naturally, I instantly get modded down.

QoS by leuk_he · 2011-01-07 02:25 · Score: 2

QoS does generally not work beyond the first hop. Your provider most likely will drop any QoS data. Some providers wil try to make their own QoS systems (e.g. to show a low ping). However if the lantency has a great variance due to all kind of buffers any algoritm will get the bandthwidt wrong.

QoS based on network types will get it wrong. For pure browsing /downloading it is relatively simple, but for VPN Encrypted skype udp traffic, game data it will never be optimal.

And as the blogger wrote, there is not a simple solution, because the end user has a "dad the internet is slow today" mentality. Couple that with a "reinstall your windows" helldesk and the solution becomes VERY HARD.

Re:QoS by Shakrai · 2011-01-07 02:27 · Score: 5, Interesting

Given that most traffic on a domestic connection is incoming, that doesn't help much.

It's not that hard to shape downstream traffic. Take a Linux router with two ethernet cards. eth0 is the LAN and eth1 is the internet. You shape eth0 with a maximum throughput of 75%-80% of your line speed. All of the downstream traffic has to go out on that interface so that's your opportunity to shape it. I do this at work and successfully share a 3.0mbit/s connection with 60+ employees. We use latency sensitive services like VoIP and RDP alongside streaming video and other large downloads without any major hassles. It stinks to lose some of your bandwidth because of this (you have to shape it to a number less than 100% of your line speed, otherwise buffering occurs at your ISP and your QoS scheme is defeated) but I'll take responsiveness over throughout any day of the week.

--
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.

ECN - Explicit Congestion Notification by ei4anb · 2011-01-07 02:32 · Score: 2

The issue is that many IP stacks do not handle ECN (Explicit Congestion Notification) and only know when the link is saturated by packet loss. Huge buffers hide the problem. A solution is to use ECN, that's what it's for http://en.wikipedia.org/wiki/Explicit_Congestion_Notification

Re:ECN - Explicit Congestion Notification by leuk_he · 2011-01-07 03:17 · Score: 2

No. There is no simple solution. ECN might help, but as the link already points out, it is disabled by most common implemnations. I think that all all hops need queue management for this to work, but i am too lazy to read the entire RFC.
I Don't dare think about relying on ECN in tunneled protocols (VPN).
I am not saying that ECN is bad, but it is a differnet discussion from bufferbloat. The problem is large buffer causing a jitter in the the delay. ECN might help in minimizing congestion, but if the end result of ECN is that ECN clients are slower than non-ECN clients i know what will happpen....

Re:pegged connection == latency, who'd of thunk it by Teancum · 2011-01-07 02:35 · Score: 4, Insightful

This is an excellent explanation of what issues are happening here. I can clearly see that this is an issue, and the problem is something that over time will impact everybody.

The problem is really focused on trying to deal with differences in bandwidth between computers... always a problem but in this case trying to match up slow connections with fast connections is particularly difficult. Since memory is cheap, a 1 GB buffer certainly can be found in some devices now and perhaps much more. I don't see this example as being really too far off the mark in the near future.... which is the point being raised and why buffer bloat is such a big deal.

More to the point, some of the complaints that triggered the "quality of service" debate are rooted in this problem. As mentioned in the original article triggering this whole slashdot thread, setting up "quality of service" priorities only creates multiple buffer queues.... it doesn't solve the problem of the monster queue to begin with. That is why the author of the blog post suggests that the debate over network neutrality is not based upon the real problem that is facing network engineering and why it is a political solution in search of a problem.

It takes awhile to "grok" this problem, but once you do it becomes obvious why this is such a huge deal.

Re:pegged connection == latency, who'd of thunk it by Shakrai · 2011-01-07 02:36 · Score: 2

People who aren't networking engineers don't know about QoS, or don't know/want to know how to configure it.

*shrug*, not my problem :)

QoS used that way is a hack to work around an issue that doesn't have to be there in the first place

The issue is always going to be there. Pegged connection == FIFO queuing, absent some sort of QoS scheme.

How do you determine the maximum throughput? It's not necessarily the official line's speed.

If you aren't getting the line speed you paid for then you need to find another ISP.

The nice thing about TCP is that it's supposed to figure out on its own how much bandwidth there is

And it does, even with QoS. All you do with QoS is force the buffering to happen on equipment that you control rather than equipment your ISP controls. In this manner you can ensure that time sensitive packets (interactive VPNs, VoIP, etc.) don't sit in the queue behind someone's Windows Update download.

QoS is most effective on stuff you're sending

It's not really all that difficult to shape downstream traffic. All you need is a router between your internet connection and LAN clients. I've done this for years at my office using the QoS functionality of the Linux kernel. We are located out in the middle of nowhere with T-1s as our only means of connectivity. Sharing a 3.0mbit/s connection with 60+ employees without QoS is virtually impossible if you need to run interactive protocols.

--
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.

Re:Someone with networking chops by SuricouRaven · 2011-01-07 02:45 · Score: 2

It's not, really. There are many buffers involved. Some of them will be in the network infrastructure - routers, firewalls - that users have no control over. A lot more are in devices they control, but that don't allow configuration of such low-level parameters. Cable modems, home routers, access points.

Re:pegged connection == latency, who'd of thunk it by bcmm · 2011-01-07 02:50 · Score: 3, Informative

That makes no sense. It doesn't matter how fat their pipe is because your computer needs to receive and ack those TCP packets. They can't just dump the file and close the connection.

OK, not on the (intentionally ridiculous) scale used in the example, but people are doing something very similar to what you describe, even though they "can't do that". http://slashdot.org/article.pl?sid=10/11/26/1729218

--
# cat /dev/mem | strings | grep -i llama
Damn, my RAM is full of llamas.

It will be a hack by dachshund · 2011-01-07 03:02 · Score: 4, Insightful

2. QoS used that way is a hack to work around an issue that doesn't have to be there in the first place
3. How do you determine the maximum throughput? It's not necessarily the official line's speed. The nice thing about TCP is that it's supposed to figure out on its own how much bandwidth there is. You're proposing a regression to having to tell the system by hand.
4. QoS is most effective on stuff you're sending, but in the current consumer-oriented internet most people download a lot more than they upload.

While the Internet in-theory is beautiful, our modern implementation really is a series of layered hacks. And the solution to Bufferbloat is going to be another hack. You're crazy if you think that the solution to the Bufferbloat 'problem' is going to be some fundamental redesign of the TCP protocol (how would you force 10 people to use it?), or the total re-architecture of millions of consumer devices to remove buffering. You're also crazy if you think the ISPs and backbone providers are going to stand by while this thing kills the Internet.

So the question is: which hack will it be? The GP poster already identified one that works well enough --- using QoS to control flows. Your final objection about content providers stressing connections is the real one. But there's some probably a good hack to deal with it --- or more likely a series of hacks, some at the content providers themselves (e.g., Netflix), some in the backbone, and some at your ISP. It won't be elegant, but it will keep this problem from ever becoming anything more than a few cranky blog posts.

Re:Looks like a hype by tippen · 2011-01-07 03:05 · Score: 2

A big part of the problem is customers complain mightily when network devices congest (drop packets). Congestion is easily monitored. Additional latency is much harder to measure and most customers are less likely to notice it.

Less support calls == better margins

For some devices, it's clear that the engineers that built it don't understand networking very well and that's where the problem crops up. For others, it reduces pain on the field support organization so, in some ways, customers are doing it to themselves.

And this is a known problem, and fairly intuitive by ebrandsberg · 2011-01-07 03:07 · Score: 3, Interesting

let me summarize the problem that is being observed: On a given interface, if you have more buffer memory than is needed as packet buffer on the transmit side, it can induce latency. As an example, consider a 1Mb/s link. If you want to have a peak of .1s latency added by buffering at high load, then you want 1Mb*.1=12,500 bytes of buffer. If you have 1MB of buffer, then you have 8 seconds of buffer, therefore triggering the "buffer bloat" issue. Part of the problem is that buffer size would be set based on the top speed a piece of hardware could drive, i.e. if you want a 1Gb/s interface to be able to buffer .1s, then you use it at 100Mb/s, then it has 1s worth of buffer. In most home deployments where you have a router that may have a 1Gbps upstream, maybe 4 100Mb/s physical connections, and a 54Mbps wireless router, you probably have a shared buffer for all the interfaces. The result of this is that when using the 54Mb/s wireless, you can easily have the buffer over saturated, while the buffer size may be just right for the 100Mb/s interfaces.

What is the solution to this? Realistically, the alternative is to drop packets that have resided in the buffer longer than a configured amount of time, which causes it's own performance issues. Net result: TCP would slowdown for a period of time, but would speed up again resulting in a sawtooth behavior. This would result in periodic issues with other protocols as well, i.e. VOIP would have dropped packets every time TCP ramps up again, etc.

Solution: Don't download porn when you are trying to do VOIP calls.

Yes, buffers can introduce latency by perpenso · 2011-01-07 03:09 · Score: 5, Informative

Latency is bad? Bigger buffers = more latency?

Buffers increasing latency is not exactly a new phenomena. Its been observed and taken into design considerations for quite some time. For example back-in-the-day serial chips essentially had a buffer of one byte. The CPU fed data one byte at a time as the buffer became available and latency was pretty low since data was immediately transmitted. As more capable serial chips became available larger buffers were introduced. A newer chip may have a larger buffer but it may also not transmit data as soon as it has a single byte. It was common to have two programmable thresholds to begin a data transmission, (1) when a certain amount of data has accumulated in the buffer or (2) when a certain amount of time has elapsed. So if a "packet" to transmit was small enough it may sit in the buffer until (2), hence more latency with larger buffers. Software that cared generally began to issue flush commands to cause anything in the buffer to be sent immediately.

Network cards and/or the operating system may try to similarly accumulate data before transmitting a packet.

Re:Yes, buffers can introduce latency by GooberToo · 2011-01-07 04:49 · Score: 5, Insightful

It doesn't help that massive numbers of people actively insist on breaking protocols which specifically exist to alleviate some of these types of problems.
Far too many people ignorantly block all ICMP traffic. As a result, the network path in between the two communicating hosts are forced to buffer more data as the destination host becomes saturated. Worse, this type of filtering has a tendency to quickly compound, which in turn creates the exact type of bufferbloat he's describing.
I wish people would understand there is a difference between, "No route to host", and a black hole. When you find a black hole, chances are really good you've found a host. As such, purposely breaking protocols for people to have an imagined increase in security only breaks the Internet as a whole when it becomes a wide spread tactic. And before people start rattling off that it opens a whole new can of worms, please realize that unlike in the past, stateful firewalls are extremely common today - so no.
Re:Yes, buffers can introduce latency by petes_PoV · 2011-01-07 05:26 · Score: 2

Ring buffers in serial ports are not quite the same thing. With a serial port, once the ring bugger had filled (i.e. inut pointer == output pointer) the sourcing program would either be deschedule, pause a time or loop until there was space in the buffer to put the next byte of data. Nothing was lost.
With network buffers, what JG seems to be saying is that this does not happen. As packets arrive at whatever the choke point is in the circuit, there is no method for telling the sender to stop sending - the packets just keep coming. As a consequence, once the buffer has filled something starts dropping them - relying on the TCP error correcting protocols to resent "lost" packets.
The problem he's describing is the lack of an XOFF or DTS/DSR handshaking in the lower-level transports. Either that of incorrectly set window sizes, so packets are sent even though a certain number of earlier packets have yet to be ACK'd.
I have to say, that I have not experienced the issues JG raises. I can easily get 1.4MByte/second off my 14269 MBit/s ADSL downlink and it will send me data at this speed all day. Maybe our european infrastructure is adequately sized for the number of users and amount of traffic?

--
politicians are like babies' nappies: they should both be changed regularly and for the same reasons

Jim Gettys did the world a great service with this by iwbcman · 2011-01-07 03:15 · Score: 5, Interesting

I discovered this series of blog posts about 2 months ago, when he accidentally published one of his blog posts prematurely. I started reading it and followed the links and saw that this was a like a sleuth tale-if I had started reading this with his very first blog on the topic I would have had no idea where he was going with this. Now as to why this contribution by Jim Gettys does the world a great service:

Gettys is not pointing fingers at someone. The problem he is describing is truly vast, and involves lots of different people in different industries(router manufactures, ISP's, kernel driver authors, carrier grade network manufactures, etc.) with, presumably, a myriad of different intentions. The problem has been building over a long time-this didn't start yesterday, and won't be solved in a short time span, without a concerted effort on the part of everyone involved in all of these divergent industries, who often have quite divergent interests.
This approach that Gettys takes allows him to describe a problem which confronts everyone. By taking the high road and not pointing fingers he is able address an issue in such a way that a lot of the people who did contribute to this problem can recognize what they have done and own it, without being labelled, accused or feeling attacked. This should be a lesson to anyone who really wants to redress an issue that effects everyone.
Gettys develops this theme over many, many blog posts. It makes for some of the best internet reading I have experienced in years. Things only gradually become clearer-not merely what the problem is, but also all of the issues involved in it. I can read away in the internet for months at a time and not learn as much as I did by reading this series of posts.
Gettys knows what he is talking about. He developed this theme by talking with lots of experts -engineers at the ISP, people who played a pivotal role in the creation of the www and network specialists. He himself is not a network specialist, but he went out and met with people to discuss his findings and took clues and information from these exchanges to inform him and his quest to find out what was going on.
The series is short on answers. It may prove frustrating to many that he offers so little in the way of solutions to this problem. But this this due to the fact that the problem cannot be resolved by you, the end user. To solve this problem means rearchitecting countless millions of devices and altering hundreds of thousands of lines of code in multiple OS's.
Failure to redress this problem means that every effort to decrease latency by upping available bandwidth or upgrading network infrastructure will fail to deliver. If packets are not dropped fast, due to excessive buffering, the negotiation process fails, which invariably means congestion, which means latency-only something that addresses this issue has any chance of actually effecting change. Saying that this problem is just an issue already solved by QOS show that you don't understand the problem.
One of the first thoughts I had reading this was: if the techs on wallstreet read this article they will inevitably exploit this issue to win precious milliseconds on the stock exchange-ring a bell?
Any ISP could exploit this issue to offer a relative market advantage. Sadly when resolving an issue is in everybody's interest, market players will exploit the issue for their own relative gain. Getting everyone to actually tackle this is going to a gargantuan task.

Hats of to Jim Gettys. Thanks for your service.

Re:Buffering of what? by mikael · 2011-01-07 03:18 · Score: 4, Informative

Within a router it would be the actual IP data packets that are being buffered. A standard router has a number of network interfaces (token ring, ethernet, wireless, ISDN, whatever....) . Each network interface is piece of hardware that is memory mapped to allow the CPU to send and receive packets. Each hardware device also has a small online memory buffer to store the most recently received or transmitted addressed data packets (every protocol layer down to the MAC source and destination address, IP address, sequence number as well as the data). Depending on system and packet size, that could be anything between 1 and 16.

The usual implementation was to have each hardware device generate an interrupt whenever some data had been received and to transfer the data from internal memory to a common pool in system RAM. The latter was divided up into pre-allocated blocks with a few large blocks (>1000 bytes) and many smaller blocks (512 bytes). Some one might have done a statistical analysis onto the theoretical distribution of the size of packet data being sent through the network. Most of the time this worked out, but there were problems that happened some times. If all the smaller blocks were in use, then the larger blocks were used instead. For efficiency, these wouldn't be transferred through the system, until all the entire block has been filled up with data, so if you have a stream of 128 byte packets, it would take eight of them before the larger block was filled. For some systems, packet sizes were enhanced to 4K or even 8K. A constant high-speed stream of small packets was most likely to do this.

Also, many of the hardware devices would simply overwrite the contents of one unprocessed data packet with the contents of the latest arrival if it wasn't collected fast enough. So that could really mess up sequence numbers.

--
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads

Blue Öyster Cult by tepples · 2011-01-07 03:54 · Score: 2

In Canada references to the Bank of Canada in news stories have lately been abbreviated to BOC.

That's because unlike "Federal Reserve" and "Federal Express", "Bank of Canada" doesn't have a snappy, pronounceable contraction (Fed and FedEx respectively).

When I read "BOC to raise interest rates" I always wonder why the Blue Oyster Cult is doing that.

No, that'd be "BÖC to raise interest rates". BÖC was probably the first rock band to incorporate a gratuitous diaeresis in its name. The root problem here is that BOC's dis am bigger than yours.

Re:pegged connection == latency, who'd of thunk it by Keramos · 2011-01-07 04:04 · Score: 4, Informative

There is no 'bufferbloat because RAM is getting cheaper'. What he is seeing is what happens when you want to saturate your link. ... ...you get either a buffered or a dropped packet.

Yes, and if a link is saturated, there should be packet drops, which TCP senses, then automatically throttles back to reduce the required bandwidth and avoid saturation. But what is happening, is that these huge buffers are holding packets that would otherwise be dropped, and so TCP doesn't get the feedback it needs to detect saturation. So it continues transmitting at full speed, believing it has uncongested pipes, which in turn continues to fill the buffers, and so on.

Because of the buffers, most of these packets are eventually getting through, but maybe in seconds instead of tens or low hundreds of milliseconds. Thus you're getting huge latency.

Jitter is caused by the buffers eventually filling or TCP timing out (registering packet loss), dropping the rate for a little bit, the buffers draining, then TCP upping the rate again as the buffers refill, hiding the saturation, until they're full again. Rinse and repeat.

It's related to the "bloat" of buffering (due to the increasing affordability of RAM and the "more of a good thing must be better than a little of a good thing - QED" mindset) because, if the size of the buffer is kept below a certain point related to the pipe bandwidth and number of traffic streams, it tends to act just as a temporary "buffer" against spikes in the traffic (the intention of buffering), and can't cause the scenario above, having insufficient capacity to overload the bandwidth just from buffer contents alone. Above this threshold, the latency issues and back-and-forth thrashing noted above occurs. The bigger the buffers, the worse the effect.

And it's not just a "well, keep your traffic below x mbit if you're on ADSL2" issue, because it happens anywhere a high capacity pipe interfaces with a low capacity or otherwise congested (of any capacity) pipe. This might be your ISP's backbone which is getting hit by several thousand people downloading the latest WOW patch simultaneously, causing your 300kbps Skype call to go to hell through latency and jitter. If the ISP's equipment had smaller buffers, the servers would be throttling back as packet loss occurred. You'd probably still be losing packets, but they'd be detected and re-transmitted pretty quickly and you possibly wouldn't notice the latency or have jitter.

What he is seeing is what happens when you want to saturate your link.

So, no, what you get with appropriate buffers is your TCP connection moderating itself to the appropriate link capacity and availability, and latency remaining approximately the same (relative to what you're seeing in bufferbloat, but worse than an uncongested link, obviously).

With bufferbloat, your bandwidth appears to remain about the same, but your latency balloons massively and you get jitter effects as above.

Enlightment by pinkeen · 2011-01-07 04:09 · Score: 2

This is been enlightening. I've suffered very similar problems at home, but instead of figuring out the problem I replaced the hardware... After reading TFA all fits perfectly, I had occasional "chokes" - sites would take ages to load, ping's wouldn't return from my Wi-Fi router, DNS queries took ages. All while downloading a big file or something. But what's significant the throughput would stay high. It was strange as hell - high thoughput (an ongoing large transfer [but not large enough to saturate connection]) but other things choke.

Re:Looks like a hype by PseudonymousBraveguy · 2011-01-07 04:11 · Score: 2

Larger buffers do not really decrease congestion as far as TCP is concerned: With a large buffer TCP will simply send more/faster, untill the buffer overflows. The congestion will simply manifest a tiny bit later, but much, much severe.

Things change at large scale by farnz · 2011-01-07 04:18 · Score: 5, Informative

How much bandwidth can I have, though? Take the link between my desktop and a Slashdot server; is the correct answer "1GBit/s, no more" (speed of my network card)? Is is "20MBit/s, no more" (speed of my current Internet connection)? Is it "0.5MBit/s, no more" (my fair share of this office's Internet connection)? In practice, you need the answer to change rapidly, depending on network conditions - maybe I can have the full 20MBit/s if no-one else is using the Internet, maybe I should slow down briefly while someone else handles their e-mail.

TCP doesn't slam the network; it starts off slowly (TCP slow start currently sends just two packets initially), and gradually ramps up as it finds that packets aren't dropped. When packet drop happens, it realises that it's pushing too hard, and drops back. If there's been no packet drop for a while, it goes back to trying to ramp up. RFC 5681 talks about the gory details. It's possible (bar idiots with firewalls that block it) to use ECN (explicit congestion notification) instead of packet drop to indicate congestion, but the presence of people who think that ECN-enabled packets should be dropped (regardless of whether congestion has happened) means that you can't implement ECN on the wider Internet.

This works well in practice, given sane buffers; it dynamically shares the link bandwidth, without overflowing it. Bufferbloat destroys this, because TCP no longer gets the feedback it expects until the latency is immense. As a result, instead of sending typically 20MBit/s (assuming I'm the only user of the connection), and occasionally trying 20.01MBit/s, my TCP stack tries 20.01MBit/s, finds it works (thanks to the queue), speeds up to 20.10MBit/s, and still no failure, until it's trying to send at (say) 25MBit/s over a 20MBit/s bottleneck. Then packet loss kicks in, and brings it back down to 20MBit/s, but now the link latency is 5 seconds, not 5 milliseconds.

--
I appear to have a blog. Odd.

Re:Things change at large scale by willmorton · 2011-01-07 06:15 · Score: 2

It's actually even worse than this. Using your example, your TCP stack ramps up to 25mbps, overflows the buffer, and loses a lot of packets at once, rather than just one or a few packets that would be lost with a sane buffer. Lots of lost packets at once leads to a RTO timeout rather than a Fast Retransmit and Fast Recovery, and essentially you're starting over from zero instead of reducing your speed a little and continuing.

Re:Buffering of what? by TheLink · 2011-01-07 04:19 · Score: 2

I don't think buffering is the real problem. Buffering can help with throughput especially TCP throughput.

For example if there is a brief burst of packets higher than a router's outbound connection bandwidth supports, the router has two options:

a) buffering or queuing up the packets (assuming enough buffer space), till the output queue empties.
b) dropping the packets.

UDP latency doesn't go up if UDP packets are dropped - since there are no retransmissions.

But if TCP packets are dropped too often just because of bursts (from other connections), the speed drops significantly and the latency goes up a lot too - because the relevant parties have to wait for acks and retransmissions and the TCP window size also decreases. If the TCP packets are buffered, the TCP transmitter won't slow it's transmission rate drastically (it will still slow a bit when waiting for acks), but there would be a brief increase in latency for the period the packets are being buffered.

Trouble is if it's not a brief burst, it means you're just delaying the inevitable packet drops, meanwhile the sustained delay becomes more noticeable.

What the author doesn't seem to realize is the upstream network providers should buffer (within reason - even up to a few seconds worth) - because they usually don't know which of your packets are more important to you, and thus which packets to drop first.

The solution is for the network providers to have enough internal bandwidth so that THEIR buffers rarely start to fill up, and there is minimal packet loss, AND for the user to do traffic shaping and policying at their connection (and the servers too).

Say you have a 10Mbps connection, and the ISP actually does give you a full 10Mbps. If you download from a fast site, the packets queue up when they reach your 10Mbps connection before dropping and finally reaching and equilibrium at 10Mbps. But meanwhile your interactive ssh or game connections suffer from increased latency since they share the queue - as mentioned before the ISP doesn't know which packets/connections are more important to you (it can guess of course and favour small packets, ACKs, DNS; or you can give QOS hints, but most ISPs don't bother).

To prevent this, you get your modem/firewall to "shape" your traffic at say 9.5Mbps (leave enough headroom for the initial TCP connection bursts) and choose which packets you want to drop when it goes past 9.5Mbps (for instance drop http packets first instead of your interactive or VOIP stuff, or the other way round if http traffic is more important to you for some reason). This way you control what happens AND the buffers never fill up at the ISP end since the traffic doesn't even hit 10Mbps (if their buffers don't fill up, there is no increase in latency). This also applies to your outbound connections - if your upload bandwidth is 5Mbps, shape it at 4.5Mbps or so.

The narrowest part of the straw controls the flow. If the ISP is the narrowest part of the straw - the buffering and shaping is up to them. If your system is the narrowest part of the straw - it controls the flow.

If there's something wrong and somehow there's a 1Mbps bottleneck in the ISP for ALL your traffic, that bottleneck will control the flow - since it will drop packets before it reaches 9.5Mbps point where you start dropping packets - in which case the ISP is cheating you and giving you a 1Mbps connection (note though that if the 1Mbps bottleneck is only for some of the traffic your other traffic will be fine ).

Note: I'm assuming the case of sustained TCP connections, and lower UDP traffic speeds. This all doesn't work if some site is blasting UDP packets at 100Mbps to you[1].

[1] The UDP blasting could actually be a way of bypassing certain ISP throttling methods - sending lots of packets, and coping with packet loss by using forward error correction and some feedback. Fortunately not many are doing this yet :).

--

Too many replies beneath your current threshold

Re:Someone with networking chops by Neil+Boekend · 2011-01-07 04:25 · Score: 2

nope. The problem can be in the server on the ISP side. The problem appears where a big pipe (1 Gbit for example) is poured into a small pipe (1 Mbit for example) see TheThiefMaster

As an extreme example, say you request a 1GB file from a download site. That site has a monster internet connection, and manages to transmit the entire file in 1 second. The file makes it to the ISP at that speed, who then buffers the packets for slow transmission over your ADSL link, which will take 1 hour. During that time you try to browse the web, and your PC tries to do a dns lookup. The request goes out ok, but the response gets added to the buffer on the ISP side of your internet connection, so you won't get it until your original transfer completes. How's 1 hour for latency?

The situation is only not that bad because:
A: Most download sites serve so many people at once and/or rate limit so they won't saturate most peoples' connections
B: Most buffers in network hardware are still quite small

--
Well, I might have a way, but it only works on a semi spherical planet in a vacuum.

The Sky is Falling.....NOT! by rocker_wannabe · 2011-01-07 04:50 · Score: 2

TCP contains some of the most incredible heuristic algorithms I've ever seen. Each algorithm, like Slow Start, RTT Estimation, SACK, etc. are relatively simple but together they work incredibly well at keeping data flowing across heterogeneous networks. They work so well that I've seen TCP overcome broken ethernet drivers and make them appear to work. Unfortunately, as someone who use to look at TCP traces for a living, I can tell you it can be really hard to work backwards from packet traces to figure out what is going on in the TCP/IP stack because there can be so much going on at the same time. This means that Wireshark in the hands of a weekend-hacker can easily lead to erroneous conclusions. If you follow this link and go to section 14.5 Random Early Detection (RED) you can see that the issue is already known and there are already solutions to mitigate the problem.

Relax and take a deep breath. Now you can move on to something more important......... like where you're going to spend your eternity

--
"Meaningless!, Meaningless!" says the Teacher. "Utterly meaningless!"

Re:Buffer bloat or inadequate bandwith by houstonbofh · 2011-01-07 05:02 · Score: 2

No the problem is that the self regulation built into TCP is delayed by buffers. The size of the pipe is only relevant in how long it takes to fill. (And like a garage, it will always fill) If you read the articles (Have a few hours free) he shows how to find the issue on a gigabit link. That is even fast in Korea.

Re:Buffering of what? by houstonbofh · 2011-01-07 05:12 · Score: 2

The solution is for the network providers to have enough internal bandwidth so that THEIR buffers rarely start to fill up, and there is minimal packet loss, AND for the user to do traffic shaping and policying at their connection (and the servers too).

There is no such thing a enough bandwidth. It will always fill. You need to allow the built in mechanisms to recognize when it is full. And while I agree that traffic shaping is nice (and easy with firewalls like m0n0wall) it is not in most home routers. Besides, expecting most users to do this properly when they can not even patch their systems is folly at best.

Re:QoS by Shakrai · 2011-01-07 05:27 · Score: 2

Another possible fix is to shorten the TTL on packets, where the packets are discarded if the route has too much delay in it.

Umm, that's not what the TTL does. The TTL gets decremented by 1 for every router that touches the packet. If it hits zero without reaching it's final destination it's dropped. It has nothing to do with latency. Traceroute works by sending out packets with increasing TTLs (i.e: the first packet has a TTL of 1, the second has a TTL of 2, and so on) and looking at the returning ICMP time exceeded packets.

Or use UDP for streaming applications rather than TCP, like it was intended, and let the end points buffer the difference.

This. UDP is tailor made for streaming applications. Nobody is going to notice if their video feed drops a frame here or there. They are going to notice if it freezes because a TCP connection is waiting for a packet retransmit.

--
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.

Re:QoS by Shakrai · 2011-01-07 05:30 · Score: 2

You just perfectly described the procedure for shapeing upstream traffic.

You are shaping the "upstream" from your LAN interface. As I described it:

Linux router eth1 == internet
Linux router eth0 == lan
Path of packet from the internet: the cloud -> your ISP -> eth1 -> (nat/routing occurs here), eth0 -> your LAN PC

If you shape it before it goes out on your LAN the TCP stack of your clients will respond accordingly and the overall bandwidth consumption is appropriately limited.

I don't know if wondershaper can do this, as I've always configured my QoS by hand. More flexible that way. What I've described is very easy to set up with iptables, tc and the relevant Linux kernel modules.

--
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.

Re:Buffer bloat or inadequate bandwith by OeLeWaPpErKe · 2011-01-07 08:12 · Score: 2, Interesting

You forgot the tiny little fact that unless one pulls his connection to the limit with a lot of tcp connections, there isn't any problem.

Turn off your bittorrent client while you're playing starcraft online, and the problem disappears.

The post fails to explain what happens in the case of insufficient buffers - and dropped packets : it can take up to 2 minutes for tcp to recover from a single dropped packet (granted - on slow links or long distance connections). Would you really feel that interactive response has improved if things work fast 95% of the time, and then your web browser* - for no apparent reason at all - takes 2 minutes to load pages** ?

(yes, I'm an ISP's network engineer. Big buffers or small buffers ? Trust me, you want big)

* the fun thing about webbrowsers is that they open lots of tcp connections, then barely send any packets at all (ie. connection generally closes after 4-5 packets tops - sometimes after 2 packets). If you lose the first packet in a connection, which is quite likely when browsing, the SRTT algorithm has no choice but to wait 2 minutes before retry - guaranteeing the user will have to interfere (ie. "F5"). This results in the massive deterioration of web browsing experience with trivially small packet loss. Unless you've never wondered why internet is near-unusable with 0.1% packet loss on your link, and nothing at all gets through at 1% packet loss. You'd think 99% correctly transmitted packets would translate in 99% of bandwidth available, no ? (in case you have this problem : a simple trick to put everything through a single tcp connection. ssh -D 1025 server_at_work; set up firefox to use socks5 proxy at port localhost:1025. You'll have 40-50% of your link bandwidth available on recent windows or linux)
** Just try, go to a big company that's upgraded to cheap gigabit switches, with tiny buffers. Ask them if, perchance, they've been experiencing sudden "timeouts" all of a sudden. Ask again if they like this.

The way to fix this - not that I'm expecting political interference from large groups of idiots - all large groups are large groups of idiots, because most people are idiots in most subjects - to go in a sensible direction all of a sudden, so "let's get the mob to 'fix' this" doesn't work regardless of good intentions - is not to go with small buffers but to have intelligent queuing algorithms in all devices. Of course, bittorrent will always cause this behavior, because one of two things will happen when bittorrent opens it's 5000 connections
1) either routers slow down bittorrent traffic in favor of http, much better performance, but results in underwear kids who haven't seen sunlight in a year shouting "NET NEUTRALITY !"
2) or they "treat all traffic the same" - and with tcp the one with the most connections "wins" the most bandwidth - meaning if you open 500 connections, your web browser is only going to get 1/500th of the link bandwidth - resulting in abysmal performance

This is what network engineers mean when they're saying bittorrent is destroying network performance. As to what lawyers and politicians mean, yes that's something else, and frankly, I don't care.

Re:pegged connection == latency, who'd of thunk it by nedlohs · 2011-01-07 10:24 · Score: 2

There are no layers involved. It's a high level description of something that ignores lots of stuff and "oh shock horror" even gets some details wrong.

Did they ever tell you in school that electrons orbit the nucleus in shells? Oh shit! They lied! Clearly they made no sense at all!

The example is describing what the packets do when you request the file. And yes its exaggerating things and yes it's simplifying things. But that's how you describe things to people who don't know the technical details when those technical details don't really matter to understanding the overall issue.

Obviously you don't get 1GB of packets in the queue then again the queue isn't 1GB in size in the first place anyway - oh look I explicitely mentioned that too but you're clearly to stupid to read.

Getting stuck up on trivial technical details is what your bringing, lots of other people manage to understand how analogies and high level approximations work.

Understanding fail. by gottabeme · 2011-01-07 11:27 · Score: 2

I can see why you're posting as an AC, because you don't understand the difference between an HTTP request and the TCP connection that fulfills it. There is no requesting of packets; the request is made via HTTP, and the receiver then ACKnowledges TCP packets from the server, which may send more quickly than it receives ACKs so as to increase throughput--this then fills buffers and causes cascading latency.

You are compounding the problem by spreading misinformation. Please stop and go educate yourself.

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."

Slashdot Mirror

Bufferbloat — the Submarine That's Sinking the Net

85 of 525 comments (clear)