Yes, thank you. I should have said that sooner.:) I hope we can look back on your work (and the contributions of others you mentioned) as helping to avoid serious problems in the future.
Theoretically it could work. In practice, if it works, it will indeed take years--hopefully not ten. Perhaps one of the largest obstacles will be convincing the marketeers that having smaller buffers is not a bad thing. They love to have the biggest numbers on their spec sheets.
I'm not sure why I'm bothering to respond to an AC, but anyway...
I'm sure you're right about QoS and buffers on backbones working well on backbones, but I'm talking about the problem end-to-end. I think bufferbloat is more an edge phenomenon, what with last-mile cable/DSL/fiber connections interfacing with poor-quality home networking gear--and wireless networks have their own special bufferbloat problems.
Most people on here have been touting QoS as a solution--mostly claiming that rate-limiting connections on LAN routers and throwing away a percentage of your rated bandwidth is the way to solve the latency problems. Your backbone/enterprise-grade QoS is a whole 'nother matter, but even that can't fix the problems at the edges.
I disagree. Netalyzr shows 3000+ms buffers on this connection, and the observed latency behavior is consistent with the "bufferbloat" problem. Throughput reaches the connection's maximum, and packet loss is normal. The only problem is extreme latency and jitter when the connection is saturated (even with a single HTTP download). The other things you mention would cause reduced throughput and increased packet loss. This is an entirely wired connection, by the way: DSL and Ethernet only.
While I generally agree with you regarding UDP for streaming, it's not quite that simple. For one thing, application-level buffering combined with application-level retransmitting should handle dropped UDP packets (though reimplementing what TCP does might not perform so well in practice). However, it doesn't work out to a simple dropped frame here and there. Media transport stream packets aren't going to have video frames byte-aligned with network packets. Besides, variable-rate encoding and partial-frame encoding throws all that out the window--compressed video isn't a stream of bitmaps. When video data is dropped or corrupted, you see ugly artifacts that sometimes aren't resolved for a few seconds until the next keyframe. Dropped audio data might result in ugly, loud sound artifacts.
For VOIP or Skype-type stuff, maybe those artifacts would be tolerable. But not for all streaming video.
You forgot the tiny little fact that unless one pulls his connection to the limit with a lot of tcp connections, there isn't any problem.
Turn off your bittorrent client while you're playing starcraft online, and the problem disappears.
The post fails to explain what happens in the case of insufficient buffers - and dropped packets : it can take up to 2 minutes for tcp to recover from a single dropped packet (granted - on slow links or long distance connections). Would you really feel that interactive response has improved if things work fast 95% of the time, and then your web browser* - for no apparent reason at all - takes 2 minutes to load pages** ?
You're wrong. Netalyzr is showing 3000+ms buffering on the DSL connection I'm using at the moment. I've discovered that downloading a single file via HTTP causes multi-second latency for browsing web pages (that "2 minutes to load pages" you mentioned). Forget about BitTorrent--I'm talking about downloading one Linux ISO with wget causing new HTTP requests to take many seconds to receive the first packet back from the server.
I'm glad for you that you aren't having such serious trouble with your connections, but there are those of us who are.
You have a 14 Gigabit ADSL connection? I'm jealous. =)
Assuming you meant 14 Megabits, still, that's much better than the ~1.2 Mbit DSL connection here in Texas. While visiting family here, I've found that downloading one file via HTTP causes all other traffic to suffer multi-second latency. Netalyzr shows 3000+ms buffers. But using my own connection at home I don't suffer this problem. I'm guessing the ISP hardware here has oversized buffers now, because it wasn't always this bad.
Solution: fix the software in the routers, modems, etc. to use appropriate buffer sizes, and use the Internet as it was intended, to do whatever you want whenever you want, sharing bandwidth and latency equally.
That's not really a solution, merely a hackish workaround--an extra layer of software complexity to work around a hardware problem that shouldn't exist. If the buffers were sized appropriately, rate-limiting wouldn't be necessary to avoid latency. Rate-limiting and QoS should be used to prioritize bandwidth, not latency.
It's not that simple. Ideally, if, for example, a 1 Mbit DSL connection were saturated with one HTTP download, and the user then started loading a web page, the packets of the web page's request should have no more latency than the download's packets. However, when buffers are several thousand milliseconds in length, it doesn't work that way; the download's packets are constant and fill the buffers, and the web page's packets, being small and bursty, have to wait to get through the queue. If the user started a second download, both downloads would end up having equal shares of bandwidth. However, when one connection is constant and saturates the buffers, and another connection is interactive and bursty, the bursty one will suffer latency.
The correct solution is to size buffers so they are short in length of time (size in bytes being relative to bandwidth). This way, even if the buffers are saturated, our hypothetical web page's packets won't have to wait long to get through the buffers, and latency won't be much worse than on an idle connection. TCP is made to rate limit connections itself--it's just that these pesky, oversized buffers defeat TCP's rate-limiting mechanisms. Having to rate limit your software or your own upstream bandwidth is an ugly hack that wastes bandwidth. Just shrink the buffers with software patches.
Theoretically, hardware wouldn't have to be replaced--all that's needed is to patch the software to use smaller buffers. So some RAM goes unused; big whoop.
As others mentioned, dropped packets are intended--they're part of how the flow-control mechanisms work.
Dropped packets may be quite annoying for VOIP, gaming, perhaps even streaming media (though multi-second application-level buffers should compensate for streaming media), but for most protocols--like HTTP web browsing, BitTorrent, etc), dropped packets aren't really a big deal.
Mr. AC, you will have to explain how it's a non-issue, since Mr. Gettys has shown how it is indeed an issue. I can testify to the problem myself, as Netalyzr shows 3000ms+ buffers on my current DSL connection, and as downloading a single large file via HTTP is causing these buffers to fill and resulting in multi-second latency for simple web browsing and jittery pings while downloading.
I think you give too much credit to these "hardware engineers" of yours. If "they" knew so much about queueing, "they" wouldn't have made buffers which are so large that they defeat the built-in TCP congestion-control mechanisms. "They" tend to think that more is always better.
Perhaps, but that would require more router CPU. Simply using sensible buffer sizes would allow the existing TCP flow control mechanisms to do what they were engineered to do without additional software complexity. I think that's the "real" solution: to undo that which should never have been done.
I can see why you're posting as an AC, because you don't understand the difference between an HTTP request and the TCP connection that fulfills it. There is no requesting of packets; the request is made via HTTP, and the receiver then ACKnowledges TCP packets from the server, which may send more quickly than it receives ACKs so as to increase throughput--this then fills buffers and causes cascading latency.
You are compounding the problem by spreading misinformation. Please stop and go educate yourself.
I disagree that it works pretty well overall. At my apartment in one place, I can use my AT&T DSL connection to download large files or use BitTorrent or stream Netflix and still browse the web with acceptable latency. Visiting my family in another place, also on an AT&T DSL connection, the bufferbloat is so bad (over 3000ms according to Netalyzr) that downloading a single large file via HTTP makes for 5-10 second latency in browsing other web sites. I'm not kidding. And it hasn't always been this way. Years ago, at this same location, with the same AT&T DSL service, it didn't behave so poorly. And ten years ago, living in another state but also with the same speed AT&T DSL service, such latency was never a problem, even with two people constantly using the connection. My best guess is that at some point the AT&T equipment here was changed and the buffers got much bigger.
You're a fool. You don't understand the difference between an HTTP request and the resulting TCP session that fulfills the request. The foolishness comes from refusing to admit your mistake and refusing to learn.
The HTTP request for a file or a range of a file is made once. The resulting TCP session works by the server sending packets and the receiver ACKnowledging packets--not requesting packets.
You shouldn't attempt to comment authoritatively on an issue until you actually understand how the systems work.
You have just demonstrated the uphill battle that is fixing this problem. You do not understand.
It's absurd for the end-user to rate limit his connection at his end--TCP is engineered to take care of that on its own. Unnecessarily large buffers defeat the very mechanism TCP uses to control congestion and data rates.
You're right about one thing: TCP links do not send data willy-nilly. But when a buffer is, e.g. 3500ms in length (buffer size in bytes / connection speed), any change in rate resulting from the receiver acknowledging packets won't happen for at least 3500ms. If packets aren't dropped for 3500ms, then the data rate won't reduce as a result of the packet loss for at least 3500ms. Then it will drop by 50%, and gradually increase again until the buffers are full. Repeat ad nauseam (e.g. jitter).
The best, simplest, and cheapest way to fix this problem is to patch software in routers, etc. to reduce buffers to sane sizes and let TCP do what it's already engineered to do. QoS and rate-limiting by every user is absurd, illogical, and wasteful--it's throwing useful bandwidth away because of a problem that shouldn't exist in the first place.
Please, at least study the issue before you try to debunk it.
Haha, compression? Adding another stage, another buffer, another process, to try to reduce latency? That's like turning on another light because it's too bright in the room.
And these guys are "the network guys"? Sad. Just shows what this problem is up against.
Yes, thank you. I should have said that sooner. :) I hope we can look back on your work (and the contributions of others you mentioned) as helping to avoid serious problems in the future.
Ideally I suppose you're right, depending on the CPU in the router.
Ok, thanks for the insights. I will have to log in to the modem's own control panel and check the line stats. It'd be great if that were the problem.
Theoretically it could work. In practice, if it works, it will indeed take years--hopefully not ten. Perhaps one of the largest obstacles will be convincing the marketeers that having smaller buffers is not a bad thing. They love to have the biggest numbers on their spec sheets.
I'm not sure why I'm bothering to respond to an AC, but anyway...
I'm sure you're right about QoS and buffers on backbones working well on backbones, but I'm talking about the problem end-to-end. I think bufferbloat is more an edge phenomenon, what with last-mile cable/DSL/fiber connections interfacing with poor-quality home networking gear--and wireless networks have their own special bufferbloat problems.
Most people on here have been touting QoS as a solution--mostly claiming that rate-limiting connections on LAN routers and throwing away a percentage of your rated bandwidth is the way to solve the latency problems. Your backbone/enterprise-grade QoS is a whole 'nother matter, but even that can't fix the problems at the edges.
I disagree. Netalyzr shows 3000+ms buffers on this connection, and the observed latency behavior is consistent with the "bufferbloat" problem. Throughput reaches the connection's maximum, and packet loss is normal. The only problem is extreme latency and jitter when the connection is saturated (even with a single HTTP download). The other things you mention would cause reduced throughput and increased packet loss. This is an entirely wired connection, by the way: DSL and Ethernet only.
Eh, I didn't read it as advocating replacing HTTP, merely exploring this CCNx thing to see where it goes.
He ain't senile or delusional: the problem is real. Don't shoot the messenger.
While I generally agree with you regarding UDP for streaming, it's not quite that simple. For one thing, application-level buffering combined with application-level retransmitting should handle dropped UDP packets (though reimplementing what TCP does might not perform so well in practice). However, it doesn't work out to a simple dropped frame here and there. Media transport stream packets aren't going to have video frames byte-aligned with network packets. Besides, variable-rate encoding and partial-frame encoding throws all that out the window--compressed video isn't a stream of bitmaps. When video data is dropped or corrupted, you see ugly artifacts that sometimes aren't resolved for a few seconds until the next keyframe. Dropped audio data might result in ugly, loud sound artifacts.
For VOIP or Skype-type stuff, maybe those artifacts would be tolerable. But not for all streaming video.
You forgot the tiny little fact that unless one pulls his connection to the limit with a lot of tcp connections, there isn't any problem.
Turn off your bittorrent client while you're playing starcraft online, and the problem disappears.
The post fails to explain what happens in the case of insufficient buffers - and dropped packets : it can take up to 2 minutes for tcp to recover from a single dropped packet (granted - on slow links or long distance connections). Would you really feel that interactive response has improved if things work fast 95% of the time, and then your web browser* - for no apparent reason at all - takes 2 minutes to load pages** ?
You're wrong. Netalyzr is showing 3000+ms buffering on the DSL connection I'm using at the moment. I've discovered that downloading a single file via HTTP causes multi-second latency for browsing web pages (that "2 minutes to load pages" you mentioned). Forget about BitTorrent--I'm talking about downloading one Linux ISO with wget causing new HTTP requests to take many seconds to receive the first packet back from the server.
I'm glad for you that you aren't having such serious trouble with your connections, but there are those of us who are.
You have a 14 Gigabit ADSL connection? I'm jealous. =)
Assuming you meant 14 Megabits, still, that's much better than the ~1.2 Mbit DSL connection here in Texas. While visiting family here, I've found that downloading one file via HTTP causes all other traffic to suffer multi-second latency. Netalyzr shows 3000+ms buffers. But using my own connection at home I don't suffer this problem. I'm guessing the ISP hardware here has oversized buffers now, because it wasn't always this bad.
Solution: fix the software in the routers, modems, etc. to use appropriate buffer sizes, and use the Internet as it was intended, to do whatever you want whenever you want, sharing bandwidth and latency equally.
That's not really a solution, merely a hackish workaround--an extra layer of software complexity to work around a hardware problem that shouldn't exist. If the buffers were sized appropriately, rate-limiting wouldn't be necessary to avoid latency. Rate-limiting and QoS should be used to prioritize bandwidth, not latency.
It's not that simple. Ideally, if, for example, a 1 Mbit DSL connection were saturated with one HTTP download, and the user then started loading a web page, the packets of the web page's request should have no more latency than the download's packets. However, when buffers are several thousand milliseconds in length, it doesn't work that way; the download's packets are constant and fill the buffers, and the web page's packets, being small and bursty, have to wait to get through the queue. If the user started a second download, both downloads would end up having equal shares of bandwidth. However, when one connection is constant and saturates the buffers, and another connection is interactive and bursty, the bursty one will suffer latency.
The correct solution is to size buffers so they are short in length of time (size in bytes being relative to bandwidth). This way, even if the buffers are saturated, our hypothetical web page's packets won't have to wait long to get through the buffers, and latency won't be much worse than on an idle connection. TCP is made to rate limit connections itself--it's just that these pesky, oversized buffers defeat TCP's rate-limiting mechanisms. Having to rate limit your software or your own upstream bandwidth is an ugly hack that wastes bandwidth. Just shrink the buffers with software patches.
Hear, hear! A voice of reason!
Another piece of the puzzle, perhaps.
Theoretically, hardware wouldn't have to be replaced--all that's needed is to patch the software to use smaller buffers. So some RAM goes unused; big whoop.
A few people do get it.
As others mentioned, dropped packets are intended--they're part of how the flow-control mechanisms work.
Dropped packets may be quite annoying for VOIP, gaming, perhaps even streaming media (though multi-second application-level buffers should compensate for streaming media), but for most protocols--like HTTP web browsing, BitTorrent, etc), dropped packets aren't really a big deal.
Mr. AC, you will have to explain how it's a non-issue, since Mr. Gettys has shown how it is indeed an issue. I can testify to the problem myself, as Netalyzr shows 3000ms+ buffers on my current DSL connection, and as downloading a single large file via HTTP is causing these buffers to fill and resulting in multi-second latency for simple web browsing and jittery pings while downloading.
I think you give too much credit to these "hardware engineers" of yours. If "they" knew so much about queueing, "they" wouldn't have made buffers which are so large that they defeat the built-in TCP congestion-control mechanisms. "They" tend to think that more is always better.
Perhaps, but that would require more router CPU. Simply using sensible buffer sizes would allow the existing TCP flow control mechanisms to do what they were engineered to do without additional software complexity. I think that's the "real" solution: to undo that which should never have been done.
I can see why you're posting as an AC, because you don't understand the difference between an HTTP request and the TCP connection that fulfills it. There is no requesting of packets; the request is made via HTTP, and the receiver then ACKnowledges TCP packets from the server, which may send more quickly than it receives ACKs so as to increase throughput--this then fills buffers and causes cascading latency.
You are compounding the problem by spreading misinformation. Please stop and go educate yourself.
I disagree that it works pretty well overall. At my apartment in one place, I can use my AT&T DSL connection to download large files or use BitTorrent or stream Netflix and still browse the web with acceptable latency. Visiting my family in another place, also on an AT&T DSL connection, the bufferbloat is so bad (over 3000ms according to Netalyzr) that downloading a single large file via HTTP makes for 5-10 second latency in browsing other web sites. I'm not kidding. And it hasn't always been this way. Years ago, at this same location, with the same AT&T DSL service, it didn't behave so poorly. And ten years ago, living in another state but also with the same speed AT&T DSL service, such latency was never a problem, even with two people constantly using the connection. My best guess is that at some point the AT&T equipment here was changed and the buffers got much bigger.
You're a fool. You don't understand the difference between an HTTP request and the resulting TCP session that fulfills the request. The foolishness comes from refusing to admit your mistake and refusing to learn.
The HTTP request for a file or a range of a file is made once. The resulting TCP session works by the server sending packets and the receiver ACKnowledging packets--not requesting packets.
You shouldn't attempt to comment authoritatively on an issue until you actually understand how the systems work.
The use of 1 GB was merely as an illustration.
You have just demonstrated the uphill battle that is fixing this problem. You do not understand.
It's absurd for the end-user to rate limit his connection at his end--TCP is engineered to take care of that on its own. Unnecessarily large buffers defeat the very mechanism TCP uses to control congestion and data rates.
You're right about one thing: TCP links do not send data willy-nilly. But when a buffer is, e.g. 3500ms in length (buffer size in bytes / connection speed), any change in rate resulting from the receiver acknowledging packets won't happen for at least 3500ms. If packets aren't dropped for 3500ms, then the data rate won't reduce as a result of the packet loss for at least 3500ms. Then it will drop by 50%, and gradually increase again until the buffers are full. Repeat ad nauseam (e.g. jitter).
The best, simplest, and cheapest way to fix this problem is to patch software in routers, etc. to reduce buffers to sane sizes and let TCP do what it's already engineered to do. QoS and rate-limiting by every user is absurd, illogical, and wasteful--it's throwing useful bandwidth away because of a problem that shouldn't exist in the first place.
Please, at least study the issue before you try to debunk it.
Haha, compression? Adding another stage, another buffer, another process, to try to reduce latency? That's like turning on another light because it's too bright in the room.
And these guys are "the network guys"? Sad. Just shows what this problem is up against.