Bufferbloat: Dark Buffers In the Internet
Expanding on earlier work from Jim Gettys of Bell Labs with a new article in the ACM Queue, CowboyRobot writes that Gettys "makes the case that the Internet is in danger of collapse due to 'bufferbloat,' 'the existence of excessively large and frequently full buffers inside the network.' Part of the blame is due to overbuffering; in an effort to protect ourselves we make things worse. But the problem runs deeper than that. Gettys' solution is AQM (active queue management) which is not deployed as widely as it should be. 'We are flying on an Internet airplane in which we are constantly swapping the wings, the engines, and the fuselage, with most of the cockpit instruments removed but only a few new instruments reinstalled. It crashed before; will it crash again?'"
the existence of excessively large and frequently full buffers
Seems better than the existence of excessively large and seldom if ever full buffers.
Cingely has been writing about this all year. He cites Jim Gettys too. See: http://www.cringely.com/tag/bufferbloat/
I paid the going retail price for a Windows screen reader and got a free Unix computer!
To configure your active queue management, the first thing I need to know is: do you have a push system, or a pull system?
Neither, sir, we have a suck system.
As this is not my area of expertise, I have no idea if this is valid or not; but something having dire consequences does not allow you to simply dismiss it as "alarmism."
not only that, but Getty's buffer bloat theory has been featured on slashdot before. Maybe the dupe queue was full?
Do you even lift?
These aren't the 'roids you're looking for.
Maybe posting a new article on an issue that was also an issue a year ago is not a "dupe", but an acceptable and possibly even normal thing for a news site to do?
Except it is an alarmist. The current situation isn't optimal but being optimal and having a critical issue are two different things. The crux of the problem is basically "Long delays from bufferbloat are frequently attributed incorrectly to network congestion, and this misinterpretation of the problem leads to the wrong solutions being proposed." That means is the administrators *might* mistake large buffer slow downs for other causes of network congestion. Idealy, it should definitely be dealt with better but it's hardly a collapse of the network.
A network buffer acts just as that, a buffer to smooth out traffic spikes. A buffer does this at the cost of latency. If a buffer is large AND consistently full, that means that network link is always being fully utilized to where a large buffer isn't needed which basically induces large latency on top of waiting for the link to clear for no benefits (the extra latency *may* confuse administrators is basically the "danger"). On the other hand, if the link is under utilized the majority of times, the a large buffer is beneficial to deal with spike traffic. The majority of networks are the latter and hence designed as such. Two solutions, get faster links or deal with it more intelligently.
...he said, over the internet.
A replacement for PATA or PCI has to interoperate only with other components in the same chassis, or possibly on the same desk in the case of eSATA and Thunderbolt. A replacement for TCP would have to interoperate with every other computer in the world. Imagine what a flag day that would be.
Feel free to try it out yourself...I have and the problem is real.
And just maybe some of us are interested in how research has progressed since the last article...
Democracy is a sheep and two wolves deciding what to have for lunch. Freedom is a well armed sheep contesting the issue
Each hop has its own buffer. Endpoints can fix their own buffers, but they can't do anything about buffering in the next hop. If something changes in the network to reduce the available bandwidth, the ideal behaviour is for packets to start getting dropped right away so that the originator gets notified of the drop and can slow itself down to compensate.
If some device in the core network just buffers up seconds worth of packets instead of droping them it destroys the ability of the sender to adapt to the changing conditions.
I'm also on the internet, but he's right. This pretty much sums up how I feel about the internet.
But that's the point, the buffers smooth the link, but not the streams going across them. At enough of a buffer bloat, the buffers actually make the link have to retransmit the same data multiple times due to the design of TCP congestion avoidance.
a handful of selfish greedy people are no match for millions of selfish, greedy people -u4ya
As soon as I start trying to shove (or suck) more bits through the pipe than it can handle, round trip latency to "nearby" points of the Internet increases from ~25 ms to ~1 second. When I need to transfer a lot of data, I use rsync or wget if at all possible, and throttle the transfer to just below the rate the connection can handle; this results in ping times staying sane while only slowing down the transfer slightly. We shouldn't need to resort to doing stuff like this to make the network function properly!
This analogy is like a bathtub, full of spiders, and on fire. It sounds dangerous, but it's self limiting.
“Common sense is not so common.” — Voltaire
fine then, ignore the warnings, but don't come crying to me when you can't download pornography one day.
“Common sense is not so common.” — Voltaire
The bad Bufferbloat setup is on the left (yellow dots), and the 'good' setup (i.e. how things used to be configured about 10-20 years ago when RAM was more expensive!) is on the right (cyan/blue dots).
Both sides start off okay, but notice how the left side 'queues' (tall yellow dot columns) keep on growing over time, while the right side blue columns stop short because of the small buffer size. As they stop short, some data 'packets' must be dropped, and this gets reported back to the upload site that it's shoving data to the user too fast. As a result, the upload site temporarily slows the sending of data, and thus the system self-corrects.
Meanwhile, on the left side, these packets of data never get dropped, so the giant bloated yellow buffers get filled more and more, but the computer at the upload site doesn't realise the carnage of these giant queues further down the line, and instead thinks "All is okay, let's keep sending data fast!".
Finally, when a smaller piece of data needs to be sent to the user (see 2:30+ signified by red dots on the left and dark blue dots on the right), the left side shows the red dots (which could be say, a small email) wading through giant queues to reach their destination, really slowly. Furthermore these tiny bits of data often need special 'emergency' treatment as they hold up other larger data associated with it. On the good right side, the dark blue dots have no such giant queues.
Why OpalCalc is the best Windows calc
So none are an appropriate analogy.
HTH
Deleted
If you look at buffers allocated to fast multi-gigabit interfaces at the core of the network they are simply not large enough compared to forwarding rates involved to be able to induce the kinds of delays needed to cause Internet wide problems.
You can argue they may not be ideal for real time voice, game or video communication when these links are oversubscribed but no doomsday is possible.
Today buffer bloat effects are mostly observed at the edge even though they need not always be.
Failure of a congestion control algorithm to control link saturation does not translate into congestive collapse of the larger network. It just results in *your* network connection turning to shit. When netalyzer runs it intentionally saturates your link at that time. In the real world only a few portions of the edge are ever saturated to the extent congestion control failure becomes an issue leading to more packets through core routers. The number of edge machines in this category would need to be significant to cause a rerun of previous issues.
That condition can not be met due to self feedbacks. If everyone maxed their pipes at once the core would saturate self-limiting edge saturation due to gross over-provisioning of available edge bandwidth in relation to core bandwidth which would ensure congestion control algorithms function properly.
I'm not arguing there is not a problem or more can't be done. I'm just arguing the doomsday congestive collapse scenario is bullshit.
That's a pretty simplified way of putting it, but basically correct. Major equipment vendors have been slow to adopt more advanced queuing strategies (Stochastic Fair Queuing integrated with some of the more advanced flavors of early discard.)
Right. The problem is not big buffers, per se. It's big dumb FIFO queues. There's nothing wrong with one big flow, like a file transfer, having a long latency, provided that other flows with less data in flight aren't stuck behind it. That's what "fair queuing" is all about. Each flow has its own queue, and the queues are serviced in a round-robin fashion. (With stochastic fair queuing, some hashing is done to eliminate some of the bookkeeping on flows, but the effect is roughly the same.)
I figured this out in the early 1980s (see RFC 970) and by the late 1990s, it was an established technology. We shouldn't be having this problem at this late date.
I wonder how much of the trouble comes from devices that are doing TCP-level processing in the middle of the network. Stateful firewalls and ISP ad-insertion engines can introduce substantial latency.
If you want to test for bad behavior, try running two flows, one that never has more than one packet outstanding, and one that just does a big file-transfer like operation like a download. If the latency of the low-traffic flow goes up to the same as that of the bulk flow, there's a big dumb buffer in the middle. If the packet loss rate of the low-traffic flow goes up, there's a small dumb buffer in the middle.
the pedophile's favorite invention of all time
Surely you confuse internet with organized religion & enforced celibacy.
The bloated or big buffers causing more latencies than necessary only if it is designed with a single queue for all flows. If each flow gets a queue in the buffer and all queues are read and send out in round robin, the ping packet would not have to wait till the earlier started big file transfer which has completely filled the buffer would be through. The ping packet would practically overtake the large amount of queued bytes of the big file transfer instead of going behind it in a single queue.
This government has literally spent billions through DHS to fehn internet safety. I may be simple, but that same overreaching government could give $500m to 4 private non-profits to actively deal with this and other "infrastructure" issues like IPv6, more rural broadband, urban wireless, and other issues.
It has been done in the past through Red Cross for disaster relief, in floods, earthquakes, epidemics and the like. Society impact issues.
NGO's have far more efficient resource allocation than the government and they also receive assistance from industry and citizens on a tax favorable basis.
JJ
IMHO. If ISP's would build out their networks instead of relying on buffers there would not be an argument here. My attention would be on fixing dumb wireless devices and drivers that ignore every attempt of making them play nice.
Having to work for a living is the root of all evil.
we need more then one FAT cable to connect two points. the FAT ONE cable is a latency bottle-neck. ... latency problems go away. but it's $$MORE EXPENSIVE$$.
if you connect two points with many smaller cables
-
many on ramps to the one-lane highway (which goes very very fast). you have to wait on te one ramp until it's you time to go on the very-fast hghway.
if you had many more "smaller" fast highways, you wouldn't have to wait on the ramp soo long. but it's $MORE EXPENISVE$.
This is a classic problem of economics. Publicly owned resources that are not owned by any one individual or company are very difficult for market factors to work on. A good example is fishing. The fish in a bay are not owned by any particular person, so their welfare is not in the economic interest of any particular person. It may be in a commercial fishermans long term interest to conserve the fish population and not over fish, but he's not the only fisherman. If he cuts back on his catch, other fisherman can simply catch the fish he left behind, the fishery is depleted just as if he'd exploited it, the only difference is the cut in profits he took. The other fisherman are thinking the same thing. They may all collectively want to conserve the fish but it's impossible for them to trust each other and agree to cut back on fishing. Sadly if the fishery were owned by a single person, even a terrible, fish hating monster, he would never allow the damage being done to the population that occurs when it's a public resource. A healthy amount of fish is his income and retirement, it's worth millions to him. But we can not allow such a thing to be "owned" and so we're stuck.
The same applies here. The "internet" is not owned by any particular person, as a result you have dozens of ISPs all fighting to provide the same service at the expense of all of the others. Their cutting their own throats to stay in business and have no way of trusting one another. Government regulation is woefully inadequate and will likely never catch up to technology. At the same time, the thought of a government owned system for transmitting information/data sounds horrifying given the recent actions by our elected officials.
This is unfortunately a situation that is very much like the classical Gordian Knot, and sadly I think the problem will likely be "solved" by a tyrant just like the original. The constitutional and privacy problems the solution causes will probably dwarf the congestion problems we started with.
Disclaimer I'm the author. I covered this in my June 2011 column: http://www.linuxpromagazine.com/Issues/2011/127/Security-Lessons-Bufferbloat/%28kategorie%29/0 direct link to the PDF http://www.linux-magazine.com/w3/issue/127/058-059_kurt.pdf. In a nutshell: my link latency at home is usually ~50ms to seifried.org, but with one single outbound file transfer to saturate my uplink ping times go to over 1000ms (1 second) reliably (which completely breaks VOIP/games/etc.).
What can I do with my own laptop and wifi router to make my own situation better?
Thought is said buffalo bloat for a second. Really there are that many fat buffalos on the internet .
The proposed solution of actve queue management is exactly the sort of discrimination the net neutrality folks want to forbid, no?
The problem is *we don't know* what will happen.
The network is operating in a regime it wasn't intended to... And we lack the instrumentation we need to understand how the internet is operating.
The buffers are now so large that we've defeated the basic congestion avoidance algorithms (slow start and congestion avoidance). Technically, TCP congestion avoidance is operating, but the time gets so long that TCP thinks the path changed, and probes more aggressively for a new operating point, as explained in the article.We are flying in a piece of the flight envelope that has not been tested. That was a big surprise, when we dug into the traces.
That should make us nervous.
We've also had one credible report of a significant network that collapsed and was very difficult to restart. We don't know exactly what all happened (and hopefully never will; we just are frustrated we don't have packet traces).
For that http://tools.ietf.org/html/rfc970 : Not every /.'er has things like that to their credit/name as you do, hence my subject-line.
(For my part here, it sounds like those buffers need a form of tunable/parameterizable aging system to do more than a FIFO queue/buffer is doing currently).
* Lastly: I recall seeing you mentioned on this before here in the past, & iirc, I said the same thing pretty much in response!
APK
P.S.=> Anyhow/anyways: I try to give credit where it's due, & thus, I have to give folks like yourself some respect, in that your "type" actually gets out there & does things that help "improve the human condition" (yes, even in telecommunications)...
... apk
So, let's see if YOU have done more, earlier, & better than I have, ok? Here goes:
----
Windows NT Magazine (now Windows IT Pro) April 1997 "BACK OFFICE PERFORMANCE" issue, page 61
(&, for work done for EEC Systems/SuperSpeed.com on PAID CONTRACT (writing portions of their SuperCache program increasing its performance by up to 40% via my work) albeit, for their SuperDisk & HOW TO APPLY IT, for a finalist position @ MS Tech Ed, two years in a row 2000-2002, in its HARDEST CATEGORY: SQLServer Performance Enhancement).
WINDOWS MAGAZINE, 1997, "Top Freeware & Shareware of the Year" issue page 210, #1/first entry in fact (my work is there)
PC-WELT FEB 1998 - page 84, again, my work is featured there
WINDOWS MAGAZINE, WINTER 1998 - page 92, insert section, MUST HAVE WARES, my work is again, there
PC-WELT FEB 1999 - page 83, again, my work is featured there
CHIP Magazine 7/99 - page 100, my work is there
GERMAN PC BOOK, Data Becker publisher "PC Aufrusten und Repairen" 2000, where my work is contained in it
HOT SHAREWARE Numero 46 issue, pg. 54 (PC ware mag from Spain), 2001 my work is there, first one featured, yet again!
Also, a British PC Mag in 2002 for many utilities I wrote, saw it @ BORDERS BOOKS but didn't buy it... by that point, I had moved onto other areas in this field besides coding only...
Being paid for an article that made me money over @ PCPitstop in 2008 for writing up a guide that has people showing NO VIRUSES/SPYWARES & other screwups, via following its point, such as THRONKA sees here -> http://www.xtremepccentral.com/forums/showthread.php?s=ee926d913b81bf6d63c3c7372fd2a24c&t=28430&page=3
It's also been myself helping out the folks at the UltraDefrag64 project (a 64-bit defragger for Windows), in showing them code for how to do Process Priority Control @ the GUI usermode/ring 3/rpl 3 level in their program (good one too), & being credited for it by their lead dev & his team... see here -> http://ultradefrag.sourceforge.net/handbook/Credits.html or here http://sourceforge.net/tracker/?func=detail&aid=2993462&group_id=199532&atid=969873
AND lastly: http://g-off.net/software/a-python-repeatable-threadingtimer-class where I got other programmer's work WORKING RIGHT (in PyThon no less, which I just started learning only 2 week ago no less) by showing them how to use a "Dummy Proxy Function" as I call it, to make a RepeatTimer class (Thread sub-class really) to take PARAMETERIZED FUNCTIONS, ala:
def apkthreadlaunch():
getnortonsafeweb(sAPKFileName = "APK_1_NortonSafeWeb360Extracted.txt".rstrip())
a = RepeatTimer(900, apkthreadlaunch) # 900 is 15 minutes... apk
Where it was NOT working for many folks there, before (submitted to the maker of the RepeatTimer class no less, & yes, it WORKS!)
----
What do I have to say about that much above? I can't say it any better, than this was stated already (from the greatest book of all time, the "tech manual for life" imo):
"But by the grace of God I am what I am: and his grace which was bestowed upon me was not in vain; but I labored more abundantly than they all: yet not I, but the grace of God which was with me." - Corinthians Chapter 10, Verse 10
(And, because I got LUCKY to have been exposed to some really GREAT classmates, professors, & colleagues on the job over time as well)