Slashdot Mirror


Blackout Shows Net's Fragility

It doesn't come easy wrote to mention a ZDNet article discussing a recent outage between Level 3 Communications and Cogent Communication. A business feud inadvertently highlighted the fragility of the Internet's skeleton. From the article: "In theory, this kind of blackout is precisely the kind of problem the Internet was designed to withstand. The complicated, interlocking nature of networks means that data traffic is supposed to be able to find an alternate route to its destination, even if a critical link is broken. In practice, obscure contract disputes between the big network companies can make all these redundancies moot. At issue is a type of network connection called 'peering.' Most of the biggest network companies, such as AT&T, Sprint and MCI, as well as companies including Cogent and Level 3, strike "peering agreements" in which they agree to establish direct connections between their networks. "

6 of 287 comments (clear)

  1. The small should pay for the big? by hkmwbz · · Score: 5, Interesting
    As I understand it, these were about the same size and had an agreement, or didn't bother to bill each other. Then suddenly one of them figured out that "hey, we are bigger, so they should pay us!"... And the smaller one cut off the connection because they didn't want to pay since they considered themselves to be as big as their rival.

    What I don't get is why one of them would suddenly want the other to pay up. What's changed now, and why does the smaller company have to pay the big one's bills?

    Am I missing something here?

    --
    Clever signature text goes here.
  2. Efficiency can be the enemy of robustness by dpilot · · Score: 5, Interesting

    This statement popped up in some of my security readings. It's most "efficient" to have one path between two places, and it's most "efficient" to set up peering agreements to route packets. But these efficient measures can introduce single points of failure.

    On a similar note, that's why there are 13 root DNS servers, and why most of us aren't supposed to use them. The DNS example though, is one where efficiency and robustness agree. It's more efficient, at least in terms of net bandwidth, to use a DNS server closer than the root servers.

    --
    The living have better things to do than to continue hating the dead.
  3. Call the helpdesk...wait, THEY don't even know! by digitaldc · · Score: 4, Interesting

    http://www.gamergod.com/article_display.cfm?articl e_id=329
    Good article on this situation here

    This situation has adversely affected various users of both companies' services. The inability of Level 3 to handle this situation in a fair and equitable manner to the consumers has alienated many customers and will continue to do so until the current situation is remedied. At what point is it good customer service to discontinue services due to no fault of said consumer base? Market history shows us that the single worse thing a company can do is to arbitrarily allow influences beyond the control of consumers to negatively impact services, determined by consumers to be status quo, without any warning or notification. If left unresolved and unaddressed, the current situation could set dangerous precedents for internet users across the country by allowing service providers to instantly discontinue provided services at the moment they feel that the services they provide are not being adequately compensated for from outside companies.

    On a side note, I was listening to Howard Stern (oh no!) this morning and he said that his Time Warner internet connection at home didn't work. Howard then called a tech guy to come and fix the problem, only for him to call a help desk to figure out what happened. The help desk didn't even know what was wrong. It sounds like Level 3 just pulled the plug and didn't notify ANYONE. Or maybe it was Cogent, the point is nobody outside of that dispute KNEW what was going on.
    This sounds like a good way to alienate your customers and/or ruin your business model. But that is just my opinion.

    --
    He who knows best knows how little he knows. - Thomas Jefferson
  4. Not a redundancy issue... by boldtbanan · · Score: 4, Interesting

    As I understood the problem, redundancy wasn't an issue. Level 3 was actively filtering out request to Cogent, however they came in. The redundancy was working, but Level 3 was playing NetNanny and blacklisting all Cogent IPs.

  5. This was predictable by PhilipPeake · · Score: 4, Interesting
    The Internet was designed to be resiliant to malfunctions and automatically take appropriate action to ensure connectivity.

    Unfortunately, that is not the Internet that we have today. In the original Internet, every router knew about every network connected to the Internet. Most networks had connectivity to many other networks. Discovery protocols allowed alternative routes to be discovered if one failed.

    Today, we don't have a (mostly) fully connected net, we have ISPs who don't know anything about networks which they don't "own", only that certain IP prefixes need to be passed to ISP x, y or z.

    This makes the infrastructure much more fragile than it was originally intended to be. We ended up with this for a few reasons. First, the wimpy routers in use at the time had limited memory available to hold the network maps. The answer chosen was to no longer attempt to hold a full world view, but to divide the world into regions, certain IP prefixes would "belong" to those regions, and all any router would need to know about was networks in its region, plus how to route traffic to other regions, who would take care of routing within the region. This led to "backbone" connections - high capacity links needed because all traffic between regions now didn't "diffuse" through the network, but was channeled into specific connections. It also set the scene to allow the net to be commercialised, those regional centers were obvious "choke points" that an enterprising company could own and pretty much dictate the pricing to lower level enterprises who would do the dirty work of dealing with end-users.

    Slowly but sureley the Internet evolved into a system dependent upon a few companies with high-speed links between them - prime candidates BTW, as locations for government control to be imposed. The self-healing nature of the original Internet was lost because all traffic HAS to pass via the top level companies infrastructure and over their interconnect backbone connections.

    The "self healing" Internet is long gone.

  6. Monitor it yourself by dereference · · Score: 4, Interesting
    I found this site while trying to research the problem. I wish I had known of it earlier; it provides a very nice (near) real-time snapshot of all the Tier 1 peering:

    http://www.internetpulse.net/

    I'm not affiliated with them in any way, and I'm sure there are other similar sites, but I thought it was worth mentioning.