Slashdot Mirror


Blackout Shows Net's Fragility

It doesn't come easy wrote to mention a ZDNet article discussing a recent outage between Level 3 Communications and Cogent Communication. A business feud inadvertently highlighted the fragility of the Internet's skeleton. From the article: "In theory, this kind of blackout is precisely the kind of problem the Internet was designed to withstand. The complicated, interlocking nature of networks means that data traffic is supposed to be able to find an alternate route to its destination, even if a critical link is broken. In practice, obscure contract disputes between the big network companies can make all these redundancies moot. At issue is a type of network connection called 'peering.' Most of the biggest network companies, such as AT&T, Sprint and MCI, as well as companies including Cogent and Level 3, strike "peering agreements" in which they agree to establish direct connections between their networks. "

68 of 287 comments (clear)

  1. The small should pay for the big? by hkmwbz · · Score: 5, Interesting
    As I understand it, these were about the same size and had an agreement, or didn't bother to bill each other. Then suddenly one of them figured out that "hey, we are bigger, so they should pay us!"... And the smaller one cut off the connection because they didn't want to pay since they considered themselves to be as big as their rival.

    What I don't get is why one of them would suddenly want the other to pay up. What's changed now, and why does the smaller company have to pay the big one's bills?

    Am I missing something here?

    --
    Clever signature text goes here.
    1. Re:The small should pay for the big? by Daniel+Boisvert · · Score: 2, Funny

      Am I missing something here?

      Yes. :)

    2. Re:The small should pay for the big? by hkmwbz · · Score: 3, Funny

      Ah, thanks! That certainly made it clearer ;)

      --
      Clever signature text goes here.
    3. Re:The small should pay for the big? by Daniel+Boisvert · · Score: 5, Informative

      NANOG has been on fire with posts about this issue over the past few days. The following two from Leo Bicknell do a good job of explaining why this sort of thing would happen, why nobody in particular is The Bad Guy[tm], and why this issue has no relevance to the issue of internet resilience in the case of natural or manmade disaster:

      http://www.merit.edu/mail.archives/nanog/msg12302. html
      http://www.merit.edu/mail.archives/nanog/msg12350. html

    4. Re:The small should pay for the big? by Cally · · Score: 4, Informative

      Check the NANOG archive over the last few days for far, far more than you ever wanted to know about "The Art of Peering: The Peering Playbook"... or read the book yourself.

      --
      "None are more hopelessly enslaved than those who falsely believe they are free." -- Goethe
    5. Re:The small should pay for the big? by 99BottlesOfBeerInMyF · · Score: 2, Informative

      Am I missing something here?

      I only read about this very briefly, but my understanding is it went beyond that. Just cutting the peering connection is fine and proper and packets then are rerouted through other peers, possibly costing more money, possibly not. Then the internet goes on as before and everyone is happy and the peers involved can negotiate a new link if they want and figure it will save them money by avoiding other routes where they have to pay for traffic.

      My understanding is that in this case they not only cut the link, but they advertised routes to their other peers for traffic from the first peer, which they then maliciously, and probably in breach of those other contracts, filtered out, resulting in failed traffic routing. Basically they intentionally lied (to the routers) and said sure we'll route those and then did not.

      I don't think this highlights the fragility of the internet, so much as the fact that end users usually rely upon a single peer (ISP) and if they can't trust them to not intentionally break traffic they had better find a new ISP.

    6. Re:The small should pay for the big? by Feyr · · Score: 2, Informative

      that's wrong, no one is filtering them. not anymore than they normally would to maintain their network.

      what we are seeing here is a pissing contest between two "tier1". so there literally is no other route the packets can take to reach each other network (contractually speaking, not technically). each of these networks have peering contracts with other companies, not transit. a peer is only used to reach other's network, a transit lets you reach networks beyond the network you are transiting through.

    7. Re:The small should pay for the big? by NanoGator · · Score: 3, Funny

      "NANOG has been on fire with posts about this issue over the past few days."

      WHat?! No! I haven't said a word about it!

      --
      "Derp de derp."
    8. Re:The small should pay for the big? by tchuladdiass · · Score: 2, Interesting

      The thing I don't under stand is, say you have two network providers, A and B. If A's customers are sending more data to B's customers, then should A pay B for the route, or should B pay A because it is their customers that are requesting the packets?

    9. Re:The small should pay for the big? by fafaforza · · Score: 2, Interesting

      Level3's stock price is tanking, they are fighting for survival and the jobs of all their employees, from engineers to secretaries, while Cogent is undercutting the price of bandwidth by a factor of 3 while taking advantage of their peering with not many being able to compete on price, and you call Level3 greedy? C'mon.

    10. Re:The small should pay for the big? by drakaan · · Score: 2, Informative
      That doesn't make any sense. It's not as if there's no other route at all between the two networks. Routing protocols and ICMP unreachables exist to allow traffic to route around trouble like this. Unless the link was deliberately broken and packets unceremoniously dropped, the source for a given connection attempt would see it's packet routed in what appeared to be an excessive manner, but it'd still get from point A to point B.

      If Cogent users can get to Qwest and L3 users can get to Qwest, but cogent users can't talk to L3 users, then cogent and L3 are doing something intentionally bad and screwing everyone on the internet.

      --
      "Murphy was an optimist" - O'Toole's commentary on Murphy's Law
  2. No worries by WormholeFiend · · Score: 5, Funny

    The pr0n industry was designed to find alternative routes of delivery in case of Internet outages.

    1. Re:No worries by JPriest · · Score: 3, Funny

      And slashdot runs redundant stories on the same thing in case the first one is lost on the way.

      --
      Saying Java is nice because it works on all OS's is like saying that anal sex is nice because it works on all genders.
  3. Background info by NicolaiBSD · · Score: 3, Funny

    Hey, I've found some interesting background info on this novel story here.

    1. Re:Background info by KDan · · Score: 2, Funny

      I think this source is blocked by the filters the editors use when selecting stories... It's pretty untrustworthy and contains lots of potentially inappropriate material...

      Daniel

      --
      Carpe Diem
  4. Efficiency can be the enemy of robustness by dpilot · · Score: 5, Interesting

    This statement popped up in some of my security readings. It's most "efficient" to have one path between two places, and it's most "efficient" to set up peering agreements to route packets. But these efficient measures can introduce single points of failure.

    On a similar note, that's why there are 13 root DNS servers, and why most of us aren't supposed to use them. The DNS example though, is one where efficiency and robustness agree. It's more efficient, at least in terms of net bandwidth, to use a DNS server closer than the root servers.

    --
    The living have better things to do than to continue hating the dead.
    1. Re:Efficiency can be the enemy of robustness by PhilHibbs · · Score: 2, Insightful

      Sure, they have the connections, but routing extra traffic through those peering links will probably only cascade this problem. The intermediate providers will see a jump in trafic coming through from L3 and Cogent, and they will have to consider how to recoup the costs that that is imposing on them.

      It's a web, and when one strand breaks, it increases the strain on the other strands.

  5. Call the helpdesk...wait, THEY don't even know! by digitaldc · · Score: 4, Interesting

    http://www.gamergod.com/article_display.cfm?articl e_id=329
    Good article on this situation here

    This situation has adversely affected various users of both companies' services. The inability of Level 3 to handle this situation in a fair and equitable manner to the consumers has alienated many customers and will continue to do so until the current situation is remedied. At what point is it good customer service to discontinue services due to no fault of said consumer base? Market history shows us that the single worse thing a company can do is to arbitrarily allow influences beyond the control of consumers to negatively impact services, determined by consumers to be status quo, without any warning or notification. If left unresolved and unaddressed, the current situation could set dangerous precedents for internet users across the country by allowing service providers to instantly discontinue provided services at the moment they feel that the services they provide are not being adequately compensated for from outside companies.

    On a side note, I was listening to Howard Stern (oh no!) this morning and he said that his Time Warner internet connection at home didn't work. Howard then called a tech guy to come and fix the problem, only for him to call a help desk to figure out what happened. The help desk didn't even know what was wrong. It sounds like Level 3 just pulled the plug and didn't notify ANYONE. Or maybe it was Cogent, the point is nobody outside of that dispute KNEW what was going on.
    This sounds like a good way to alienate your customers and/or ruin your business model. But that is just my opinion.

    --
    He who knows best knows how little he knows. - Thomas Jefferson
    1. Re:Call the helpdesk...wait, THEY don't even know! by peragrin · · Score: 4, Funny

      You want scary, I can show you scary. I emailed Roadrunner saying I would drop them if they couldn't due something.

      I got a semi canned response but it did have some techincal details. It also stated that if you wish to discuss the techincal nature of the problem go to www.ask.slashdot.org With a full link to the other article.

      Yep Roadrunner sent me to slashdot to get more information.

      --
      i thought once I was found, but it was only a dream.
  6. Re:When did this blackout happen by varmittang · · Score: 2, Informative

    I think since Wednesday.

    --
    -----BEGIN PGP SIGNATURE-----
    12345
    -----END PGP SIGNATURE-----
  7. A New Approach by mysqlrocks · · Score: 2, Interesting

    So, it appears a big part of the Internet traffic is controlled by large companies like Cogent or Level 3. No big surprise. I think this highlights the need for a new approach to connecting people together. I know there's been talk of wireless mesh networks where everybody is both an end point and a router. This would work in populated areas but I'm not sure how well it would work for "long haul" connections which is what the issue is here. Can anybody think of (or know of) any alternatives that gives control and power of the Internet back to the people who use it?

    1. Re:A New Approach by fireboy1919 · · Score: 2, Insightful

      I know there's been talk of wireless mesh networks where everybody is both an end point and a router. This would work in populated areas

      This would work in populated areas in theory. In practice, though, 95% of the bandwidth in any given system gets eaten up by 5% of the users unless there is heavy regulation. Actually, we pretty much need the big internet companies in order to get a particular level of QoS.

      Like I said, all it takes is one in fifty who won't play nice to ruin it for everybody else. I'd be willing to bet that 1 in 50 people is a sociopathic jerk - probably even more. Ultimately, we need something to keep the sociopaths from going nuts. "Power to the people," like the anarchy that is mesh networking won't work.

      --
      Mod me down and I will become more powerful than you can possibly imagine!
    2. Re:A New Approach by BeBoxer · · Score: 4, Informative

      I know there's been talk of wireless mesh networks where everybody is both an end point and a router. This would work in populated areas but I'm not sure how well it would work for "long haul" connections which is what the issue is here.

      If by "work in populated areas" you mean "slow the network to a crawl" then yes, it would work. Mesh networking is cool stuff, but you aren't going to build a backbone out of it. Wireless is really fast compared to your DSL line or cable modem. But it isn't even in the same ballpark as what you can do on fiber. Backbone links are running at 10Gbps or even 40Gbps. Full duplex, so that is 20Gbps or 80Gbps of "marketing bandwidth". Compared to what, 22Mbps or 54Mbps half-duplex for your wireless? You aren't going to build a comparable backbone out of wireless links running at roughly 1/1000th of the speed. Physics pretty much guarantees that fiber links will always be faster than wireless.

  8. Peering by Neurotoxic666 · · Score: 5, Funny

    At issue is a type of network connection called 'peering.'

    In other news, the RIAA announced they've stopped an extremely large P2P network.

    --
    You are more than the sum of what you consume. Desire is not an occupation.
  9. Internet can route against natural calamities by anandsr · · Score: 4, Informative

    Internet cannot route when your providers do not want you to communicate.
    Nothing can protect you in this case.
    If on the other hand there was a natural calamity and every one was trying to get you access
    then you would get it. Like it happened during Katarina.
    This is not a natural calamity.

    The best option is to ditch your provider if they are not a monopoly and if they are lobby to your government to create multiple providers.

    1. Re:Internet can route against natural calamities by mjh · · Score: 2, Interesting
      Internet cannot route when your providers do not want you to communicate. Nothing can protect you in this case.
      I agree with the first part, not the second part. What protects you if your providers don't want you to communicate is:
      1. contracts that state that your provider is required to allow you to communicate
      2. competition from other providers
      --
      Key to financial independence: Spend less than you earn. Save and invest the difference. Do it for a long time.
  10. It's dupealicious! by mrpotato · · Score: 3, Funny

    But for easy karma, just go get a +5 comment in the other thread, and repost it here without attribution.

    Not that I would ever do such a thing...

    --

    cheers
  11. It always will be fragile by squoozer · · Score: 4, Insightful

    The Internet will IMVHO always be quite fragile. While the design lends itself to robustness the reality is that there is only money for a few very big connections and therefore a disaster that affects one of these connections is going to cause wide spread outages.

    Take, for instance, the connections running between Europe and America. I bet most of them run in almost exactly the same place on the sea bed because it's the cheapest / shortest path to take. A fairly localized geological disaster (at least in geological terms) could cut all the cables at once; or at least enough to make to difference.

    If we wanted the network to be robust we would need to run cables up over the north pole and round the equator and probably stick in some satelite links as well. There just isn't money for that. People are willing to accept the risk that it might fail in extreme situations.

    FWIW I think the problem is worse on the global scale than the country scale. I imagine most developed countries probably have enough redundancy in their own country. It's the interconnects between countries that are probably the biggest problem.

    --
    I used to have a better sig but it broke.
    1. Re:It always will be fragile by brunes69 · · Score: 3, Informative
      Take, for instance, the connections running between Europe and America. I bet most of them run in almost exactly the same place on the sea bed because it's the cheapest / shortest path to take. A fairly localized geological disaster (at least in geological terms) could cut all the cables at once; or at least enough to make to difference.

      This isn't a good example, because in this case most traffic would automatically be re-routed to go through Asia and the trans-Pacific cables. And if those went down it would go over South America Oceana.

      It would get much slower, sure, but would not cause an outage.

      There is no *technical* reason this peering relationship breaking down should be causing an outage either. If the both also peered with some third party that could service them both, like MCI or something, then the traffic would still get through. The companies are just being bull-headed.

    2. Re:It always will be fragile by WuphonsReach · · Score: 2, Interesting

      Ah yes, the joys of thinnet. OTOH, it was very easy to debug if you knew how the thinnet was routed from cubicle to cubicle. When you had a broken segment, you went halfway down the line and terminated it off. If the segment started working, your problem was farther away from the bridge (repeater?). Otherwise, you would head back upstream towards the head of the segment and try again. Where you typically ran into trouble were users who constantly moved equipment (test labs, laptop users). User training fixed most of those issues due to the informal posse of coworkers who would hunt down the frequent offender.

      The previous topology in that office had been thicknet (where you had to manually tap the cable). Thinnet was seen as better. Or at least easier to build a network out of in a cubicle environment.

      Token Ring wasn't all that bad. Unlike thinnet, the physical wiring was more of a topology like today's ethernet where you had a dedicated cable running from the patch panel to the workstation's network jack. At least, it was wired that way in the buildings where I've seen it. So it was easy enough to plug/unplug stations from the network in a central location. The topology was also designed to deal with a single break (the stations before/after the break would loopback).

      The usual problems we had with TR were the fragile connectors (problematic for test environments / laptop users with frequent plug/unplug). Plus the issue that you only had 4Mbps (later 16Mbps) and a 4Mbps card wouldn't work on a 16Mbps network. Ethernet hubs/switches did a much better job of handling the upgrade path automatically where one port might be 10Mbps another 100Mbps and a third port running at 1Gpbs without redoing your entire network topology.

      --
      Wolde you bothe eate your cake, and have your cake?
    3. Re:It always will be fragile by beebware · · Score: 2, Informative

      Oh, for the record, the BBC has a brilliant network in the UK and the US - http://support.bbc.co.uk/support/network/ . I believe, although I haven't even attempted to confirm it, that they have peering agreements with most of the major UK ISPs.

  12. Keep it private by dada21 · · Score: 2, Insightful

    If you make the Top Tiers a government-controlled service, expect long term problems like censorship, taxation and regulations on sub-level tiers.

    Neither company involved in this dispute wants to do t is. They need to work it out, or other companies will find a solution and take the customers.

    If you're desperate to provide data to multiple top tiers, pay for a host that is connected to multiple backbones.

    There is zero need to mandate anything. Let the free market provide and we'll be safer in the long run. Let government provide and we'll see a slowly creeping tyranny online.

  13. Re:Didn't notice at all. by lostlogic · · Score: 3, Informative

    You would only notice if you are on one of these two networks. I am personally on UUNet at home and MCI at work, and my server is on SpringLink (via Schlund, who I am not familiar with). As a result, all of my traffic is completely unaffected. Customers on a single-homed connection through Cogent, or through L3 cannot see other single homed customers on the other network. The rest of us don't know the difference. The dumb thing that this article points out is that both Cogent and L3 are refusing to route packets destined for each other through the rest of the internet (probably for fear of fucking up other peering agreements by dumping too much traffic on their other peers). I believe there was a comment in the previous thread about this issue saying that traffic in one direction could be routed, but that even return packets were being null-routed at some point, preventing any type of connection from being established.

    --
    --Brandon
  14. Not a redundancy issue... by boldtbanan · · Score: 4, Interesting

    As I understood the problem, redundancy wasn't an issue. Level 3 was actively filtering out request to Cogent, however they came in. The redundancy was working, but Level 3 was playing NetNanny and blacklisting all Cogent IPs.

  15. Re:Ask Slashdot by AlexTheBeast · · Score: 3, Interesting

    The problem with web services is that they need for the internet to be completely secure and completely reliable. The internet of today is neither.

    Physicians trying to use the internet to take care of critically ill patients are already experiencing this. Radiologists sitting home reading films are seeing this as well.

    Is 100% on neccessary? Hell, VoIP is making money like crazy over this unstable network of ours.

    My suggestion is to test with people that will understand the limitations of your service. Then get a little VC money to spread your servers out.

  16. The fragility of the net by elfguygmail.com · · Score: 5, Informative

    It's very true, and anyone can see how a few big companies basically make the net work in north america. Simply do traceroutes to various big web sites, and you'll notice the packets always go across the same networks. The biggest one seems to be alter.net (MCI), with others including Level3, above.net, AT&T and UUnet. Basically you remove any of these and the North American part of the Internet would be in chaos. The problem is because most ISPs do the same thing. They pick a primary provider, and get a backup one. The problem is they all pick the same few primary companies, and their backup links are much smaller pipes.

  17. Re:designed to withstand? Says who? by 'nother+poster · · Score: 2, Insightful

    No, the internet was never designed with anonymity in mind, but it was designed to be a communications network that would not experience systemic problems when individual nodes and connections went down.

  18. ah peering by bigpat · · Score: 3, Interesting

    The only time peering should involve an ongoing exchange of money for bandwidth should be when a network is primarily serving as an intermediary between other networks, such as long haul or backbone networks.

    But if most of the traffic from other networks is going to customers that are connected and already paying for your network's service then it makes no sense and is simply wrong for a network to start charging other network providers. It breaks the end to end communication model and is providing your customers with less than the service they are paying for. People pay for internet connectivity so they can transfer data between other users on the internet, not just the ones on your company's network.

    If money exchanging hands is at all appropriate in this case it might be for the actual installation of routing equipment which establishes the physical connection between networks.

  19. Re:A solution can be... by cloudmaster · · Score: 2, Funny

    Be sure to let the UN know about that - this is surely something they'll want to take care of when they take "control of the Internet" away from the US. :)

  20. not a blackout by bradk500 · · Score: 2, Interesting

    All this crap about it showing weakness in the internet is uninformed bs. They didn't just stop peering, but they are actively blocking traffic from cogent. If Level3 had just stopped peering the traffic would reroute around the problem. The only time you will see problems is if your a cogent customer trying to get a single homed computer on level3's network. We are a cogent customer and an internap customer, and to get around the problem I just reouted traffic destined for level 3 networks over one of our internap t's. This solved the issue for us.

  21. So the internet is breaking down by iminplaya · · Score: 2, Insightful

    Privatization strikes again. You put the infrastructure into the hands of a few powerful people and this is what you will get. Those big power outages happened for the same reason. We aren't holding those in charge responsible. There is no redundancy when there is only one provider. They can cut you off and what are you going to do? Only community services and coops can provide the necessary robustness. But it seems to be more convenient to just hand it over to corporate pirates.

    --
    What?
    1. Re:So the internet is breaking down by the_real_bto · · Score: 3, Insightful

      "Privatization strikes again. You put the infrastructure into the hands of a few powerful people and this is what you will get."

      Are you arguing that government control moves power from the few to the many? That is backwards to my way of thinking. The quickest way I can think of to concentrate power is to put the government in charge of it.

  22. Cogent Sucks by Lamont · · Score: 2, Interesting

    As a customer who has had Cogent inflicted on us (when Verio sold all their domestic internet lines to Cogent), we've had nothing but pain and bumbling inefficiency from them for the last six months.

    I contacted Cogent's "premium" help desk last night when I found that I was suddenly no longer able to get to our networks in Australia. The tech had no idea that his own company was in the middle of a huge peering battle with L3. I had to tell them!

  23. North Pole Run coming soon by davidwr · · Score: 2, Funny

    As soon as all that pesky arctic ice melts away, it'll be cheap enough to run cable across the pole.

    As a bonus, Santa's new underwater toy factory can tap into it.

    Woo-hoo, faster email to Santa! Hope the jolly old elf doesn't discover online pr0n or he'll never get those presents made on time.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  24. This was predictable by PhilipPeake · · Score: 4, Interesting
    The Internet was designed to be resiliant to malfunctions and automatically take appropriate action to ensure connectivity.

    Unfortunately, that is not the Internet that we have today. In the original Internet, every router knew about every network connected to the Internet. Most networks had connectivity to many other networks. Discovery protocols allowed alternative routes to be discovered if one failed.

    Today, we don't have a (mostly) fully connected net, we have ISPs who don't know anything about networks which they don't "own", only that certain IP prefixes need to be passed to ISP x, y or z.

    This makes the infrastructure much more fragile than it was originally intended to be. We ended up with this for a few reasons. First, the wimpy routers in use at the time had limited memory available to hold the network maps. The answer chosen was to no longer attempt to hold a full world view, but to divide the world into regions, certain IP prefixes would "belong" to those regions, and all any router would need to know about was networks in its region, plus how to route traffic to other regions, who would take care of routing within the region. This led to "backbone" connections - high capacity links needed because all traffic between regions now didn't "diffuse" through the network, but was channeled into specific connections. It also set the scene to allow the net to be commercialised, those regional centers were obvious "choke points" that an enterprising company could own and pretty much dictate the pricing to lower level enterprises who would do the dirty work of dealing with end-users.

    Slowly but sureley the Internet evolved into a system dependent upon a few companies with high-speed links between them - prime candidates BTW, as locations for government control to be imposed. The self-healing nature of the original Internet was lost because all traffic HAS to pass via the top level companies infrastructure and over their interconnect backbone connections.

    The "self healing" Internet is long gone.

    1. Re:This was predictable by Red+Flayer · · Score: 2, Interesting

      Slowly but sureley the Internet evolved into a system dependent upon a few companies with high-speed links between them - prime candidates BTW, as locations for government control to be imposed. The self-healing nature of the original Internet was lost because all traffic HAS to pass via the top level companies infrastructure and over their interconnect backbone connections.

      This is what happens when you have an industry based upon a high cost of entry (physical infrastructure, here) and a low marginal cost of supply. We need fat pipelines because we demand fast speeds and high volumes for our traffic. If we didn't have regions, but instead had the "original self-healing internet," how long do you think it would take to download big files if the source didn't happen to be just 2 or 3 routers away? Say goodbye to streaming video, etc.

      Net cost of transmission would be far higher for packets that are many routers away in a truly web-based system, since not all apths are equal.

      The problem is, how do we balance cheap efficiency (fatline "superhighways") with expensive redundancy to optimize the system for all participants?

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
  25. Re:The small should pay for the big? (mod this up) by gskouby · · Score: 5, Informative

    About 4 months ago I got a call from a sales critter at Cogent saying "We will knock 50% off of the price you are paying for your L3 connectivity if you drop them and come be our customer." I was kind of surprised at the boldness of this proposition because they were specifically targeting current L3 customers. I was even more surprised to find out from others that this sales pitch from Cogent was company wide. Of course this pissed off L3 and that was the start of this pissing contest.

  26. Re:Net blackouts by Evil+W1zard · · Score: 2, Insightful

    Ah I see so by giving control over to the UN it will magically put in place all the hardware, software and correctly configure the web to never fail? Hopefully this statement was made to be joke because otherwise it doesn't make a bit of sense. I picture the web now as a 100,000 foot long giant slinky that someone has twisted into oblivion. I don't even know if the web can be fixed at this point...

    --
    News Reporters Make Tasty Polar Bear Treats!
  27. Re:Didn't notice at all. by LurkerXXX · · Score: 2, Insightful
    It's not a blackout. This is a FUD story. There are plenty of alternate routes in place to 'heal' a broken link between the Level3 and Cogent networks. The problem is Level3 is deliberately black-holing traffic to Cogent on the Level3 routers.

    If Level3 didn't want to peer anymore with Cogent. That's understandable, it wasn't an even exchange of traffic anymore. They could have done the right thing and simply stopped the peering. Insted, they have decided to be vendictive and filter any traffic to/from Cogent's IP range, even if the traffic is coming through some other ISPs network that Level3 still has peering or paid relationships with.

    One again, the internet routers are perfectly able to find routes, Level3 is just deliberately trashing the packets before they get there. The Internet isn't 'unstable'. Any ISP can filter packets entering or leaving their network, and Level3 has decided to do so in an bad way. This just means Level3 customers should be pissed. This is nothing for anyone to get their panties in a bunch over except Cogent and Level3 customers, who's ISPs are being dicks.

  28. Monitor it yourself by dereference · · Score: 4, Interesting
    I found this site while trying to research the problem. I wish I had known of it earlier; it provides a very nice (near) real-time snapshot of all the Tier 1 peering:

    http://www.internetpulse.net/

    I'm not affiliated with them in any way, and I'm sure there are other similar sites, but I thought it was worth mentioning.

  29. Re:Crazy Idea by gfilion · · Score: 2, Insightful

    Damage: Level3 won't accept Cogent traffic. Horrible hack: tunnel BGP traffic to Level3 customer who masquerades requests as local traffic.

    You don't need to masquerade anything, if you're connected to Level3 and Cogent, just configure your router to advertise your route to the Level3 network on the Cogent side and vice-versa.

    Then watch your router melt under the hundreds of gigabits of traffic -- that you'll have to pay for both ways. Congratulation, you're the new peering agreement between Level3 and Cogent!

  30. Re:This is so strange... by cloudmaster · · Score: 2, Funny

    Knowing that it pissed Howard Stern off and wasted some of his time, I now feel much better about this outage.

  31. The problem here is conflicting business models by Mulligan · · Score: 2, Informative

    At the fringes there are really two types of internet service offered: upstream and downstream. Most consumers (individuals) need a lot of downstream and very little upstream. They typically are sold assymetric service that is heavily biased in this direction. My cable connection, for example, gives me ~5Mbps down and 768kbps up. On the flip side are the content providers who typically need a lot of upstream bandwidth and less upstream bandwidth. ISPs have found that these customer are willing/able to pay quite a bit more for their internet connections. Therefore, the law of supply and demand has increased the cost of connections with higher upstream capacity.

    Several levels up the ISP heirarchy, however, there are mostly only symmetric lines (T3, OCx, ...) providing equal upstream and downstream bandwith. In order to maximize the use of this bandwidth, many providers try to balance the number of content providers with content consumers in order to use the upstream and downstream capacity equally. In theory, this usage should be well balanced by the time it reaches the Teir 1 providers.

    The problem we are having right now is caused by Cogent not subscribing to that business model. They have found that the cost to support content consumers is much higher than the cost to support providers. (If for no other reason than there are far more of them.) So, their business model skews heavily towards the provider customers, reducing their operational costs. This, in turn, means that they are able to offer lower costs to those content providers -- in many cases undercutting the other big service providers such as Level 3

    This, of course, makes the other providers unhappy because it cuts into their high-yield business. So, occasionally, one of them demands compensation for "transit" instead of providing free peering. They do this because they feel (rightly IMO) that Cogent is able to make more money on these high paying content providers by using an asset owned by the other service providers -- the online customer/consumer base. Basically, Level 3 is telling Cogent that because Cogent is making money by using that virtual asset owned by Level 3, Cogent owes Level 3 some sort of compensation. It is worth noting that several other Teir 1 providers already take this approach with Cogent and Cogent is forced to pay for "transit" service to those providers' customers.

    As long as all the Teir 1 providers cooperate, the system works reasonably well. However, in this case, Cogent is trying to take advantage of that informal cooperation to make some extra money. So, they are being capatalists. In this case, capatalism is at odds with cooperation and the system is not working well.

    Many people are calling for government regulation to prevent this sort of situation. I expect this to cause some major problems. The issue could be resolved if all the Teir 1 providers would realize that there is a different market value for ingress and outgress traffic. In a free market, I expect that the ingress traffic (corresponding to upstream traffic of content providers from the lower levels) would have substantially more value than the outgress traffic (downstream traffic to consumers). The outgress traffic might even have negative value (meaning that a service provider would charge to take care of it). In the case that two peers balance their traffic well (the ideal cooperative solution) no money needs to change hands. In the other cases (like this one) the ISP with excess outgress usage should probably be charging the one with excess ingress.

    Unfortunately, there is no fluidity to the system between the true market (the upstream and downstream bandwidth consumers) and the core market (the Teir 1 providers). If there were, Level 3 could justify their demand for more money based on the value of the traffic they were accepting from lower down the food chain.

  32. Re:A solution can be... by Angostura · · Score: 2, Funny

    Wow, that's a nice idea. That'll mean that all I have to do is run a bit of Ethernet into a peering point and I'll get free connections to all the tier ones. Fabulous.

    Oh - hang on, if someone else runs a bit of Ethernet in, do I have to connect to them? Damn.

  33. Baloney. Its just bad companies. by Spazmania · · Score: 2, Informative

    If either Level3 or Cogent was buying a "default" service from a third party, their customers wouldn't have a problem. The moment the peering connection was cut the lower-priority BGP routes from the third party would have taken over and their traffic would have gone through the third-party link.

    The reason these two jokers are having this problem is that they made a business decision to only move traffic with reciprocal peering and then failed to keep that peering alive. That's because they're both cheap-ass bastards; peering costs a heck of a lot less than buying transit.

    Go buy from someone else who who isn't a cheap-ass. Someone who buys transit for anything they can't peer. You won't have a problem.

    The only lesson here is that most time honored of lessons: you get what you pay for.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  34. Re:The small should pay for the big? (mod this up) by monkeydo · · Score: 2, Insightful

    The pitch is even better now. If you are an L3 transit customer, Cogent will give you free service for a year. For L3's current customers this solves the immediate problem, and they wind up multi-homed, so they don't get bitten by this in the future.

    --
    Si vis pacem, para bellum
    The only thing more annoying than a Libertarian is an (un|mis)informed Libertarian
  35. It has probably been said... by cr0sh · · Score: 2, Informative
    I am late to this thread, and it has probably been said anyhow, but I want to reiterate:

    The problem isn't soley with the business arrangements between the "big providers" - oh, certain, that does have impact, but the internet would be as robust as ever, if every participant on it could be a peer.

    This is how the network was meant to be, a mesh comprised of stupid interconnects and smart nodes. Every node on the internet, from the largest colo to the smallest wireless handheld, should have the ability to be a true peer on the internet. In practice, this isn't really possible, but imagine a mesh network with a distributed p2p DNS system which many people could run if they wanted to - if only a fraction were running it, and were distributed enough, such outages might not occur (the traffic could continue to be routed, albeit at a slower pace).

    Everyone should be able to be a peer on the network, everyone should be able to get at least one static IP, everyone should be able to run their own server(s) if they want to. Right now, the only way you can do it is by paying huge amounts of $$$ so you can get a garden hose instead of a straw. I am not saying access to the internet should be or could be free, but peering should be a natural right of being a part of the internet, not something you have to pay extra (a LOT extra) for.

    --
    Reason is the Path to God - Anon
  36. Roadrunner affected by Jeff85 · · Score: 2, Informative

    I had a friend on Roadrunner who complained he couldn't connect to many sites. I think he happened to know that they used Level 3. Is there a way to determine what backbone your ISP or a particular site uses?

    --
    Fetch Text URL - Firefox Extension
    1. Re:Roadrunner affected by Secrity · · Score: 2, Informative

      Is there a way to determine what backbone your ISP or a particular site uses?

      The Unix traceroute command can be used to do this:

      $ traceroute slashdot.org

  37. Re: Fucking Kids stuff by EddyPearson · · Score: 2, Insightful

    HOW PATHETIC! Two major ISP are willing to piss off thousands of people just because they've thrown their toys out of the pram. Back at school they'd get told to shut up and get along, now it'll become a legal action. GET A FUCKING GRIP!!! I thought the world was too sensible for this kind of thing. I was wrong.

    --
    You feel sleepy. Close your eyes. The opinions stated above are yours. You cannot imagine why you ever felt otherwise.
  38. Re:A solution can be... by frost22 · · Score: 2, Informative

    *Sigh*. Why do you spew nonsense if you actually have not even found out how a clue looks like, not to mention ever aquired one ?

    So you claim there are no Internet Exchange Points ?

    pray tell, what is this thing ? Or that one, not to mention the middle one.
    Oh, and what do you think those Guys do for a living ?

    Nobody expects you to be a fucking genius or know everything. But why are some folks constantly touting stupid nonsense instead of keeping their mouths shut and learning something ?

    --
    ...and here I stand, with all my lore, poor fool, no wiser than before.
  39. Re:It's Nobody's Fault by Alioth · · Score: 3, Informative

    Cogent COULD route around the damage - if they wanted to, but they don't.

    If the peering point had been taken out by a bomb, the re-routing would have been performed in fairly short order. However, this is not the case here.

    Level3 think that Cogent is taking the piss and is not a real peer. Level3 want Cogent to buy transit to reach Level3, either directly from them (or from someone else) because at the moment the peering is very lopsided, and costing Level3 a bucketload of money and giving Cogent a boatload of free bandwidth.

    Cogent on the other hand doesn't want to pay for transit to Level3.

    Right now, Cogent could route all their traffic for Level3 over transit they pay for. They don't want to do that because it won't force Level3 back into the peering agreement. So what they do is leave the link severed and do not re-route so that Level3 customers cannot get to sites hosted by Cogent. This means Level3 customers will grumble at Level3. Additionally, they offer a year's free transit to single homed Level3 customers just to raise the brinkmanship with Level3 a notch higher. Basically it's war between L3 and Cogent.

    If Cogent re-routes their traffic, they are defeated and L3 will never re-peer. What Cogent are hoping is that enough angry customers on the L3 end will whine at L3 so L3 will be forced to re-peer.

    For the rest of us in the peanut gallery (i.e. those of us who aren't single homed customers of Cogent or Level3) we can just watch the fun and games and throw peanut shells at the squabbling combatants because we don't see any black hole at all.

  40. Re:Didn't notice at all. by LurkerXXX · · Score: 2, Informative
    I have friends working for other Level3 clients and peers. The packets are getting to Level3. Then they disappear. The routes ARE advertised to them. That's the beauty of good internet routing, it heals around wounds.

    FYI, smaller ISPs pay larger ISPs for bandwidth all the time. The larger ISPs have huge costs. Switches costing hundreds of thousands of dollars, filled with a bunch of cards in it that each cost hundreds of thousands. Lots of them. Lots of fiber and other costs. It gets real easy to have billions invested just in hardware. They offset those costs in part by selling bandwidth to smaller ISPs. That's the way the net works.

    Try telling some small ISP that they should stop paying their upstream provider. That the upstream provider should give them bandwidth free so that the larger ISPs customers can access websites hosted by the smaller ISP. They will tell you you are living in a dream world. That's not the way the net works.

  41. Re:Didn't notice at all. by LurkerXXX · · Score: 2, Interesting

    That's exactly what it looks like. And yes, the routers are set to find another path. The problem is when it finds a new path through some 3rd or 4th ISP to the Level3 network, as soon as the Level3 router sees the packet originated from a Cogent IP address, it null routes it. That's not a problem with fragility of the net, it's Level3 behaving badly. (Note: Cogent should have ponied up money for traffic to a larger provider to avoid this mess in the first place. There are no good guys involved in this.)

  42. Userfriendly? by wembley+fraggle · · Score: 2, Interesting

    Is this why I can't read userfriendly or Something Positive this morning? Or is it just some weird coincidental webcomic blackout?

  43. The peering is back up by Anonymous Coward · · Score: 2, Informative

    the problem has been solved. I can ping level3 from cogent and i have one connection. I don't know yet who flinched first......

  44. Fixed now? by dereference · · Score: 3, Informative

    The availability grid for the past 4 hours shows ~40% and the grid for the past 1 hour shows 100%. As noted by "Cally" below, I honestly have no idea how exactly this grid has been generated (hence my original disclaimer) but this certainly seems to indicate, from a practical standpoint, that the L3/Cogent issue has been very recently resolved. Indeed, from my (single-homed) L3 server I can now traceroute directly to a (single-homed) Cogent host.

  45. That's not how peering works - here's the diff by billstewart · · Score: 3, Informative
    There are two basic ways that networks connect to each other - peering and transit. In a transit arrangement, one network (typically the big one) agrees to deliver any traffic the other network hands it, in return for a bunch of money, and it typically either advertises a default route (telling a small customer that they can send it all their packets) or a bunch of detailed routes and a default (telling a dual-homed medium-large customer how good its connections are to lots of places, but that customer might use another carrier for destinations that are closer with that carrier.) If you're an end customer, or a small ISP buying service from a big ISP, that's usually what you buy.

    Peering arrangements are different. Two networks that have a lot of traffic for each other will set up direct connections, split the direct costs of the connections, and not charge for accepting packets from the other carrier. But they'll only advertise the routes for their *own* customers. If two small ISPs peer with each other, typically they're each also buying transit service from big ISPs, but it's cheaper for them to dedicate a connection or put bits on a public peering point like MAE-West than to both pay their upstream ISPs.

    The biggest ISPs in the US are called "Tier 1" ISPs, and they all peer with each other rather than buying transit, though they might buy transit for international connections, if they can't get the other side to buy transit from them. It seems flaky, but it makes business sense, or at least it did for a while. In some sense, being big enough that all the other Tier 1s will peer with you is what defines Tier 1, and aside from technical issues, it's a marketing thing - "See, we're one of the big players!" Peering and Transit don't mix very well - you either connect to a given carrier by peering, or by transit, or else you spend a long time hammering out custom arrangements about exactly which routes you'll accept and tweaking routing tables.

    Cogent is a Wannabe-Tier-1. Their main business model is to put fiber into big multi-tenant office buildings and sell everybody 100-meg Ethernet for about the price other carriers charge for one or two T1s. If I were a customer, I wouldn't expect there to be enough upstream to really get that much bandwidth all the time, but I'd expect to get more than a T1 all the time, and a lot more than a T1 almost all the time. Level 3 has apparently decided they're not getting enough value out of the relationship (i.e. not sending Cogent enough packets to make it worth their while) to keep peering, and wants Cogent to either pay them for service or get transit from somebody else. They gave them about 50 days to make other arrangements, but Cogent decided to play chicken with them.

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks