Slashdot Mirror


Why Google Went Offline Today

New submitter mc10 points out a post on the CloudFlare blog about the circumstances behind Google's services being inaccessible for a brief time earlier today. Quoting: "To understand what went wrong you need to understand a bit about how networking on the Internet works. The Internet is a collection of networks, known as "Autonomous Systems" (AS). Each network has a unique number to identify it known as AS number. CloudFlare's AS number is 13335, Google's is 15169. The networks are connected together by what is known as Border Gateway Protocol (BGP). BGP is the glue of the Internet — announcing what IP addresses belong to each network and establishing the routes from one AS to another. An Internet "route" is exactly what it sounds like: a path from the IP address on one AS to an IP address on another AS. ... Unfortunately, if a network starts to send out an announcement of a particular IP address or network behind it, when in fact it is not, if that network is trusted by its upstreams and peers then packets can end up misrouted. That is what was happening here. I looked at the BGP Routes for a Google IP Address. The route traversed Moratel (23947), an Indonesian ISP. Given that I'm looking at the routing from California and Google is operating Data Centre's not far from our office, packets should never be routed via Indonesia."

28 of 110 comments (clear)

  1. Re:And I thought.. by YodasEvilTwin · · Score: 4, Funny

    Only 6 and a half years late on that joke.

  2. All your packets are belong to... by Adeptus_Luminati · · Score: 5, Interesting

    ... Network Admins who have no clue. Like when just 4 years ago, Pakistan took down Youtube...
    http://securitywatch.pcmag.com/dns/285152-pakistan-takes-youtube-down

    Clearly this should be on the agenda for the new "Cyber Reserves" of the department of Homeland Security. If Google can be taken down by accident in parts of the world, then it certainly can be taken down on purpose. Route filters are your friends!

    CYBER RESERVES: http://www.techradar.com/news/internet/department-of-homeland-security-recruiting-for-cyber-reserve-1109906

    --
    No trees were killed in the making of this post; however, many trillions of electrons were horribly inconvenienced.
    1. Re:All your packets are belong to... by aaaaaaargh! · · Score: 4, Funny

      If Google can be taken down by accident in parts of the world, then it certainly can be taken down on purpose.

      Oh my god, that would be the end of the world as we know it... I'd have to use Bing for a few minutes!

      You're right, we need to get the cyber-army ready!

    2. Re:All your packets are belong to... by icebike · · Score: 4, Insightful

      How is it a problem, again? Something bad happened, it got fixed right quick. I fail to see how it's a call to arms for anything. or anybody. If idiots keep broadcasting bad routes, then other networks will be more rigorous about their filtering. This doesn't need a committee.

      Something bad happened, it got fixed right quick. This Time.

      What about next time, when the whole mess is run by the UN?

      If idiots are currently accepting bad routes from idiots that broadcast them, then it surely does need fixing.
      Why would you rely on bottom-up security?

      --
      Sig Battery depleted. Reverting to safe mode.
  3. Re:Will DNSSEC help with this? by X0563511 · · Score: 5, Informative

    Nope. DNS doesn't mean shit if the routers are sending your traffic to the wrong place. (DNS points to an IP, which is (supposed to) point to the target machine. If that last part isn't working, the first part won't work no matter what)

    --
    For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
  4. Root cause was PCCW, not Moratel by Aqualung812 · · Score: 5, Interesting

    From TFA:

    Someone at Moratel likely "fat fingered" an Internet route. PCCW, who was Moratel's upstream provider, trusted the routes Moratel was sending to them. And, quickly, the bad routes spread.

    Yes, someone at Moratel screwed up, but this is exactly why upstream ISPs should never allow advertisements from their customers for networks that their customer does not control.

    PCCW is to blame for allowing this to happen. Never trust customers with things that don't belong to them.

    --
    Grammer Nazis - I mod you "troll" unless you actually add something on-topic. Yes, I know I have mispellings in my sig.
    1. Re:Root cause was PCCW, not Moratel by Anonymous Coward · · Score: 5, Interesting

      PCCW is to blame for allowing this to happen.

      Again. They were also the upstream for the Pakistan-takes-down-YouTube fiasco.

    2. Re:Root cause was PCCW, not Moratel by vlm · · Score: 5, Informative

      Yes, someone at Moratel screwed up, but this is exactly why upstream ISPs should never allow advertisements from their customers for networks that their customer does not control.

      Another important point is its twenty freaking twelve and at a "respectable" ISP this was part of my job a decade ago. Too many customers try advertising too much stupid space. Rule number one for a BGP operator... never trust whats incoming from nobody. Rule number two is when you call in for support and 1st level call center tells you to reboot everything, tell them to F off and transfer directly to my desk unless you want to learn the joys of route flap dampening. Rule 2 is hilarious when there's a genuine catastrophic failure and like 30 customers all want to talk to me personally because all their sessions dropped when the Juniper caught fire or whatever it was... so beware.

      There are only three things funnier than a fat finger BGP route advertisement:
      1) Why can't I advertise my old /28 from AT&T on your network? Well dumbass thats their space not "your" /28, and secondly on the civilized internet everyone filters at /24 or bigger to keep out the riff raff so even if I was dumb enough to advertise a subnet of another ISPs space, no one gonna see it past our borders.
      2) Multihomed people who basically accidentally try to turn themselves into a transit network. Oh, you connect to L3? How nice. You don't really want to advertise that the whole freaking internet can route thru you to reach it, do you?
      3) Advertising space in BGP, maybe redistributing a static or null route, doesn't mean you can actually route it on your internal network. OK I see your measly little /20 and now that you let me know to update our filters, we can all see it via us on any looking glass in the world. Yes I'm quite sure it doesn't work and no its not BGPs fault, go fix your internal routing protocol and filters and GTF off my phone so I can go back to sleep. No for the 20th time its not a BGP problem just look at the looking glass I'm not filtering you anymore.

      The primary problem is BGP is a social layer 8 protocol for how network managers... manage. You don't learn that shit in a weekend training class where they teach you the exact syntax of "show ip bgp neighbor" or by memorizing AS path regex syntax or whatever. At least up till I got out of the business half a decade ago, no one was teaching anything like "this is how to use BGP while not making an ass outta yourself" class. No book either. I think "Internet Routing Architectures" and maybe the name Halabi sticks in my mind as a good theoretical book as I recall, but no one had a practical "real" hands on class or book. I suppose I shouldda done something about that but its been a long time now. Then again I've probably forgotten more about BGP that most one week CCNP bootcampers will ever know, so maybe its not too late anyway. Another "in my infinite spare time" project.

      Sorry if I've offended any /.er I've actually talked to on the job who Fed up, nothing personal... But since I carefully identified noone by name, at least no one knows you Fed up. If today I failed to offend anyone who Fed up while I was doing front line BGP support then I'll try harder next time. BGP is kind of the network engineering version of giving little kids boxes of matches. Its surprising more networks don't burn down, but boxes of matches are so blasted useful if you actually know how to use them safely so its not like we'll ever get rid of it.

      --
      "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    3. Re:Root cause was PCCW, not Moratel by steelfood · · Score: 4, Informative

      In the age of information, there is one thing people continue to forget: information relies on trust. And like sociology tells us, trust as a commodity is only easy to trade on a small scale. Trust is very hard to acquire in large populations.

      There are two fundamental flaws with the internet. The first is that it was originally designed and built on a small scale. Trust was not an issue. This is apparent everywhere, at every layer. Every piece of information received is inherently considered true. Validation is limited only to determining the accuracy of the reproduction.

      When trust became a problem, people attempted to address this issue via a glorified whitelist. Certificates were meant to address both concerns of the accuracy of the information, and the validity of the origin. Trust in the contents of the whitelist was implicit. It worked on small scales, but on large scales, it fails.

      The whitelist was used because of the second fundamental problem: statelessness. Trust relies on the continual accuracy throughout many interactions. It cannot be calculated or created out of materials, but is acquired over time. The more times the information is accurate from a particular source, the greater the trust in the information. Time requires state. It requires having both a before, and an after.

      The stateless nature of the internet makes it impossible to be fully trusted. Even if the internet had state, it is difficult to enough to devise an algorithm that will accurately calculate the trustworthiness of a piece of information. Trust is a judgment call. It is a product of emotion, not of logic. Without state, it is an impossibility.

      --
      "If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be."
  5. On the other hand: escape from bad ISP by DamonHD · · Score: 4, Interesting

    This sort of 'feature' did allow me once to escape from a misbehaving ISP holding me hostage and preventing me getting my mail to, for example, change my DNS glue records many many years ago. A helpful friendly new ISP managed to reroute traffic to me via them with a "bogus" routing announcement long enough for me to fix those records and then escape the old ISP when the new records propagated.

    Rgds

    Damon

    --
    http://m.earth.org.uk/
  6. Re:Happens all the time, just not usually to Googl by Adeptus_Luminati · · Score: 3, Insightful

    Seriously, a porn link in your sig?

    Anyway... clearly Anonymous hasn't learned how to delete BGP filters and inject fake routes yet.

    --
    No trees were killed in the making of this post; however, many trillions of electrons were horribly inconvenienced.
  7. China already did this in 2010 by hydrofix · · Score: 5, Interesting

    China Telecom also hijacked web traffic to US government websites in April 2010 for 17 minutes. At least that incident seems to have been a purposeful disruptions to capture sensitive data and/or try out a novel cyberwarfare tactic.

  8. Re:Happens all the time, just not usually to Googl by Anonymous Coward · · Score: 5, Funny

    Errr, yeah, what about that porn link? That's really... that's awful. I can't believe that they would have that there. Man, porn. Anyway, I've just got to go and do... a thing. Nothing interesting, don't you worry about it, just... Go about your business.

  9. Re:And I thought.. by jhoegl · · Score: 3, Funny

    Quite, the noise resonates off the tubes causing packet loss and errors!

  10. Re:Happens all the time, just not usually to Googl by Anonymous Coward · · Score: 2

    Since when does erotic nudes immediately equal "porn", and clearly you haven't visited the site.

  11. BGP Attack! by Jeremiah+Cornelius · · Score: 4, Informative
    --
    "Flyin' in just a sweet place,
    Never been known to fail..."
    1. Re:BGP Attack! by Jeremiah+Cornelius · · Score: 2

      "BGP, I choose YOU!"

      --
      "Flyin' in just a sweet place,
      Never been known to fail..."
  12. Filtering by Todd+Knarr · · Score: 2

    I get the feeling that upstreams should start to not completely trust BGP announcements from peers. I know in my firewalls the configuration knows which networks ought to appear where, and the rules are set to block traffic when that network shouldn't be able to appear on that interface. Perhaps it's time to look into having an administrative communication of which ASes each peer ought to be handling, and having the BGP system at the upstream filter out or ignore announcements for ASes that that peer isn't supposed to be handling. The problem I see with that though is that it works well at the edges, but the closer to the core you get the larger the list of potentially valid ASes and I can see it getting unmanageable pretty quickly. But with the number of these incidents, I think we need to do something to change the assumption that you can unconditionally trust peers to only hand you valid routing data, because that assumption pretty clearly isn't true anymore.

    1. Re:Filtering by vlm · · Score: 3, Interesting

      I get the feeling that upstreams should start to not completely trust BGP announcements from peers.

      Start? This was BAU at respectable ISPs a decade ago. Guess what I was doing at that time, endless Fing around with filtering. Bureaucratic level varied a lot over time but when I left that part of the biz it was crystallizing around something like the 800 number letter of agency process, where you need a company officer to fax a signed sheet verifying thats really your space and yes we really do have permission to advertise it. At least at that time ARIN did not do dun and bradstreet numbers and there's no way to verify via whois and everyones merging, so we needed that signed letter to protect us legally just as much as the internet needed it so we could protect the internet from them. At least as I recall.

      Basically if you are "Ford dealer of chicago" I have no legal idea if you're allowed to advertise ARIN's ford.com space, but if we have a LOA then at least if it all hits the legal fan we have a signed letter from a corporate officer at the dealership to get us off the hook (at least partially) when the real ford goes after us, or at least we can tell the "real ford" who to add to the lawsuit. Many a time I had to call the ARIN registered owners to verify an apparently unrelated minion should be advertising some of their space. Sometimes yes, sometimes no. It was always an entertaining conversation. Except for when the ARIN contact info was invalid. Then the swearing began.

      Most of the time, obviously, its just a dude advertising additional space with identical ARIN contact info as the old space, so it doesn't come to this level of paperwork.

      I don't know if the situation has gotten better or worse since the mid 00s.

      but the closer to the core you get the larger the list of potentially valid ASes

      Ah but that's not where you need it. At least not for black hole events like this. If I'm properly filtering at the border, I don't need to filter in the middle, in fact it shouldn't ever be even theoretically necessary and its none of the cores business what business deal I've signed at the border anyway. Also god help us there were people trying to what amounts to dynamically load balance and disaster recovery using BGP, not necessarily a "stable" situation anyway. Route flap dampening is enough of a PITA.

      --
      "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
  13. Re:Google should know by Shrike+Valeo · · Score: 4, Funny

    As long as those looking to fix the problem don't start by Googling the problem..

  14. Re:Happens all the time, just not usually to Googl by Beardo+the+Bearded · · Score: 3, Funny

    Seriously, a porn link in your sig?

    Anyway... clearly Anonymous hasn't learned how to delete BGP filters and inject fake routes yet.

    The only reason you replied was to bookmark!

    --

    ---
    ECHELON is a government program to find words like bomb, jihad, plutonium, assassinate, and anarchy.
  15. Re:Happens all the time, just not usually to Googl by Jonah+Hex · · Score: 4, Informative

    We don't do Porn, we try to keep on the erotic art side of things, and thanks for drawing attention to it lots of visitors from your mention! - HEX

  16. The Real Reason by guttentag · · Score: 2

    The Google logo got caught with its hand in the ballot box cookie jar! It's all over Google's front page!

  17. Re:Shift by 93+Escort+Wagon · · Score: 4, Funny

    It's okay, those of us who aren't network admins just need to type "Border Gateway Protocol" into Google and... oh crap!

    --
    #DeleteChrome
  18. Re:Happens all the time, just not usually to Googl by kasperd · · Score: 2

    Another networking issue that is probably never going to go away

    Oh, really? I thought Route Origin Authorisations were designed to address exactly this issue?

    --

    Do you care about the security of your wireless mouse?
  19. Re:Will DNSSEC help with this? by kasperd · · Score: 2

    Can this system of Network addresses and border gateways also be protected by DNSSEC?

    No, but I think Route Origin Authorisations can help.

    --

    Do you care about the security of your wireless mouse?
  20. Re:Shift by DMUTPeregrine · · Score: 2

    Slashdot is targeted at the tech-oriented crowd. The set of all tech-oriented people is quite a bit larger than the set of network administrators. It's therefore a good idea to explain what BGP is so that the mathematicians, scientists, engineers, etc, can understand what the article is about. Even for many network administrators BGP will be a thing they learned about and then mostly forgot, since it's not used directly by smaller organizations, and larger organizations likely have some admins responsible only for internal systems.

    --
    Not a sentence!
  21. UCLA Cyclops by jroysdon · · Score: 2

    UCLA's Cyclops is a great tool to monitor your own IP space and make sure you know immediately when this sort of this occurs.