Google Takes Blame For Internet Disruption Across Japan (theregister.co.uk)
An anonymous reader shares a report: Google on Saturday accepted responsibility for the widespread internet disruptions Japan experienced the previous day. The search engine giant apologized for the trouble, saying it was caused by an errant network setting that was corrected within eight minutes of its discovery. Google did not say whether human error or a technical malfunction was to blame. The disrupted services used internet connections provided by NTT Communications Corp. and KDDI Corp., both of which said Friday that the issues were caused by a change in the flow of data traffic. From a report on The Register: The trouble began when Google 'leaked' a big route table to Verizon, the result of which was traffic from Japanese giants like NTT and KDDI was sent to Google on the expectation it would be treated as transit. Since Google doesn't provide transit services, as BGP Mon explains, that traffic either filled a link beyond its capacity, or hit an access control list, and disappeared. The outage in Japan only lasted a couple of hours, but was so severe that Japan Times reports the country's Internal Affairs and Communications ministries want carriers to report on what went wrong.
TFA explains how, and at the end of the TFA: "There are various proposals to tweak BGP to stop this sort of thing happening, but as is so often the case, implementation is lagging far behind requirement." In short, it's a known problem and eventually maybe someone will care enough to fix it.
Multihoming. I'd imagine Google provides transit service, but only for their own IP blocks. Each Google datacenter almost certainly has multiple Internet connections to the world. As a result, they have multiple netblocks provided by multiple ISPs.
If, through some unlucky DNS accident, a client on ISP A looks up google.com and gets an IP address provided by ISP B, it would take many more hops to reach that server via public Internet routes than by sending traffic to that datacenter's nearest router (on ISP A) and asking that router to forward traffic through the datacenter to the other set of IPs.
Check out my sci-fi/humor trilogy at PatriotsBooks.
Hmm, actually we *knew* it was them in the first place the moment it started, most non-joke internet network engineers refuse to fly blind, so there are probes and monitors everywhere in the Internet control plane (DFZ BGP routing). E.g. read: https://bgpmon.net/bgp-leak-causing-internet-outages-in-japan-and-beyond/
Also, most internet network engineers will place a lot of the blame on *Verizon*, not Google. "route leaks" are a *fact of life*, they will happen at least once to everyone. You *MUST* filter the routing plane to not accept crap from other autonomous systems you peer with, and Verizon *utterly failed* at doing it. Had they filtered, they'd have rejected the bogus routing from Google and avoided most, if not all of the damage.
So, Google might have publicly taken the blame since it was their operational error that triggered the damage in the first place, but they are at most responsible for half of it... and *everyone in the field* knows it, Google included.
BGP routing has no authority mechanism. Anyone can publish any route. This is not the first time this has happened, nor will it be the last.
"I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.