How Google Routes Around Outages
1sockchuck writes "Making changes to Google's search infrastructure is akin to 'changing the tires on a car while you're going at 60 down the freeway,' according to Urs Holzle, who oversees the company's massive data center operations. In a Q-and-A with Data Center Knowledge, Holzle discusses Google's infrastructure, how it has engineered its system to route around hardware failures, and how it responds when something goes awry. These updates usually go unnoticed, but during system maintenance last month a software bug triggered an outage for Gmail."
I thought about it for approximately 30 seconds. Then I realized that it is a bad analogy. A Google car would have hundreds of redundant wheels, changing one is easy.
Basically, all this means is Google designs like Mack while everyone else designs like Chrysler...
Isn't this how the *internet* is (at least in theory) supposed to work anyhow? Instead we have 90% of the cables that route the middle-east/europe running through the same canal. And I know of VERY few ISPs who actually make their systems redundant anymore. /sadface
I would wager to say you would learn all this if you were hired on as part of google's site reliability team. Probably most of that info. you're curious about is something they're not willing to talk about in great detail for competitive reasons.