Slashdot Mirror


How Google Routes Around Outages

1sockchuck writes "Making changes to Google's search infrastructure is akin to 'changing the tires on a car while you're going at 60 down the freeway,' according to Urs Holzle, who oversees the company's massive data center operations. In a Q-and-A with Data Center Knowledge, Holzle discusses Google's infrastructure, how it has engineered its system to route around hardware failures, and how it responds when something goes awry. These updates usually go unnoticed, but during system maintenance last month a software bug triggered an outage for Gmail."

5 of 105 comments (clear)

  1. Re:Just me? by Yetihehe · · Score: 4, Interesting

    Car is a bad analogy, building airplane in mid-air is better.

    --
    Extreme Programming - Redundant Array of Inexpensive Developers
  2. Article doesn't really say anything. by girlintraining · · Score: 5, Interesting

    You know, the article read like a press release. Hasn't slashdot whored itself out enough lately on these kinds of things? Google is so ultra-reliable, blah blah, 24x7, blah blah, commitment, blah blah, premier service partner, blah blah... I get that kind of talk enough in staff meetings. Where's the meat already!?

    Why not write an article with some nice graphics saying what happens to my request from the time I hit "Search" to the time I click a result. List off all the servers it goes through, their roles, how they're monitored, etc. Give examples of failure and show the mode decisions the software makes (and where this software is running) -- show the latencies and other performance impacts as my request bounces over failure after failure. That's what I expect when I pull up an article entitled "How Google Routes Around Outages". Something useful, professionally enriching, intellectually stimulating, etc. In short, tell me why I (should) never see a "500 Internal Server Error" from Google, but I do from just about every other major website I've used.

    --
    #fuckbeta #iamslashdot #dicemustdie
    1. Re:Article doesn't really say anything. by Red+Flayer · · Score: 2, Interesting

      You know, the article read like a press release. Hasn't slashdot whored itself out enough lately on these kinds of things?

      YMBNH.

      This has been happening since as long as I've been lurking slashdot (2000?), and didn't go away once I set up an account (2002? maybe 2003). And from the YMBNH posts I saw when I began lurking, this has apparently been an issue since the beginning (or shortly thereafter).

      At any rate, complaining about it won't do much good. There's a saying maybe it might help you to repeat:

      Give me the strength to change the things I can, the humility to accept the things I can't, and the wisdom to know the difference.

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
  3. Changing tires by Spazmania · · Score: 2, Interesting

    akin to 'changing the tires on a car while you're going at 60 down the freeway,'

    This is not so hard. Just design the car with 4 axles instead of 2 and lift one off the road at a time. Helps if it can swivel for easy access to the lugnuts.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  4. Re:Just me? by zonky · · Score: 2, Interesting

    Trying to lower costs of competing. Also, it could be argued that mousse meant that off-line mistakes were not 'punished'.