Slashdot Mirror


How Google Broke Itself and Fixed Itself, Automatically

lemur3 writes "On January 24th Google had some problems with a few of its services. Gmail users and people who used various other Google services were impacted just as the Google Reliability Team was to take part in an Ask Me Anything on Reddit. Everything seemed to be resolved and back up within an hour. The Official Google Blog had a short note about what happened from Ben Treynor, a VP of Engineering. According to the blog post it appears that the outage was caused by a bug that caused a system that creates configurations to send a bad one to various 'live services.' An internal monitoring system noticed the problem a short time later and caused a new configuration to be spread around the services. Ben had this to say of it on the Google Blog, 'Engineers were still debugging 12 minutes later when the same system, having automatically cleared the original error, generated a new correct configuration at 11:14 a.m. and began sending it; errors subsided rapidly starting at this time. By 11:30 a.m. the correct configuration was live everywhere and almost all users' service was restored.'"

7 of 125 comments (clear)

  1. Re:Well congratulations by Anonymous Coward · · Score: 5, Funny

    On recovering by using the "last known good" configuration. What wizardry!

    I expect we'll be seeing the Google patent application on that shortly </sarcasm>

    Give Google a little credit (but not too much please). If they were Apple they'd have already patented it.

  2. Reminds me of something... by stjobe · · Score: 5, Funny

    "The Google Funding Bill is passed. The system goes on-line August 4th, 2014. Human decisions are removed from configuration management. Google begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug."

    --
    "Total destruction the only solution" - Bob Marley
    1. Re:Reminds me of something... by Immerman · · Score: 5, Funny

      Google perceives this as an attack by humanity, and routs all search queries to goat.se in self defense.

      --
      --- Most topics have many sides worth arguing, allow me to take one opposite you.
  3. Re:Well congratulations by 93+Escort+Wagon · · Score: 1, Funny

    Give Google a little credit (but not too much please). If they were Apple they'd have already patented it.

    Whereas Google would just look for a small company holding a relevant patent, then buy it.

    --
    #DeleteChrome
  4. Re:Well congratulations by Anonymous Coward · · Score: 5, Funny

    Yeah that totally must be it. Me, the guys who write configuration management tools who'll tell you how hard it is (and sell you consultancy to try to make it slightly less hard) and the guys who write monitoring tools who'll tell you how hard it is (and sell you consultancy to try to make it slightly less hard). All those guys from companies like Facebook and Google who give talks at conferences about how difficult it is. We all suck at it and don't know what we're talking about. If only we'd listened to Slashdot, all our troubles would be but a dream.

  5. Re:Well congratulations by phantomfive · · Score: 3, Funny

    "Our system is high-availability, it can return 404s all day for decades without going down"

    --
    "First they came for the slanderers and i said nothing."
  6. Re:Well congratulations by Anonymous Coward · · Score: 3, Funny

    Careful. Only the advice of Anonymous Cowards is trustworthy. All the other people on Slashdot are not to be trusted. After all, they are not even able to find out how to post anonymously! ;-)