Confirmed Gmail / Google App Outage
mbone writes "Earlier today there was a confirmed Google outage which got a lot of attention from network operators. From a post to NANOG after everything calmed down: 'Google ack'd a maintenance on their core network did not go as planned-Forced traffic to one peer link that was unable to handle all the traffic. Maintenance has been rolled back. Issue has been restored.' This is exactly what makes me nervous about cloud computing and data storage. It's bad enough when I screw up a config and it takes down my mail, but what about when it happens to the entire globe at once?" Several readers also point to CNET's coverage of the outage.
Update: 05/14 19:25 GMT by T : CWmike adds this: "Steven J. Vaughan-Nichols writes that what may be happening is a massive DDoS attack. Based on the size of the attack that would be needed to interfere with Google, I believe that it's quite likely to be the result of an attack from the controllers of the Windows worm, Conficker. Another theory that has been put about — that the problem was due to AT&T NOC routing problems — does not appear to hold water, writes Steven."
Update: 05/14 21:01 GMT by T : Google's put up a low-detail explanation on their blog that says "An error in one of our systems caused us to direct some of our web traffic through Asia, which created a traffic jam. As a result, about 14% of our users experienced slow services or even interruptions."
I love all the fucktards who keep saying: "oop, the cloud's down I'll go for a stroll" or "welp, google's down, I'll go home." Where in the hell do you work? Your phone is going to be lit up like Times Square with all the user calls/complaints for hours. And just up and leaving offers zero customer service to users who rightfully don't know what is actually wrong.
Shit's down, whether it is your or not you are seen as responsible and at least have to offer some communication and support. The problem is you look bad because you can't tell anyone an actual ETA or valid explanation besides a shrug of the shoulders and a "hopefully CompanyX gets it fixed soon."
Cloud computing can be a great thing but this shows that there are fundamental flaws still. I have run systems that could have zero downtime and achieved it. Yes, it requires redundancy. Yes, it is expensive. Yes, it requires geographically and ISP independent sites. Yes, it requires planning. But it can be done, so stop all the bullshit praise that because it is Google and they are big, this is OK. It isn't. if anything they should NEVER have this kind of issue.
The Google-colored glasses need to be taken off.
http://teasphere.wordpress.com - A little spot of tea