Lightning Strikes Amazon's Cloud (Really)

← Back to Stories (view on slashdot.org)

Lightning Strikes Amazon's Cloud (Really)

Posted by ScuttleMonkey on Friday June 12, 2009 @03:25PM from the better-than-ball-lightning-in-the-cage dept.

The Register has details on a recent EC2 outage that is being blamed on a lightning strike that zapped a power distribution unit of the data center. The interruption only lasted around 6 hours, but the irony should last much longer. "While Amazon was correcting the problem, it told customers they had the option of launching new server instances to replace those that went down. But customers were also able to wait for their original instances to come back up after power was restored to the hardware in question."

14 of 109 comments (clear)

Min score:

Reason:

Sort:

Irony? by Anonymous Coward · 2009-06-12 15:31 · Score: 5, Insightful

Isn't cloud computing supposed to tackle such instances?
1. Re:Irony? by log0n · 2009-06-12 23:42 · Score: 3, Insightful
  
  The irony is that a cloud was struck by lightning. Lightning usually comes from clouds.
  Sometimes we all need to tone back the nerd a bit :)
What irony? by MrMista_B · 2009-06-12 15:44 · Score: 3, Insightful

What irony?
Maybe I'm just tired, but I'm not sure what irony is being referred to by the poster.
1. Re:What irony? by Anonymous Coward · 2009-06-12 16:13 · Score: 2, Insightful
  
  That a computing technology that was supposed to be largely immune to damage of individual "nodes" in the cloud could be taken down by lightning hitting a single point?
2. Re:What irony? by quanticle · 2009-06-12 16:44 · Score: 5, Insightful
  
  Perhaps they were referring to the irony of Amazon's EC2 being affected by one of the very natural disasters it advertises protection against.
  Its rather like an "unsinkable" vessel going down on her maiden voyage.
  
  --
  We all know what to do, but we don't know how to get re-elected once we have done it
Inconcievable! by binaryspiral · 2009-06-12 15:47 · Score: 5, Insightful

While everyone is talking up the cloud and how resilient it is... this is just yet another example to never put all your eggs in one basket. If your service is so damn important that it can't go down - have it hosted in two places.
Notice, Amazon.com didn't go down... :)
Re:Lightning once striked our office building. by xrayspx · 2009-06-12 16:18 · Score: 5, Insightful

I'm thinking critically because Amazon, EMC, VMWare, etc bill The Cloud as a mystical place where you throw your shit and then it's universally available 100%. Nothing bad happens in The Cloud.

So what's the deal with having all copies of these VMs in one datacenter? That's not very The Cloud of them. Maybe they should replicate all of EC2 to GFS. Would The Cloud win then?

Customers being given the option of redeploying their VMs or waiting an unspecified period of time until The Cloud is back online isn't The Cloud we were promised.

/cloud

--
I like music
Re:Lightning once striked our office building. by Logic+and+Reason · 2009-06-12 16:58 · Score: 3, Insightful

I'm thinking critically because Amazon, EMC, VMWare, etc bill The Cloud as a mystical place where you throw your shit and then it's universally available 100%. Nothing bad happens in The Cloud.
No, they don't. You're either being disingenuous, or idiotic.

So what's the deal with having all copies of these VMs in one datacenter? That's not very The Cloud of them.
So you expect Amazon to somehow be running the same VM simultaneously on multiple machines? The point of EC2 is that you have machine images prepared in advance, which you can launch at any time to instantiate a new, ready-to-go VM. The VMs themselves are obviously still running on actual machines, which are (surprise!) still vulnerable to things like lightning strikes and other random hardware failures.

If a few minutes downtime when something like that happens is unacceptable, then you should be running multiple machines in different availability zones-- which is exactly what you'd be doing in a more traditional environment. EC2 just makes it easier to do this in a flexible way. Yes, you pay for that privilege, but it's clearly worth it to some people.
Re:Do any of you know how they survived? by RsG · 2009-06-12 17:36 · Score: 5, Insightful

I'm reading between the lines here (it doesn't actually say this in TFA), but it sounds like this was a direct hit. Not an outage, which is a different beast.
A UPS is about as useful in this instance as antibiotics against a virus - it's a solution to a different problem. Surge protectors don't help much either, not unless the strike was a fairly mild and/or remote one. You could switch over to a disconnected UPS system every time there's a thunderstorm on the horizon, but that seems needlessly complicated and expensive.
That being said, the GP referred to an outage, so you've quite correctly answered his question; it's just the wrong question to ask in this instance. And of course I could be misreading (or Amazon could be misrepresenting) the exact nature of the failure - if it were a regular outage, none of the above would apply.

--
Erotic is when you use a feather. Exotic is when you use the whole chicken.
Re:Who covers the cost? by TheLink · 2009-06-12 17:40 · Score: 2, Insightful

What if they insured with AIG?

Who covers the cost then? :)
--
- Too many replies beneath your current threshold
Re:Lightning once striked our office building. by Anonymous Coward · 2009-06-12 17:58 · Score: 1, Insightful

Endless arguing. Did or didn't amazon say that using the cloud you "protect your application from failure of a single location"? And did or didn't this happen? Answering the two question in the right order will explain what the OP meant even to you.
Re:Lightning once striked our office building. by lena_10326 · 2009-06-12 19:32 · Score: 2, Insightful

The irony here is that 6 hours in a year is 99.93% so they've already blown it for the year.
A region consists of multiple datacenters. 99.93% would be for 1 datacenter, not the region.

--
Camping on quad since 1996.
Re:Lightning once striked our office building. by dhall · 2009-06-12 20:00 · Score: 3, Insightful

"Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from failure scenarios."
Let's highlight the words that needs emphasis.
"provides", "developers", "tools"
As to whether the developers use them or not isn't always automatic.
"you can protect your applications from failure of a single location"
"can"
Highly available does not meant fault tolerance. The latter allows an application to continue functioning after a component failure. Regardless of the snake oil that has been thrown around, there is no silver bullet that can automagically enable application to be multi-node aware with no chance of deadlock or data corruption. You need to program for this. Again, tools are provided, but that doesn't mean everyone will use them. So in the absense of a fault tolerant application, the cloud provides high availability.
Re:Do any of you know how they survived? by drinkypoo · 2009-06-13 00:37 · Score: 2, Insightful

You could switch over to a disconnected UPS system every time there's a thunderstorm on the horizon, but that seems needlessly complicated and expensive.
Actually, that's NOT a bad idea at all. If you used fiber to the rack and you had big ugly relays that would open the connections, it might be a useful strategy in lightning country. It shouldn't be too hard to detect when lightning is striking nearby, and open the contacts. You would definitely need to do it per-rack at minimum though, because having a battery in every system is an ecological nightmare.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"