The Risk of a Meltdown In the Cloud

← Back to Stories (view on slashdot.org)

The Risk of a Meltdown In the Cloud

Posted by timothy on Tuesday March 20, 2012 @02:54AM from the precipitating-danger dept.

zrbyte writes "A growing number of complexity theorists are beginning to recognize some potential problems with cloud computing. The growing consensus is that bizarre and unpredictable behavior often emerges in systems made up of 'networks of networks,' such as a business using the computational resources of a cloud provider. Bryan Ford at Yale University in New Haven says the full risks of the migration to the cloud have yet to be properly explored. He points out that complex systems can fail in many unexpected ways, and he outlines various simple scenarios in which a cloud could come unstuck."

12 of 154 comments (clear)

It has to happen by Anonymous Coward · 2012-03-20 02:58 · Score: 5, Interesting

At some point, there is going to be a massive failure. Someone big is going to lose *all* of their data. I still don't trust virtualization despite it being years old. It's still nascent in the grand scheme.
Someone wake me when they invent the holodeck.
1. Re:It has to happen by Chrisq · 2012-03-20 03:00 · Score: 5, Funny
  
  At some point, there is going to be a massive failure. Someone big is going to lose *all* of their data.
  I just hope its my mortgage company and not my bank.
2. Re:It has to happen by jedidiah · 2012-03-20 03:37 · Score: 5, Insightful
  
  If the cloud is not more robust than what your grandma could come up with on her own then what's the point really?
  Isn't the whole point of "the cloud" the fact that you aren't managing this stuff yourself? You don't have the burden? You don't need the expertise?
  If you push it back on the cloud consumer then a lot of it is really quite pointless.
  
  --
  A Pirate and a Puritan look the same on a balance sheet.
3. Re:It has to happen by jriding · 2012-03-20 04:22 · Score: 5, Insightful
  
  Just a thought. Forget actual failure. What happens when they have IP data or licensed data that is being hosted by a cloud provider, or company to company lawsuit. Court case starts. Could or would they they seize all computers / servers that could house the data? What would happen to the other peoples data that resides on the same physical hardware?
  
  --
  love the taste, hate the texture
Why not stick to real risks? by msobkow · 2012-03-20 03:03 · Score: 5, Insightful

I don't understand the intent of the article other than to provide a knee-jerk chicken-little response to cloud processing and storage.
Not one of the items mentioned is unique to the cloud. It can happen to any data center with more than two nodes involved in a cluster.
But that's not surprising, because "the cloud" is just a distributed collection of cluster servers, the same as large multi-nationals have been running pretty much since their customer loads exceeded the ability of one server to span the global community.

--
I do not fail; I succeed at finding out what does not work.
1. Re:Why not stick to real risks? by Anonymous Coward · 2012-03-20 03:13 · Score: 5, Interesting
  
  Problem is that half the people buying cloud services thing their system is imune to any new problem, and might not implement the same failback procedures they would have in a traditional way.
  We are in many way seeing a new generation of IT people and managers with far less understanding of the fragility of their system then the previus generation, emerge onto the scene.
2. Re:Why not stick to real risks? by NatasRevol · 2012-03-20 03:18 · Score: 5, Insightful
  
  Doubtful.
  More likely, bean counters don't listen to the IT people who say offline backups are important. Beancounters just hear extra expense, think everything in the cloud is secure, and deny redundancy/backups.
  
  --
  There are two types of people in the world: Those who crave closure
3. Re:Why not stick to real risks? by gweihir · 2012-03-20 03:20 · Score: 5, Interesting
  
  In the cloud, you have no idea who else is on the same hardware as you and what their usage patterns are. You cannot be careful yourself anymore, you have to trust your service-provider. The past shows that unless you have huge contractual penalties in place, that is a losing game.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
4. Re:Why not stick to real risks? by Opportunist · 2012-03-20 03:35 · Score: 5, Funny
  
  This. A million times this!
  Our brass is also abuzz with "the cloud". Nobody has the foggiest (pun intended) idea what it is about, what it is or what the hell is going on, but it is the best thing since bread has been sliced. The cloud. It will help us safe millions. How? I dunno, I don't care, but it does! We hear it everywhere, we see it in our manager magazines, and there it is kinda-sorta explained but I didn't understa... I mean, I didn't read it throughly, I don't have the time, my time is valuable, ya know? But it will help us cut costs in a big way, we gotta push towards the cloud!
  The risks? They are not aware of the risks. They don't even know they exist. Why would they exist, they didn't exist so far, right? Redundan..what? Backup... whazat? Now why the hell do we need that suddenly?
  
  --
  We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
Cloud services should complement, not replace by sandytaru · 2012-03-20 03:09 · Score: 5, Insightful

I think the safest bet is to have local copies in addition to copies in the cloud, even if all the processing and computing is actually done in the cloud. Companies should set stuff up to keep a local copy of critical services on a good old fashioned tape drive or backup server. This is sort of a reverse of the cloud based backup solutions, where local processing and databases took place on local servers, but had backups in the cloud in case of a local disaster. Same idea: Have a local backup in case of the meltdown of the cloud. You may find your primary app is temporarily useless, but you at least have all your critical data (hopefully in a format that can be transferred.)

--
Occasionally living proof of the Ballmer peak.
Re:Oscillator by dkf · 2012-03-20 03:20 · Score: 5, Insightful

The general point is that it's possible to get bad emergent behavior which is unexpected. This shouldn't be surprising to anyone (but it is, alas). We see it over and over in complex systems, and it's got pretty much nothing to do with what you implement the complex system with.
What to do about it? Well, the only real fix is to stop the drive for efficiency at all costs. All those little inefficiencies that hit your bottom line, they also mean that when things go wrong you can weather the storm more easily. And yes, that resilience means things are going to cost more. How much more? Well, depends how much risk you want to take out of the system and how much you're willing to pay. Your call. (A local backup removes a lot of risk from things like cloud providers going belly up unexpectedly, but it does mean you're stuck with actually having to pay real money to do the backup and make sure it is working.)

--
"Little does he know, but there is no 'I' in 'Idiot'!"
Re:Oscillator by IcyHando'Death · 2012-03-20 05:16 · Score: 5, Insightful
Pardon me mods, but +4 informative? This is a terrible summary from someone who doesn't seem to have understood what he's read. The novel "cloudy-thing" aspect of the article's argument is the very part the parent misses when he dismisses this as "nothing new".
The cloud is an abstraction that intentionally hides detail. Cloud providers do that to make the service being offered simple to package, sell and use. They also do what they can to keep the tricks of their trade secret from competetors. But their infrastructure is actually very complex relative to what the average small to medium client would need for themselves. This is important in three ways:
1. 1) Your own engineers can't take all aspects of a deployment into account when making decisions.
2. 2) As a moderately sized company, using the cloud will expose you to the risks of emergent behaviour that would simply not be an issue on the smaller scale you would operate on if you ran your own infrastructure.
3. 3) Your system may be humming along smoothly one moment, then start thrashing disasterously the next in the absence of any action on your part and for no apparent reason, simply because your cloud provider has tweaked some seemingly innocuous parameter (even after extensive testing)
This is an important and novel issue and worthy of some real consideration.