Slashdot Mirror


The Risk of a Meltdown In the Cloud

zrbyte writes "A growing number of complexity theorists are beginning to recognize some potential problems with cloud computing. The growing consensus is that bizarre and unpredictable behavior often emerges in systems made up of 'networks of networks,' such as a business using the computational resources of a cloud provider. Bryan Ford at Yale University in New Haven says the full risks of the migration to the cloud have yet to be properly explored. He points out that complex systems can fail in many unexpected ways, and he outlines various simple scenarios in which a cloud could come unstuck."

23 of 154 comments (clear)

  1. It has to happen by Anonymous Coward · · Score: 5, Interesting

    At some point, there is going to be a massive failure. Someone big is going to lose *all* of their data. I still don't trust virtualization despite it being years old. It's still nascent in the grand scheme.

    Someone wake me when they invent the holodeck.

    1. Re:It has to happen by Chrisq · · Score: 5, Funny

      At some point, there is going to be a massive failure. Someone big is going to lose *all* of their data.

      I just hope its my mortgage company and not my bank.

    2. Re:It has to happen by Anonymous Coward · · Score: 4, Insightful

      If someone loses "all" their data in the cloud, their problem has nothing to do with the cloud. If you lose all your data, it's because you kept all your data in one place, with no backups in a different place, and all fault lands on you, not the cloud, not your cloud provider, and not on any given piece of technology. There have already been large failures, and some companies have already lost massive amounts of data, but it doesn't change anything, because these problems have nothing to do with whether you host your own servers or rent them from somebody else, which is really all "the cloud" boils down to.

      Also, inherently not trusting virtualization as a concept in 2012 is is moronic and baseless. It's a technology, just like any other. It can be implemented well or it can be implemented poorly, but as a concept it is not novel or revolutionary to any degree that it should engender trust or distrust, at least not any more than the hardware underneath it.

    3. Re:It has to happen by jedidiah · · Score: 5, Insightful

      If the cloud is not more robust than what your grandma could come up with on her own then what's the point really?

      Isn't the whole point of "the cloud" the fact that you aren't managing this stuff yourself? You don't have the burden? You don't need the expertise?

      If you push it back on the cloud consumer then a lot of it is really quite pointless.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    4. Re:It has to happen by lightknight · · Score: 4, Insightful

      "If the cloud is not more robust than what your grandma could come up with on her own then what's the point really?"

      Money.

      --
      I am John Hurt.
    5. Re:It has to happen by datavirtue · · Score: 4, Insightful

      I keep remembering the Google applications that keep disappearing. Data is one thing, application providers going out of business or discontinuing a service is another issue entirely. Hopefully the competition still stands and you can migrate your applications over to their service.

      --
      I object to power without constructive purpose. --Spock
    6. Re:It has to happen by Anonymous Coward · · Score: 4, Funny

      But, but... you don't understand. You're money's not here. It's in Joe's house, and Jimmie's house and... it really is a wonderful life.

    7. Re:It has to happen by jriding · · Score: 5, Insightful

      Just a thought. Forget actual failure. What happens when they have IP data or licensed data that is being hosted by a cloud provider, or company to company lawsuit. Court case starts. Could or would they they seize all computers / servers that could house the data? What would happen to the other peoples data that resides on the same physical hardware?

      --
      love the taste, hate the texture
    8. Re:It has to happen by DarkOx · · Score: 4, Informative

      I am sorry but we have been virtualizing things by one name or anything going back to 1960's mainframes. In other words almost as long as commercial computing has existed.

      The cloud is a different matter. The issue is not with virtualization but with creating dependencies on and between parties who don't really talk to each other.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    9. Re:It has to happen by CAIMLAS · · Score: 4, Interesting

      While I agree with you, a little perspective.

      I've seen systems, verified backups, and duplicate backups simply fall over. Every best practice was followed, backups were taken regularly, and the backups had been verified by Industry Leading Backup Software Everyone Still Uses (arg). But the system and data on the backups could not be restored.

      It eventually got fixed through some virtualized gyrations, but it took the better part of a week for the company's top engineer to figure it out.

      Shit happens.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    10. Re:It has to happen by postbigbang · · Score: 4, Insightful

      Yes, a lot of people lose data on mainframes and the problem was generally behind the console. Online production systems and job-based systems need to be designed with practicality and component failure in mind.

      The cloud is a system, and the system needs redundancy, checks and reality checks, and quality ins-and-outs. That's right, just like what you've been doing all along. Same security, same backups, same contingencies-- as there are no shortcuts, just cheaper hardware.

      I have to dismiss the cited article as it's entirely ephemeral, with not one single citation to back it up. Financial markets, while important, are a somewhat unique context to cite; they're built differently than many cloud components. There is nothing tangible cited, no case history, just some Yale-y's bad, after-lunch tummy growling coinciding with thinking about the complexity of modern infrastructure.

      This whackamole approach serves no purpose, except to sell more inside, rather than outside, infrastructure.

      --
      ---- Teach Peace. It's Cheaper Than War.
  2. Why not stick to real risks? by msobkow · · Score: 5, Insightful

    I don't understand the intent of the article other than to provide a knee-jerk chicken-little response to cloud processing and storage.

    Not one of the items mentioned is unique to the cloud. It can happen to any data center with more than two nodes involved in a cluster.

    But that's not surprising, because "the cloud" is just a distributed collection of cluster servers, the same as large multi-nationals have been running pretty much since their customer loads exceeded the ability of one server to span the global community.

    --
    I do not fail; I succeed at finding out what does not work.
    1. Re:Why not stick to real risks? by Anonymous Coward · · Score: 5, Interesting

      Problem is that half the people buying cloud services thing their system is imune to any new problem, and might not implement the same failback procedures they would have in a traditional way.

      We are in many way seeing a new generation of IT people and managers with far less understanding of the fragility of their system then the previus generation, emerge onto the scene.

    2. Re:Why not stick to real risks? by NatasRevol · · Score: 5, Insightful

      Doubtful.

      More likely, bean counters don't listen to the IT people who say offline backups are important. Beancounters just hear extra expense, think everything in the cloud is secure, and deny redundancy/backups.

      --
      There are two types of people in the world: Those who crave closure
    3. Re:Why not stick to real risks? by gweihir · · Score: 5, Interesting

      In the cloud, you have no idea who else is on the same hardware as you and what their usage patterns are. You cannot be careful yourself anymore, you have to trust your service-provider. The past shows that unless you have huge contractual penalties in place, that is a losing game.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    4. Re:Why not stick to real risks? by Opportunist · · Score: 5, Funny

      This. A million times this!

      Our brass is also abuzz with "the cloud". Nobody has the foggiest (pun intended) idea what it is about, what it is or what the hell is going on, but it is the best thing since bread has been sliced. The cloud. It will help us safe millions. How? I dunno, I don't care, but it does! We hear it everywhere, we see it in our manager magazines, and there it is kinda-sorta explained but I didn't understa... I mean, I didn't read it throughly, I don't have the time, my time is valuable, ya know? But it will help us cut costs in a big way, we gotta push towards the cloud!

      The risks? They are not aware of the risks. They don't even know they exist. Why would they exist, they didn't exist so far, right? Redundan..what? Backup... whazat? Now why the hell do we need that suddenly?

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    5. Re:Why not stick to real risks? by Dan667 · · Score: 4, Insightful

      "the cloud" is just dumb terminals all over again. After businesses make a bunch of money on the cloud they will then start selling local solutions again to try and mitigate problems with being on the cloud.

  3. Oscillator by vlm · · Score: 4, Informative

    The TLDR version of the article is that load balancers can oscillate.
    Its spun into a cloudy-thing because thats trendy, but the basic argument is nothing new.
    Perhaps there's more "meat" in the original paper?

    One common thread is that nothing is ever really "new" in computer science / IT. Clouds are just a rehash of ye olde mainframe outsourcing from decades ago. I worked at a place that was doing that in the early to mid 90s.

    --
    "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    1. Re:Oscillator by dkf · · Score: 5, Insightful

      The general point is that it's possible to get bad emergent behavior which is unexpected. This shouldn't be surprising to anyone (but it is, alas). We see it over and over in complex systems, and it's got pretty much nothing to do with what you implement the complex system with.

      What to do about it? Well, the only real fix is to stop the drive for efficiency at all costs. All those little inefficiencies that hit your bottom line, they also mean that when things go wrong you can weather the storm more easily. And yes, that resilience means things are going to cost more. How much more? Well, depends how much risk you want to take out of the system and how much you're willing to pay. Your call. (A local backup removes a lot of risk from things like cloud providers going belly up unexpectedly, but it does mean you're stuck with actually having to pay real money to do the backup and make sure it is working.)

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    2. Re:Oscillator by IcyHando'Death · · Score: 5, Insightful

      Pardon me mods, but +4 informative? This is a terrible summary from someone who doesn't seem to have understood what he's read. The novel "cloudy-thing" aspect of the article's argument is the very part the parent misses when he dismisses this as "nothing new".

      The cloud is an abstraction that intentionally hides detail. Cloud providers do that to make the service being offered simple to package, sell and use. They also do what they can to keep the tricks of their trade secret from competetors. But their infrastructure is actually very complex relative to what the average small to medium client would need for themselves. This is important in three ways:

      1. 1) Your own engineers can't take all aspects of a deployment into account when making decisions.
      2. 2) As a moderately sized company, using the cloud will expose you to the risks of emergent behaviour that would simply not be an issue on the smaller scale you would operate on if you ran your own infrastructure.
      3. 3) Your system may be humming along smoothly one moment, then start thrashing disasterously the next in the absence of any action on your part and for no apparent reason, simply because your cloud provider has tweaked some seemingly innocuous parameter (even after extensive testing)

      This is an important and novel issue and worthy of some real consideration.

  4. Cloud services should complement, not replace by sandytaru · · Score: 5, Insightful

    I think the safest bet is to have local copies in addition to copies in the cloud, even if all the processing and computing is actually done in the cloud. Companies should set stuff up to keep a local copy of critical services on a good old fashioned tape drive or backup server. This is sort of a reverse of the cloud based backup solutions, where local processing and databases took place on local servers, but had backups in the cloud in case of a local disaster. Same idea: Have a local backup in case of the meltdown of the cloud. You may find your primary app is temporarily useless, but you at least have all your critical data (hopefully in a format that can be transferred.)

    --
    Occasionally living proof of the Ballmer peak.
  5. Serious problem, but not a surprise by gweihir · · Score: 4, Interesting

    Complex systems almost always exhibit surprising behavior. Cloud computing is no exception, and it is new in addition. This leads to a high level of risk of such events emerging without warning. Of course, people with a stake in the business side will never admit the risk. For examples of this happening in other fields, look at TEPCO, BP, RSA, ... All save and risk-free. Until things blow up.

    Put simple: "The Cloud - where other peoples servers can crash yours."

    Also appropriate:
        "A distributed system is one in which I cannot get something done because a machine I've never heard of is down." --Leslie Lamport
    This holds even more for the cloud.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  6. The cloud by javascriptjunkie · · Score: 4, Informative

    I've been working in the cloud since July. The company I work for really likes the idea of it. But I'll tell you something. As a programmer and systems administrator responsible for something that lives in the cloud, I'm just not seeing the value of it. At least the way it's implemented at Rackspace. We've had problems that are absolutely bizarre, that seemingly have no explanation, that take weeks to resolve, that don't originate on our side. We've had issues with data integrity that don't happen on regular servers, and while we're able to "scale," we're very limited in the ways we're allowed to do it. Maybe this kind of set up works for other companies and groups, but I can't see myself choosing a cloud provider over traditional collocation and the standard three tier server model for 99% of what I need to do.