Researcher: Interdependencies Could Lead To Cloud 'Meltdowns'

← Back to Stories (view on slashdot.org)

Researcher: Interdependencies Could Lead To Cloud 'Meltdowns'

Posted by Soulskill on Saturday June 9, 2012 @03:16PM from the we-can-only-hope dept.

alphadogg writes "As the use of cloud computing becomes more and more mainstream, serious operational 'meltdowns' could arise as end-users and vendors mix, match and bundle services for various means, a researcher argues in a new paper set for discussion next week at the USENIX HotCloud '12 conference in Boston. 'As diverse, independently developed cloud services share ever more fluidly and aggressively multiplexed hardware resource pools, unpredictable interactions between load-balancing and other reactive mechanisms could lead to dynamic instabilities or "meltdowns,"' Yale University researcher and assistant computer science professor Bryan Ford wrote in the paper. Ford compared this scenario to the intertwining, complex relationships and structures that helped contribute to the global financial crisis."

10 of 93 comments (clear)

Min score:

Reason:

Sort:

This is why you cloud your cloud... by houstonbofh · 2012-06-09 15:27 · Score: 4, Insightful

If you have a critical service, have it at more than one host... That way when AWS has a bad hair day, you are still up.

Or, have your entire business totally dependent one someone else. (Sounds kinda scary that way, don't it?)
1. Re:This is why you cloud your cloud... by girlintraining · 2012-06-09 15:43 · Score: 5, Funny
  
  If you have a critical service, have it at more than one host... That way when AWS has a bad hair day, you are still up.
  While we're at it, we should probably backup the internet too. You'd think someone would have done it by now, in case it crashes, but I can't find any record of anyone doing it.
  
  --
  #fuckbeta #iamslashdot #dicemustdie
2. Re:This is why you cloud your cloud... by c0lo · 2012-06-09 15:50 · Score: 4, Funny
  
  You'd think someone would have done it by now, in case it crashes, but I can't find any record of anyone doing it.
  Heh... the real think crashed long ago, you are using now the backup.
  
  --
  Questions raise, answers kill. Raise questions to stay alive.
3. Re:This is why you cloud your cloud... by flonker · 2012-06-09 15:58 · Score: 4, Informative
  
  http://archive.org/
4. Re:This is why you cloud your cloud... by martin-boundary · 2012-06-09 16:42 · Score: 4, Insightful
  
  There's a limited number of cloud hardware providers on the internet, and the rest are middle men. It's useless to diversify yourself on the middle men, they will all be affected when the common underlying hardware provider has an issue. Thus there's a limit to the reliability that can be achieved, irrespective of how much mixing and matching is performed at the "business end".
  Diversification only "works" when the alternatives are provably independent. That's not true in a highly interconnected and interdependent world, which is TFA's point, I believe.
5. Re:This is why you cloud your cloud... by im_thatoneguy · 2012-06-09 16:52 · Score: 4, Informative
  
  That's one of the problems though that the researcher is flagging.
  1) If a company has one instance on AWS and one on Azure and AWS fails... Azure suddenly doubles in load ( and also fails due to everybody piling on unexpectedly).
  the other being:
  2) Everybody uses Azure for SQL and AWS for hosting and Azure goes down... suddenly SQL dies and the AWS hosts all fail with the database down. Or the converse happens and AWS goes down and the SQL is useless without a head.
  The more services you rely on the more likely that on any given day one of them will be down. If you have 99% reliability and 20 services that you depend on (without any redundancy) then your failure rate could be up to 20% since any one of the 1% failures could kill your service.
  It's interesting but it seems like most of the cloud failures have been due to #1 internally so far. One sector fails and in an effort to load balance it starts taking out its peers who then also overload and take out their peers.
The analogy the author uses doesn't work. by stephanruby · 2012-06-09 15:34 · Score: 4, Insightful

The analogy the author uses doesn't work.
A better analogy would be the airline industry. The airline industry likes to over-book airplane seats it may not have because it's always trying to optimize its profit-margin.
The same will happen with cloud-services. Cloud-services will always try to optimize their own profit-margins, at the risk of triggering significant outages.
And I don't see what this has to do with the financial crisis at all.
1. Re:The analogy the author uses doesn't work. by pitchpipe · 2012-06-09 16:02 · Score: 4, Insightful
  
  A better analogy would be the airline industry.
  I think a better analogy is the power grid. System hits a peak, one line goes down, others try to compensate becoming overloaded, another can't handle the load and goes down, and behold: cascading failures.
  
  --
  Look where all this talking got us, baby.
Low hanging fruit of a research piece by mcrbids · 2012-06-09 15:39 · Score: 4, Interesting

Efficiency normally comes with economies of scale. As a partner in an outsourced vertical software company, we have hundreds of clients running in our highly tuned hosting cluster, and are able to bring economies of scale to an otherwise ridiculously expensive software niche. Yes, that means that if we have an outage, all of our clients experience an outage as well.
However, we have carefully laid plans for multiple recovery points in a disaster scenario, (Plan B, Plan C, Plan D, etc) and have maintained an uptime significantly better than our clients would typically attain if left to their own devices. We easily manage close to 4 nines of uptime in an industry where the average is realistically around 2 nines. (having "the computer is down" a day or two every year or so is typical)
Although the Internet is a "network of ends" the truth is that not all ends are created equal. Having a high quality, high speed (100 Mb), reliable (99.99%+) Internet feed in my small-ish hometown of around 80,000 people is ridiculously expensive. But in a nearby city (500,000 people 2 hours' drive) we host our servers in a tier 1 colo at 1/10th the cost of running it all ourselves, with dramatically improved reliability and network performance.
Yes, putting all your eggs in one basket means that if that basket fails, you lose all your eggs. But it also makes it easy to buy just one, really nice basket that won't break and lose your eggs.

--
I have no problem with your religion until you decide it's reason to deprive others of the truth.
just like mainframes by Dan667 · 2012-06-09 16:05 · Score: 4, Insightful

I think it is funny that lessons learned years ago with mainframes are being presented as new by just changing the word mainframe to cloud.