Researcher: Interdependencies Could Lead To Cloud 'Meltdowns'
alphadogg writes "As the use of cloud computing becomes more and more mainstream, serious operational 'meltdowns' could arise as end-users and vendors mix, match and bundle services for various means, a researcher argues in a new paper set for discussion next week at the USENIX HotCloud '12 conference in Boston. 'As diverse, independently developed cloud services share ever more fluidly and aggressively multiplexed hardware resource pools, unpredictable interactions between load-balancing and other reactive mechanisms could lead to dynamic instabilities or "meltdowns,"' Yale University researcher and assistant computer science professor Bryan Ford wrote in the paper. Ford compared this scenario to the intertwining, complex relationships and structures that helped contribute to the global financial crisis."
If you have a critical service, have it at more than one host... That way when AWS has a bad hair day, you are still up.
Or, have your entire business totally dependent one someone else. (Sounds kinda scary that way, don't it?)
The analogy the author uses doesn't work.
A better analogy would be the airline industry. The airline industry likes to over-book airplane seats it may not have because it's always trying to optimize its profit-margin.
The same will happen with cloud-services. Cloud-services will always try to optimize their own profit-margins, at the risk of triggering significant outages.
And I don't see what this has to do with the financial crisis at all.
we live in an age where information is distributed, even if statistical. (hell I made a fake Facebook account and somehow they found my mom, and she is no where close to me) a meltdown of information can't happen unless there is a world wide melt down of power. we have backups, but also ways of statistically restoring those backups.
Redundancy helps but it is not bullet-proof. A good chunk of it is the "topology" in which this redundancy is engaged in events of failure.(e.g. we had cascading blackouts in the past even if the energy network had enough total power to serve all consumers)
Have a look on cascading failures.
Questions raise, answers kill. Raise questions to stay alive.
I think it is funny that lessons learned years ago with mainframes are being presented as new by just changing the word mainframe to cloud.
Unmanaged systems are hard to manage.
It's a leap year, February 28, and all over the world, completely out of the blue (or azure if you prefer) cloud clusters crash as the local clocks swing around to midnight, then stay down all day. :)
Still, it's three nines of uptime when it's spread out over a few years
A highly interdependant system is only as reliable as the QC on the weakest link. Who would have thought that somebody from a company that had a lot of embarrassing press about a leap year stuffup would make such a stupid and obvious mistake four years later? That's the cloud, where even the biggest names still don't care anywhere near as much as you would about your own systems and so don't pay enough attention to detail.