Slashdot Mirror


Microsoft Azure Outage Across the Globe

hawkinspeter writes: The BBC reports that overnight an outage of Microsoft's Azure cloud computing platform took down many third-party sites that rely on it, in addition to disrupting Microsoft's own products. Office 365 and Xbox Live services were affected.

This happened at a particularly inopportune time, as Microsoft has recently been pushing its Azure services in an effort to catch up with other providers such as Amazon, IBM, and Google. Just a couple of hours previously, Microsoft had screened an Azure advert in the UK during the Scotland v. England soccer match."
(Most services are back online. As of this writing, Application Insights is still struggling, and Europe is having problems with hosted VMs.)

7 of 167 comments (clear)

  1. Re:Yawn ... by i+kan+reed · · Score: 5, Interesting

    Yeah, but it's never really been about the reliability. It's always been the "not paying your own IT maintenance staff" thing that's the big draw.

  2. You Still Need Geographic Diversity by digsbo · · Score: 3, Interesting

    Just like the Amazon AWS failure that took down Netflix, architecting your cloud infrastructure for geographic diversity can significantly reduce the likelihood of these kinds of outages.

  3. Wow, I'd be pretty angry by ErichTheRed · · Score: 4, Interesting

    Everyone forgets that Azure is a way-beyond-massive Hyper-V implementation, and that AWS is a way-beyond-massive Xen-like-thing implementation. Even though both cloud providers let you be smart in designing your infrastructure (multi-site, redundancy, etc,...the tools are there) nothing will save you from an outage of the core guts of the system. Wasn't Azure's last failure due to a certificate expiration? There's no way an end customer can plan around that.

    I'm a big fan of the private or hybrid cloud version of this fad. You get all the good stuff that Azure and AWS customers get like dynamic provisioning and software defined networking, without having to rely on a third party. Unfortunately, CIOs and other execs just see the numbers on a spreadsheet and don't take the costs of outages that you can't control into account. Power fails, networks drop, and people do stupid things in on-site implementations also. But you can at least have your staff working on it with the incentive being "you get to keep your job." With a public cloud provider or even a hoster, the responsibility ends with "oops, here's 7 hours of free service" and you have to wait in line with everyone else.

  4. Re:Yawn ... by Crudely_Indecent · · Score: 4, Interesting

    There is something you can do about all of those conditions.

    With cloud, you just wait for the rain (outage). You can pray (call an outsourced tech support department) for it to stop raining (services restored), but until god (cloud provider) decides the rain is done (fixes the problem), you're getting wet (offline).

    That gives me a new "Cloud" tagline:

    Cloud - We will definitely rain on your parade.

    --


    "Lame" - Galaxar
  5. Does MS offer any guarantees? by Anonymous Coward · · Score: 2, Interesting

    I mean, what happens now? If I use Azure in my business, and because this outage I have lost x dollars in business transactions that i could not carry out, is MS going to compensate me in any way? Or is Azure one those services that comes without any guarantees?

  6. Re:Yawn ... by Trailer+Trash · · Score: 5, Interesting

    Let me explain it from my point of view. I own and operate a one or two man software company that also hosts web sites. I work in the flim & tv music industry, meaning I have a shit load of music (literally terabytes) that has to be available for download.

    8 years ago I owned a rack of servers downtown here that I managed myself. Honestly, it wasn't that bad. I bought reliable used 1U servers (mainly IBM and Dell) off ebay and stocked them with disks. I ran FreeBSD and Linux, used RAID, etc. But I always had two issues to deal with. The main one was "I have to always be available to handle hardware issues".

    My company isn't big enough to hire someone to do it, but I managed for nearly 10 years with no disasters. In that time I had a motherboard crap (when I was starting out with one server - ouch) and a few disks fail. In all of those times I had to go in - sometimes in the middle of the night - and fix/replace whatever was wrong.

    Then I found Amazon AWS. Here's the kicker - it was actually cheaper for me to simply "rent" storage from them than to rent rack space for my own servers. I moved my servers to linode.com - again it was cheaper although they're nowhere near as fast as my former dedicated servers were, but they're fast enough for my applications and I can always move to larger instances where needed. And that eliminated my maintenance issues for hardware while costing less per month and maintaining the same 3-4 nines level of availability that I've always had. Oh, one other thing - S3 makes it just as easy to secure my audio files but the delivery speed can easily saturate any pipe that the files are being delivered to.

    So the cloud might not be "magical" and solve all the world's problems, but for small IT shops it's great. Everything I do is on the internet so the whole "what if your connection goes down?" issue doesn't exist for me. I do not recommend such a solution for everybody. I have clients in the industrial wholesale space and their inventory & sales system definitely should be on-site with off-site backups. But their web site can be hosted elsewhere.

    Anyway, yes, the "cloud" is very useful for many businesses.

  7. Re:Yawn ... by Dutch+Gun · · Score: 3, Interesting

    I don't think anyone is disputing that hosted online services are both useful and, in some cases, absolutely essential, especially for smaller businesses. Well, maybe some people are, but they're pretty much Luddites, so we can ignore them. It's just that in the rush to push everything to the cloud since that's seen as some sort of panacea, people tend to forget that there are serious consequences to outages, and the more you push services to the cloud, the greater the impact of those outages will be. It's essentially putting all your technological eggs in one basket.

    As much as people complain about proprietary file formats, those really don't hold a candle to proprietary services as far as vendor lock-in. If the service you chose, for instance, starts to go south on a regular basis, and you've built your entire ecosystem inside a specific vendor's cloud, you could be in a world of hurt.

    That being said, my feeling is that these sorts of system-wide outages are simple part of these services growing pains. Even now, keep in mind that these sorts of large-scale failures are rare enough that they make international headlines. In another five to ten years, it's going to be even rarer still. Otherwise, fewer large players will trust them for critical infrastructure over the long haul. For smallish businesses, even with occasional outages, it's still probably a net win.

    --
    Irony: Agile development has too much intertia to be abandoned now.