Slashdot Mirror


Microsoft Azure Outage Across the Globe

hawkinspeter writes: The BBC reports that overnight an outage of Microsoft's Azure cloud computing platform took down many third-party sites that rely on it, in addition to disrupting Microsoft's own products. Office 365 and Xbox Live services were affected.

This happened at a particularly inopportune time, as Microsoft has recently been pushing its Azure services in an effort to catch up with other providers such as Amazon, IBM, and Google. Just a couple of hours previously, Microsoft had screened an Azure advert in the UK during the Scotland v. England soccer match."
(Most services are back online. As of this writing, Application Insights is still struggling, and Europe is having problems with hosted VMs.)

13 of 167 comments (clear)

  1. Yawn ... by gstoddart · · Score: 5, Insightful

    Cloud fail, like nobody saw that coming.

    If you don't own and operate your own infrastructure, you're at the mercy of someone else.

    And clearly that someone else can't guarantee you robustness with this magic cloud.

    All of these people who say "awesome, because, cloud" -- well, I have yet to be convinced that any of these vendors can provide as much uptime and reliability as a decent IT department.

    I suggest we start calling it Clown Computing -- you cram a lot of Clowns into a tiny little car, and hope it keeps going.

    When something goes wrong, hilarity ensues.

    --
    Lost at C:>. Found at C.
    1. Re:Yawn ... by i+kan+reed · · Score: 5, Interesting

      Yeah, but it's never really been about the reliability. It's always been the "not paying your own IT maintenance staff" thing that's the big draw.

    2. Re:Yawn ... by dontbemad · · Score: 4, Insightful

      Once again, missing the point. In my (small) shop, by using azure (which has worked well for us), we avoid having to use money to hire admins to maintain any sort of in house servers we might have. We can then put that money towards more developers (or better salaries for us current devs), as well as paying for training, nicer dev machines, etc. At the same time, if we do have a problem with any sort of hosted service through azure, support is literally a phone call away, and I can't remember the last time a resolution didn't happen within a couple hours.

      Sure, cloud computing has its short-comings. But it has also allowed a litany of small companies who simply can't afford to own their own infrastructure to do business.

    3. Re:Yawn ... by Crudely_Indecent · · Score: 4, Interesting

      There is something you can do about all of those conditions.

      With cloud, you just wait for the rain (outage). You can pray (call an outsourced tech support department) for it to stop raining (services restored), but until god (cloud provider) decides the rain is done (fixes the problem), you're getting wet (offline).

      That gives me a new "Cloud" tagline:

      Cloud - We will definitely rain on your parade.

      --


      "Lame" - Galaxar
    4. Re:Yawn ... by serviscope_minor · · Score: 4, Insightful

      Sure, but when you have outages and stability issues which impact your business, is it really a good trade off?

      Of course it is. Outsource to the cloud and cut the quarterly costs massively by laying off staff. Get a big bonus. Possibly share options go up due to better profits and blathering to the shareholders about the cloud. Sure 3 years down the line it might tank for a few days and in one fell swoop wipe out all the savings and then some.

      Not my problem, I'll be long gone.

      So is it worth it? Hell yes!

      --
      SJW n. One who posts facts.
    5. Re:Yawn ... by Bengie · · Score: 4, Insightful

      There are many reasons to use the cloud.

      1) You're too small to afford enough full time IT
      2) You can't afford the capital investment into your own servers
      3) You need a low latency global CDN like service, but you can't afford dedicated servers running everywhere
      4) You need only temporarily need to scale up your servers to handle burst load
      5) I'm sure there are other reasons.

    6. Re:Yawn ... by Trailer+Trash · · Score: 5, Interesting

      Let me explain it from my point of view. I own and operate a one or two man software company that also hosts web sites. I work in the flim & tv music industry, meaning I have a shit load of music (literally terabytes) that has to be available for download.

      8 years ago I owned a rack of servers downtown here that I managed myself. Honestly, it wasn't that bad. I bought reliable used 1U servers (mainly IBM and Dell) off ebay and stocked them with disks. I ran FreeBSD and Linux, used RAID, etc. But I always had two issues to deal with. The main one was "I have to always be available to handle hardware issues".

      My company isn't big enough to hire someone to do it, but I managed for nearly 10 years with no disasters. In that time I had a motherboard crap (when I was starting out with one server - ouch) and a few disks fail. In all of those times I had to go in - sometimes in the middle of the night - and fix/replace whatever was wrong.

      Then I found Amazon AWS. Here's the kicker - it was actually cheaper for me to simply "rent" storage from them than to rent rack space for my own servers. I moved my servers to linode.com - again it was cheaper although they're nowhere near as fast as my former dedicated servers were, but they're fast enough for my applications and I can always move to larger instances where needed. And that eliminated my maintenance issues for hardware while costing less per month and maintaining the same 3-4 nines level of availability that I've always had. Oh, one other thing - S3 makes it just as easy to secure my audio files but the delivery speed can easily saturate any pipe that the files are being delivered to.

      So the cloud might not be "magical" and solve all the world's problems, but for small IT shops it's great. Everything I do is on the internet so the whole "what if your connection goes down?" issue doesn't exist for me. I do not recommend such a solution for everybody. I have clients in the industrial wholesale space and their inventory & sales system definitely should be on-site with off-site backups. But their web site can be hosted elsewhere.

      Anyway, yes, the "cloud" is very useful for many businesses.

    7. Re:Yawn ... by nine-times · · Score: 4, Insightful

      Yes, the "cloud" servers sometimes have outages. So do managed hosting providers. So do internal servers. And frankly, although every business thinks that what they're doing is super-important and they can't afford even the briefest outage, the fact is that most businesses can.

      If Azure or AWS go down for an hour, it makes news and everyone freaks out because a lot of people are using them. If your business's server goes down for an hour, it does not make news, and people don't freak out. But for the business experiencing that 1 hour of downtime, what difference does it make whether they own the hardware or it's in "the cloud".

    8. Re:Yawn ... by Jaime2 · · Score: 4, Insightful

      The calculations are simple when you assume the cloud will fail and your infrastructure will not. A real tradeoff calculation has to include estimates of the reliability of both scenarios. The answer to "Is it really a good trade off?" will be entirely based on estimates and opinions. I'm not saying you're wrong, I'm just saying that the math does not spit out "no-brainer".

      Some cloud providers will even give you SLAs with real money behind them. So, they could conceivably come up with a no-brainer deal where the cloud provider guarantees your $80,000 every day, whether it's from having your business up and running or writing you a check.

    9. Re:Yawn ... by uncqual · · Score: 4, Insightful

      However, in a widespread outage like this, I'll bet the big cloud providers have a better record of rapid recovery than their customers had in-house. By necessity, MS, Amazon et al have very competent engineers who know the product well available to pull off what they are doing (including sleeping) and jump into any really serious problem. There simply are not enough such engineers to go around all the mid-sized IT organizations in the world nor interesting enough work to keep these engineers interested and sharp at most of these IT organizations (to say nothing of the cost of keeping such engineers around).

      For a car analogy... When your high end car has a nagging problem that your local mechanic can't figure out, the dealer often can figure it out quickly, possibly with the help of a factory specialist who deals with (say) ECUs on only this make all day, every day. Rarely can an independent mechanic specialize enough to come close to the factory specialists in diagnosis. Now, if your car just has a dead battery, your local mechanic may give you faster, better, and cheaper service than the dealer.

      --
      Why is there an "insightful" mod and why isn't it "-1"? If I wanted insight, I wouldn't be reading /.
  2. Re:Out of band patch.. by afidel · · Score: 5, Informative

    I installed it last night on all domain controllers after testing it in my isolated testing network. It's not really optional since it allows any domain user to become domain admin and the only resolution to that is a domain rebuild or authoritative restore. It's also already been seen in attacks in the wild so you can assume the next client to get driveby malware will be going for domain admin.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  3. Wow, I'd be pretty angry by ErichTheRed · · Score: 4, Interesting

    Everyone forgets that Azure is a way-beyond-massive Hyper-V implementation, and that AWS is a way-beyond-massive Xen-like-thing implementation. Even though both cloud providers let you be smart in designing your infrastructure (multi-site, redundancy, etc,...the tools are there) nothing will save you from an outage of the core guts of the system. Wasn't Azure's last failure due to a certificate expiration? There's no way an end customer can plan around that.

    I'm a big fan of the private or hybrid cloud version of this fad. You get all the good stuff that Azure and AWS customers get like dynamic provisioning and software defined networking, without having to rely on a third party. Unfortunately, CIOs and other execs just see the numbers on a spreadsheet and don't take the costs of outages that you can't control into account. Power fails, networks drop, and people do stupid things in on-site implementations also. But you can at least have your staff working on it with the incentive being "you get to keep your job." With a public cloud provider or even a hoster, the responsibility ends with "oops, here's 7 hours of free service" and you have to wait in line with everyone else.

  4. Re: Out of band patch.. by Eosi · · Score: 4, Insightful

    Interesting... What about all the Open SSL or SSH issues that happened this year, which in many cases were default as part of Linux servers???
    Regardless of OS, poor testing of third party apps / services or poor security as part of your deployment, can cause you to be violated. I have seen many Linux server still using Telnet or VNC for management, and allowing ROOT to login directly to them....
    Secure your environment regardless of what you run......