Slashdot Mirror


Microsoft Azure's Southern US Data Center Goes Down For Hours, Impacting Office365 and Active Directory Customers (geekwire.com)

New submitter courcoul alerted us to Azure outage, which is affecting several customers in many parts of the world: Some Microsoft Azure customers with workloads running in its South Central US data center are having big problems coming back from the holiday weekend Tuesday, after shutdown procedures were initiated following a spike in temperature inside one of its facilities. Around 230am Pacific Time, Microsoft identified problems with the cooling systems in one part of its Texas data center complex, which caused a spike in temperature and forced it to shut down equipment in order to prevent a more catastrophic failure, according to the Azure status page. These issues have also caused cascading effects for some Microsoft Office 365 users as well as those who rely on Microsoft Active Directory to log into their accounts. The cooling system is the most critical part of a modern data center, given the intense heat produced by thousands of servers cranking away in an enclosed area. More resources: The official status page of Azure; and third-party web tracking tool DownDetector's assessment. Further reading: Microsoft Azure suffers outage after cooling issue.

16 of 86 comments (clear)

  1. This is why by Anonymous Coward · · Score: 5, Insightful

    I do not like software that requires you to phone home to the mothership. The second something go wrong outside of your control it borks all your work. Office 365 is a bad joke if I have ever seen one.

    Aside: Yes I know video games do this a lot but games are games and work is work.

  2. Comment by WallyL · · Score: 4, Interesting

    My employer was affected. Many employees could not authenticate to our third-party webapps because we use whatever the cloud-provided Active Directory SSO solution is. Ah, well. I wonder if this violated SLAs and we get some money back... My company is always concerned about not violating our SLAs to our customers (Saas), so hopefully we extract the same pound of flesh from our vendors.

    1. Re:Comment by Darinbob · · Score: 2

      Being dependent upon "the cloud" is not a good thing, and yet so many companies are throwing out their brains and signing up in the hope to reduce costs. The company that recently purchased my previous employer is in whole hog for Microsoft, Microsoft 360, Microsoft cloud, and anything with the word Microsoft attached, most of it all online only. To read some corporate announcements I have to log into a third party site which just seems absurd to me. When the cloud servers eventually get their inevitable downtime, I predict a lot of hand wringing.

      I haven't seen this level of slavish devotion to a single vendor since the IBM administration.

    2. Re:Comment by hawguy · · Score: 3, Interesting

      Being dependent upon "the cloud" is not a good thing, and yet so many companies are throwing out their brains and signing up in the hope to reduce costs. The company that recently purchased my previous employer is in whole hog for Microsoft, Microsoft 360, Microsoft cloud, and anything with the word Microsoft attached, most of it all online only. To read some corporate announcements I have to log into a third party site which just seems absurd to me. When the cloud servers eventually get their inevitable downtime, I predict a lot of hand wringing.

      I haven't seen this level of slavish devotion to a single vendor since the IBM administration.

      For most small to mid-sized businesses, "the cloud" is more reliable than any solution they'd be willing to pay for. I don't know Microsoft's redundancy model, but AWS's multi-AZ model gives much more redundancy than most businesses would build themselves -- even more so for multi-region redundancy since most companies aren't going to spend the money to duplicate their production environment in another region on the other side of the country (or world).

      Though the side effect of using a cloud provider is that when a major cloud provider goes down, so do a *lot* of businesses -- but that doesn't mean they would have been better off building their own datacenter.

  3. Cloud - I don't think that word... by RelaxedTension · · Score: 4, Insightful

    I don't think that word means what they think it means. They need to rethink their distributed model if one data center takes down customers. Isn't the pitch for those services that they basically bulletproof for businesses?

  4. Same old same old by Tough+Love · · Score: 2
    --
    When all you have is a hammer, every problem starts to look like a thumb.
  5. Irony by gregarican · · Score: 3, Funny

    The most ironic part is that the Azure Support Twitter account keeps pointing customers to the Azure status page. Which also happens to be down with 503 errors. Guess they could e-mail for support, unless they are using Office 365. Or request help via the Management Portal, but guess that's down too. lol.

  6. Re:4.5 hours downtime! by mu51c10rd · · Score: 2

    On the positive note, as least you can blame the outage on Microsoft and not take the heat yourself for Exchange crashing and being down for 4.5 hours.

  7. Minor correction by sjames · · Score: 5, Funny

    It is now "Office 364"

    1. Re:Minor correction by Anne+Thwacks · · Score: 2

      Not so much "cloud" as "smog".

      --
      Sent from my ASR33 using ASCII
  8. Texas? by Anne+Thwacks · · Score: 3, Funny

    You have something that needs to be cool, and you put it in Texas?

    --
    Sent from my ASR33 using ASCII
  9. Re:And this... by sjames · · Score: 2

    Or the software you need to view them or the services you need to keep machines on your LAN running.

  10. Infrastructure management by klubar · · Score: 2

    Isn't this what backup generators and N+1 infrastructure is for? I can understand Joe's hosting and bait shop emporium going down, but power and HVAC are pretty well solved sciences. The weather in Texas is hot -- this is not a surprise. There are lightning storms in Texas, this is also not a surprise.

    It seems like if you a positioning a data center in Texas (which there as some reasons for), you prepare for both heat and lightning. I could understand if there was an incredibly unusual weather event (asteroid landing on data center, or death rays from the moon) but hot is not unusual in Texas.

    However, when major cloud service providers it does provide an excuse for everyone else who manages a data center -- even the biggest cloud provider occasionally has an outage, so when our data center has an issue it's no worse. We say thank you Microsoft/AWS/Joe's!

    1. Re:Infrastructure management by Anonymous Coward · · Score: 2, Interesting

      I maintain HVAC for cell sites. EVERYONE I've worked on had TWO independent HVAC systems.

      They toggle back and forth to equalize wear and tear, but when one fails the other system takes control and sends me an email asking for attention.

      There is no reason in the world for a data center NOT to have multiple HVAC systems in place. The equipment is pocket change compared to the electronics it protects.

      It could be M$ should put more thought in the design of their data centers than was put into Win95 or Vista.

      Just saying....

    2. Re:Infrastructure management by sjames · · Score: 2

      N+1 is common. So you take the total cooling need and divide by (for example) 4, then install 5 systems of that size so you can lose one and be at full capacity. That's necessary anyway since you may need to shut one down for routine maintenance from time to time.

      Ideally you don't let everything be inter-dependant so if you lose 2, you can still get by with shutting down 1/4 of the hardware.

  11. There is no cloud.... by klubar · · Score: 2

    There is no cloud, just other people's computers.

    Back in my day, we called this time sharing. Now you kids get off my lawn.