Slashdot Mirror


Microsoft Azure Failure: SSL Certificates Were Updated... Sort Of

judgecorp writes "Microsoft has published an explanation of the failure of Windows Azure earlier this month. Users of the Azure storage saw that an SSL certificate had expired. Microsoft's explanation says that the certificate had in fact been renewed, but an update with the new certificate details was not prioritized, and hadn't actually been implemented till after the old certificate expired. There are more interesting details, but Microsoft says better alerts and more automation will stop this particular fault happening again."

21 of 103 comments (clear)

  1. Re:When will they accept Windows 8 as a failure? by theRunicBard · · Score: 5, Insightful

    Look, I know nobody cares, but Microsoft Azure has nothing to do with Windows 8. I'm also not sure it's a failure. Microsoft tried something new after getting great positive reviews for Windows 7, which is the BEST time to try something risky. Worst case, people skip one generation of Windows, and stick with... Windows. Best case, you redefine the PC interface. It is innovative, no matter how poorly implemented. Besides, Microsoft has a history of creating a shitty first version and then fixing kinks as time goes by. Was anyone expecting a good first version of Metro? The slow adoption numbers can easily be credited to how good Windows 7 is. Why would you switch? It costs $0 for me to stay on 7, and > $0 to upgrade. We won't be seeing many Windows 8 devices for a while. The timely upgrades brought about by Windows Blue might even spur more adoption (too early to tell, I think). Windows Phones I won't attempt to defend since I know nothing of them.

  2. It won't happen again by Nerdfest · · Score: 3, Insightful

    Unless I'm horribly mistaken, they've let certificates expire before. Why would I think they won't let it happen again?

    1. Re:It won't happen again by phantomfive · · Score: 4, Interesting
      Yeah, and they also had the Sidekick outage with actual data loss. A lovely quote from that article:

      "I asked Microsoft for comment Saturday when I was writing this, in particular as to how the rest of its cloud might differ from the Danger set up. Microsoft said Sunday that its the fabric controller that manages the Azure service is built with redundancy in mind. "

      It may be built with redundancy in mind, but apparently it still has at least one single point of failure.

      --
      "First they came for the slanderers and i said nothing."
    2. Re:It won't happen again by 93+Escort+Wagon · · Score: 4, Informative

      Some of us remember when they forgot to renew hotmail.com. I'd say that might be worse...

      --
      #DeleteChrome
    3. Re:It won't happen again by Anonymous Coward · · Score: 5, Insightful

      I always back up my cloud data to a local harddrive, just to be safe.

    4. Re:It won't happen again by girlintraining · · Score: 2

      It may be built with redundancy in mind, but apparently it still has at least one single point of failure.

      Yeah. It's the same single point of failure present in every IT project: It's called The Manager, and it goes something like this:

      Engineer: "I sent you the e-mail!"
      Manager: "Oh? I never got it."
      Users: "Oh f---."

      --
      #fuckbeta #iamslashdot #dicemustdie
    5. Re:It won't happen again by phantomfive · · Score: 3, Interesting

      Maybe. It seems to me that if the engineers have let the manager become powerful enough to be a single point of failure, they've designed the system wrong.

      --
      "First they came for the slanderers and i said nothing."
    6. Re:It won't happen again by girlintraining · · Score: 3, Insightful

      Maybe. It seems to me that if the engineers have let the manager become powerful enough to be a single point of failure, they've designed the system wrong.

      You're fired. Anyone else have a problem with the manager?

      --
      #fuckbeta #iamslashdot #dicemustdie
    7. Re:It won't happen again by phantomfive · · Score: 4, Insightful

      Of course I don't tell THEM that, I just build around them and let them think they are useful. Sometimes we have meetings whose sole purpose is to affirm the usefulness of the manager.

      --
      "First they came for the slanderers and i said nothing."
    8. Re:It won't happen again by girlinatrainingbra · · Score: 3, Informative

      Nice! I like the fact that it was a linux user who paid the renewal fee and got passport.com back up again, allowing further logins into hotmail. Linky to credit card receipt of individual user : The lapse, which was first reported on the Internet news service Slashdot.org, was apparently caused when Microsoft's registration for the Passport.com domain name expired sometime Dec. 24, Chaney said. The Passport.com site verifies user identification and passwords for access to Hotmail and about 25 other services, according to Chaney. Chaney said he paid the bill Dec. 25 at about 2 p.m. EST and was given invoice #11395965 documenting the transaction. An electronic copy of the receipt can be viewed at his Web site at "www.doublewide.net."

    9. Re:It won't happen again by Alioth · · Score: 2

      Certificates need an expiry for the same reason that passwords ought to have them. The probability that a certificate has fallen into unauthorized hands increases with the passage of time, so having certificates expire means you can limit the usefulness of a stolen certificate.

    10. Re:It won't happen again by wvmarle · · Score: 2

      Interesting you mention expiry dates on passwords as plenty of security people will argue that having expiry dates on passwords tends to decrease the security of passwords, as people select easier ones.

      Having a multi-year expiry date pretty much beats the purpose: after falling in the wrong hands the certificate is useful only until it's detected that it's in the wrong hands. And that's usually not very long after it's being used.

      And a short expiry date (weeks, months) where it may actually have an effect on stopping unauthorised people to use it, is so inconvenient that no-one uses such short periods. In case of passwords that have to change monthly, expect many users to have a password like "March03" for this month. Or "March!03". Dictionary word? Well "hcraM!03" should circumvent that one, too. So much for security. And yes, first-hand experience on that part.

    11. Re:It won't happen again by Trailer+Trash · · Score: 2

      And I would have renewed their cert had I been able.

      Look, the bottom line is that they haven't learned anything in the past 13 years (wow, I feel old). The sloppiness that allowed a domain registration to lapse is the sloppiness that allows a cert to expire. This is a cultural issue that will likely never be overcome.

      To step into another industry, let's look at phone service. The "Phone Company" (AT&T back in the day, then the baby bells) had a culture of "this service has to work, period". I'm 45 today and there have been now 3 times in my life that I've picked the phone up and there wasn't a dial tone. In our parlance, they have good "up time".

      The cable company, on the other hand, has never had that culture. Their product isn't necessary in the way that a phone is and outages are fairly common. If they have to work on something your service may be unexpectedly down for a few minutes. Or hours. Whatever. My internet has more outages every month than my phone service has had over my lifetime.

      The point is that Microsoft's culture is more like the cable company. They are a software company and having to keep servers "up" hasn't been their deal until kind of recently. Companies like Amazon or Google, on the other hand, have had to have a phone-company-like culture from the beginning. They write software, yes, but their main product is a web site that has to be up come hell or high water. And, yes, I know that there have been a couple of high-profile outages, but those outages weren't caused by the kind of sloppiness that results in someone forgetting to renew a domain registration.

      So I use Amazon for my stuff but would switch to Google if there were problems at Amazon.

      But Microsoft? Are you kidding? I feel like I diapered them on Christmas Day of 1999, so I probably have less respect due to that.

    12. Re:It won't happen again by frinkster · · Score: 3, Interesting

      None of which can claim to be better than 99.999% uptime, since it's practically impossible to achieve.

      Having worked for half a decade on mobile communications infrastructure that regularly exceed 99.999% uptime, I feel qualified to say that it is neither impossible nor super difficult. If it is a goal and you are willing to spend a lot of money than you can accomplish it.

      But nobody is going to pay $X for 99.99999% uptime when 98% uptime is available for $X / 100 unless they are forced to. Look at all of the various highly-funded internet services that go down completely when a single Amazon data center has an outage. They aren't even willing to pay a little bit extra and do the extra work to make their services run on multiple data centers at a time. Clearly, it is not a requirement of the venture capital that they are getting.

  3. Typical scenario of ... by Skapare · · Score: 4, Insightful

    ... managers saying "we need to get this up and running sooner ... automating it reliably is hard to do ... just get it working and update things manually for now and we will automate it later". When later comes, everyone is working on something else.

    --
    now we need to go OSS in diesel cars
  4. Re:When will they accept Windows 8 as a failure? by phntm · · Score: 3, Informative

    the adoption rates for students who get windows 8 for free is non existant at least by the anecdotal evidence in my faculty (computer science).
    even during exam season (when you suddenly get the urge to clean the room, re-check the fridge or format your laptop).
    you can piss on my face but don't tell me it's raining.

  5. Car analogy... by girlinatrainingbra · · Score: 2
    Read the MSDN blog for how screwed up this really was. Here's the car analogy: We have a "Secret Store" that tells "the team that owns the tires" that the tires are just about worn out and that they will be useless on a certain specific date. The "team that owns the tires" buys new tires and tells the "Secret Store" that new tires have been bought. But the team does not install the new tires, but places the task of installing the tires in an "unprioritized queue"!!!! Somehow, more important tasks like replacing the windshield washer fluid and replacing that pine-tree air freshener hanging off the mirror get prioritized on the queue and performed. Lo and hehold, the tires get too old, expire, and are taken off of the car. No one bothers putting new tires on the car. The car is nonfunctional. MS FTW, yet again!

    It's incredible how they keep shuffling blame around, or hot-potato-ing it:

    In this case, the Secret Store service notified the Windows Azure Storage service team that the SSL certificates mentioned above would expire on the given dates. On January 7th, 2013 the storage team updated the three certificates in the Secret Store and included them in a future release of the service. However, the team failed to flag the storage service release as a release that included certificate updates. Subsequently, the release of the storage service containing the time critical certificate updates was delayed behind updates flagged as higher priority, and was not deployed in time to meet the certificate expiration deadline. Additionally, because the certificate had already been updated in the Secret Store, no additional alerts were presented to the team, which was a gap in our alerting system. [source link] [bold emphasis mine]

    Laughable, if it were not so stupid.

  6. Re:When will they accept Windows 8 as a failure? by DaveV1.0 · · Score: 2

    Guess what. Almost nobody cares that it comes with a secure bootloader. The only people who do care are a small number of geeks.

    --
    There is no "-1 offended" or "-1 you don't agree with me" mod options for a reason.
  7. Re:When will they accept Windows 8 as a failure? by RaceProUK · · Score: 2

    When you charge an arm and a leg for an OS and your company basically has unlimited money, then there is no excuse for not delivering perfect software with no bugs. So yes I was expecting a perfect version of Metro.

    The cost of certifying a modern OS totally bug-free would exceed the GDP of the entire world, hundreds of times over.

    --
    No colour or religion ever stopped the bullet from a gun
  8. Re:When will they accept Windows 8 as a failure? by X0563511 · · Score: 2

    Almost nobody cares about a lot of things that matter a great deal.

    --
    For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
  9. Re:When will they accept Windows 8 as a failure? by phantomfive · · Score: 2

    You are wrong. There is nothing compiled for OSX before 2005 that still works on their most recent OS. The shift to 64 bit is further causing Apple to remove public APIs. Apple has demonstrated again and again they have no commitment to backwards compatibility, and there is nothing you can do as a programmer to avoid it.

    --
    "First they came for the slanderers and i said nothing."