Skype Blames Microsoft Patch Tuesday for Outage
brajesh writes to tell us that Skype has blamed its outage over the last week on Microsoft's Patch Tuesday. Apparently the huge numbers of computers rebooting (and the resulting flood of login requests) revealed a problem with the network allocation algorithm resulting in a couple days of downtime. Skype further stressed that there was no malicious activity and user security was never in any danger.
Care to elaborate, Hercule Poirot?
"You don't need a weatherman to know which way the wind blows." - Bob Dylan
It was just a few days ago the Open Source elders asked people to stop bashing Microsoft. Skype did not blame Microsoft for the outage. They admitted the fault was in their software. We are not children here or part of a cult. This type of child play is no appreciated here.
Skype network was overloaded by the zillions of Windows PCs rebooting after the patch installations.
Skype Blames Microsoft Patch Tuesday for Outage
For the love of God editors, I understand that it is fine to write a sensationalist title on some articles but that is blatant FALSE. It is a complete LIE. People at Skype specifically stated that the fault was in *their* log-in mechanisms.
Really this kind of journalism is disgusting... I am tagging this story as LIE which I hope other people do as well, unless editors change the title.
I find hard to believe Slashdot has got so low... this and the speculative digg-like "articles" ending with a question mark "?", What the fuck.
Ubuntu is an African word meaning 'I can't configure Debian'
Perhaps it would be troubling if they were blaming Microsoft. In this case they explained that the large number of simultaneous reboots and subsequent logins simply stressed their servers. They further stated that their "self healing" did not function as designed. It is strange that earlier "patch Tuesdays" did not cause this to occur, but as I write code I find that many behaviors I see in my applications are strange until I truly understand their root cause. It may have been that the software was resilient to a point and then just fell over. Perhaps the point that it fell over was when the "self healing" kicked in and hit its fatal bug.
Load testing is hard. I know. I used to do it. It is hard to anticipate what your peak load might be. It can also be hard to generate the right kinds and volumes of loads that your service might experience. Proper load testing requires a realistic test bed with enough machines running client simulation scripts to sufficiently load the machine. This requires a deep understanding from management that spending large amounts of money on non-production systems is essential. Your setup might deal with some kinds of load well and fail on others. Perhaps Skype had considered what might happen during a natural disaster with a large number of calls originating at the same time, but neglected to see login as a significant risk, especially if they had weathered that storm before.
My least proud moment in quality assurance was seeing my company's service go down for a weekend due to excessive database load. We had a new version of our web service software that required significant database changes to each user account (including database structure redesign...go ahead and wade through that hard book on database principles before you start coding my friends...funny its what I'm doing right now as I go from QA dude to programmer). We made an upgrade script that ran when each user logged in, which brought the user's data up to date with the current version of our software. The thing is I knew about the risk, measured a high load at user login, notified engineering about the potential problem, but didn't demand that the upgrade be placed on hold until the issue could be better quantified. Ah, live and learn.
-Jon
Arent people usually complaining that windows userd doesnt install the security patches? now people complain that they actually DO install them... WHEN OH WHEN is people satified?
It just goes to show that you DON'T have control over your machine when it's running Microsoft Windows and it's on the internet. We have seen problems that result from this level of consumer trust in Microsoft before. I just have to wonder how much more will consumers tolerate? Seems like plenty since most people thing that anything Microsoft does is normal.
How do you know your phone service has never been out in 60 years? Do you monitor it? How many calls a day do you make? Are you home 24/7 and do you use the phone all the time, as in more than 10,000 minutes per month?
Sure, you've never been affected by an outage of your phone service, but that doesn't mean it hasn't been out of service ever.
Plus, you pay for it too. At $30-40/month per line, you expect minimal outages. When you are paying $30/year or even nothing, a two day outage, while annoying, isn't surprising, especially when operated on a public network. Your phone line is on a private, dedicated network. You simply can't compare the two when it comes to uptime.
If all of Skype's customers paid $30-40/month, I'm much more confident that they wouldn't have had this outage.
TossableDigits.com: Temporary Phone Numb
There's a difference between a reason and an excuse. The *reason* the network went down was related to the MS patches. That's not an excuse -- Skype admits there is no excuse, and is now fixing their code.
Isn't this how it's supposed to work?
Under this circumstance, I think it was funny, that they recommended leaving the client running in order to reconnect automagically again once the login service was fixed. Sounds like a bad idea while having login issues...