Slashdot Mirror


WGA Meltdown Blamed On Human Error

Erris writes "As commentators like Ars Technica slam WGA as deeply flawed, Microsoft is blaming human error and swears it won't happen again. 'Alex Kochis, Microsofts senior WGA product manager, wrote in a blog posting that the troubles began after preproduction code was installed on live servers. ... rollback fixed the problem on the product-activation servers within 30 minutes ... but it didnt reset the validation servers. ... "we didnt have the right monitoring in place to be sure the fixes had the intended effect"' Critics were not impressed. 'A system thats not totally reliable really should not be so punitive, said Gartner Inc. analyst Michael Silver. Michael Cherry, an analyst at Directions on Microsoft in Kirkland, Wash., said he was surprised that it was even possible to accidentally load the wrong code onto live servers ... [and asks], "what other things have they not done?' This is not the first time this has happened, either."

26 of 250 comments (clear)

  1. Have we gone backwards? by Ckwop · · Score: 4, Insightful

    This sort of ties in with what I was saying on IRC with my friends yesterday. My central point was that all operating system have got worse over the past ten years.

    I'm currently reading the Mythical Man Month (which I imagine most of you of heard of and already read) and in it he talks about the OS/360 operating system in great detail. I'm recalling this from memory so I'm sure someone will correct my mistakes but anyway, the machine had 2MB of memory and the operating system cost 400Kb of the memory. They charged something like $9.50 a month for 1Kb of system memory. That meant that every Kilobyte of memory saved was worth hundered or even thousands of dollars over the life time of the machine.

    It made me realise what is in retrospect a fairly obvious statement. The cost of the operating system on your hardware is an effect that should be minizimed. The operating system exists as a framework for runs tasks and applications, not for being a self-serving execuse to munch resources.

    While Moore's Law technically means something different; the adage has held true that computing power has doubled every eighteen months. This means that my machine which I bought in January should be roughly 100 times more powerful than the machine I had in 1997. Yet do I have hundred times more power to run my applications on a modern Operating System? Absolutely not.

    Strictly speaking, there are no tasks I do today that I couldn't do in 1997. I can be honest that computing hasn't really got easier since then either. There's the odd innovation here and there that's nice from a usability point of view, but fundamentally nothing has really changed. For an example, Office 97 and Windows 98 are no harder to use than XP and Office 2003. The addition of an extra monitor to my compute has impacted my productivity more than the choice of software in this period.

    In short, where did all these cycles go?

    Now Microsoft Vista is a sort of a post-modern operating system. In every sense it is a regression. It does not allow tasks to be managed easier yet requires an enormous amount of extra resources just to operate. WGA in a sense breaks the very stability of the system. The point of the OS is to perform tasks and applications yet Microsoft can take this away from you either by malice or stupidity.

    When are we going to demand more from OS vendors? When are we going to demand that future versions do the same as the previous version with less memory and less CPU overhead? Why do we pay to upgrade only to find our upgrades are wiped out by OS bloat? All of these are interesting questions, and while off-topic slightly, I'd like to see what you think!

    Simon

    1. Re:Have we gone backwards? by PeeAitchPee · · Score: 5, Interesting

      Strictly speaking, there are no tasks I do today that I couldn't do in 1997.

      Speak for yourself. Just because *you personally* don't use the extra processing power, memory, and storage that are available doesn't mean that lots of others don't. For example, I'm in the middle of digitizing and OCRing 110 years of local newspapers from microfilm into archival-quality PDFs for an historical society. Quite simply, you *cannot* have too much processing power when doing OCR -- I'm running multiple instances of ABBYY FineReader Corporate on a 2x Quad Core Xeon that has been pegged for two weeks now. It's quick, multithreads across all 8 cores and does a great job, but there's simply too much data. Note that this project would have been completely impossible in 1997 -- there simply wasn't enough processing power, memory or storage available to do it on anything less than a supercomputer. And that's not even considering truly bandwidth- and processor-intensive tasks related to video, weather meodeling, etc.

    2. Re:Have we gone backwards? by Dunbal · · Score: 4, Insightful

      When are we going to demand more from OS vendors?

            I would extend this to "software" as a whole. Software seems to be in a special protected class, since companies are able to KNOWINGLY deliver a defective product and be immune from prosecution. Computer games I am looking at you. There seems to be a mentality in the industry of "ship now, patch later".

            I can't let this go without a car analogy (this is slashdot after all):

            It's like buying a new car from a dealership, only to find out it comes with 5 flat tires. But the salesman puts his arm on your shoulder and says "hey, no worries, look - there's a gas station just over there and you can get those tires fixed in no time".

            It's high time the software industry as a whole was held accountable for this sloth. And don't give me the crap about "oh but there are so many different computers and hardware and configurations". After all, ISN'T THAT WHAT WINDOWS WAS SUPPOSED TO FIX? We certainly were sold on that idea in 1995. Windows was supposed to be a common application interface that smoothed over all the hardware differences. But because it's the poorly documented, bloated, kludge that it is, programmers yet again have to rely on little tricks and cheats to get top performance out of it. Resulting in crashes/bugs on non-standard systems.

      --
      Seven puppies were harmed during the making of this post.
    3. Re:Have we gone backwards? by pandrijeczko · · Score: 4, Insightful
      The Amiga was taken down, not because there was not enough demand for it, but because it was too efficient.

      Rubbish! The Amiga was a far superior machine to the IBM PC but Commodore/Escom/Gateway/Amiga Inc. did not have a single clue as to how to market it and expand it correctly. It was their total lack of incompetence that caused its death.

      Amiga users (and I know because I was one of them once) were the most loyal bunch of users there could possibly be, a bunch of people who remained loyal for years despite being continually f*cked in the arse by unfulfilled promises by David Pleasance and whomever else controlled the Amiga name over the years.

      --
      Gentoo Linux - another day, another USE flag.
    4. Re:Have we gone backwards? by Generic+Guy · · Score: 4, Interesting

      I think you're more on-topic then you think. I feel compelled to respond to your observations with my own:

      the OS/360 operating system...the machine had 2MB of memory and the operating system cost 400Kb of the memory.

      Keep in mind that 400K is about 20% of the machine's available resources, which doesn't seem to different from today. Although today we have a lot more choice in how many 'resources' to put into a workstation or server type system.

      There is also the difference between hosting old world text terminal interfaces and the modern high color depth, fancy windowing systems we have today.

      They charged something like $9.50 a month for 1Kb of system memory. That meant that every Kilobyte of memory saved was worth hundered or even thousands of dollars over the life time of the machine.

      Now this is the interesting point, IMO. In the past, you would often lease your 'mainframe' software, and need to renew it every year. Often you would have to contact your sales rep, get a new key, and 'activate' the software for another year. With a computer on every desktop, people were sold on the idea that you 'buy' your OS and software from the store and its yours -- forever. While 'Activation' and WGA are ostensibly an anti-pirating measure, in my eyes Microsoft is trying to steer the desktop PC market back to the old mainframe model of paying a yearly (or perhaps monthly) tithe to keep your computer working. Get the market used to phone-home features, and slowly close the net. They've been interested in subscription models for quite awhile, now.

      The problem for Microsoft is that, unlike mainframe vendors, they suck at reliability. So while Microsoft is eager for a lease-type model, they don't have the corporate culture or experience to make a robust system, they still have a lot of design issues with the tracking and activation back end which is of course necessary for a 'rental' paradigm.

      --
      { - Generic Guy - }
    5. Re:Have we gone backwards? by Ckwop · · Score: 4, Informative

      Quite simply, you *cannot* have too much processing power when doing OCR -- I'm running multiple instances of ABBYY FineReader Corporate on a 2x Quad Core Xeon that has been pegged for two weeks now.

      This is an application task and I'm inclined to agree with you. You can never have enough resources, whether you're encoding HD-DVDs all day or just using Notepad.

      However, I was talking about the operating system. The role of an operating system should be to provide a framework for performing tasks and running application as cheaply as possible; that is, using the least amount of resources as possible.

      It's a fair bet your program would work on Windows 2000 and Windows Vista. Yet Windows Vista will "tax" your system more to achieve exactly the same result. This is my point - the operating system is gobbling more and more resources that should be used by your applications without giving the user anything in return. In this sense, we are moving backwards.

      Simon

    6. Re:Have we gone backwards? by PeeAitchPee · · Score: 5, Interesting

      As for your task, it may not have been done on single machine in a reasonable timeframe and certainly not in a point and click fashion. However you could have easily integrated the ABBY engine into a networked batch OCR solution and then hired the capacity to run it (eg: a renderfarm).

      Ahhh, spoken like someone who's never done a project like this before. So easy to plan in your head on Slashdot in 30 seconds, isn't it?

      If creating the required integration work to ABBYY's OCR engine to some sort of distributed processing farm wasn't cost-prohibitive (which it is -- historical societies aren't exactly made of money), how would you suggest I upload over a terabyte of raw image data in a timely fashion to said render farm? And then download it again once completed (not as big of a problem, but still an issue)?

      The bigger question is whether or not to take on OCR in-house at all. If you want to sub-out OCR, then you have to wait until the scanning is complete (weeks) -- sending partial jobs via hard drive is more expensive than sending everything at once at the end. It's still too much money at the end of the day -- much, much cheaper to keep it in-house, and the QA process is better. The cheapest option is to buy the fastest server your budget permits and run it 24x7 in parallel with scanning and final PDF assembly / burning. ABBYY FineReader multithreads on recognition, but NOT on opening batches or writing out PDFs. That is the real bottleneck, and the reason it's necessary to run multiple instances.

    7. Re:Have we gone backwards? by ScrewMaster · · Score: 4, Interesting

      If I have 2 cores at my disposal, I'm going to be even more inclined to let the OS do some extra stuff on one of them.

      Yes, but you paid for those cores, the OS vendor did not. The problem is this: what is that extra stuff, and why should your operating system be doing anything that isn't of benefit to you?

      Take Vista for example. It is a resource hog. Some of that piggishness is the user interface, but there's a lot of other "extra stuff" in Vista that has no right to be there. Hopefully, someone will figure a way to strip most of it out at some point: maybe then it will be actually usable. Until then, I'm personally going to stick with XP and Linux. There's less extra stuff.

      --
      The higher the technology, the sharper that two-edged sword.
    8. Re:Have we gone backwards? by Sancho · · Score: 4, Insightful

      From Win95 to Win98 to Win2000 to WinXP, I've seen nothing but stability and security improvements. Vista has some security improvements, too, but in my experience, it isn't any more stable than XP. What's also come with every single new release of Windows is a changed UI, more eye-candy, and features that many geeks find useless.

      That doesn't mean that they're useless to everyone.

      Part of the issue is that you're focusing on the operating system. Windows is really quite a bit more than that--it's an operating environment (or a desktop environment, as GNOME/KDE are described.) This means that they aren't just there to provide a framework for performing tasks--the operating environment performs tasks on your behalf, provides feedback, allows the user access to information in a subtle, yet useful way (many OS X widgets, for example, and whatever Microsoft is calling their clone of it in Vista.)

      In the Unix world, we separate the operating system (kernel) from the shell (bash/ksh/whatever) from the window maanger (metacity/fluxbox/xwm) from the desktop environment (GNOME/KDE). This separation allows for immense flexibility. I can mix-and-match flavors, and even eliminate some of these layers entirely, depending upon my needs.

      Windows, however, caters to the mass market. It needs consistency in order to maintain its marketshare, while simultaneously requiring each version to have a distinct look in order to differentiate itself from the earlier versions. It has to be everything to everyone in order to keep existing users and attract new ones. It makes sense to throw in as much stuff as you can, so that people will want to use their product.

      Most people buying a computer will use it for the Internet (browsing, email) and maybe for creating documents and managing finances. Yes, they could do this on a 10 year old machine. The only reason to upgrade, then, is for the new UI or because their old computer broke. In either case, they aren't really losing anything. They're gaining more cycles in their new computer, and they're getting an OS that uses those cycles. If their tasks don't change, their CPU power needs (over what the OS requires) probably haven't changed, either.

      In more specialized circumstances, yes, it matters. And that's part of the reason that new OS are adopted fairly slowly in the business world. Not only do we want to ensure that the change is as easy as possible, but we want to make sure that we aren't losing anything.

      I think I've rambled a bit much, but the gist is, you aren't the target of Windows Vista, and Microsoft isn't just making an operating system. And that you're bringing Unix-like preconceptions into the Microsoft world.

  2. Why didn't they kill the server? by G4from128k · · Score: 4, Interesting

    One of the articles I read (http://www.betanews.com/article/Microsoft_WGA_Out age_Not_an_Outage/1188405961) suggested that if the server had actually gone down, then this would not have been a problem. The article, based on comments from Microsoft, suggested that WGA defaults to "genuine" if it can't reach the WGA server. So why didn't MSFT just kill the server to let people's software default to "genuine" instead of leaving the server connected with faulty software?

    --
    Two wrongs don't make a right, but three lefts do.
    1. Re:Why didn't they kill the server? by Technician · · Score: 4, Insightful

      So why didn't MSFT just kill the server to let people's software default to "genuine" instead of leaving the server connected with faulty software?

      It's an anti-piracy feature. It prevents a business from firewalling the WGA server to get "genuine" status. Remember there was an un-authorised software update site? If it works without the real MS saying it's OK, the anti-piracy feature does not work.

      Unfortunately for MS is this feature does not prevent users from migrating to the alternatives. It's hard to run a monopoly when Ubuntu is legal and free for the taking. If they had a choice, the first would be that I run Windows fully paid for. Second choice is that I run a pirated copy, but they are using WGA to prevent that to encourage me into the first choice, but the result is I have gone to their worst option.. I've gone legal to the competition. MS is helping themselves break their monopoly by reducing piracy.

      --
      The truth shall set you free!
  3. "won't happen again"? by haeger · · Score: 5, Insightful
    So, if it's human error that caused the problem, how can the swear that it won't happen again? Will there be no more humans working at microsoft anymore?
    I don't get it?
    People make mistakes and as long as people are involved in any process they will cock up from time to time.

    The point about systems not being so punitive is a valid one and should be brought up more often and louder. People who've paid money for their product should not be punished for an error on microsofts end.

    .haeger

    --
    You are not entitled to your opinion. You are entitled to your informed opinion. -- Harlan Ellison
  4. It's a fair point by Joe+Jay+Bee · · Score: 5, Interesting

    Critics were not impressed. 'A system thats not totally reliable really should not be so punitive, said Gartner Inc. analyst Michael Silver. Michael Cherry, an analyst at Directions on Microsoft in Kirkland, Wash.,

    WGA is a natural, if not perfect (or even good) business response to the problem of piracy (leaving out all the debate over whether it's a good or bad thing for Microsoft as a whole). But the technical implementation leaves a lot to be desired; if anything, the response to a WGA server failure should be automatic pass (fail safe) instead of an automatic fail (fail deadly).

    Sure, for a 24 hour window pirates would have a free-for-all in getting perfectly valid WGA results, but at the same time legitimate customers would not be inconvenienced. As far as I can see, that's the only way to keep WGA while minimising the backlash against it.

    1. Re:It's a fair point by Anonymous Coward · · Score: 5, Insightful

      Sure, for a 24 hour window pirates would have a free-for-all in getting perfectly valid WGA results.

      Actually, pirates would probably very quickly figure out how to set the WGA server failure condition in Windows to get the automatic pass without ever actually contacting the real WGA servers, which would render WGA completely worthless. Well... more so.

      I don't use Windows, can't stand Microsoft, and had a hearty laugh at the news of the WGA meltdown, but the problem is not as easy to solve from a technical standpoint as you believe.

  5. What happens in Safety Critical Windows installs? by Anonymous Coward · · Score: 4, Informative

    So if you were stupid enough to use Windows in a safety critical application you risk WGA putting people's lives at risk?

    Imagine if you used Windows in a doctors surgery to hold patient records, or store drug allergy data on it. WGA flags the PC as counterfeit, after that only Window Explorer works, and you can't get their records or allergy info.

    As long as Microsoft can deliberately or accidentally remove your right to use your PC, then you can't use it in any cases where you may find yourself in future dispute with MS, or where you need to rely on the PC. Having backups is no fix for the Windows Genuine Advantage bugs, because all Windows PCs go down in one go. It represents the ultimate single point of failure.

  6. Microsoft is blaming human error by suv4x4 · · Score: 4, Insightful

    Microsoft is blaming human error and swears it won't happen again.

    Self-contradictory: of all things that could happen out there, one thing will keep happening, and that's human errors.

    Realistically, it's just another fail point on your OS that will blow up from time to time.

  7. Re:Zoom by gatzke · · Score: 5, Insightful


    Slashdot is not about journalistic integrity, it never has been. It is about nerd topics and dupes.

    ACs complaining about twitter does look like astroturfing. MS has enough money to pay a few guys to beat back public opinion on well-known public tech sites. Without facts disputing the current article, it looks like you are just pro-MS ranting against a anti-MS article without any substance.

    Fact- WGA broke for a while causing many people troubles.

    Fact- Some people don't like having to phone MS all the time to keep a product running.

    Fact- MS has paid astroturfers to anonymously post pro-MS grassroots stuff online.

  8. Monitoring by Dunbal · · Score: 4, Insightful

    "we didnt have the right monitoring in place to be sure the fixes had the intended effect"'

          This sounds a lot like the Bush administration's excuse... oops!

          Seriously, Microsoft is great at monitoring YOUR computer, but they can't monitor their own?

    --
    Seven puppies were harmed during the making of this post.
  9. I've said it before and I'll say it again by FoolsGold · · Score: 4, Insightful

    If the pirates are having no problems and it's the legit users who are getting fucked in the ass, why the hell does Microsoft continue to bother with WGA?

    What do they gain? Was WGA suppose to convince people using illegitimate versions of Windows to turn to the light? Fuck that, they'll just download the latest cracked WGA .DLL and get on with it, while the legit users will get boned because their serial key wasn't recognized or whatever.

    WGA does NOTHING to hinder piracy, at least not with any level of success that compensates for the negative affects to legit users. It's a complete joke - and yet Microsoft doesn't have the balls to admit this yet. It pisses me off to see such short-sightedness from a bunch of guys who are suppose to be experienced in business.

  10. Not an acceptable answer by Anonymous Coward · · Score: 5, Insightful

    Look, most of us here work (directly or indirectly) in software. Who hasn't had a launch fail, or a product go bad, in a way that's negatively impacted customers. Such things DO happen. Usually not out of malice, and even sometimes not from carelessness--there are things that sometimes you can't catch on a test system. So to that extent, I feel for the folks who caused this problem..

    So why do I call it unacceptable? Because of the difference in standards. On Microsoft's side, they are holding the user to a high level of scrutiny, and reserve the right to cripple some OS features if Microsoft believes the install is pirated. No discussions. Go directly to "aero jail".

    Which is possibly understandable if their stance is "look, we're losing billions here--we need to fight piracy." But if they're going to take such radical and punitive measures as locking down OS features based on their tool, then they have to have an absolutely rock solid fail resistant totally monitored system. Basically, they need to hold WGA to a higher standard than most business software. This needs to be the gold standard if they want people to trust the system (and TFA links to a number of other reasonably well-balanced Ars articles that suggest it is not).

    Oops, we forgot to monitor the validation boxes? You can't be organic about this--add monitoring for problems as they're discovered on a system this critical not just to Microsoft, but to their customers. You have to anticipate what MIGHT happen, even if "there's no way that should ever occur." You have to think of things that should never happen, but would be problematic if they did.

    The fact that they failed here, if it never happens again, might not be a huge deal. But their answer shreds confidence that this is an isolated issue. The fact that this specific failure might not happen again gives me no comfort. Because their answer indicated that they didn't get it when they designed the system, and the don't get it now.

    What they SHOULD have said is "boy, this was something we never thought could happen. We have fixed the issue, and are confident we have the monitoring to prevent this specific issue going forward. And we are undertaking a comprehensive review of our validation and monitoring systems to make sure nothing even remotely close to this could ever possibly happen again." Nothing less should be acceptable.

  11. You have a choice, people! by pandrijeczko · · Score: 4, Insightful
    You do not need to keep yourselves tied to Microsoft's apron strings, there are alternative operating systems you can use.

    If WGA or other Microsoft activities are p*ssing you off as a user, then have some strength of conviction and DO SOMETHING ABOUT IT!

    Just stop with the continual whining about it...

    --
    Gentoo Linux - another day, another USE flag.
  12. paying for updates around the corner by gelfling · · Score: 4, Interesting

    Some division head inside Redmond is crafting his internal proposal to convert the update realm from a cost center to a revenue center. The rationale will be to collect the funding to staff up that function appropriately so as not to harm MS from mistakes such as this.

    The ironic thing is that few people will pay - and while the level of installed patches will go down the overall level of security will not materially change given the overall poor security stance in the first place. What will happen is that interoperability will begin to fail badly.

  13. Human error by Bromskloss · · Score: 4, Funny

    ...as opposed to an error in the actual WGA, which is not coded by humans, but by Microsoft's programmers.

    --
    Swedish plasma phys. PhD student; MSc EE; knows maths, programming, electronics; finance interest; seeks opportunities
  14. Re:What happens in Safety Critical Windows install by Technician · · Score: 4, Informative

    So if you were stupid enough to use Windows in a safety critical application you risk WGA putting people's lives at risk?

    Imagine if you used Windows in a doctors surgery to hold patient records, or store drug allergy data on it. WGA flags the PC as counterfeit, after that only Window Explorer works, and you can't get their records or allergy info.


    Read the EULA. Pay attention to the section regarding life critical application. It clearly states it is not to be used in life support applications. It simply isn't reliable for that. MS is avoiding lawsuits from people depending on Windows for life support by explicitly stating it is not designed, manufactured, or intended for that.

    "Note on Java Support. The SOFTWARE may contain support for programs written in Java. Java technology is not fault tolerant and is not designed, manufactured, or intended for use or resale as online control equipment in hazardous environments requiring fail-safe performance, such as in the operation of nuclear facilities, aircraft navigation or communication systems, air traffic control, direct life support machines, or weapon systems, in which the failure of Java technology could lead directly to death, personal injury, or severe physical or environmental damage. Sun Microsystems, Inc. has contractually obligated MS to make this disclaimer."

    snipped from here;
    http://www.microsoft.com/msdownload/ieplatform/ie/ license.txt

    --
    The truth shall set you free!
  15. Re:Zoom by iminplaya · · Score: 4, Insightful

    Well, let's be honest. Any program or OS that requires activation deserves a good bashing, and we should not support it in any fashion. And I proudly champion those who develope workarounds. Those who complain about bootleggers while benefiting from them as Microsoft and Adobe do are just as hypocritical as gay bashing republicans.

    --
    What?
  16. Re:tagged as "blamebill" by Doctor+O · · Score: 5, Funny

    Bill is still chairman. Ballmer is CEO. Last thing I heard, Ballmer indeed is the chair man. I don't think Bill has *ever* thrown a chair.
    --
    Who is General Failure and why is he reading my hard disk?