Failed Win XP Upgrade Wipes Out UK Government Agency
Lurker McLurker writes "The BBC and the Register report that the UK Government's Department for Work and Pensions attempted to upgrade seven PCs from Windows 2000 to Windows XP, and ended up with BSODs on over 60,000 machines. I wonder if the National Health Service is regretting awarding Microsoft a £500 million contract now." The Guardian also has a good story.
They wanted that new version of Internet Explorer with the fancy built-in pop-up blocker.
Never email donotemail@WeAreSpammers.com
I can imagine it now
Intern: "Sir, Microsoft have bought out Windows XP Service Pack 2. It's had numerous bug reports of dying pcs and software not working anymore. THIS is the time to upgrade to Windows XP, then upgrade to SP2 because windowsupdate won't stop bugging the hell out of us until we do!"
Boss: "You mean we could cock something up, and it might not even be our fault for a change?! Lets pay someone vast amounts of money to do it!"
The Gaurdian reports it was a week long outage. Now, I may be completely wrong here, but surely all they had to do was restore those pcs back to their previous Windows 2000 state using the daily backups they do... I mean, it's only common sense to do backups on such a critical syst...oh, wait, nevermind.
</cynical>
Get paid to search..It's geniune and
OH SHI-
If only they had reached the conclusion hinted at in this BBC News article a year or two ago, this would not have happened.
It's certainly bad PR for Microsoft though, perhaps this will serve as a wake-up call to other governments that "other options" are out there.
But still I have to say it: "HAHA!"
Every time I hear about a big government IT fuck-up it seems to be caused by EDS. Yet the government keep awarding them contracts. Why?
Incompentent admins can turn any minor upgrade to a catastrophic failure. Don't blame M$ for this one unless there are irrefutable proof that the admins did everything by the numbers.
Quidquid latine dictum sit, altum sonatur.
Every Desktop Shutdown.
All those moments will be lost in time, like tears in rain.
It's like a thousand solitaire players suddenly cried out in frustration and then silence...
I like muppets.
From the Guardian article: "At this point there is no known solution or ETA"
I RTFA and all I see is a money discussion, not a technical discussion. I would speculate that an SMS or Zenworks push or somthing similar which was supposed to be restriced to the 7 PC's went almost everywhere. It might be a fair bet that the remaining 20,000 might have been upgraded too if those people had been at work and turned on their computers. IT Computer management tools give the department much power, which could do plenty of damage in the wrong hands.
Have you Meta Moderated t
Jon.
The BBC article mentions that EDS is responsible for the ugprade. They're partnered with Altiris, so I'd be willing to bet that the upgrade was carried out using the Altiris Client Management Suite.
It's a great set of tools--we own it at work and managed our own Win2k -> WinXP upgrade using the PC Transplant and Deployment Server tools, but can massively bone you if you don't do enough testing. PC Transplant, in particular, can hurt if you--that's the application that lifts your profile off of one PC and slaps it down on another, so that you don't have to re-configure your Exchange settings, Office personalizations, backup documents and application settings and bookmarks, and a whole mess of other things. When doing an OS migration, if you don't design your personality transplant template correctly, you can end up with all kinds of Win2k-specific settings stuffed into your WinXP profile, which can lead to all kinds of crazy-ass problems.
From the article: Another source says that the DWP was trialing Windows XP on a small number ("about seven") of machines. "EDS were going to apply a patch to these, unfortunately the request was made to apply it live and it was rolled out across the estate, which hit around 80 per cent of the Win2k desktops. This patch caused the desktops to BSOD and made recovery rather tricky as they couldn't boot to pick any further patches or recalls. I gather that MS consultants have been flown in from the US to clear up the mess." EDS is also thought to be flying in fire brigades."
/.
Brilliant work on the part of EDS, trying to patch the wrong systems, lord only knows what can happen then.
You could force an XPSP2 onto a 2k machine... would you still blame Microsoft for it? That seems to be the case here, EDS screwed up, and of course it's Microsoft's fault in the eyes of
Help Brendan pay off his student loans
"On another note, How did upgrading seven machines to XP BSOD 60000"
If you read the register article, it says that they were attempting to only push the update out to 7 PCs, but it actually went to all 60,000.
I would imagine they were using something like Microsofts SMS services or Bigfix to push out packages, and simply selected push out to all instead of a test community.
I don't think this is a nail in Microsofts coffin, I have seen similar things happen in the mainframe world where patches intended for dev hit live production systems with similar bad consequences. It has to count as a bad day at the office for the person pushing the button though.
It also highlights the difficulty in pushing out big updates to major networks of PCs, be they running Windows or Linux. The complexity of moving from Win NT to XP has proved so complex in my organisation that for the future Longhorn upgarde and beyond we are now looking to Citrix to allow the migrations of applications across servers and essentially use the PC as a thin client for all but core office and email apps.
When a government ends up with BSODs on 60000 computers, it can't be good for Microsoft.
Yea, I can just see them going bankrupt over this. Their coffin was half closed before, but now they're bound to be pennystock.
Obviously these sysadmins were incompetent. Everybody knows that a BSOD is impossible under Windows XP. If they had simply upgraded the other 60,000 machines to XP first, and then updated these 7 problem systems, this whole problem would easily have been avoided.
So ... 5 working days, 60,000 PCs (= 60,000 employees?)
Assume £8/hr employee. 40 hours of work a week. 60,000 unusable systems.
=> TCO increased by £19.2m for the 8 PCs they upgraded (before costs incurred fixing the problem)! £2m TCO per system for Windows XP eh? A clear example that Windows TCO can increase rather horribly if something goes wrong, and this was a standard upgrade. It's £320 per PC if you count all 60,000 systems - that's still horrendous.
the UK Government's Department for Work and Pensions attempted to upgrade seven PCs from Windows 2000 to Windows XP, and ended up with BSODs on over 60,000 machines.
In actual fact, the Register quotes:
According to one, a limited network upgrade from Windows 2000 to Windows XP was taking place, but instead of this taking place on only a small number of the target machines, all the clients connected to the network received a partial, but fatal, 'upgrade.'
and then below it:
Another source says that the DWP was trialing Windows XP on a small number ("about seven") of machines. "EDS were going to apply a patch to these, unfortunately the request was made to apply it live and it was rolled out across the estate, which hit around 80 per cent of the Win2k desktops.
So, by merging them you get the following story:
There was a trial of seven PC's, instead of patching only those seven, the request to roll it out was accidently performed and every computer attempted to install a botched version of XP.
Somewhat slightly different to the Slashdot version wouldn't you say?
In addition, I'm pretty sure that if you accidently deployed a botched version of the linux kernel then it too would probably have a similar effect.
Avantslash - View Slashdot cleanly on your mobile phone.
I wish I could take one of you Linux "experts" up on your idea. "Here, upgrade these 2000 PCs, all of which are from different manufacturers and different configurations, to Linux. I need it done in the off hours and I need everything to work like it did before.".
*crickets*
Of course someone will reply and say "ok!" knowing it won't happen. It's not because I don't have the ability to make that decision but it's because I know better than to get real information/insight about IT from most /. posters.
It's painfully obvious that a scant few here actually have a clue about running a business that relies on IT. It's more than ripping CDs and DVDs kids. Sure, the company that did the mistake is at fault but the problem is not in the chosen OS, it's in the chosen technicians and management.
When a government ends up with BSODs on 60000 computers, it can't be good for Microsoft.
No, but that doesn't necessarily mean it's bad for the rest of us!
Let's hope Congress plans to upgrade soon!
See? Even Microsoft is good for something!
Im pretty embarrassed for my country right now. How the fuck did we go from technological pioneers to this? And its only the tip of the iceberg, what with Ken Livingstone's numerous stupid ideas, David Blunkett's insanity and the incompetence of 100's of 'IT' projects (hint: if its called an IT project it means its run by incompetent MCSEs and it will fail catastrophically leaving millions of people without a service or having planes crashing into the ground, time and time again) with tax money falling out of their pockets, fuck them! Why do these idiots get the contracts? What happened to all the competent people??
This comment does not represent the views or opinions of the user.
and you missed out big time. 4 years later you could have been naming your own price for Y2k fixes.
You'd probably be retired now! Pity you chose long hair, and have another 40 years of work to go.
was the government spokesperson. After the intro to this piece on Radio 4 this morning, her opening sentence was "Let me correct you, 20% of our workstations are functioning". Talk about a positive spin.
I once knew a bean-counter (quite senior) on nearly 3 times my engineer's salary. He was sat there in front of a spreadsheet adding up a column of numbers on a pocket calculator.
Welcome to the UK Public Sector. That was your tax money.
Stick Men
I have found that many MPs when questioned on anything related to technology simply say that "it is a complex issue", which to me isn't good enough when such huge amounts of money and significant impact on people's lives is involved.
There is a huge contract that'll be up for grabbs soon - EDS are preparing themselves to manage the UK national identity database and identity card scheme. This is one we could lobby our representatives on to ensure they do it right..
Where to have the debate where it might be read by those who mater:
Free service to fax your MP
Boris
Richard Allan
Tom Watson
Shaun Woodward
Citing the recent and ongoing failures such as that cited in the article, and the UK Child support agency's computer failure. as well as the NHS computer system UK
UK Laptops
Read the article. EDS applied a patch intended to update 7 Windows XP boxes to 60,000 Windows 2000 machines. The TCO here applies to the contract to EDS, not the software. It's like saying that a prison guard intending to open one gate to let someone out accidentally opened all of the gates and then they blamed the door manufacturer.
Microsoft sells itself as easy to administer, what in management terms means that the systems are so /user friendly/ that any moron can administer them.
/user friendly/ GUI program.
So, admin stupidity can also be blamed on MS, it's part of the TCO studies that make the decision to buy MS.
Aside from that, a point-and-click update cannot fail so miserably. A script made by the admin, of course should, because you can assume that someone smart (and bold) enoguh to make a little script should be responsible for their decisions. Some guy clicking checkboxes shouldn't be allowed by those means to break 60000 computers, through a
GUIs for dummies should have enough checks to prevent such underiable effects, they have a sufficiently constrained domain to be able to do so. If the guy wanted to do a legal task that the tools dosnt' allow, he could always write some Visual Basic Script, and then he would be on his own. Bringing down an organization by mis-clicking checkboxes is responsability of the guy that provided the checkboxes, too.
Something that makes me curious, you hear Ballmer lament about the lower TCO of windows. You hear the linux community shriek about it's lower TCO. The bottom line is really this, if your sysAdmins are less than competent and bugger up something like this which system would have a lower cost to recover? This is a really good thing to know when you are considering any enterprise system. Call it, TCCR (total cost of catastrophic recovery). Ballmer, Linux communities answer me this!
All your database are belong to us
Don't worry knowing Linux and the IT of the public sector they'd have chmoded root to 777 long before any upgrade.
Yes. It's not like the upgrade could detect the version of the program it's being applied to, and only install if the version matches the version it is intended for. That is completely unheard of, and would be impossible technically.
This was sarcasm, FYI.
This situation is more analogous to a wrong signal causing the door to open and then jam. And yes, such a door manufacturer deserves to be blamed.
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
The question about all of this that I am left with is, how did the patch even install? Microsoft has had sanity checking on their patches for ages, checking not only the Windows version, but even service pack levels and any other prerequisites. Ever tried installing a patch intended for IE6-SP1 over plain IE6 for example? I'm assuming that this is some custom patch rolled by EDS, rather than an official Microsoft one downloadable by all and sundry. Still, the story appears to have made it onto UK prime time news, so no doubt more details will emerge...
UNIX? They're not even circumcised! Savages!
Easy, a dialog like this appeared:
"Do you want to update the machines on your network now?"
[Accept]
No cancel button.
--
Wiki de Ciencia Ficcion y Fantasia, un cuento por Fly.
you can call them senators if it makes you feel better.
Jeez, sometimes Slashdot readers are blind and zealous like headless chickens...
1. The patch they tried to update with wasn't a complete one for an OS upgrade.
2. Then they deployed it to their entire network by mistake.
This interesting piece of information can be gathered by RTFA.
I wonder what would happen to, say, Linux boxes if they had 60,000 and they applied an incomplete kernel patch?
Maybe some... thing... would panic?
Beware: In C++, your friends can see your privates!
...but is there any actual evidence is was a Microsoft error? I like bashing Windows as much as the next guy, but it seems this is at least as likely to be a huge fumble by the admins.
They're probably using something like Novadigms's Radia. And instead of linking the correct 7 PCs, they linked to all of them (misconfigured group). In that case, it's not a case if installing a patch that is installed using the new mechanisms, the "Patch Manager" simply dumps the files to all the machines that connect up using it's client, and force an overwrite.
Given, they should actually have an install script that checks the OS before it actually dumps the install package on there, but hey.
Not normally an MS apologist, but this isn't really Microsoft's problem. It's the contracted company that made the update package failing to ascribe it to the right download group.
So, the analogy. It's like some perfectly good system being installed, and someone presses the button marked 'open all doors' instead of simply open door 7.
I don't see anyone really blaming the door manufacturer here (Microsoft or the contractors), although I'd hazard a guess that the person who skipped over the part of the process that said 'double check the groups you assign this patch to' will be sorely chastised...
Upgrades NEVER work! Not for Windows 95, 98, ME, 2000, XP, Longhorn, whatever! It will never be a good idea to try and replace a MS OS without doing a clean install.
This is first day stuff.
Without any specific details on the failure or what exactly happened, it seems like this is a huge admin error. My guess is they're using something like Altiris to do their builds, and if an admin were to accidentally "drop" the package meant for the the test group on to the production group, wham-o... every PC starts installing a build that probably isn't meant for them, and won't work. And you can imagine how that would go.
As much as I'm sure the zealots among us would like to make this seem like a Windows failure, it looks like it's more of an example of how outsourcing leads to disconnected, incompetent, and unmotivated IT staff. And that, of course, leads to mishaps like this.
Either way, if you work for a company that brings EDS in house in any way, drop your shit and run. And don't look back. The flash could be blinding.
The public sector in the UK is nothing more than unemployment benefit for the middle classes.
In my experience (having worked for both) in terms of inefficieny and stupidity, there's only one thing worse than the British Public sector and that's the British Private sector.
My company used to be part of a large public sector concern and was sold off. Since then we seem to spend nearly of our time/money:
Changing company logo and name every 6-12 months
Adding a new problem management system which we have to learn every 6 months (we currently have about 5 each of which was supposed to replace all the others).
Paying huge bonuses to upper managent.
Paying huge car allowances to middle management including those who refuse to drive.
Not giving any rises under the so-called performance related pay scheme for 4 years despite meeting profit targets because all the money has gone on the above 2 items.
Making skilled people redundant then recruiting at vast expense people with the same skills 2 months later.
Making skilled people redundant then reemploying them at twice the pay as contractors for the next 2 years because they're still needed.
Repeatedly shuffling kit from datacenter to datacenter around the country at vast expense and disruption to our customers.
Ordering expensive buffets for management meetings , 95%+ of which get thrown away.
Managers having a schedule involving meetings all over the country which means that they spend about 25 hours out of 40 driving.
Managers refusing to use video-conferencing for meetings even in the light of the above.
How many of these things happened when I was in the public sector? Virtually none. We didn't have the money to throw around on such things. We were forced to be efficient.
Also if this private sector company I'm referring to was atypically inefficient, presumably it would do so badly it would collapse or be taken over. So this implies that many private sector companies are like this.
It's very easy to slag off the public sector if you use stereotypes, generalizations and distortions.
She added that the emergency payments system was "working perfectly."
Jones agreed, "I still have plenty of blank cheques. My pen is at room temperature."
Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.
This isn't Microsoft that ballsed it up, nor is it inherently the fault of DWP. Chances are it's an underpaid sysadmin somewhere who hit the wrong checkbox when rolling out the patch.
If someone can manage this by selecting the "wrong checkbox" then the system is broken by design.
Microsoft sell a complex system with the claim idiots can administer it. The DWP employ/contract idiots to administer a complex, but vital, system. Niether of these are "innocent parties".
If you give a chimp an Uzi with a defective trigger mechanism and a bunch of people get shot, whose fault is it: the chimp's or the Uzi's? My first networking experience was with AppleTalk; plug it in and you had a network. I was subsequently required--with co-worker--to learn everything we could about Windows networking so we could implement it in one of our products.
My co-worker and I spent the next period AMAZED that Windows networking even worked at all. The system of domain controllers and WINS servers and browse lists and host files... it's too byzantine to be believed. There is, without doubt, a corporate network somewhere that could be comopletely undone by someone opening a wireless laptop in the wrong place at the wrong time. Add Windows XP and the attendant SP2 fun they're having and you get chaos.
Yes, those delightful folks at EDS are the chimps in this scenario, but Microsoft's products are definitely the defective Uzi. And I note that the BBC News article studiously avoided mentioning either of them. Hmm... Microsoft wouldn't be doing everything it can to tamp down this PR disaster, would it?
Naaah!
It's much more reliable to back up your data and do a fresh install. I experimant with upgrades, but even(or especially) with linux, I prefer to clean the disk and start fresh. Apple on the other hand(before OS X anyway, don't know if it still is) was great. It would just create a clean new system folder. With the old one still there, I could just "bless" it if necessary. Oh, well...There's still nothing more trustworthy than pen and paper, and a good ol' mimeograph machine(the hand crank variety) for makin' copies...And they smell great.
What?