How Would You Handle a $1,000,000 Coding Error?
theodp writes "The Chicago Tribune's efforts to upgrade its computer system over the weekend turned into a fiasco when the system crashed, halting all printing operations and leaving about half of the Trib's subscribers without papers. The software contained 'a coding error,' according to a spokesman who estimated the cost to resolve the problem at 'under $1 million.' Any advice for the poor schmuck who's going to get the blame?"
Well, ok so that might not fly, but hey, it works when its true if you work for a modestly forgiving employer...
;-)
Now if the cause was insufficient testing, well then QA has to answer for it.
And if there's no QA, well that's managements fault...
Now if it all comes down to dumb circumstances, it's poor planning on the papers fault for not testing themselves
That said, fess up, worse comes to worse, you now have national infamy, and any fame is good fame, right??
-- (appended to the end of comments you post, 120 chars)
Where was the pre-install testing?
A good test should have identified some errors, especially if it blew up IMMEDIATELY.
Management frequently makes mistakes which cost much more. The difference is that their mistakes are not as easily identified or attributed to a single person.
The culprit should just admit it. Shit happens, it's unavoidable even if you take all precautions. Don't make the same mistake again, though.
In my experience being honest about your mistakes and having the willingness to learn from them always pays off.
With any large roll out, if only one person is at fault for a fiasco like this, then the project mas mismanaged. They should have had a plan in place to backout the change.
Where was the phased or parallel deployment?
You don't just change a system like in a weekend. There WILL be problems, so you have to have ways of dealing with it. Maybe that means flicking the switch back to the old system if it fails, or maybe it means running with degraded capacity a while, but whatever it is, it's dead-in-the-water is not your Plan B.
Forget thrust, drag, lift and weight. Airplanes fly because of money.
Good planning would have had an abort procedure, so the show would go on. Everything changed should be undone if it did not work. They could figure it out after the paper was printed.
Errors are inevitable. Good planning and implementation keep you from falling on your face even when you publish seven days a week. It's not the coder's fault.
Friends don't help friends install M$ junk.
Software testing is boring boring boring. You have to try things out again and again after each change. Modules that haven't changed gain confidence in the face of changes and might not be tested, but omitting tests can end up being the Achilles heel. There can be an overwhelming desire when a project nears completion to just get things done and over with. After all the hard problems may well be solved and it's all down to seemingly inconsequential details.
These days programmers have a Sword of Damocles hanging over them. Once they finish a major piece of code they may have a hard time finding new work. The economy has not lived up to forecasts of more jobs. Outsourcing has reduced computer opportunities. Management of many companies do not see new uses for computers. Off-the-shelf programs abound for almost every aspect of computerized work.
Stress may distract software engineers enough that someone will make a major mistake.
Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.