Slashdot Mirror


How Would You Handle a $1,000,000 Coding Error?

theodp writes "The Chicago Tribune's efforts to upgrade its computer system over the weekend turned into a fiasco when the system crashed, halting all printing operations and leaving about half of the Trib's subscribers without papers. The software contained 'a coding error,' according to a spokesman who estimated the cost to resolve the problem at 'under $1 million.' Any advice for the poor schmuck who's going to get the blame?"

10 of 878 comments (clear)

  1. It's my first week! by Fubar420 · · Score: 5, Insightful

    Well, ok so that might not fly, but hey, it works when its true if you work for a modestly forgiving employer...

    Now if the cause was insufficient testing, well then QA has to answer for it.

    And if there's no QA, well that's managements fault...

    Now if it all comes down to dumb circumstances, it's poor planning on the papers fault for not testing themselves ;-)

    That said, fess up, worse comes to worse, you now have national infamy, and any fame is good fame, right??

    --
    -- (appended to the end of comments you post, 120 chars)
    1. Re:It's my first week! by Soko · · Score: 5, Insightful

      I'm giving up moderation on this story to post this, so listen the fuck up.

      I work in newspapers, and have for the past 7 years. The blame for this fiasco should be pinned directly on the project manager. Not the coders, not the people trying to get the thing running, but the project manager. Right in the middle of his fucking forehead.

      I've torn the guts out of many newpaper networks upgrading or improving them, but never have I ever put anyone in the position of "If the new system doesn't work, we're fucked." I've always made ab-so-fucking-loutely certain there was a fall back position where the paper would hit the press. I actually had this conversation before:

      <Management weenie> What happens if this new server fails?
      <me> I haven't touched the old server. If the new one hiccups one whit, we fire up the old box and produce product.
      <Management weenie> I don't like that - we've spent a million bucks on the new gear. Delays make me look bad.
      <me> Well, if you're willing to man the phones when the advertisers call demanding re-prints of thier ads because of human error somewhere, I have no problem with it.
      <Management weenie> You're an asshole. I could have you fired.
      <me> In this instance, I'm paid to be an asshole. You can't fire me for doing my job.
      <Management weenie> Heh. OK, we'll go with your plan.

      Not planning some way to get the paper on the press is dereliction of duty, and deserves your professional head to be lopped off.

      Is there _no_ professionalism anymore? Fuck, I should be paid more. Morons like that burn me - when you blow up a critical system with no backup, it's not just your livelyhood, but for everyone who depends on that system functioning as needed - it's thier livelyhood as well. Fucking morons.

      Soko

      --
      "Depression is merely anger without enthusiasm." - Anonymous
  2. Testing? by buff_pilot · · Score: 5, Insightful

    Where was the pre-install testing?

    A good test should have identified some errors, especially if it blew up IMMEDIATELY.

    1. Re:Testing? by ryen · · Score: 5, Insightful

      I agree.
      Blame the project manager (hopefully their was one) that led testing the services thoroughly before deployment. Individual coders shouldn't be held to any legal liability.
      Any legal action should be directed towards the'outside provider' (as noted in the article).

  3. 1 million is not that much by Anonymous Coward · · Score: 5, Insightful

    Management frequently makes mistakes which cost much more. The difference is that their mistakes are not as easily identified or attributed to a single person.

    The culprit should just admit it. Shit happens, it's unavoidable even if you take all precautions. Don't make the same mistake again, though.

  4. Re:Dogbert Strategy by Pulse_Instance · · Score: 5, Insightful

    In my experience being honest about your mistakes and having the willingness to learn from them always pays off.

  5. No one person should be at fault by David+Frankenstein · · Score: 5, Insightful

    With any large roll out, if only one person is at fault for a fiasco like this, then the project mas mismanaged. They should have had a plan in place to backout the change.

  6. Deployment? by BiggerIsBetter · · Score: 5, Insightful

    Where was the phased or parallel deployment?

    You don't just change a system like in a weekend. There WILL be problems, so you have to have ways of dealing with it. Maybe that means flicking the switch back to the old system if it fails, or maybe it means running with degraded capacity a while, but whatever it is, it's dead-in-the-water is not your Plan B.

    --
    Forget thrust, drag, lift and weight. Airplanes fly because of money.
  7. planning? by twitter · · Score: 5, Insightful
    A good test should have identified some errors, especially if it blew up IMMEDIATELY.

    Good planning would have had an abort procedure, so the show would go on. Everything changed should be undone if it did not work. They could figure it out after the paper was printed.

    Errors are inevitable. Good planning and implementation keep you from falling on your face even when you publish seven days a week. It's not the coder's fault.

    --

    Friends don't help friends install M$ junk.

  8. Testing is Boring by PingPongBoy · · Score: 5, Insightful

    Software testing is boring boring boring. You have to try things out again and again after each change. Modules that haven't changed gain confidence in the face of changes and might not be tested, but omitting tests can end up being the Achilles heel. There can be an overwhelming desire when a project nears completion to just get things done and over with. After all the hard problems may well be solved and it's all down to seemingly inconsequential details.

    These days programmers have a Sword of Damocles hanging over them. Once they finish a major piece of code they may have a hard time finding new work. The economy has not lived up to forecasts of more jobs. Outsourcing has reduced computer opportunities. Management of many companies do not see new uses for computers. Off-the-shelf programs abound for almost every aspect of computerized work.

    Stress may distract software engineers enough that someone will make a major mistake.

    --
    Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.