Slashdot Mirror


History's Worst Software Bugs

bharatm writes "Wired has an article on the 10 worst sofware bugs.. From the article 'Coding errors have sparked explosions, crippled interplanetary probes -- even killed people. Here's our pick for the 10 worst bugs ever, but the judging wasn't easy.'"

29 of 645 comments (clear)

  1. Predictions are hard by Teppy · · Score: 4, Interesting

    Wonderful article. Twenty years ago I believed that writing software would soon become a licensed profession. (Need a
    license to own a compiler, for instance.) I thought that the event that would inevitably trigger this is when a software
    bug caused a human death.

    I still believe that programming will eventually require a license, but I now think that lobbying by the big media
    companies will be the cause. Depressing, huh?

    1. Re:Predictions are hard by j-cloth · · Score: 3, Interesting

      What constitues programming is so blurry, though.
      Does it count when someone puts some HTML in a blog? What about Javascript? a DIY PHP site? a batchfile or shell script? Excel function/macro?
      Do you only want to licence compilers? How do I install my OSS? What about the power of interpereted or JIT languages? So much can be done with uncompiled code.

    2. Re:Predictions are hard by idontgno · · Score: 5, Interesting
      I can't cite any documentation, but I recall seeing studies which show that the number one critical attribute of persistently optimistic personalities is a chronic inability to clearly see reality. Is this the same phenomenon?

      In the words of the old chestnut, "If you're calm and confident when everyone around you is running around in blind panic, you clearly don't understand the situation."

      --
      Welcome to the Panopticon. Used to be a prison, now it's your home.
    3. Re:Predictions are hard by servicemaster · · Score: 2, Interesting

      Often I find it's a difference in personality type. Using the Keirsey temperment sorter you see where people are separated by how they perceive the world. One aspect of personality is how you perceive the world to be, either in concrete or possible terms. Objectively the two are not entirely opposite, it's just in how the world is organized. One views their surroundings as possibilities, the other as actualities. The actual view is the same, but the actions and course determined is different. ie. If you're calm and confident when everyone aroudn you is running around in a *BLIND* panic... it's just as likely you can see something they can't

    4. Re:Predictions are hard by TemporalBeing · · Score: 2, Interesting

      We've all seen it: the employee who's convinced she's doing a great job and gets a mediocre performance appraisal, or the student who's sure he's aced an exam and winds up with a D

      In all reality, that is hardly a way to rate someone. Some people, despite how good they may be at the subject, just don't test well - they student could very well be a genius at the subject and still flunk.

      As per performance reviews - you have to have an accurate representation on what you are going to be reviewed on order to be able to achieve the review. This is also flaws as many don't describe or tell how they will actually review someone until after the review has been done, so the person being reviewed has no concept of what they can do to get a good review. As a result, people will think they are going to be rated well, and end up rated poorly because the person rating did not clarify well enough how they were going to do the rating.

      As I said, both of those are majorly flawed ways to evaluate someone. Hopefully, the guy from Cornell took that into account, but I wouldn't be surprised if he didn't. And even if he did, he would have to take it into account so many different ways that it would be too hard to really test its accuracy.

      --
      Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
  2. They are just very, VERY careful. by Poromenos1 · · Score: 1, Interesting

    When you are writing software for life-critical applications, there is various software and techniques that ensures bug-free code. Just look at all the airplanes, powerplants, car computers, etc. It's not very usual at all to see one fail critically.

    --
    Send email from the afterlife! Write your e-will at Dead Man's Switch.
    1. Re:They are just very, VERY careful. by Coryoth · · Score: 3, Interesting

      When you are writing software for life-critical applications, there is various software and techniques that ensures bug-free code. Just look at all the airplanes, powerplants, car computers, etc. It's not very usual at all to see one fail critically.

      When you are writing software for life critical systems, there are methods you can follow that allow you much greater assurance of correct code and drastically reduce the testing burden (byt being abel to prove that certain classes of errors don't exist in the code). It's akin to static types, which allow you to statically catch a lot of type errors obviating reducing the need to spend time testing for possible type errors.

      The languages and methods used are things like SPARK and B-method The beauty of systems like SPARK is that they provide a degree of flexibility in how much work you go to depending on how much extra assurance you want. It is quite possible to simply specify critical portions of code with a little extra formality (basically extended static checks beyond what type checing alone can give you) through to fully specifying everything and doing formal proofs for the whole system. You can tailor the effort and assurance to the needs of the project.

      (This time without that dangling link - that'll teahc me not to preview)

      Jedidiah

  3. Moth. by Poromenos1 · · Score: 4, Interesting

    The moth was trapped, removed and taped into the computer's logbook with the words: "first actual case of a bug being found."

    Why would they say that, if the term "bug" didn't exist? I mean, you wouldn't find a rat in your car and say "First actual case of a car 'rat' being found" if you didn't use it as a term to indicate something. You'd just say "this bug caused computing errors". I smell a car rat.

    --
    Send email from the afterlife! Write your e-will at Dead Man's Switch.
  4. Re:Microsoft's striking absence by IdleTime · · Score: 2, Interesting

    Yes, I saw that too and I guess they have forgotten the most devastating MS bug which is present in all releases from NT 3.1 and at least up to 2k. I haven't tested XP.
    I couldn't find the description right now, but I'm sure others know the bug. The one were you can basically type a special textfile using type-command or similar and will basically BSOD the machine. The file consists of tabs, spaces and newline/carriage return pairs and nothing else. MS never fixed the bug.

    --
    If you mod me down, I *will* introduce you to my sister!
  5. Re:You get what you pay for by jurt1235 · · Score: 2, Interesting

    If continuing education for programmers was a priority, quality would be better.

    This also requires more than the current courses which are pretty much level starter course. It is sad that after a few days being busy with a language before a course, you will already find mistakes/bugs or just better ways to do it than what is promoted in the course.

    For example after a 3 day crash course (I missed day 1, else it would have been 4 days), I became a certified Stellent developer. So a "real" test at the end to determine if you are worth it or not.

    --

    My wife's sketchblog Blob[p]: Gastrono-me
  6. Re:omg by Jason+Hood · · Score: 2, Interesting

    And i suppose if I had a "broken" gun in my basement and you broke in and stole it, then tried to use it and injured yourself, you could sue me right?

    Sorry, i am having a hard time seeing the correlation to Terrorism here. It seems that you have a predisposition to the US's stance on terror and are desparately trying to make a connection for a political statement. Unfortunately, typical slashdot readers will agree with you =)

    This would be very different if the US broke in to USSR and altered their software to malfunction. That definitely would be a criminal act, but more perhaps importantly and act of War. I doubt the Soviets ever figured out what happened until they were told.

    --
    Are you intolerant of intolerant people?
  7. Re:omg by Yahweh+Doesn't+Exist · · Score: 1, Interesting

    I don't think you can justify the largest non-nuclear explosion ever just because it was a "side-effect" of economic damage. otherwise it becomes very easy to justify 9/11 since all the targets were economic/military.

    on a much smaller scale, I think it's illegal here in UK to set "traps", for example a landmine in your house in case of thieves breaking in. I believe the reasoning is the indiscriminate nature - it could kill a fireman trying to save the house from burning down.

    similarly, even in war, indiscriminate killing is ethically wrong and I doubt the gas workers were wearing military uniforms (and I guess the US still pretended to care about the Geneva convention back then)

  8. Re:Time for stressing more on formal specification by Anonymous Coward · · Score: 1, Interesting

    That's easier said than done. After all, buggy software is usually better than no software. But who's to say that it will even prevent the problem.

    Mariner I software was correct, but failed because the software was incorrectly typed into the computer. The Ariane 5 software was correct when it was written for Ariane 4. The only way to find that bug is with simulation of the whole system. The Therac software was correct because it was part of a system of hardware interlocks. Later machines took half the system without replacing the other half and people expected it to work the same way.

    Formal specs won't help you if your software is not being used as designed or if the designer can't know all possible inputs (such as fly-by-wire software for aircraft).

    dom

  9. Medical Systems by koehn · · Score: 5, Interesting

    I designed and build a diagnostic radiology workstation (in 1997, in Java 1.1, 4x5 megapixel monitors, still in use today). During the development effort we were regaled with stories of software glitches in medical systems resulting in disaster. It really keeps you focused.

    In one case, a radiation treatment system had a bug where if you used the backspace key when entering the dose a patient received, the display would show you deleted the last digit, but internally you hadn't. So the patient would recieve 10^backspace times the intended dose of radiation. Not a big deal normally, since the techs would typically shut the machine off between treatments. Until one day when they had two patients needing treatment back to back. The tech knew something was wrong when the machine was running for an unusually long time. The patient knew something was wrong when he died.

    On our team a defect that crashed the system was considered severity 2. Severity 1 was reserved for defects that could result in a mis-diagnosis, which most patients agree is worse than a crash.

  10. Re:only 10? by ScentCone · · Score: 2, Interesting

    I think it would better be called terrorism.

    Why? Because code that the Soviets stole from the US turned out to be (from their perspective) defective? I don't think it's terrorism if my car blows up while you, having stolen it, are driving it around.

    More to the point, though, the CIA's objective was to cripple the cash flow of the Soviet Union, an entity that really was busy terrorizing much of the world. Their murderous, oppressive grip on Eastern Europe and attempts at foisting their cheerful utopia on South America and Africa wasn't going to get anywhere without the cash they were trying to raise by selling Siberian natural gas to the west. Making the Soviet government's cold cash sales operation less workable for them was part of what finished pulling the rug out from under that hellish government. That they so desperately needed western cash was a sign of how hollow that regime actually was, and that event just added clarity to the picture. I doubt the CIA expected that exact outcome, but you never really know what someone's going to do with the stuff they steal from you. Makes you wonder what's ticking under the hood in North Korea's squalid little IT universe, doesn't it? No doubt our team, and China's as well, have planted similar things in case they're needed. Tactics like that are going to be more subtle now, probably.

    --
    Don't disappoint your bird dog. Go to the range.
  11. Re:The meat of the article... by arrow014 · · Score: 5, Interesting

    I actually did a research report on the Therac-25 incident while I was in Software Engineering class a few semesters ago (I was also in Technical Writing at the time, so I could kill two assignments with one report!) ;-) The details of the incident(s) are actually quite fascinating and sometimes spine-chilling.

    Here's the report in PDF if anyone's interested: reportfinal.pdf

    And in HTML for those of you who prefer it: link

  12. Re:Peer review for software? by tree_frog · · Score: 2, Interesting

    And what makes you think that phone network software isn't peer reviewed?

    regards,
    treefrog

  13. 2003 North American Power Outage???? by darthnoodles · · Score: 5, Interesting
    en.wikipedia.org/wiki/2003_North_America_blackout

    From Wiki page:

    It also found that FirstEnergy did not take remedial action or warn other control centers until it was too late because of a bug in the Unix-based General Electric Energy's XA/21 system that prevented alarms from showing on their control system, and they had inadequate staff to detect and correct the software bug. The cascading effect that resulted ultimately forced the shutdown of more than 100 power plants.

  14. intentional bug? by GuyinVA · · Score: 2, Interesting

    "1982 -- Soviet gas pipeline. Operatives working for the U.S. Central Intelligence Agency allegedly plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline"

    can this really be considered a bug? It was an intentional software error..

  15. Not a bug but I think this is appropriate by Iphtashu+Fitz · · Score: 4, Interesting

    My dad tells this story from time to time. I don't know if it's true, but it makes a good story. Back in the early days of computers when only big corporations had them, most software was written in-house by staff programmers. One of the major soda manufacturers had a new mainframe and had one of their top programmers write an accounting package for them. It so happens that the manufacturer was a major competitor of 7-Up. Well for whatever reason the programmer left the company on not-too-good terms. The very next time the manufacturer when to print out a report from the accounting package, every 7th page contained the phrase "Drink 7-Up" in big block letters. They had their remaining programmers go back through the code and try to remove this new "feature" but they were unable to. This guy was so good that he'd embedded the logic for this nastygram right into the actual logic of the accounting package. Supposedly there was code that would dynamically generate other instructions that, when executed would generate other instructions, etc. They were supposedly unable to get rid of the 7-Up message without breaking other parts of the program, so they ended up having to go back to square one and write a whole new accounting package.

    So the story goes...

  16. Did the original post actually quote correctly? by theshowmecanuck · · Score: 2, Interesting
    It would be nice if someone quoting a post in an article to sensationalize, at least made sure the quote was not misleading or wrong... there were no satellites in space during World War One, so of course the Halifax Explosion (which really was the largest non-nuclear explosion recorded) was not the largest non-nuclear explosion seen from space.

    From the post:

    The resulting event is reportedly the largest non-nuclear explosion in the planet's history.

    The actual quote from a hyperlink in the article mentioned in the post:

    "The result was the most monumental non-nuclear explosion and fire ever seen from space"

    The actual largest non-nuclear explosion occured during World War One in Halifax Harbour when an munitions ship collided with another ship and exploded. It is known as the Halifax Explosion. It was picked up on seismographs and created an 18 metre tsunami.

    --
    -- I ignore anonymous replies to my comments and postings.
  17. Understanding the end user by gmerideth · · Score: 2, Interesting

    Years ago, while working on a project for a medical firm, I found out first hand just how horrible things can go wrong with what we eventually agreed was a "bug" but was more of a "human bug" issue that made me sit up and realize that it's not just programmers who will use our programs.

    Without getting to detailed, the end users were allowing certain conditions to go unchecked as the software was telling them it was "OK". There was a rather neat explosion (read, small) that hurt nobody and damaged some equipment because instead of being "OK" it was telling the operator that there was exactly "ZERO K" of space available for data storage on a recording device and the test needed to be shutdown.

    Now, the operators were told that when the counter got low the would see a warning and be told to stop the tests so, was it a bug, was it my assumption that these 11.95/hour service techs would "understand" what "0K" means from "OK" (that's a zero(0) and an O there)? Either way, there was some damage, we had a bit of a laugh, but at least nobody got irradiated and died.

    --
    Why do overlook and oversee mean opposite things?
  18. Re:And don't forget your roots... by drinkypoo · · Score: 4, Interesting

    Right, but back then you had to know how they worked to operate them. In fact I've never seen a modernized steam engine that ran itself. You couldn't even crest a hill too fast, or you'd have a flash in the boiler and blow the thing up, potentially killing people who weren't even in or near the engine at the time since there's a lot of energy involved in phase change and the boiler parts are all heavy. Thus steam engineers actually knew something, or they (and many people around them) were in a lot of trouble.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  19. Re:Whatever happened to the US Navy? by Anonymous Coward · · Score: 1, Interesting

    > First, the ship did not need to be towed back into port, though it did sit dead in the water for a bit ..

    "The ship had to be towed into the Naval base at Norfolk, Va., because a database overflow caused its propulsion system to fail, according to Anthony DiGiorgio, a civilian engineer with the Atlantic Fleet Technical Support Center in Norfolk."

    "Using Windows NT, which is known to have some failure modes, on a warship is similar to hoping that luck will be in our favor," DiGiorgio said

    Curiously enough DiGiorgio later wrote a retraction and 'resigned` from the Navy as did Vice Adm. Henry Giffin.

    "DiGiorgio denies reported statements"

    "I did not say that the Yorktown was towed into Norfolk"

    http://www.gcn.com/17_20/news/33292-1.html

    "Ron Redman, deputy technical director of the Fleet Introduction Division of the Aegis Program Executive Office, said there have been numerous software failures associated with NT aboard the Yorktown."

    "Refining that is an ongoing process," Redman said. "Unix is a better system for control of equipment and machinery, whereas NT is a better system for the transfer of information and data. NT has never been fully refined and there are times when we have had shutdowns that resulted from NT."

    "The Yorktown has been towed into port several times because of the systems failures" [Ron Redman - Aegis]

    "This is the only time this casualty has occurred and the only propulsion casualty involved with the control system since May 2, 1997, when software configuration was frozen," Vice Adm. Henry Giffin

    > Second, the problem was in the software running on top of Windows

    But the software made a call to Windows to divide by zero and Windows made a call to the fpu which did just that.

    http://www.slothmud.org/~hayward/mic_humor/nt_navy .html

    http://www.jerrypournelle.com/reports/jerryp/yorkt own.html

  20. this was not a bug by Anonymous Coward · · Score: 1, Interesting

    According to several docs, this system was taken down by mod.

    January 15, 1990 -- AT&T Network Outage. A bug in a new release of the software that controls AT&T's #4ESS long distance switches causes these mammoth computers to crash when they receive a specific message from one of their neighboring machines -- a message that the neighbors send out when they recover from a crash.

    One day a switch in New York crashes and reboots, causing its neighboring switches to crash, then their neighbors' neighbors, and so on. Soon, 114 switches are crashing and rebooting every six seconds, leaving an estimated 60 thousand people without long distance service for nine hours. The fix: engineers load the previous software release.

  21. Re:And don't forget your roots... by Anonymous Coward · · Score: 2, Interesting

    Engines have existed for centuries. Roman engineers, for example, built siege engines (ballistas, and the like).

    Dictionary.com
    Engineer
    [Middle English enginour, from Old French engigneor, from Medieval Latin ingenitor, contriver, from ingenire, to contrive, from Latin ingenium, ability. See engine.]

    Engine
    n. 1.1. A machine that converts energy into mechanical force or motion.
          2.1. A mechanical appliance, instrument, or tool: engines of war.

    [Middle English engin, skill, machine, from Old French, innate ability, from Latin ingenium. See gen- in Indo-European Roots.]

  22. 22222222 missiles ... almost launched WWIII by Khopesh · · Score: 2, Interesting
    My favorite bug is a computer chip in the US surveillance of Soviet Russia's missile silos. Basically, some early warning system stated that Russia had launched something like 2222222 missiles from every source they had. (I'm not sure of the actual number, but it only contained 2's.)

    Some person down the line noticed that the Russians didn't have that many missiles, couldn't have launched them all with such synchronization, and that there were an awful lot of two's in the report ... actually, every digit of every number was a two. It turned out to be a fried chip somewhere, always pumping out the same bit regardless of input (I have no understanding of the technical side of the issue; maybe it hit the 32-bit limit and the int->string function reacted with 2's).

    Good thing we were not too automated, and that we employed somebody smart enough to critically examine his printouts.

    Disclaimer, this is a favorite tidbit of one of my professors ... I have no real source to refer to.

    --
    Use my userscript to add story images to Slashdot. There's no going back.
  23. Licensing - ACM Position by Embedded+Geek · · Score: 2, Interesting
    I recently completed an "Ethics in the Information Age" class for grad school (my earlier M.S. and undergrad predated such focused classes). As part of the discussions, we talked quite a bit about the Software Engineering Code of Ethics created by the ACM and IEEE and how such a code was a precursor to making software engineering an licensed, certified profession (akin to a CPA). So I figured it'd be neat to link to ACM's page advocating licensing.

    Guess what: They don't, although they appear to be hedging their bets with safety critical software.

    An interesting read...

    --

    "Prepare for the worst - hope for the best."

  24. The problem is more fundamental than competence by MOBE2001 · · Score: 1, Interesting

    Consider how much software is written by people with five years or less of professional experience, on short schedules, with no time allocated for continuing education. If software projects weren't always rush jobs, and on relative shoestring budgets, the quality would be better.

    The software reliability crisis has very little to do with greed, engineering incompetence or the lack of big budgets, in my opinion. There is something fundamentally wrong with the way we program our computers, something that no amount of quality control measures can ever cure.

    The reason that software is bad has to do with a custom that is as old as the computer: the practice of using the algorithm as the basis for software construction. Switch to a synchronous, signal-based approach and the problem will disappear. Complex algorithmic software is essentially unreliable, something that Fred Brooks has shown in his now famous "No Silver Bullet" paper back in 1987. For an alternative approach to software construction see this article in The Silver Bullet News.

    Regardless of what has been said in the past, the problem can be solved. Otherwise, we are in big trouble, very big trouble.