Slashdot Mirror


History's Worst Software Bugs

bharatm writes "Wired has an article on the 10 worst sofware bugs.. From the article 'Coding errors have sparked explosions, crippled interplanetary probes -- even killed people. Here's our pick for the 10 worst bugs ever, but the judging wasn't easy.'"

107 of 645 comments (clear)

  1. Predictions are hard by Teppy · · Score: 4, Interesting

    Wonderful article. Twenty years ago I believed that writing software would soon become a licensed profession. (Need a
    license to own a compiler, for instance.) I thought that the event that would inevitably trigger this is when a software
    bug caused a human death.

    I still believe that programming will eventually require a license, but I now think that lobbying by the big media
    companies will be the cause. Depressing, huh?

    1. Re:Predictions are hard by ZiakII · · Score: 5, Insightful

      Wonderful article. Twenty years ago I believed that writing software would soon become a licensed profession. (Need alicense to own a compiler, for instance.) I thought that the event that would inevitably trigger this is when a software bug caused a human death.

      This is like saying you need a license to operate a Soda Vending Machine because some idiot decided tipping it over trying to get a free soda was a smart idea. You might have to put warnings on compliers like do not code if you have no clue what you are doing, etc but requiring a license won't ever happen. I am sure there will be lawsuits in the future regarding software bugs, but any software being used where an error could cause a human death is going to have a corporation behind it, that can be held responsible.

    2. Re:Predictions are hard by bunratty · · Score: 4, Insightful
      You might have to put warnings on compliers like do not code if you have no clue what you are doing
      Unfortunately, in my experience the ones who have no clue about what they are doing seem to be the most confident that they are top experts.
      --
      What a fool believes, he sees, no wise man has the power to reason away.
    3. Re:Predictions are hard by Jason+Ford · · Score: 5, Informative
      Several recent studies lend support to this observation. From an article at the American Pyschological Association:

      We've all seen it: the employee who's convinced she's doing a great job and gets a mediocre performance appraisal, or the student who's sure he's aced an exam and winds up with a D.

      The tendency that people have to overrate their abilities fascinates Cornell University social psychologist David Dunning, PhD. "People overestimate themselves," he says, "but more than that, they really seem to believe it. I've been trying to figure out where that certainty of belief comes from."

      Dunning is doing that through a series of manipulated studies, mostly with students at Cornell. He's finding that the least competent performers inflate their abilities the most; that the reason for the overinflation seems to be ignorance, not arrogance; and that chronic self-beliefs, however inaccurate, underlie both people's over and underestimations of how well they're doing.

      --
      I did not become a vegetarian for my health, I did it for the health of the chickens. --Isaac Bashevis Singer
    4. Re:Predictions are hard by j-cloth · · Score: 3, Interesting

      What constitues programming is so blurry, though.
      Does it count when someone puts some HTML in a blog? What about Javascript? a DIY PHP site? a batchfile or shell script? Excel function/macro?
      Do you only want to licence compilers? How do I install my OSS? What about the power of interpereted or JIT languages? So much can be done with uncompiled code.

    5. Re:Predictions are hard by idontgno · · Score: 5, Interesting
      I can't cite any documentation, but I recall seeing studies which show that the number one critical attribute of persistently optimistic personalities is a chronic inability to clearly see reality. Is this the same phenomenon?

      In the words of the old chestnut, "If you're calm and confident when everyone around you is running around in blind panic, you clearly don't understand the situation."

      --
      Welcome to the Panopticon. Used to be a prison, now it's your home.
    6. Re:Predictions are hard by BenEnglishAtHome · · Score: 3, Funny
      The tendency that people have to overrate their abilities fascinates Cornell University social psychologist David Dunning, PhD.

      I'll bet the guy just LOVES the first few installments each season of American Idol.

    7. Re:Predictions are hard by servicemaster · · Score: 2, Interesting

      Often I find it's a difference in personality type. Using the Keirsey temperment sorter you see where people are separated by how they perceive the world. One aspect of personality is how you perceive the world to be, either in concrete or possible terms. Objectively the two are not entirely opposite, it's just in how the world is organized. One views their surroundings as possibilities, the other as actualities. The actual view is the same, but the actions and course determined is different. ie. If you're calm and confident when everyone aroudn you is running around in a *BLIND* panic... it's just as likely you can see something they can't

    8. Re:Predictions are hard by TemporalBeing · · Score: 2, Interesting

      We've all seen it: the employee who's convinced she's doing a great job and gets a mediocre performance appraisal, or the student who's sure he's aced an exam and winds up with a D

      In all reality, that is hardly a way to rate someone. Some people, despite how good they may be at the subject, just don't test well - they student could very well be a genius at the subject and still flunk.

      As per performance reviews - you have to have an accurate representation on what you are going to be reviewed on order to be able to achieve the review. This is also flaws as many don't describe or tell how they will actually review someone until after the review has been done, so the person being reviewed has no concept of what they can do to get a good review. As a result, people will think they are going to be rated well, and end up rated poorly because the person rating did not clarify well enough how they were going to do the rating.

      As I said, both of those are majorly flawed ways to evaluate someone. Hopefully, the guy from Cornell took that into account, but I wouldn't be surprised if he didn't. And even if he did, he would have to take it into account so many different ways that it would be too hard to really test its accuracy.

      --
      Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
  2. only 10? by Lucan_UK · · Score: 4, Insightful

    I wouldnt say they are the 10 worst bugs ever... more like the 10 most widely known media announced bugs. Okay I have no examples of any others but I'm sure there must be worse bugs out there...

    anyone think of any others?

    --
    why?
    1. Re:only 10? by plover · · Score: 5, Insightful
      I don't think they should count the "pipeline bug."

      That was a trojan. It was a deliberate attack on their system by an enemy. It didn't even arrive via the now classical "worm" or "virus" route, which would have implied that a "bug let it in the door." No, this one was deliberately planted carefully at the root. It's not a bug, it was an attack.

      --
      John
    2. Re:only 10? by c_fel · · Score: 4, Insightful

      I remember the Mars Polar Lander crash in 1999 [http://www.space.com/missionlaunches/mars_polar_l ander_031222.html%5D. At the time there was a rumor that said it was a human error : somebody had mixed a foot and a meter. Now we know that it was a software bug that was contained in a single line of code.

      --
      I hate all sigs, mine included.
    3. Re:only 10? by arivanov · · Score: 5, Insightful

      Seconded.

      Both radiation bugs in both cases have killed less people then the shiteware used in Patriot missile system. Ariane and Mariner get an honorable mentioning, Raytheon doen't. Why?

      There also no mentioning of power grid system bugs. The recent US blackout was a good example.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
    4. Re:only 10? by ChodeMonkey · · Score: 3, Insightful

      The worst bug ever is the one that's there but you don't know about. Yet.

      --
      All your attention are belong to my old internet meme.
    5. Re:only 10? by ScentCone · · Score: 2, Interesting

      I think it would better be called terrorism.

      Why? Because code that the Soviets stole from the US turned out to be (from their perspective) defective? I don't think it's terrorism if my car blows up while you, having stolen it, are driving it around.

      More to the point, though, the CIA's objective was to cripple the cash flow of the Soviet Union, an entity that really was busy terrorizing much of the world. Their murderous, oppressive grip on Eastern Europe and attempts at foisting their cheerful utopia on South America and Africa wasn't going to get anywhere without the cash they were trying to raise by selling Siberian natural gas to the west. Making the Soviet government's cold cash sales operation less workable for them was part of what finished pulling the rug out from under that hellish government. That they so desperately needed western cash was a sign of how hollow that regime actually was, and that event just added clarity to the picture. I doubt the CIA expected that exact outcome, but you never really know what someone's going to do with the stuff they steal from you. Makes you wonder what's ticking under the hood in North Korea's squalid little IT universe, doesn't it? No doubt our team, and China's as well, have planted similar things in case they're needed. Tactics like that are going to be more subtle now, probably.

      --
      Don't disappoint your bird dog. Go to the range.
    6. Re:only 10? by glesga_kiss · · Score: 2, Insightful
      It's always struck me as a little odd that there are jurisdictions where someone trespassing can be shot (dead!) without any crime having been committed... but if the trespasser shows up when you're not around, picks your front door lock, and a 100-pound safe falls on his head and kills him, then that's a crime.

      Look at the article you just read (or probably didn't ;-). Absorb it. Consider that systems engineered by humans can contain flaws. A human won't shoot the postman or a paramedic ariving unannounced at your door, but the 100-pound safe doesn't differentiate between legitimate and non-legitimate visitors. So, when you kill the eight year old asking for his ball back from your garden, you go to jail.

      OK, so you are now thinking, "what if the trap was indoors, behind lock & key?". Then I say, "what if you had dialed 911, were laying with a broken back and the paramedics have to break down your door?".

      Makes perfect sense to me.

    7. Re:only 10? by mattOzan · · Score: 4, Insightful
      Why? Because code that the Soviets stole from the US turned out to be (from their perspective) defective? I don't think it's terrorism if my car blows up while you, having stolen it, are driving it around.

      Actually, they didn't steal it--they bought it. From the Canadians. After we refused to sell it ourselves.

      These days, the Soviets could probably just have filed an unfair restraint of trade complaint with the WTO!

      Seriously, though, culpability here is convoluted. The Soviets had a legitimate need for this technology, and we said, "No, you can't have it." So they went to someone else to buy it, and we sabatoged it. And the only justification is that the Soviet Union was the "evil empire," which had to be destroyed no matter what.

      Yeah, yeah, it was a "tense time," and "they wanted to bury us, too." But everytime we talk about how capitalism beat communism because it is inherently better, we should remember all of these incidents which were expressly designed to choke out the Soviet State. Did it wither away because it was inefficient and inferior? Or because we had the strength at the time to hound it into oblivion?

  3. Well, that makes me feel better by PIPBoy3000 · · Score: 3, Funny

    Bringing down the company's intranet countless times over the years almost seems like an amusing little distraction. No one died, nothing blew up, and I've even managed to keep my job. It must be that people are getting used to these "software bug" excuses for the various problems that pop up with computers. I'll have to remember that for next time.

    Caller: "My computer exploded and I'm bleeding profusely!"
    911 Operator: "Must be a software bug."

  4. hey, we're all still here by Colin+Smith · · Score: 3, Funny

    So nobody's hit on the really big one yet.

    --
    Deleted
  5. Moth. by Poromenos1 · · Score: 4, Interesting

    The moth was trapped, removed and taped into the computer's logbook with the words: "first actual case of a bug being found."

    Why would they say that, if the term "bug" didn't exist? I mean, you wouldn't find a rat in your car and say "First actual case of a car 'rat' being found" if you didn't use it as a term to indicate something. You'd just say "this bug caused computing errors". I smell a car rat.

    --
    Send email from the afterlife! Write your e-will at Dead Man's Switch.
    1. Re:Moth. by BridgeBum · · Score: 4, Informative

      The term predates computers. In the original usage, any sort of mechanical device or system could have bugs.

      http://www.silicon.com/software/webservices/0,3902 4657,10005407,00.htm

      --
      My UID is the product of 2 primes.
  6. Please try to pay attention by mister_llah · · Score: 4, Insightful

    1995/1996 -- The Ping of Death. A lack of sanity checks and error handling in the IP fragmentation reassembly code makes it possible to crash a wide variety of operating systems by sending a malformed "ping" packet from anywhere on the internet. Most obviously affected are computers running Windows, which lock up and display the so-called "blue screen of death" when they receive these packets. But the attack also affects many Macintosh and Unix systems as well.

    ===

    WinNuke made it...

    --
    MoM++ - A Classic Expanded - [Master of Magic 1.5]
    http://mompp.sourceforge.net/
  7. Bug or User error? by Lucan_UK · · Score: 5, Insightful

    The last one on the list is this

    "Multidata's software allows a radiation therapist to draw on a computer screen the placement of metal shields called "blocks" designed to protect healthy tissue from the radiation. But the software will only allow technicians to use four shielding blocks, and the Panamanian doctors wish to use five.

    The doctors discover that they can trick the software by drawing all five blocks as a single large block with a hole in the middle. What the doctors don't realize is that the Multidata software gives different answers in this configuration depending on how the hole is drawn: draw it in one direction and the correct dose is calculated, draw in another direction and the software recommends twice the necessary exposure.

    At least eight patients die, while another 20 receive overdoses likely to cause significant health problems. The physicians, who were legally required to double-check the computer's calculations by hand, are indicted for murder. " ... to me that sounds like a user not using the software correctly..

    --
    why?
  8. Re:Microsoft's striking absence by IdleTime · · Score: 2, Interesting

    Yes, I saw that too and I guess they have forgotten the most devastating MS bug which is present in all releases from NT 3.1 and at least up to 2k. I haven't tested XP.
    I couldn't find the description right now, but I'm sure others know the bug. The one were you can basically type a special textfile using type-command or similar and will basically BSOD the machine. The file consists of tabs, spaces and newline/carriage return pairs and nothing else. MS never fixed the bug.

    --
    If you mod me down, I *will* introduce you to my sister!
  9. Intel FP divide is -not- a software bug by RootsLINUX · · Score: 2, Insightful

    Why do they have the Intel Pentium floating point divide error listed as a bug? That was a hardware design error in the circuit, it was not a software bug. Of course it caused software to behave unexpectedly, but still I'm surprised that Wired put that one in there.

    --
    Hero of Allacrost, a FOSS RPG for *NIX/*BSD/OS X/Win
    1. Re:Intel FP divide is -not- a software bug by ameline · · Score: 4, Informative

      That is correct -- Modern processors perform divides by having a reciprocal estimate lookup table.
      This table produces an estimate with 12 or so good bits of precision. Iterative refinement (typically microcoded) then produces the rest of the bits. After that the reciprocal is multiplied in, and you get the result.

      More recently this has been somewhat exposed, as most all modern processors have a reciprocal estimate instruction which executed in a single cycle. This is very useful if, for example, you want to normalize a bunch of normal vectors before passing them into the graphics pipeline. 12 bits is almost always enough for this purpose, and the reciprocal sqrt instruction is very much your friend here. So something that was dominated by the ~60 cycles of 1.0f/sqrt(sum_of_squares) becomes 1 cycle. Total speedup is about 10x -- and it's vectorizable -- the SSE unit will do a vector rsqrte.

      My understanding of the pentium fdiv bug is that a section of the reciprocal estimate table had bad data in it.

      This, in my opinion, counts as software, as would the microcode. If the bug had been in the multiplier, adder, or logic circuitry of the lookup table, then it would count as hardware.

      Many, if not all the complex ciscy instructions are implemented in microcode -- so I believe that a bug in them would count as a software bug.

      --
      Ian Ameline
  10. The meat of the article... by cytoman · · Score: 4, Informative

    July 28, 1962 -- Mariner I space probe. A bug in the flight software for the Mariner 1 causes the rocket to divert from its intended path on launch. Mission control destroys the rocket over the Atlantic Ocean. The investigation into the accident discovers that a formula written on paper in pencil was improperly transcribed into computer code, causing the computer to miscalculate the rocket's trajectory.

    1982 -- Soviet gas pipeline. Operatives working for the U.S. Central Intelligence Agency allegedly (.pdf) plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline. The Soviets had obtained the system as part of a wide-ranging effort to covertly purchase or steal sensitive U.S. technology. The CIA reportedly found out about the program and decided to make it backfire with equipment that would pass Soviet inspection and then fail once in operation. The resulting event is reportedly the largest non-nuclear explosion in the planet's history.

    1985-1987 -- Therac-25 medical accelerator. A radiation therapy device malfunctions and delivers lethal radiation doses at several medical facilities. Based upon a previous design, the Therac-25 was an "improved" therapy system that could deliver two different kinds of radiation: either a low-power electron beam (beta particles) or X-rays. The Therac-25's X-rays were generated by smashing high-power electrons into a metal target positioned between the electron gun and the patient. A second "improvement" was the replacement of the older Therac-20's electromechanical safety interlocks with software control, a decision made because software was perceived to be more reliable.

    What engineers didn't know was that both the 20 and the 25 were built upon an operating system that had been kludged together by a programmer with no formal training. Because of a subtle bug called a "race condition," a quick-fingered typist could accidentally configure the Therac-25 so the electron beam would fire in high-power mode but with the metal X-ray target out of position. At least five patients die; others are seriously injured.

    1988 -- Buffer overflow in Berkeley Unix finger daemon. The first internet worm (the so-called Morris Worm) infects between 2,000 and 6,000 computers in less than a day by taking advantage of a buffer overflow. The specific code is a function in the standard input/output library routine called gets() designed to get a line of text over the network. Unfortunately, gets() has no provision to limit its input, and an overly large input allows the worm to take over any machine to which it can connect.

    Programmers respond by attempting to stamp out the gets() function in working code, but they refuse to remove it from the C programming language's standard input/output library, where it remains to this day.

    1988-1996 -- Kerberos Random Number Generator. The authors of the Kerberos security system neglect to properly "seed" the program's random number generator with a truly random seed. As a result, for eight years it is possible to trivially break into any computer that relies on Kerberos for authentication. It is unknown if this bug was ever actually exploited.

    January 15, 1990 -- ATT Network Outage. A bug in a new release of the software that controls ATT's #4ESS long distance switches causes these mammoth computers to crash when they receive a specif

    1. Re:The meat of the article... by meringuoid · · Score: 4, Funny
      The resulting event is reportedly the largest non-nuclear explosion in the planet's history.

      The former dinosaurian population of the Yucatan Peninsula might disagree...

      --
      Real Daleks don't climb stairs - they level the building.
    2. Re:The meat of the article... by arrow014 · · Score: 5, Interesting

      I actually did a research report on the Therac-25 incident while I was in Software Engineering class a few semesters ago (I was also in Technical Writing at the time, so I could kill two assignments with one report!) ;-) The details of the incident(s) are actually quite fascinating and sometimes spine-chilling.

      Here's the report in PDF if anyone's interested: reportfinal.pdf

      And in HTML for those of you who prefer it: link

    3. Re:The meat of the article... by endersdouble · · Score: 2, Informative

      Not to mention anyone at Tunguska.

  11. Whatever happened to the US Navy? by Lead+Butthead · · Score: 3, Informative

    Something about their latest toy... ahm, ship that had to be towed back to port because Windows NT they used to run everything on the ship keep blue screening.

    --
    ELOI, ELOI, LAMA SABACHTHANI!?
    1. Re:Whatever happened to the US Navy? by Phanatic1a · · Score: 4, Informative

      There are no "Aegis class" cruisers. Aegis is a ship combat system, specifically an AN/SPY-1 radar system, a computer based command-and-control system, and one of a number of missile systems either in current-tech VLS cells or older cylindrical magazines and launchers.

      The Aegis system can be found on Ticonderoga-class cruisers and Arleigh Burke-class destroyers in the USN, Kongo-class destroyers in the IJN, and some Spanish frigate whose designation I forget.

      The ship we're talking about is the USS Yorktown, CG-48, and the problem was pretty much as you describe. A user input an erroneous zero value for some quantitity (fuel pressure, I think), and the system ate itself and took the engines offline.

      The Yorktown was decommissioned last year. Shame that the practice of using Windows in ship-critical systems wasn't.

  12. You get what you pay for by cryptoguy · · Score: 2, Insightful

    Consider how much software is written by people with five years or less of professional experience, on short schedules, with no time allocated for continuing education. If software projects weren't always rush jobs, and on relative shoestring budgets, the quality would be better. If continuing education for programmers was a priority, quality would be better. If a couple of decades of experience was properly appreciated, quality would be better.

    1. Re:You get what you pay for by jurt1235 · · Score: 2, Interesting

      If continuing education for programmers was a priority, quality would be better.

      This also requires more than the current courses which are pretty much level starter course. It is sad that after a few days being busy with a language before a course, you will already find mistakes/bugs or just better ways to do it than what is promoted in the course.

      For example after a 3 day crash course (I missed day 1, else it would have been 4 days), I became a certified Stellent developer. So a "real" test at the end to determine if you are worth it or not.

      --

      My wife's sketchblog Blob[p]: Gastrono-me
  13. This bug reminds me of a Dilbert comic by technoextreme · · Score: 4, Funny
    1988-1996 -- Kerberos Random Number Generator. The authors of the Kerberos security system neglect to properly "seed" the program's random number generator with a truly random seed. As a result, for eight years it is possible to trivially break into any computer that relies on Kerberos for authentication. It is unknown if this bug was ever actually exploited.

    Hehehe.... This reminds me of a Dilbert cartoon. Here is what I can remember:
    Some guy: And here is our random number generator.
    Another guy: 2 2 2 2 2 2 2 2 2 2 2 2.
    Dilbert: That isn't very random though.
    Some guy: He is randomly getting the same number.
    Anyone actually know which comic I am thinking of.
    --
    Ooo man the floppy drive is broken. No wait. The computer is just upside down.
    1. Re:This bug reminds me of a Dilbert comic by Yahweh+Doesn't+Exist · · Score: 3, Informative

      if you subscribe (I don't any more) you can probably find it by searching for "random".

      I think the last line is actually something like
      Dilbert: That isn't very random though.
      Some guy: That's the trouble with randomness - you can never tell.

    2. Re:This bug reminds me of a Dilbert comic by Lucan_UK · · Score: 5, Informative

      Here is the Dilbert Strip... Enjoy
      http://www.geocities.com/raptorred42/Dilbert0001.j pg

      --
      why?
  14. Why not both? by brouski · · Score: 5, Insightful

    I've read about this instance before, and I think it's attributable to ignorance on both the user and the developer. The software developer in this case knows the life of a human being is resting on his code, so it should have been nigh impossible to "trick" the software into allowing anything other than what the specs said it could do.

    --
    Proud member of the American Non Sequitur Society. We might not make much sense, but boy do we love pizza!
    1. Re:Why not both? by Lucan_UK · · Score: 2, Insightful

      I agree... to a certain extent. I think this instance shouldnt be blamed on the developer but on on the testing team... Im sure that a piece of software like this had to have been tested... so why was it not found?

      A developer is really only as good as the testing team telling him its wrong!

      --
      why?
  15. Time for stressing more on formal specifications by ashtophoenix · · Score: 2, Insightful

    I found it a hard subject in school and have never used it practically, but it seems to be the only SURE way of proving the correctness of a program. Shouldn't we be using it, at least in real-time mission-critical applications now. I think it needs to be stressed a lot more in school from the start, as compared to topics like web development and java and all other pragmatic things that can be learned more easily.

    --
    Life is about being a Phoenix!
  16. Re:Microsoft's striking absence by Shaper_pmp · · Score: 2, Informative

    Jeuss Christ. I'd somehow never heard of this bug, and I've been developing for Windows machines for years.

    How on earth was such a basic and low-level bug ignored for so long? It doesn't seem like rocket-science to fix it with a small bounds-checking if statement!

    --
    Everything in moderation, including moderation itself
  17. Re:Microsoft's striking absence by mattyohe · · Score: 3, Informative

    Do people just open an article, do a Ctrl+F and type microsoft to find something 'juciy'? If you would have RTFA you would have seen that the 'Ping Of Death' was mentioned which did impact Windows machines.

    --
    - what is the definition of simultanagnosia?! I've been meaning to look it up!
  18. Re:omg by PhilHibbs · · Score: 4, Informative

    Its intent was not to cause terror, but to inflict economic damage. I heard about a similar incident where a Japanese shipbuilder was stealing blueprints from a UK shipyard tendering for a contract and undercutting them. The UK shipbuilder deliberately designed a ship that would capsize on launch, which the Japanese duly stole, built, and launched. I don't know if anyone was killed, but ethically it's a tricky one.

  19. Suuuuure by kcurtis · · Score: 3, Insightful

    Indeed. Causing a *NON FATAL* explosion in a country that imprisoned as many as 2.5 million political prisoners in Gulags at one time, and is estimated to have murdered upwards of 60 MILLION of its own citizens. Terrorism?

    Terrorism is an act of mayhem designed to terrorize. This did not.

    Sabotage? Yes.
    Act of war? Probably.
    Terrorism? Not even close.

    Your statement is just a display of anti-American rhetoric with no basis in reality.

  20. Airbus Crash by CruddyBuddy · · Score: 5, Informative
    Here is video of an Airbus crashing into the trees because the autopilot didn't like the landing conditions. IIRC (remember), the pilot's pull-up was ignored because the flight conditions weren't optimum despite an obvious life threatening situation. If this isn't a software bug, what would you call it? (Maybe the software considered crash modes and this configuration allowed the black box to survive intact.)

    http://www.alexisparkinn.com/photogallery/Videos/A irbus320_trees.mpg/

    (Let the slashdotting begin! (poor servers))

    All things considered, I don't know if the pilots survived.

    --
    ----------
    Any problem can be made unsolvable if there are enough meetings made to discuss it.
    1. Re:Airbus Crash by be-fan · · Score: 5, Informative

      I actually know why this happened. We learned about it in our flight dynamics class. The problem was the result in a mistmatch between what the pilot thought the airplane was doing, and what it was actually doing. The A320 had software that prevented the pilot from stalling the airplane during flight. However, the protection only kicked in above 90', because the software assumed that if you were below that, you wanted to land (which involves a stall right at touchdown). The pilot was trying to do a flyby, and was supposed to be above 100', but for whatever reason he came in at around 30'. Now, the reasons he didn't pull up and ramp up the engines are debatable, but the equitable explanations suggest that he assumed that the airplane's stall protection would kick in, while the airplane had disabled them because it thought it was about to land.

      --
      A deep unwavering belief is a sure sign you're missing something...
    2. Re:Airbus Crash by Tim+Browse · · Score: 3, Informative
      Just to add yet another explanation, when I worked for Rediffusion (UK flight simulator manufacturer), this air show crash was discussed during our induction. If I remember correctly, the pilot span down the engines to lower the aircraft, and then tried to power them up again to lift the aircraft out of the descent and fly over the trees. The pilot claimed the system over-rode his desire to power up the engines, causing the crash. (I believe he had already over-ridden some safety mechanism to allow him to perform this descent in the first place.)

      However the actual problem was that airliner engines aren't like some awesome fighter jet with afterburner. They take time to spin up - from examining the black box, they determined that at the point the pilot wanted to ascend, even if the engines span up at the maxiumum rate, it was still nowhere near enough to pull the plane out of the descent. Hence, pilot error.

  21. MOD PARENT BACK DOWN IT WAS A SOFTWARE BUG by gorim · · Score: 4, Informative

    Because it was actually implemented as microcode and stored into the CPU, whether as mask rom or some other means of storing, but it was indeed software either way you look at it.

  22. Same old tiresome error: "BUG" was old then by SysKoll · · Score: 4, Insightful
    The Wired article perpetuates the same old tiresome mistake, that is, that the term "bug" originated from a moth found in a 1947 computer.

    That is wrong. This is a myth that has been disproved several times. See for example the "IEEE Annals of Computer History" where Adm. Grace Hopper said that that the term "bug" was used at least since the 30s, and maybe earlier, to describe an electrical problem in a system. See also here.

    In interview, Hopper confirmed that the notebook moth's caption, "First actual case of bug being found", clearly shows that it was a joke referring to a term that was already in use at the time.

    Any idiot researching this anecdote for five minutes could have found about it. I guess Wired couldn't be bothered. At this level of laziness and incompetence, one wonders why they just don't start publishing printouts of slashdot laced with ads. At least, this place contains occasional nudgets of truth.

    Once again, Wired blew it. Nice jobs, guys.

    --

    --
    Mad science! Robots! Underwear! Cute girls! Full comic online! http://www.girlgeniusonline.com/

    1. Re:Same old tiresome error: "BUG" was old then by ScottForbes · · Score: 2, Informative
      I hate to interrupt your rant, but... the Wired article doesn't say that the term "bug" originated in 1947. It merely notes that the first widely known "buggy computer" was the Harvard Mark I:
      With that recall, the Pruis joined the ranks of the buggy computer -- a club that began in 1947 when engineers found a moth in Panel F, Relay #70 of the Harvard Mark 1 system. The computer was running a test of its multiplier and adder when the engineers noticed something was wrong. The moth was trapped, removed and taped into the computer's logbook with the words: "first actual case of a bug being found."
      Unless you demand that anyone retelling the 1947 anecdote immediately prove their street cred -- "Of course the term 'bug' did not originate with this incident, blah blah blah, I mention this to prove that I'm smarter than you are" -- then the Harvard Mark I's moth is the earliest example of a computer glitch that the public might have heard about. Since the rest of the article is about other bugs the public might have heard about, and since the article repeated Hopper's exact words about finding an "actual bug" (which, as you note, implies that they'd been calling them "bugs" long before they found a genuine moth), how about you easing up a little and giving writer Simson Garfinkel some slack?
  23. Re:omg by Jason+Hood · · Score: 2, Interesting

    And i suppose if I had a "broken" gun in my basement and you broke in and stole it, then tried to use it and injured yourself, you could sue me right?

    Sorry, i am having a hard time seeing the correlation to Terrorism here. It seems that you have a predisposition to the US's stance on terror and are desparately trying to make a connection for a political statement. Unfortunately, typical slashdot readers will agree with you =)

    This would be very different if the US broke in to USSR and altered their software to malfunction. That definitely would be a criminal act, but more perhaps importantly and act of War. I doubt the Soviets ever figured out what happened until they were told.

    --
    Are you intolerant of intolerant people?
  24. Re:Microsoft's striking absence by varmittang · · Score: 4, Informative

    Remember when the LA air traffic control tower crashed, due to a bug in MS software after 49 days. I would think that this would make it up there. http://www.itgarage.com/node/459

    --
    -----BEGIN PGP SIGNATURE-----
    12345
    -----END PGP SIGNATURE-----
  25. Re:They are just very, VERY careful. by Coryoth · · Score: 3, Interesting

    When you are writing software for life-critical applications, there is various software and techniques that ensures bug-free code. Just look at all the airplanes, powerplants, car computers, etc. It's not very usual at all to see one fail critically.

    When you are writing software for life critical systems, there are methods you can follow that allow you much greater assurance of correct code and drastically reduce the testing burden (byt being abel to prove that certain classes of errors don't exist in the code). It's akin to static types, which allow you to statically catch a lot of type errors obviating reducing the need to spend time testing for possible type errors.

    The languages and methods used are things like SPARK and B-method The beauty of systems like SPARK is that they provide a degree of flexibility in how much work you go to depending on how much extra assurance you want. It is quite possible to simply specify critical portions of code with a little extra formality (basically extended static checks beyond what type checing alone can give you) through to fully specifying everything and doing formal proofs for the whole system. You can tailor the effort and assurance to the needs of the project.

    (This time without that dangling link - that'll teahc me not to preview)

    Jedidiah

  26. Medical Systems by koehn · · Score: 5, Interesting

    I designed and build a diagnostic radiology workstation (in 1997, in Java 1.1, 4x5 megapixel monitors, still in use today). During the development effort we were regaled with stories of software glitches in medical systems resulting in disaster. It really keeps you focused.

    In one case, a radiation treatment system had a bug where if you used the backspace key when entering the dose a patient received, the display would show you deleted the last digit, but internally you hadn't. So the patient would recieve 10^backspace times the intended dose of radiation. Not a big deal normally, since the techs would typically shut the machine off between treatments. Until one day when they had two patients needing treatment back to back. The tech knew something was wrong when the machine was running for an unusually long time. The patient knew something was wrong when he died.

    On our team a defect that crashed the system was considered severity 2. Severity 1 was reserved for defects that could result in a mis-diagnosis, which most patients agree is worse than a crash.

  27. Re:Whoops forgot to hit preview by CastrTroy · · Score: 3, Informative

    There are software engineers in Canada now. They can legally sign off on a software project. The problem is, is that you don't want to have every one of your programmers be licensed software engineers, all signing off on their own code. It would be too expensive to try and hire that many engineers, and managing all the signatures for all the code, when different people work on the same piece of code would be a nightmare to manage. Basically you'd have to have one engineer, or team thereof, overseeing the entire project to be sure that proper methods are being followed to ensure that there aren't any bugs. What you're asking for is more like saying that everyone who in building a bridge be licensed, and that they should all have to sign off on every rivet they put in.

    The problem is, is that most companies producing software do not want to pay for an engineer to oversee their project. Also, the way most software operations are run, you wouldn't see an engineer, signing off on the projects. The engineer would force things to be much more tested in order to be ensure that things were actually worthy to be signed off on. There is lots of this kind of software being built for planes, and other situations where it really matters if there is bugs. I don't think this kind of situation will ever happen with off the shelf software. For one thing, software would cost too much, and most people aren't willing to pay $2000 to run an operating system on their home computer, and also because most engineers wouldn't sign off on a system, in which they didn't know the computer their software would run under. There's too many variables on a home computer to be able to garauntee, at that level, that your software will operate completely as expected.

    --

    Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
  28. I think this one should have made the list by Phanatic1a · · Score: 5, Insightful

    For potential severity, this one's worse than a few they listed.

    Basically, the Navy was running critical ship systems on a Windows NT platform, and a divide-by-zero in a database caused a buffer overrun that resulted in a shutdown of the engines, leaving the ship dead in the water for 2.5 hours.

    Fortunately, it was on maneuvers off of Cape Charles, and not at war off the coast of Yemen or something. Scratch a billion-dollar destroyer and most of her crew because of an NT bug, in that case.

    1. Re:I think this one should have made the list by YrWrstNtmr · · Score: 3, Insightful
      Mostly incorrect. An application bug, not an NT bug. The exact same situation could have occurred no matter what the platform. Poor development, minimal training, and pretty much zero testing, was the cause.

      "NT played no role in the Yorktown's LAN crash, Baker said."
      "The Yorktown is unique because it was a proof-of-concept [ship] put out to sea without formal testing and software certification, which our products normally go through," Baker said.

  29. Re:Whoops forgot to hit preview by Balthisar · · Score: 3, Informative
    You are not even allowed to call yourself an engineer without getting that license. That person is actually held legally responsible for the projects he signs off on.

    Actually, you're confusing the title "P.E." (professional engineer) with the generally accepted term "engineer." One (the P.E.) is a licensed engineer, and others are used traditionally and arbitrarily with no legal recourse. For example, I and my co-workers are bona fide engineers, and most of us have engineering or engineering technology degrees. None of what we do requires a P.E. to sign off on anything, although there are other aspects of our business (and many other businesses) that do require a P.E.

    Of course, there are all kinds of "engineers" that have that title but don't truly merit it -- customer service engineer; field service engineer; applications engineer; and so on. Most of these don't hold engineering degrees. For many of them, I don't begrudge them their title, either. But we also know that they're not P.E.'s.

    --
    --Jim (me)
  30. Couple of Bugs I thought of by randomErr · · Score: 2, Funny

    2004 Luxembourg blackout
    Patriot Missle - Missles had to be shut down once a day because targeting system would cycle every minute and change the internal cordinating system a fraction of a degree. Over the course of a few days the targeting system would be completely useless.
    PS/2 shutdown bug - Analog copiers at the time fuser componants worked athe same frequency as the processor's shutdown signal.
    Minus World - Super Mario Brother - A hidden water glitch
    ErMac - Mortal Combat

    --
    You say things that offend me and I can deal with it. Can you?
  31. Re:Worst _software_ bugs, huh ? by squiggleslash · · Score: 2, Informative

    The bug was about missing data in a lookup table. Intel said the problem was caused by a bug in the script designed to populate that table when the CPU was being designed, though legend has it that someone erronously "proved" that the data wasn't needed. So, I guess, either way you look at it, be it the script, the table, or the alleged logic flaw, it's a software, not a hardware bug (or at least, it's a bug caused by software.)

    --
    You are not alone. This is not normal. None of this is normal.
  32. What? No Outlook Express? by edunbar93 · · Score: 4, Insightful

    Why isn't Outlook Express in here? Early versions basically changed unopened e-mail viruses from a hoax to reality, when Microsoft decided it was a *good* idea to automatically run any VB script that was recieved. That's cluelessness like trusting everyone to be good and decent human beings while you walk through a prison shower with "Please rape me" painted on your back.

    Later versions tried to fix the problem while keeping the functionality, as if somehow the bad guys would intentionally include the Evil Bit in their code.

    --
    "No problem. I have the capacity to do infinite work so long as you don't mind that my quality approaches zero."-Dilbert
    1. Re:What? No Outlook Express? by Scoria · · Score: 2, Insightful

      Later versions tried to fix the problem while keeping the functionality, as if somehow the bad guys would intentionally include the Evil Bit in their code.

      If the newer build didn't contain the same functionality, then nobody would upgrade their software. Outlook Express has also served to reinforce the idea that this functionality should exist and be activated by default in all modern e-mail clients. If you were to install a different e-mail client -- Thunderbird, for instance -- on a computer belonging to an individual that had become accustomed to Outlook Express, you would receive complaints due to the more secure behavior.

      Good security is transparent to an end-user. The problem is that functionality is not, and end-users often prefer functionality at any cost.

      --
      Do you like German cars?
  33. Re:Peer review for software? by tree_frog · · Score: 2, Interesting

    And what makes you think that phone network software isn't peer reviewed?

    regards,
    treefrog

  34. Not a bug by stlhawkeye · · Score: 4, Informative
    In a series of accidents, therapy planning software created by Multidata Systems International, a U.S. firm, miscalculates the proper dosage of radiation for patients undergoing radiation therapy.

    I used to work with the lead programmer on this software package from Multidata. We worked together at two different companies for a total of about four years.

    Multidata's software allows a radiation therapist to draw on a computer screen the placement of metal shields called "blocks" designed to protect healthy tissue from the radiation. But the software will only allow technicians to use four shielding blocks, and the Panamanian doctors wish to use five.

    This is also made very clear in the documentation. This isn't a bug at all, the dosimitrists misused the software.

    The doctors discover that they can trick the software by drawing all five blocks as a single large block with a hole in the middle. What the doctors don't realize is that the Multidata software gives different answers in this configuration depending on how the hole is drawn: draw it in one direction and the correct dose is calculated, draw in another direction and the software recommends twice the necessary exposure.

    Exactly. They tried to create a feature that the software did not support, and they did so in a manner that broke the software.

    At least eight patients die, while another 20 receive overdoses likely to cause significant health problems. The physicians, who were legally required to double-check the computer's calculations by hand, are indicted for murder.

    It's not a software bug, it's a user error. This isn't a bug any more than it's a "bug" that your Linux box stops working properly if you do sudo rm -rf /. The users of the product knew better.

    To be fair, Multidata was not a great shop from a procedural standpoint - the guy who ran it was insane, but the software was rock solid. I actually worked with a number of former Multidata employees who jumped ship and went to a rival shop that builds similar software, and they were all fairly competant and intelligent.

    --
    "I have never won a debate with an ignorant person." -Ali ibn Abi Talib
    1. Re:Not a bug by JMan1 · · Score: 3, Insightful
      Exactly. They tried to create a feature that the software did not support, and they did so in a manner that broke the software.

      Except that the software didn't break well. It should have either reported that the action wasn't allowed or calculated correctly. It shouldn't look like it's working but give erroneous results. If a single block with a hole isn't supported, why are you allowed to select it?

    2. Re:Not a bug by bit01 · · Score: 4, Insightful

      It's not a software bug, it's a user error.

      It's both. The program should not have accepted easily recognised invalid input and the user should not have entered it.

      I don't care if it's not in the spec, it's commonly accepted programming practice that all input should be bounds checked and any program that doesn't do that is crap.

      Your rm example is not equivalent as command line programs are by design flexible; in unusual circumstances it may be exactly what the operator wants to do.

      ---

      Keep your options open!

    3. Re:Not a bug by frankie · · Score: 2, Insightful
      It's not a software bug, it's a user error.

      No. When you design software that is explicitly intended to perform potentially lethal actions on human beings, you absolutely make sure it's foolproof. You do input validation at every freaking step, then double-check the result before you pull the trigger.

      If I go in for LASIK and get my retina burned off because some technician turned the wrong dial up to 11, you bet your ass I'm suing the manufacturer right along side the clinic. It should not be possible for the user to screw up the software when life is on the line.
    4. Re:Not a bug by barole · · Score: 3, Insightful
      I develop medical software for a living and this is the scariest thing I have ever read.

      Your example is incomplete. Imagine that you type "rm -rf / junk" and the system responds "Delete /junk?", so you answer "Y" and it then deletes the whole filesystem.

      It is most certainly a bug. First, there is a mismatch between what is shown on the screen and what the system is doing. That is a bug by any definition. Second, the system obviously had gaps in its validation of input. This makes it no less of a bug than many of the others listed (eg fingerd bug).

      Furthermore, it is the responsibility of designers and developers of medical software to ensure that potential hazards are identified and mitigated. A hazard of "calculated dose does not match image shown on screen" is not some obscure hazard that no one would have thought of - it is the first that comes to mind!

      Please tell me that these people are not involved in medical software anymore.

    5. Re:Not a bug by Ernesto+Alvarez · · Score: 2, Insightful

      Although in this incident there is a clear operator error (attempting to do some function clearly out of spec), the creators of the software are also to blame, if the problem was as you described it.

      Changing the order of the vertices of a geometric figure should not affect the way the "inside" of the figure should be, since the order of the points is irrelevant (geometric-wise, as in mathemathics).

      The software should have probably prompted the user (in all cases) which should be the inside area and not assumed something that is not clearly defined (especially since we're talking about a potentially lethal assumption).

      As the sibling posts say, a better UI would have probably helped a lot, but there was a fatal mistake in the software from the beginning.

      You shouldn't call software made under insane management and disregarding procedures "rock solid" (especially if there are deaths involved). It is definitely not. I would have supposed that software developers would have taken a hint after Therac-25.

  35. 2003 North American Power Outage???? by darthnoodles · · Score: 5, Interesting
    en.wikipedia.org/wiki/2003_North_America_blackout

    From Wiki page:

    It also found that FirstEnergy did not take remedial action or warn other control centers until it was too late because of a bug in the Unix-based General Electric Energy's XA/21 system that prevented alarms from showing on their control system, and they had inadequate staff to detect and correct the software bug. The cascading effect that resulted ultimately forced the shutdown of more than 100 power plants.

  36. Re:omg by greginnj · · Score: 2, Informative
    And i suppose if I had a "broken" gun in my basement and you broke in and stole it, then tried to use it and injured yourself, you could sue me right?
    ...believe it or not, this is essentially true. A lawyer friend of mine (in NJ) tells me that if you booby-trap your house against thieves, a thief breaks in, and is injured, he can sue you and has some chance of winning. I forget what the actual liability is (it's not 'unsafe working conditions' or something urban-legend-sounding like that), but there are grounds for a suit.
    --
    Read the best of all of Slash: seenonslash.com
  37. intentional bug? by GuyinVA · · Score: 2, Interesting

    "1982 -- Soviet gas pipeline. Operatives working for the U.S. Central Intelligence Agency allegedly plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline"

    can this really be considered a bug? It was an intentional software error..

  38. Re:Stupid question about the gets() problem... by SnarfQuest · · Score: 2, Insightful

    It is inherently broken by design. You pass it a buffer of indeterminate size (selecting one large enough for your purposes), but you don't have any way of telling the function how big the buffer is. If you read more data than the buffer can handle, bad things happen.

    No, the size of the buffer cannot reliably be determined from inside the function. Not even if you make it a macro.

    Why do they retain it? Because dropping it would break a LOT of existing code. So they have been modifying the compilers to generate warning messages when it sees them.

    --
    Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
  39. Re:omg by daniil · · Score: 2, Insightful
    I'd say that its intent was to INDIRECTLY cause terror.

    You keep using that word, 'terror'. Are you sure you know what it means?

    The fact that there was an explosion of such magnitude doesn't bother me a bit. And I bet the majority of the citizens of the USSR weren't shaken a bit by this explosion, because (drum roll) they never knew such an accident had happened (and that's, for me, the scary part). And nothing spells success better than an act of terror noone finds out about, now does it?

    --
    Man is a slave because freedom is difficult, whereas slavery is easy.
  40. Not a bug but I think this is appropriate by Iphtashu+Fitz · · Score: 4, Interesting

    My dad tells this story from time to time. I don't know if it's true, but it makes a good story. Back in the early days of computers when only big corporations had them, most software was written in-house by staff programmers. One of the major soda manufacturers had a new mainframe and had one of their top programmers write an accounting package for them. It so happens that the manufacturer was a major competitor of 7-Up. Well for whatever reason the programmer left the company on not-too-good terms. The very next time the manufacturer when to print out a report from the accounting package, every 7th page contained the phrase "Drink 7-Up" in big block letters. They had their remaining programmers go back through the code and try to remove this new "feature" but they were unable to. This guy was so good that he'd embedded the logic for this nastygram right into the actual logic of the accounting package. Supposedly there was code that would dynamically generate other instructions that, when executed would generate other instructions, etc. They were supposedly unable to get rid of the 7-Up message without breaking other parts of the program, so they ended up having to go back to square one and write a whole new accounting package.

    So the story goes...

  41. Well, they're ok, but not quite the worst by douthat · · Score: 5, Informative

    I think the two worst computer bugs of all time are the two that quite possibly could have wiped us all out. More inforation here.

    (Copied from the article:)
            * November 9, 1979, when the US made emergency retaliation preparations after NORAD saw on-screen indications that a full-scale Soviet attack had been launched. No attempt was made to use the "red telephone" hotline to clarify the situation with the USSR and it was not until early-warning radar systems confirmed no such launch had taken place that NORAD realised that a computer system test had caused the display errors. A Senator at NORAD at the time described an atmosphere of absolute panic. A GAO investigation led to the construction of an off-site test facility, to prevent similar mistakes subsequently. A fictionalized version of this incident was filmed as the movie WarGames, in which the test system is inadvertantly triggered by a teenage hacker believing himself to be playing a video game.

            * September 26, 1983, when Soviet military officer Stanislav Petrov refused to launch ICBMs, despite computer indications that the US had already launched.

            If it weren't for two humans who said "fuck what the computer says!", we might be in a very different place right now.

    --
    She loves me: 09F911029D74E35BD84156C5635688C0 She loves me not: 09F911029D74E35BD84156C5635688BF ...
    1. Re:Well, they're ok, but not quite the worst by hackstraw · · Score: 2, Insightful

      If it weren't for two humans who said "fuck what the computer says!", we might be in a very different place right now.

      I guess that is why they were there.

      Computers are excellent at performing according to the logic that is programmed into them. For the most part, they cannot "think" or take a step back and say, "I'm sure I did everything right, but something still looks wrong". I used to put on my math tests something like, "I know this is not the right answer, but here is my work". To me, that is much more important than purporting that the answer is correct, and most of the time, I had done something stupid that given more time than the class allowed, I would have found the error.

      Just recently, I had an issue with my bank because I had just over my minimum balance to not receive a maintenance fee. But one month I did dip below that minimum value, and I put the money back in shortly after that. Anyway, for a few months I was still getting maintenance fees because the balance was going below the minimum value because the maintenance fee was causing the balance to go below the minimum again. I would go to the bank, show them my balance history, and they would say sorry and refund my account. However, the refund was not applied at the time of incident, but immediately, so next month I would get another fee.

      Finally, I said, "Look, I can't keep coming in here to get this fee removed. Especially, when the fee is because of a fee, and I've been able to keep the balance at the agreed upon amount with the exception of when you keep billing me. I could put more money in the account to compensate for a fee so that it would not drop below the minimum, but in my eyes, that is similar to extortion. I can close the account if necessary." Finally, the banker put a fee hold on my account for 45days or so, and its only a memory now.

  42. Probably BS by hughk · · Score: 2, Informative

    I looked at this a while back because many millenia ago, I worked at the company that produced the telemetry/control system for the Trans-Sib pipeline. It was a specialised outfit based in Warwickshire, UK. It is very doubtdul that their systems could have nobbled by anyone. The network was closed, based on an X.25ish HDLC and the software was blown on to UV erasable EPROMs. The CIA may have modified the s/w at the pump stations, but again it is doubtful.

    --
    See my journal, I write things there
  43. Did the original post actually quote correctly? by theshowmecanuck · · Score: 2, Interesting
    It would be nice if someone quoting a post in an article to sensationalize, at least made sure the quote was not misleading or wrong... there were no satellites in space during World War One, so of course the Halifax Explosion (which really was the largest non-nuclear explosion recorded) was not the largest non-nuclear explosion seen from space.

    From the post:

    The resulting event is reportedly the largest non-nuclear explosion in the planet's history.

    The actual quote from a hyperlink in the article mentioned in the post:

    "The result was the most monumental non-nuclear explosion and fire ever seen from space"

    The actual largest non-nuclear explosion occured during World War One in Halifax Harbour when an munitions ship collided with another ship and exploded. It is known as the Halifax Explosion. It was picked up on seismographs and created an 18 metre tsunami.

    --
    -- I ignore anonymous replies to my comments and postings.
  44. Understanding the end user by gmerideth · · Score: 2, Interesting

    Years ago, while working on a project for a medical firm, I found out first hand just how horrible things can go wrong with what we eventually agreed was a "bug" but was more of a "human bug" issue that made me sit up and realize that it's not just programmers who will use our programs.

    Without getting to detailed, the end users were allowing certain conditions to go unchecked as the software was telling them it was "OK". There was a rather neat explosion (read, small) that hurt nobody and damaged some equipment because instead of being "OK" it was telling the operator that there was exactly "ZERO K" of space available for data storage on a recording device and the test needed to be shutdown.

    Now, the operators were told that when the counter got low the would see a warning and be told to stop the tests so, was it a bug, was it my assumption that these 11.95/hour service techs would "understand" what "0K" means from "OK" (that's a zero(0) and an O there)? Either way, there was some damage, we had a bit of a laugh, but at least nobody got irradiated and died.

    --
    Why do overlook and oversee mean opposite things?
  45. Y2K by Dausha · · Score: 2, Insightful

    What about the Y2K bug? I believe that had a greater economic impact than many of the other "worst."

    --
    What those who want activist courts fear is rule by the people.
  46. Mars Climate Orbiter -- English/metric SNAFU by alumshubby · · Score: 2, Informative

    speaking of NASA foulups, Remember this one? "(CNN) -- NASA lost a $125 million Mars orbiter because a Lockheed Martin engineering team used English units of measurement while the agency's team used the more conventional metric system for a key spacecraft operation, according to a review finding released Thursday."

    --
    "How many light bulbs does it take to change a person?" --BMcC-->
  47. Re:And don't forget your roots... by drinkypoo · · Score: 4, Interesting

    Right, but back then you had to know how they worked to operate them. In fact I've never seen a modernized steam engine that ran itself. You couldn't even crest a hill too fast, or you'd have a flash in the boiler and blow the thing up, potentially killing people who weren't even in or near the engine at the time since there's a lot of energy involved in phase change and the boiler parts are all heavy. Thus steam engineers actually knew something, or they (and many people around them) were in a lot of trouble.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  48. Re:You get what you pay for NONSENSE by mumblestheclown · · Score: 4, Insightful
    Some of you marked the parent comment "insightful" because it doubtlessly seemed like commonsense, reasonable analysis.

    However, you have been fooled. The parent comment is competely at odds with the article.

    The article shows largely a series of examples where you DID have HIGHLY PAID and HIGHLY trained professionals with plenty of experience and oversight, but nevertheless very significant bugs occurred. So, the real lesson from this article is not "you get what you pay for," but rather that "software development is very hard" and perhaps that "by nature of its hardness, we can expect critical flaws to pop up from time to time, even when highly trained, experienced, and monitored programmers are involved."

  49. Abuse of language by Presence1 · · Score: 2, Insightful

    "That CIA gas plant explosion 'bug' is disgusting and has America == No.1 Terrorist written all over it if true."

    I might as well say: "Idiots like you that corrupt the language are worse than terrorists."

    Both are absurd exaggerations that have nothing to do with reality, and only degrade the ability of our language to carry meaning.

    Get Real. Terrorism is the deliberate use of violence against civilians in order to induce a state of terror in the general population, as a method intended to achieve political, religious or ideological goals.

    The CIA were not using violence, they were attempting to cause stolen technology to fail.

    The CIA were not targeting civilians. Moreover, AFAIK, not one person was even killed in the explosion, which happened in a very remote area, and the specific explosion was certianly not planned (they had no knowledge of or control over how the Soviets used the stolen technology).

    The CIA were certainly neither attempting to induce a state of terror, not cause change by inducing a state of terror.

    You want to oppose the US government? Great -- there are many good bases on which to do so. But please, before you speak up next time, get some facts, learn how to use the language, and THINK! You might then have a chance of convincing somebody of your point, instead of just annoying them with your ignorance.

  50. Not that "bug found in relay" story again by Anonymous Coward · · Score: 2, Informative
    With that recall, the Pruis joined the ranks of the buggy computer -- a club that began in 1947 when engineers found a moth in Panel F, Relay #70 of the Harvard Mark 1 system. The computer was running a test of its multiplier and adder when the engineers noticed something was wrong. The moth was trapped, removed and taped into the computer's logbook with the words: "first actual case of a bug being found."

    I hate to be pedantic (well no, I love it), but according to the Jargon file's entry on "bug":

    Indeed, the use of bug to mean an industrial defect was already established in Thomas Edison's time, and a more specific and rather modern use can be found in an electrical handbook from 1896 (Hawkin's New Catechism of Electricity, Theo. Audel & Co.) which says: "The term 'bug' is used to a limited extent to designate any fault or trouble in the connections or working of electric apparatus." It further notes that the term is "said to have originated in quadruplex telegraphy and have been transferred to all electric apparatus." ...
    Actually, use of bug in the general sense of a disruptive event goes back to Shakespeare! (Henry VI, part III - Act V, Scene II: King Edward: "So, lie thou there. Die thou; and die our fear; For Warwick was a bug that fear'd us all.") In the first edition of Samuel Johnson's dictionary one meaning of bug is "A frightful object; a walking spectre"; this is traced to 'bugbear', a Welsh term for a variety of mythological monster which (to complete the circle) has recently been reintroduced into the popular lexicon through fantasy role-playing games.

    But then again, why expect more from Wired.

  51. Re:And don't forget your roots... by mrisaacs · · Score: 3, Informative

    Actually engineers existed long before the steam engine. The title of Civil Engineer was created to differentiate the practioners from the Military Engineers - the most common and probably the oldest usage of the title of engineer.

    --
    ...carrier dead.....
  52. Re:Goose and gander by Jherek+Carnelian · · Score: 3, Informative

    Chile
    Colin Powell's statement: "With respect to your earlier comments about Chile in the 1970s and what happened with Mr. Allende, it is not a part of American history that we're proud of."

    Iran

    Guatemala

    Greece

    There's lots more where those came from -- all democratically elected too. I hope you survive the cognitive dissonance.

  53. Mangement problems by gr8_phk · · Score: 4, Insightful
    "...trained professionals with plenty of experience and oversight, but nevertheless very significant bugs occurred."

    Some of the bugs reported in the story were not so much the fault of programmers, but of management. The phone network bug was a misplaced { character in a nested if-else construct. The code had already been though extensive testing, and then a small change was needed. Because it was a "minor" change someone said it didn't need to go through the extensive (expensive) testing again. It's always easy to point at the code or the guy who wrote it. Especially when the boss is the one tasked with finding out what went wrong.

    1. Re:Mangement problems by TFloore · · Score: 2, Informative

      The phone network bug was a misplaced { character in a nested if-else construct.

      Is that what it was? I thought I'd heard that the AT&T outage was from a missing break; in a switch-case statement.

      I found that more believable, because a missing { would cause a compiler error, where a missing break; is a valid way to purposely fall into the next case.

      Though, really, I suspect both of us are just repeating rumors we heard.

      --
      This is my sig. There are many like it but this one is... Oops. Frank, I've got your sig again! Where's mine?
  54. Re:You get what you pay for NONSENSE by tyen · · Score: 5, Insightful

    ...you DID have HIGHLY PAID and HIGHLY trained professionals with plenty of experience and oversight, but nevertheless very significant bugs occurred...So, the real lesson from this article is not "you get what you pay for," but rather that "software development is very hard"...

    It doesn't matter how highly paid and trained your professionals are, if the environment that produces the software is not conducive to eliminating these types of flaws. Like if they are not given enough resources to test and QA the the projects they are assigned, there is no organizational commitment to take the time and expense to document properly, or leadership overrides technical objections to project timeframes, etc. Most of the cited projects could probably be classified as failures of project management rather than failures of the end product (the software) that these flawed projects produced. Yes, software is hard and the software profession should continue its efforts to improve quality, but that doesn't let the organizational culture, leadership and processes that produced the software in these cases off the hook.

    Why is it when the accounting profession makes spectacular mistakes that take down entire Fortune 500 class organizations, there is a critical analysis of the processes that led to these failures, and remedies often comprise prescriptive measures for these processes, but similar analysis for software failures focus upon the software flaw but not the environment that allowed the flaw to emerge? Now sometimes the remedy in the accounting case might not make complete sense (like SOX), but the point here is people don't look at just the end result (the accounting system transactions) of the accounting process.

  55. From the quoted article... by Anonymous Coward · · Score: 2, Informative

    Some outside observers, however, said they are not convinced NT is blameless.

    "It still boggles the mind that any divide by zero error on NT would cause a system to crash, let alone" 27 end-user terminals, said Gil Young, corporate network engineer for a systems integration firm in Orlando, Fla. "I don't care what operating system, computer or application I'm using, I should be able to type in a zero and expect the computer not to crash, especially if that zero is to represent a closed valve."

  56. Re:Whoops forgot to hit preview by RobinH · · Score: 4, Insightful

    Look, I write software for control systems (and I design them electrically too). Just because programmers at Microsoft or EA Games have tight schedules where they are just too stressed to write code well doesn't mean all code needs to be written like that.

    Back to what you were saying, if you have a system that could cause damage or whatever, then you start by writing your output routines, and you create rules to govern the machine (i.e. outputs A and B can't come on at the same time, or output C can't exceed this value). Then you write another module that monitors the inputs AND outputs looking for fault conditions that shuts down the machine if you do anything dangerous. Only this part of the code needs to be signed off by an engineer. Typically it's simple code, and easy to prove correct, with peer review. Then you write other modules that essentially make requests through the safety checks to do anything. You don't have to review the complex other code so much, because your output stage should catch any mistakes.

    That's how you make a machine safe. Unfortunately, most engineers I know just go out and write the software figuring there's no difference, and that's how bad things happen. It comes from believing you won't make a mistake, or believing that testing will catch all problems. If you plan from the start that you're going to be making mistakes, you can catch them before damage is done. It's too bad this isn't taught, even in the software engineering classes I took at a Canadian university.

    --
    "I have never let my schooling interfere with my education." - Mark Twain
  57. Re:You get what you pay for NONSENSE by loose_cannon_gamer · · Score: 4, Insightful
    I think both the parent and grandparent have some validity. I'm a master's student in CS who has managed never to take a software engineering class before this semester (and I graduate in December). This has been an eye opening experience. Let me point out some of the well known highly advocated techniques which, as far as I can tell, most graduates and many 'out in the field' software engineering professionals don't do that would help avoid these bugs.

    1. Design reviews, by peers and independents

    2. Code reviews, by peers and independents

    3. Regulary, organized, unit testing

    4. Correctness proving

    5. Documentation is about a bazillion forms

    6. Defect tracking

    7. Effective software process metrics measurement and improvement

    8. Continuing education

    9. Humility / egoless programming

    This list was assembled in about a minute off the top of my head. I work in a CMM3/4 type organization, and although there are processes for these things, most people don't use them, or consider them a hassle.

    So my point is, the parent is right -- creating good software, even when done by properly trained experts with great experience -- is hard. But the grandparent is right too -- doing all of the above to 'do it right' takes time and money, and many organizations, and by this I mean software process management as well as the actual engineers, don't understand the value / aren't willing to pay for or aren't willing to do all that work. And occasionally, as the article shows, the piper comes and takes his payment.

    --
    In Soviet Russia, us are belong to all your base.
  58. Re:And don't forget your roots... by Anonymous Coward · · Score: 2, Interesting

    Engines have existed for centuries. Roman engineers, for example, built siege engines (ballistas, and the like).

    Dictionary.com
    Engineer
    [Middle English enginour, from Old French engigneor, from Medieval Latin ingenitor, contriver, from ingenire, to contrive, from Latin ingenium, ability. See engine.]

    Engine
    n. 1.1. A machine that converts energy into mechanical force or motion.
          2.1. A mechanical appliance, instrument, or tool: engines of war.

    [Middle English engin, skill, machine, from Old French, innate ability, from Latin ingenium. See gen- in Indo-European Roots.]

  59. Re:Microsoft's striking absence by uncqual · · Score: 2, Insightful
    But then if I walked in and saw that the machine that was about to pump tons of radiation into me was running Windows, I'd turn around and walk back out.
    I would also - but it's probably pretty impractical to tell if just the operator interface is running Windows, or if the low level controller of the electromechanical sensors, switches and actuators is also running Windows. I wouldn't worry about the former, I'd worry a lot about the latter. (Well, to be honest, I probably would accept the treatments anyway - risk analysis would lead me to trust Windows more than trusting my body to eradicate the cancer on its own)

    My guess is that the operator really wouldn't know either (although, she would probably assure me that it "it's very safe").

    --
    Why is there an "insightful" mod and why isn't it "-1"? If I wanted insight, I wouldn't be reading /.
  60. Re:You get what you pay for NONSENSE by twiddlingbits · · Score: 2, Informative

    You mised one..

    Item 0: Requirements reviews by peers and independants. If you don't have good requirements you obviously don't know things well enough to be building them. Sure you can catch some requirements issues in 1 and 2 but the longer you wait the costlier it is to fix.

    A MSCS is NOT a Software Engineering Degree, so why WOULD you take courses in SE?I'd say that CS and SE are two different professions. There are places to get a MS SE (Texas Tech comes to mind) if you are interested.

  61. Disagree with them on some bugs by AviLazar · · Score: 2, Insightful

    Some of the bugs they listed are not truly bugs.
    Soviet Gas Pipeline...This was a desired feature working just as intended (unless they CIA didn't want to blow up the pipeline)

    Buffer Overflow in Berkley - a worm is not a bug. it is a program designed to infiltrate a system and do something. While the people utilizing the program may not have intended this to happen (duh) the makers of the worm did.

    A bug is an unwanted aspect of the code as implemented by the people who wrote (or edited the code) but this does not include something affected by a virus/worm. A program that crashes every six minutes for no apparant, or intended reason has a bug...a program that gets infected by a virus which causes it to crash every six minutes is not a bug. Also, a piece of code that is intentially inserted in the hopes of crashing a system is not a bug...it is a feature. It may be undesirable, but it is a feature.

    --

    I mod down so you can mod up. Your welcome.
  62. 22222222 missiles ... almost launched WWIII by Khopesh · · Score: 2, Interesting
    My favorite bug is a computer chip in the US surveillance of Soviet Russia's missile silos. Basically, some early warning system stated that Russia had launched something like 2222222 missiles from every source they had. (I'm not sure of the actual number, but it only contained 2's.)

    Some person down the line noticed that the Russians didn't have that many missiles, couldn't have launched them all with such synchronization, and that there were an awful lot of two's in the report ... actually, every digit of every number was a two. It turned out to be a fried chip somewhere, always pumping out the same bit regardless of input (I have no understanding of the technical side of the issue; maybe it hit the 32-bit limit and the int->string function reacted with 2's).

    Good thing we were not too automated, and that we employed somebody smart enough to critically examine his printouts.

    Disclaimer, this is a favorite tidbit of one of my professors ... I have no real source to refer to.

    --
    Use my userscript to add story images to Slashdot. There's no going back.
  63. Re:Whoops forgot to hit preview by CastrTroy · · Score: 2, Insightful

    That works great for the kind of machines you described, and I wasn't saying good code couldn't be written, I'm saying that it isn't usually written, and won't be written in off the shelf software. The problem is, if you look at something like a mars lander, then you can't just shut it down if it gets some bad inputs. Also, even good inputs can result in the the machine not doing what it needs to do. If it has to land on mars, and some input tells it to fire the left rocket for 4 seconds, then the input may fall within proper values, but may push it way off course. There's no GPS in space, so you can't get your position very accurately, and if you go way off course, you may not have enough fuel to get back on course.

    --

    Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
  64. Licensing - ACM Position by Embedded+Geek · · Score: 2, Interesting
    I recently completed an "Ethics in the Information Age" class for grad school (my earlier M.S. and undergrad predated such focused classes). As part of the discussions, we talked quite a bit about the Software Engineering Code of Ethics created by the ACM and IEEE and how such a code was a precursor to making software engineering an licensed, certified profession (akin to a CPA). So I figured it'd be neat to link to ACM's page advocating licensing.

    Guess what: They don't, although they appear to be hedging their bets with safety critical software.

    An interesting read...

    --

    "Prepare for the worst - hope for the best."

  65. Re:A radiology system written in Java 1.1????! by koehn · · Score: 4, Insightful

    It really doesn't matter what language you use: bugs can be written in any of them. In this case, the customer wanted a GUI workstation running on Windows, with the possibility of being cross-platform. Java was new and cool (1.1 had been out for six week when we started), and they decided to give it a shot. This is a company with fifty years experience in medical systems, not some dotcom startup, so the procedures are in place to make sure that their products don't kill people.

    As it turns out, JDK1.1 (along with a native-C library for quick image processing, and a custom PCI card for doing 30MB/sec image transfers) was just fine for the task. We had a team of seven testers working on the project full time for a year, and were able to ship with zero severity 1-2 defects.

    We set a new record for lowest defects/KLOC at the customer (a major player in the medical systems industry), despite running JDK 1.1 on Windows NT 4. Our product was several times faster than the C-based product it replaced, had more functionality, and provided more accurate diagnosis for the patient.

    Good design is the most important thing in developing good software. The language/runtime/OS can provide crutches to save you if you screw up, but bad design will result in defects no matter how sturdy the crutches are.

  66. Re:Whoops forgot to hit preview by RobinH · · Score: 2, Insightful

    The problem is, if you look at something like a mars lander, then you can't just shut it down if it gets some bad inputs.

    True, it's just that in my line of work, off is usually the safe state, but what should be done is to go to some kind of safe state, whatever that may be. Sometimes you revert to a manual operation, for instance.

    Also, even good inputs can result in the the machine not doing what it needs to do.

    Which is why you need to also hire a mechanical and an electrical engineer to design those aspects so that the mechanical and electrical systems fail in a safe and detectable way.

    For instance, it used to be that stoplights were designed with a physical disk inside that rotated creating the "program" of the different lights. You also had interlocked electrical circuits so that both greens could never come on at the same time. These are mechanical and electrical ways to make the system fail to a safe condition. I have recently been at an intersection where I saw the traffic lights had green both ways (during a storm). This is because some vendor is selling a traffic light system on the market that is completely software based, and they hired a bargain basement programmer and/or engineer to design it, and we should find them and shoot them for their incompetence, but I doubt that will happen.

    --
    "I have never let my schooling interfere with my education." - Mark Twain
  67. Re:gets() by mfrank · · Score: 2, Funny

    All you really have to do is force everyone at the company to use an include file with this:

    #define gets() DONT_USE_GETS_YOU_MORON()

  68. Re:The more I learn, the less I know by jp10558 · · Score: 2, Insightful

    I think the whole thing has to do with the different spheres of knowing (IIRC - the actual title might be different):

    1. Knowledge you have that you are aware of
    2. Knowledge you have that you are ignorant of
    3. Knowledge you are aware you are ignorant of
    4. Knowledge you are are not aware you are ignorant of

    So, as you move knowledge from the other areas into area 1, you tend to pull things "up" if you will. Knowledge moves from 4 to 3 as well.

    2 isn't a contradiction, just that you might not be aware that some "tip" is true, or may not realize at a certain time that certain stuff you know is actually relevant to the situation at hand.

    The scary part is area 4 is a default, so the less you move "up", the less you are aware that you don't know things. This is why lots of people say things like you did - the more you learn, the more you learn you don't know.

    --
    Opera, Proxomitron-Grypen,GPG 0x0A1C6EE3
  69. Re:If Engineers built buildings by legirons · · Score: 4, Insightful

    "If Engineers built buildings the way computer programmers wrote programs, the first woodpecker that came along would destroy civilization."

    If engineers built buildings the way computer programmers wrote programs, an average engineer would be able to build an array of radio telescopes by himself in one evening. A team of 30 engineers would be able to build a ringworld in 3 months.

    i.e. it would be nice if software were like designing an office, where there were 3 architects, 5 engineers, a building inspector, and 50 professional workmen to examine a system containing just a few hundred variables, and almost identical to the last 20 buildings they'd constructed.

    And in case that didn't start a flamewar, how about...

    "Just one unexpected input (of an aeroplane) caused the failure of two of New York's biggest civil engineering projects -- imagine how they'd cope with being attacked every 3 seconds like some internet software"

  70. Re:Goose and gander by minus_273 · · Score: 2, Insightful

    "Dude, focus here - democractically elected - not sham election"

    Right, and you verified the elections of these people how? let me give you an example, Hugo chavez led a military coup to take over his country and then later was "democratically eleced" according to Jimmy carter. I am going to assume you consider his government legitimate.

    I can tell you one thing, if the US hadn't intervened in many of those countries, the wouldn't have free democratic govenments today. It is far easier to remove a dictator who is just a man than it is to eliminate an entrenched political party/idea like communism. Compare N. korea to S. korea or taiwan to china... etc etc.

    --
    The war with islam is a war on the beast
    The war on terror is a war for peace