Debug your Code, or Else!
Trevor Lovett writes "I ran across a collection of famous software bugs that have caused large scale disasters including the explosion of the Ariane 5 rocket due to integer overflow and the misfiring of a US Patriot missile that caused 28 deaths because of accumulated floating point error. "
It really amazing how many software project managers that don't fully understand what regression testing is all about.
Software engineers simply cannot be trusted to do more than small unit level testing! We get into a pattern of behavior, we know what to expect, and simply do not stress test the system.
Thats why I like hiring sales people and 2-year olds to test my code at the unit/integration level.
Old age and treachery almost always overcome youth and skill.
Software Horror Stories linked from the post's link
I believe more patients' lives are lost because of mistakes by doctors/hospitals/nurses, or sheer negligence. In some parts of India, for example, private hospitals are afraid to admit victims of accidents or crimes because the hospital itself might get into some trouble. Personally, I have seen doctors giving stupid advice, and people losing lives.
To put things in perspective, fatalities caused by human errors (non programming related) outnumber those caused by software errors by orders of magnitude, in many fields (except, say in launching unmanned space vehicles).
S
Sure, some people here gripe about this not being newsworthy. But as a hardware guy, I am happy to see that software guys are finally going to be held to some sort of standard.
In electronics, if your hardware has ONE little problem, it's almost bankruptcy time. Remember the Pentium FP bug? And how it would have affected very little? Remember the hoopla, people wanted new processors, etc..
But software bugs? Who cares! It's NORMAL, it's EXPECTED. Well, geeks and nerds, time to get your asses in gear and live up to the same standards mechanical and electrical engineers have been living up to for decades.
I'm tired of being held to a standard of perfection that the software people (who make more money than me!) don't even KNOW about.
The bug that caused Airane explosion was a requirements analysis bug. The Pentium FP bug was a hardware bug.
A quick skim of the rest nets me at least 6 more non-software software bugs
After seeing that, I can't really trust the list on things I don't have a good knowledge about.
Here's a challenge for someone: Go through the list and find out how many (if any) of the listed software bugs are actually software bugs.
and goes on to say:
seven times the loss of human life, but less of a tragedy? I guess they are soldiers so fuck 'em, eh?
This story is over two years old, so they have had ample opportunity to correct it. The "comment" button on that page just takes me to the front page. Nice.
Also on that page, "The DoubleSpace automati hard disk comparision software included in Microsoft MS-DOS 6.0 [. .
Ironic that there are such glaring errors in an article about buggy software.
Well, I wasn't particularly a fan of Byte before, but now I'm convinced that they suck.
-Peter
Your first link is a translation of a patriotic Israeli article cheerleading the competence of their military. It doesn't necessarily make what they're saying false, but does make it suspect.
The second link is way low on content, I'm not sure how to judge it. All it says is "we looked at a bunch of videotapes and arrived at this conclusion". And then goes on to mention the bitter dispute between the U.S. and Israeli military over why the system didn't work so well in Israel.
I'm not sure I'm going to buy either argument. I know enough about flight characteristics to question the assertion that the scuds were so good at jinking and chaff the patriots (which were originally designed to hit jinking, chaff releasing aircraft) couldn't hit them.
If the scuds were dropping debris because extra fuel tanks made them unstable:
1) Why wasn't the wobble a pronounced problem at launch when the extra weight would have completely thrown off the trim characteristics of the missile?
2) Dropping "debris" is a bad thing, and it's only a matter of time before doing so results in an uncorrectable failure of the missiles flight aerodynamics. Why weren't most of them failing earlier?
3) Missiles don't fly in smooth trajectories nearly as often as you think. They jink to try and make anti missile systems (like say the Phalanx close-in weapons system) miss them or think they are dead and not worth any more attention.
Even if the patriots did fail, why would that have grave implications for our anti ballistic missile shield? SCUDs are cruise missiles, not ballistic missiles. Why do you think those big computers at Norad can accurately predict where the warheads will hit just after boost?
Education is a better safeguard of liberty than a standing army.
Edward Everett (1794 - 1865)
I think most of the bugs in software are the result of "Coding Under Influence". Wether it is a strict time-limit, ambiguous specifications, no sleep or other disturbances, it leads to blatant dumb assumptions or similar faults. Everyone knows that driving under influence is dangerous and can lead to accidents. Why do "software architects" think this is different when someone writes important programs?
I think part of the problem is that writing software is a rather new handwork in comparison to e.g. metalworking. Programmers don't have a union, often they work under poorer confitions than workers at conveyor belts if you consider the higher responsibility they have.
I'm pretty sure the media has mentioned this, beyond those two media links you already posted, I mean. The issue has been debated since the first Patriot experiences during the Gulf War.
But I don't really see how this has "grave implications" for an anti-ballistic missile shield. The effectiveness of the Patriot missile used during the Gulf War era is in doubt, but a that does nothing to invalidate the general concept of destroying a ballistic missile with another interceptor missile. It certainly isn't easy to do, and there may be better ways to accomplish the same goal or things more worthy of our limited resources, but to claim that it's somehow physically impossible is both disingenuous and incorrect.