Can Software Kill?
mykepredko writes "Eweek has an interesting, if somewhat long article titled Can Software Kill? The article focuses on a programming error that resulted in 28 Panamanian cancer patients receiving many times an expected lethal dose of radiation. The article briefly mentions, but doesn't go into detail, the 1991 Patriot Missile Failure that resulted in the deaths of 28 American service men and women."
If a software maker is found negligible and convicted of manslaughter (unintentionaly causing death) due to buggy software, would that void out the whole EULA business since they all claim they can't be held responsible? Or would the burden pass on the poor chap that used it for being irresponsible enough to use something where the maker couldn't be held accountable? Lets's face it, why are only software companies able to make themselves free from accountability when every other industry has to design for it?
the therac-25 actually injured a fair number of people in the US 10-15 years ago. yeah, software fucks up sometimes. it's old news. for the article:
Nancy G. Leveson and Clark S. Turner. An investigation of the Therac-25 accidents. Computer 26, 7 (July, 1993) pages 18-41.
-ninjaneer
The software is only one piece of the puzzle, of course. Its killing was enabled by the hubris of its developers and the blind trust of its users.
Not by itself, no.
An autopilot that is consistently 1000 feet off, a poorly written control routine for an MRI, miscalibrated antilock brakes...can certainly cause death.
But ultimately, it comes back to whoever wrote it. Or specced it. Or tested it.
Software by itself is benign.
Human implementation of it may be lacking, though.
Can negligence in any area kill? Yes.
Software is no different from hardware in this aspect. If it is handling mission-critical or potentially-lethal equipment... great care should be taken to ensure its integrity.
Trusting those that make your irraditation software is no different from trusting the those that made your life-support hardware.
Human error, or mechanical, can mean death in both cases. If the error is glaring, it becomes a case of negligence.
Unfortunately in cases of software or even computer hardware operating environment becomes an often overlooked factor. Stress tests are needed... data collisions checked for, line noise, redundancy, etc. When we're talking about people's lives, that extra parity bit can be just as important as a backup-parachute...
There must be a point where software makers can no longer say "DISCLAIMER: IF WE BREAK YOUR MACHINE, IT'S NOT OUR FAULT." If you look at every piece of software's license, you'll see a clause like that. Imagine if every industry took that approach:
DISCLAIMER: IF YOUR CAR'S BRAKES FAIL, IT'S NOT OUR FAULT. TOUGH LUCK!
DISCALIMER: IF THIS MEDICINE KILLS YOU, OH WELL.. NOT OUR FAULT!
etc.
Some laws must be passed and software makers must be held accountable- they should no longer be able to hide under the big umbrella of the disclaimer.
IIRC, Microsoft's license has had since day zero, a clause to the effect that you are not legally allowed to use their software to control nuclear reactors, medical devices, avionics or any other application that could endanger human life. THERE'S A REASON THAT'S THERE!
If you DO have such an application, the software vendor : 1) takes much greater care in design, implementation and testing, 2) carries a godawful ginourmous insurance rider to cover any such failure.
There is a segment of the industry that works in this niche and is well aware of the risks and how to best manage them. It's goddamned clueless PHBs that think Microsoft == Software that don't understand simple goddamned little nuances like this.
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
The problem is that in every other development environment, the legal liability ultimately rests on the engineer who signed off on the quality assurance. But because software developers are not professionals and have no professional code of conduct, their signatures are meaningless. The only way software can become as reliable as other engineered products is to create the profession of software engineering*. And I'm not just talking about giving CompSci students a ring: many CompSci curriculums don't require any engineering techniques at all, and those that do usually devote less time to engineering than they do to sorting algorithms. The software industry requires fundamental changes, and legal liability is at most the catalyst.
* Yes, I know there are a couple of schools out there that offer SoftEng degrees, but until industry distinguishes them from CompScists and requires the engineering designation for key positions they are meaningless.
The next time you visit the doctor watch the workflow of the office staff. Increasingly chances are they will probably be entering your medical information, and I mean the clinical stuff, not your address into some type of computer system.
I currently work for a small Electronic Medical Records company. At some level I worry about potentially killing someone every day. In fact our bug tracking tool has a special category in it called "Patient Safety" which is the highest priority bug. We deal with things most of you probably wouldn't think of such as a tool for writing Prescriptions, which given the fact that many drugs interact ( potentially fatally) has to catch and alert the physician to such cases. I also deal with lab results which if reported incorrectly could lead to a potentially fatal decision by the doctor and so forth.
Consultants and pundits like to say that computer control reduces the chances of human error and failure, this is said IMO to comfort the masses. To state the obvious I suspect EVERYONE on Slashdot knows that in reality that statement is not true, the human error has just been moved to a different point in the chain. A tired programmer is just as likely to make a mistake as a tired machinery operator. The difference is that that software might be used by 5,000 machines, whearas that operator runs 1.
Actually, if you check the link in the article, it explains all about the Patriot failoure. It was a "range gate error" caused by clock drift. The patriot was designed as a mobile anti-aircraft SAM and, as such, was never designed to run for more than a few hours at a time. The one at Dhahran had been running for over 100 hours. It was the Israelis who first noticed the clock drift problem and they reported it to Raytheon. The problem was caused by programmers who would "round off" the clock increment before storing it in order to save a couble bytes of memory. This rounding error was inconsequential so long as the system was rebooted every few hours (which a mobile SAM on the move would do), but it could easily grow to cause huge errors if the computers were left running continuously, as they were on SCUD intercept duty. Raytheon's solution was to send out a warning followed by a patch to fix the error. Unfortunately, in classic Raytheon bumbling style, the warning was "'very long run times' could affect the targeting accuracy", with no indication what "very long" was, or how much it would affect accuracy. The Alpha battery at Dhahran only ran so long because the Bravo battery was having radar trouble and Alpha was picking up the slack. The operators had no idea the range gate tracking was off by 600+ meters, otherwise obviously they'd have rebooted to fix it.
If a job's not worth doing, it's not worth doing right.
In a former life ( :-> ) I was employed by a large multi-national that worked with utilities. Some of our software used SCADA protocols to remotely switch breakers - not household breakers, these switches control significant segments of the US power grid.
All the software and documentation contained numerous warnings, because if a utility employee manually switched of a segment to make repairs, and switch was remotely turned on, someone could be killed.
There are numerous other software applications that control (potentially) deadly devices - robots, industrial equipment, etc. Failure of the software, or problems with operator headspace, create a potential for death when working with almost any software that controls physical entities.
Interested in a Flash-based MAME front end? Visit mame.danzbb.com
We write software for railroad traffic control and crossing warning systems. If it fails we could end up with two trains trying to occupy the same piece of track at the same time (ref. Clapham Junction 35 dead) or gates that stay up when the train comes. Testing is very formal and rigorous and every step is documented.
For every hour we spend making sure the system does what it's supposed to do, we spend eight hours making sure it doesn't do what it's not supposed to do.
None of them can see the clouds; The polished wings don't care.