Richard Feynman, the Challenger, and Engineering
An anonymous reader writes "When Richard Feynman investigated the Challenger disaster as a member of the Rogers Commission, he issued a scathing report containing brilliant, insightful commentary on the nature of engineering. This short essay relates Feynman's commentary to modern software development."
The problem with the shuttle disaster (both of them, really) is external pressures that are not in anyway at all scientific. The pressure from your manager at Morton Thiokol to perform better, faster and cheaper. The pressure from the government to beat those damned ruskies into space at all costs.
So this is really a case of engineering ethics, when do you push back? As a software developer, I never push back. Me: "There's a bug that happens once every 1,000 uses of this web survey but it would take me a week to pin it down and fix it." My Boss: "Screw it--the user will blame that on the intarweb, just keep moving forward." But could I consciously say the same thing about a shuttle with people's lives at stake? No, I could not.
So when an engineer at Morton Thiokol said that they hadn't tested the O-Ring at that weather temperature that fateful day and that information was either not relayed or lost all the way up to the people at NASA who were about to launch--it wasn't a failure of engineering, it was a failure of ethics. External forces had mutated engineering into a liability, not an asset.
And there's a whole slough of them I studied in college: * Space Shuttle Columbia disaster (2003)
* Space Shuttle Challenger disaster (1986)
* Chernobyl disaster (1986)
* Bhopal disaster (1984)
* Kansas City Hyatt Regency walkway collapse (1981)
* Love Canal (1980), Lois Gibbs
* Three Mile Island accident (1979)
* Citigroup Center (1978), William LeMessurier
* Ford Pinto safety problems (1970s)
* Minamata disease (1908-1973)
* Chevrolet Corvair safety problems (1960s), Ralph Nader, and Unsafe at Any Speed
* Boston molasses disaster (1919)
* Quebec Bridge collapse (1907), Theodore Cooper
* Johnstown Flood (1889), South Fork Fishing and Hunting Club
* Tay Bridge Disaster (1879), Thomas Bouch, William Henry Barlow, and William Yolland
* Ashtabula River Railroad Disaster (1876), Amasa Stone So I agree with Feynman's comments in relationship to engineering and the further comments to software development. But I don't find them to be a fault in the nature of engineering, just a fault in our ethics. What does capitalism and competitiveness drive us to do? Cut corners, often.
My work here is dung.
To be fair, the Challenger disaster actually preceeded NASA's slogan and procurement policy of "faster, better, cheaper" by a bit. More to the point, Feynman's article should be a cautionary tale to ANYONE in a engineering field. It isn't a matter of one field being subject to unscientific pressures and another field being immune. No technology or industry is immune from the pressures and problems that caused the challenger disaster. Anyone who claims to be well adapted to safety concerns enough to not spend lots of time and effort on fixing them is foolish. The nuclear industry still has to practice strong QC on parts, procedures and maintenance and CONTINUE that practice. Same with commercial aviation, acute medical care, etc. Constant vigilance is rewarded only with another uneventful day. That is the fundamental problem. Vigilance is expensive and time consuming. these are not pressures from the profit motive. They apply to government as well as civilian ventures.
(I will refrain from a four-step Profit post). Standard technique: latch on to an essay by a brilliant and insightful person. Extend the insights of that person slightly into a different field with usual compare-and-contrast, brand-extension writing techniques. Claim that resulting essay (and self) are as insightful as the original essayist.
It doesn't work 99.994% of the time, generally because very few people are as insightful as the original brilliant person.
sPh
The blog post makes a nice contribution by linking to Feynman's original thoughts (for example, here: http://www.ranum.com/security/computer_security/editorials/dumb/feynman.html ), ones I haven't read for a long time (and was happy to be reminded of). However, the author makes the mistake of thinking that the original thoughts need to be interpreted and summarized for the reader. Feynman's words by themselves are simple to understand, are concise, and contain just the tone for which geeks go gaga. Anyone interested in the subject will be able to make his or her own judgements about the engineering and politics involved in the Shuttle development, engineering in general, and the extensions to software development.
I've been in software quality and testing for 14 years. I've worked at very large corporations as well as startups. There is a WIDE gap in software development process in our industry. Many people like to call themselves software engineers when they are developers. There is a huge difference. Engineering is a discipline that follows well-defined rules, and it usually takes time. But I think the very important thing to point out is that some software requires engineering - other software does not. If I go into a startup company that is trying to develop a blog/wiki site and try to implement a NASA-like software development methodology, they will fail. Likewise, software to control a heart monitor should be engineered and closely controlled. Sometimes quality and perfection is the goal, other times it might be time-to-market that is critical. You have to fit the process to your business. A bridge is a bridge, and they should all be engineered pretty much in the same way. You can't say the same thing about software.
I think that this is a very key point to software development. I have seen companies who spent entirely too much time and money trying to eliminate all defects from their software when it wasn't the critical part of their business. Yes, we should always strive to eliminate defects, but you can't get them all. You have to know when to pick your battles, and when to accept the risks. If we're talking about life-or-death software, or security, or other very critical things - you need to focus on those.
There's a grid I have seen used that is a great tool when doing projects.
Schedule, Cost, Quality, Scope.
1 can be optimized, 1 is a constraint, and the other 2 you have to accept. Period. It is a more useful version of the "fast, good, cheap - pick two"
My beliefs do not require that you agree with them.