Slashdot Mirror


Murphy's Law Rules NASA

3x37 writes "James Oberg, former long-time NASA operations employee, now journalist, wrote an MSNBC article about the reality of Murphy's Law at NASA. Interesting that the incident that sparked Murphy's Law over 50 years ago had a nearly identical cause as the Genesis probe failure. The conclusion: Human error is an inevitable input to any complex endeavor. Either you manage and design around it or fail. NASA management still often chooses the latter."

18 of 274 comments (clear)

  1. Mark my words by zerdood · · Score: 5, Funny

    Someday all decisions will be made by machines. We'll just sit back while they do all the work. Then, no more human error.

    --
    My sig would have been a lot cooler if /. didn't filter out HTML tags 0.o
    1. Re:Mark my words by wiggys · · Score: 4, Insightful

      Except, of course, that we programmed the machines in the first place.

      When a computer program crashes it's usually down to the human(s) who programmed it, and in the rare occasions it's a hardware glitch and it was humans who designed the hardware, so we're still to blame either directly or indirectly.

      I suppose it's like the argument about whether bullets kill or the human who pulled the gun's trigger.

      --

      Sorry, but my karma just ran over your dogma.

    2. Re:Mark my words by j0yb0y · · Score: 5, Funny

      Let me restate what he said,

      Someday all errors will be made by machines. We'll just sit back while they do all the work. Then, no more human error.

  2. interesting but it's not really true by spacerodent · · Score: 4, Interesting

    while it's possible to always have a mistake, having people double check a project from the ground up will almost always find the problems. Nasa's current difficulties arise from scattered teams that all only check their parts rather than having fully qualified teams that go over the entire vehical. The fact that the whole thing is usually designed by committee and in several pieces then assembled at the last minute probally helps facilitate error. The Saturn V rockets and other technology we used to land on the moon had hte capability of being far less relyable than today's technology but we still managed to use them for years without error.

    1. Re:interesting but it's not really true by Moby+Cock · · Score: 4, Informative

      Its an oversimplification to say that older technology was used without errors. In fact, its just downright incorrect. Appolo 1 and Appolo 13 both suffered from catastophic failures. Furthermore, the next generation of space vehicles, the shuttle, has had two very significat disasters and reams of other failures.

    2. Re:interesting but it's not really true by EvilTwinSkippy · · Score: 4, Insightful
      I'm still trying to figure out why the Apollo formula of contractors with Nasa oversight doesn't seem to work anymore.

      Then I remember Apollo 1, that killed 3 astronauts, and Apollo 13, that nearly killed 3 more.

      To invoke Heinlien, Space is a harsh mistress.

      To invoke Sun Tsu, success in defense is not based on the likelyhood of your enemy attacking. It is based on your position being completely unassailable.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
    3. Re:interesting but it's not really true by Wizzy+Wig · · Score: 5, Insightful
      ...having people double check a project from the ground up will almost always find the problems...


      Then you double check the checkers, and so on... that's the point of the article... humans will err... Like Demming said... "you can't inspect quality into a process."

    4. Re:interesting but it's not really true by Control+Group · · Score: 5, Insightful
      No, it is true. It's the "almost always" in your statement that's the key. It's simple statistics, really. Assume that a well-trained, expert engineer has a 5% chance of making a material error. This implies that 5% of the things s/he designs have flaws.

      Now suppose this output is double-checked by another engineer, who also has a 5% chance of error. 95% of the first engineer's errors will be caught, but that still leaves a .25% chance of an error getting through both engineers.

      No matter what the percentages, no matter how many eyes are involved, the only way to guarantee perfection is to have someone with a zero percent chance of error...and the chances of that happening are zero percent. Any other numbers mean that mistakes will occur. Period.

      I remember reading a story somewhere about a commercial jet liner that took off with almost no fuel. There are plenty of people whose job it is to check that every plane has fuel...but each of them has a probability of forgetting. Chain enough "I forgots" together, and you have a plane taking off without gas. At the level of complexity we're dealing with in our attempts to throw darts at objects xE7 kilometers away, it is guaranteed that mistakes will propagate all the way through the process.

      --

      Reality has a conservative bias: it conserves mass, energy, momentum...
    5. Re:interesting but it's not really true by sphealey · · Score: 4, Insightful
      I'm still trying to figure out why the Apollo formula of contractors with Nasa oversight doesn't seem to work anymore.
      Two reasons. First, outsourcing requires more and better project managers and technical managers than insourcing. Many organizations learned this to their sorrow in the 1980s; many more are going to learn it around 2006.

      Second, the stable of competent contractors that existed in the 1940-1960 time frame is gone. North American, Grumman, McDonnell, dozens of others that could be named have been absorbed into 2-3 borg-like entities. The result is less competition, less choice, less innovation, few places for maverick employees to go, and in the end worse results from outsourcing.

      sPh

  3. Cost Effective by clinko · · Score: 4, Interesting

    It's actually more cost effective to allow for failures. You build the same sat 5 times and if 4 fail in a cheaper launch situation, you still save money.

    From this article:

    "Swales engineers worked closely with Space Sciences Laboratory engineers and scientists to define a robust and cost-effective plan to build five satellites in a short period time."

  4. Good Point by RAMMS+EIN · · Score: 5, Insightful

    ``Human error is an inevitable input to any complex endeavor. Either you manage and design around it or fail.''

    This is a very good point, and I wish more people would realize it.

    For software development, the application is: Just because you can write 200 lines of correct code does not mean you can write 2 * 200 lines of correct code. Always have someone verify your code (not yourself, because you read over your errors without noticing them).

    --
    Please correct me if I got my facts wrong.
  5. That is NOT correct. by Puls4r · · Score: 4, Insightful

    >>Either you manage and design around it or fail. >>NASA management still often chooses the latter.

    This is hindsite at its best, and is the classic comment by beareaucrats who have no concept of what cutting edge design is about. F1 race cars, Racing Sailboats, Nuclear Reactors - NO design is failsafe, and NO design is foolproof. Especially a one off design that isn't mass produced. Even mass produced designs have errors, like in the Auto Industry. It is a simple fact of life that engineers and managers balance Cost and Safety constantly.

    What you SHOULD be comparing this against is other space agencies that launch a similar number of missions and sattelites - i.e. other real world examples.

    Expecting perfection is not realistic.

    1. Re:That is NOT correct. by orac2 · · Score: 4, Insightful

      This is hindsite at its best, and is the classic comment by beareaucrats who have no concept of what cutting edge design is about.

      You only get to play the hindsight card the first time this kind of screw-up happens. If you actually read the article you'll see that Oberg (who isn't a beauracract but a 22-year veteran of mission control and one of the world'd experts on the Russian space program) is indicting NASA for having a management structure that leads to technical amnesia: the same type of oversight failure keeps happening again and again.

      Oberg is not alone in this. The Columbia Accident Report despairingly noted the similities between Columbia and Challanger: both accidents where caused by poor management but what was worse with Columbia was that NASA had failed to really internalise the lessons of Challanger, or heed the warning flags about management and technical problems put up by countless internal and external reports.

      Sure, space is hard. But it's not helped by an organization that has institutionalised technical amnesia and abandoned many of its internal checks and balances (at least this was the case at the time of the Columbia report, maybe things have changed).

      And if you really want to compare against other agencies, NASA's astronaut bodycount does not compare favorably against the cosmonuat bodycount...

      Sadly, your post is a classic comment by slashdotters who have no concept what effective technical management of risky systems looks like. (Hint: not all cutting edge designs get managed the same way. There's a difference between building racing sailboats and spaceships. This is detailed in the Columbia accident report. Read it and get a clue).

      --
      "Just once, I'd like to meet an alien menace that wasn't immune to bullets." -- The Brigadier, Dr. Who
  6. where would we be without mistakes... by woodsrunner · · Score: 5, Insightful

    If you compare the advances to Science and Knowledge due to mistakes rather than deliberate acts, it might come out that everything is a mistake.

    Recently I took a class on AI (insemination, not intelligence) and apparently the two biggest breakthroughs by Dr. Polge, in preserving semen were due to mistakes. First, his lab mislabeled glycerol as fructose and they were able to find a good medium for suspension. Secondly, he blew off finishing freezing semen to go get a few pints and didn't make it back to the lab until the next day thus discovering that it was actually better to not freeze the stuff right away.

    Mistakes are some of the best parts of science and life in general. It's best to try to make more mistakes (i.e. take risks) than it is to try and always be right. (unless you are obsessive compulsive).

  7. Human Factor by xnot · · Score: 4, Insightful

    I think the biggest difficulty surrounding large organizations is the lack of communication tools linking the right engineers together. It seems unfathomable that some of these mistakes were able to propegate throughout the entire engineering process and nobody caught them.

    Unless you consider the fact that often in large organizations, the left hand typically has no clue what the right hand is doing. I work at Lockheed Martin, and typically I'm involved in situations where one group makes an improvement that then none of the other groups know about, changes/decisions are poorly documented (if at all) so nobody knows where the process is going, people making poor decisions due to lack of proper procedures from management about what to do, teams not being co-located, poor information about which people have the necessary knowledge to solve a particular problem, or any number of things that confuses the engineering process, to the detriment of the product. Most of these situations are caused by a lack of communication throughout the organization as a whole.

    This is a serious problem, and it needs to be acknowledged by the people in a position to make a difference.

  8. armchair rocket science by onion_breath · · Score: 5, Insightful

    I love how journalists and others like to sit back and criticize these engineers' efforts. They are human, and they will do stupid things. Having been trained as a mechanical engineer (although I mostly do software engineering now), I have some idea of how many calculations have to be made to design even one aspect of a project. I couldn't imagine the complexity of such a system, trying to account for every scenario, making sure agorithms and processes work as planned for ONE mission. No second chances. That we have individuals willing to dedicate the mental efforts to this cause at all is worthy of praise. These people have pride and passion in what they do, and I'm sure they will continue to do their best.

    For anyone wanting to yack about poor performance... put your money where your mouth is. I just get sick of all the constant nagging.

    --
    this is my sig, be amazed.
  9. John Galls Systemantics by Anonymous Coward · · Score: 4, Interesting
    Systems display antics. John Gall has written a great book which vastly expands on Murphys law which is called Systemantics - The Underground Text of Systems Lore. I cannot recommend this book enough. It contains some truths about the world around us that's blindingly obvious once you see it, but until then you're part of the problem. Systemantics applied to political systems is very enlightening. Too bad that the only people who think like this in politics are the selfish and egomanical Libertarians (yeah, yeah.. I know. Libertarianism is the new cool for the self styled nerd political wannabe).

    Here are some of the highlights:
    • 1. If anything can go wrong, it will. (see Murphy's law)
    • 2. Systems in general work poorly or not at all.
    • 3. Complicated systems seldom exceed five percent efficiency.
    • 4. In complex systems, malfunction and even total non-function may not be detectable for long periods (if ever).
    • 5. A system can fail in an infinite number of ways.
    • 6. Systems tend to grow, and as they grow, they encroach.
    • 7. As systems grow in complexity, they tend to oppose their stated function.
    • 8. As systems grow in size, they tend to lose basic functions.
    • 9. The larger the system, the less the variety in the product.
    • 10. The larger the system, the narrower and more specialized the interfaces between individual elements.
    • 11. Control of a system is exercised by the element with the greatest variety of behavioral responses.
    • 12. Loose systems last longer and work better.
    • 13. Complex systems exhibit complex and unexpected behaviors.
    • 14. Colossal systems foster colossal errors.
  10. The REAL REAL Reason for Errors! by Ced_Ex · · Score: 5, Funny

    Here's the real reason for NASA and their errors, as quoted by Gordon Cooper a former astronaut.

    "Well, you're sitting on top of this rocket, about to be flung into the most hostile environement know to man, and you keep thinking, 'Everything here was supplied by the lowest bidder.'"

    --
    Live forever, or die trying.