Mars Global Surveyor Died from Single Bad Command
wattsup writes "The LA Times reports that a single wrong command sent to the wrong computer address caused a cascade of events that led to the loss of the Mars Global Surveyor spacecraft last November. The command was an orientation instruction for the spacecraft's main communications antenna. The mistake caused a problem with the positioning of the solar power panels, which in turned caused one of the batteries to overheat, shutting down the solar power system and draining the batteries some 12 hours later. 'The review panel found the management team followed existing procedures in dealing with the problem, but those procedures were inadequate to catch the errors that occurred. The review also said the spacecraft's onboard fault-protection system failed to respond correctly to the errors. Instead of protecting the spacecraft, the programmed response made it worse.'"
One bad command started the chain, but it needed a series of system failures to kill it. In other words, a slight misalignment of the solar panels (or whatever it was) may have been a necessary cause, but not sufficient. The thing needed a safe-mode that wasn't safe, and battery logic that failed to consider environmental variables. All the conditions lined up.
It's like saying that a mid-air collision occurred because two jetliners were assigned the same altitude and jetway in opposite directions at the same time. Yeah, but A) How they got that assignment is kinda complicated and B) any number of traffic control and collision avoidance systems have to fail too.