Software Problem Linked to Osprey Crash
An Anonymous Coward sent in: "While not the only problem facing the ill-fated V-22 Osprey, a bug in the software controlling the pitch of the Osprey's rotors was listed as a contributing factor to the crash of a Marine Opsrey last December. It appears that a hydraulic leak initiated a sequence of events that included the pilot pressing a computer reset button. Rather than resetting the computer, the software changed the pitch of the rotors. Not so good... One more reason to fear too much technology. Has anybody ever seen a bug-free piece of software of any complexity greater than "Hello World"?"
It's a shame that programming still has this bullshit mystique of "art" to it.
In many cases, it's not at all a matter of not knowing how to properly plan, design, and test (though sometimes it is). It's a matter of what the customer will pay for, how long they will wait for it, and how important it is for the software to operate flawlessly.
Usually, they want it cheap and yesterday. They want the latest whizz-bang while you're at it, never mind that it has no track record.
When, if ever, the customer wants to pay what it costs, wait as long as it takes (even if the initial estimates are wrong), and use what's known to work, things will change.
anyone who has ever written anything, 'nuff said, right?
.
Now, what's described in this Osprey crash, that's definately a bug. The expected behavior was a reset - is the expected behavior of a reset to change the pitch of the rotor? Then the bug was in the TRAINING procedure that recommended hitting that button.
In the real world, bugs range from, mispelled words in dialog boxes, to crashes, to having an OK button two pixels too small. It's all a matter of the opinion of any single user of a peice of software, if a given feature "worked". Worked how? Fulfilled the expected requirements? In 99.999% of the cases out there, the requirements were not well enough defined. For that matter, when you THINK you've got them defined well enough, you start running into semantics issues that would make a lawyer drool. Marketing guy writes requirements, engineer interprets requirements. It's a beautiful world eh? You thought you were writing "Hello World", but now you've got to spit it out in 50 different languages on 10 different OS/Hardware platform combinations, and it's got to be able to notify SNMP and email the administrator if it was unable to do so. And you've got to be able to get it to print out Hello World in the correct language from a single mouse click, and for some languages it has to determine the time of day, and present the greeting as a "good morning, world" or "good evening, world".
Pretty soon you're talking about 200 pages of specs. .
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
Go download the core (e.g. written in CWEB) part of TeX, which was written by probably the greatest computer scientist of our day, Donald Knuth. I don't think a bug has been found in TeX in at least a decade. It's gotten to the point where Knuth will cut you a check for several hundred "hexadecimal" dollars (256 cents ** years since he made the offer) if you find one, which you would never cash anyways but rather mount it on your wall. TeX is definitely a bit more complex than "Hello World". Many people in the publishing industry will tell you that the features it provides could be sold for many thousands of dollars or even tens of thousands as a closed-source software package. It's highly complex.
--
I think there is a world market for maybe five personal web logs.
Maybe this is the ultimate version of hello world:
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
int main(void)
{
int rc;
rc = printf("Hello, World!\n");
if (rc < 0)
ex it(errno);
exit(0);
}
I think the real problem is the change in the meaning of the word art in the past ~100 years. What most people think of these days when they hear the word "Art" is what is traditionally thought of as "Fine Arts". The traditional meaning of the word Art was more akin to "Technique". The phrase "State of the art" is refering to the most advanced techniques available. This is also why many universities still call their science departments "The college of arts and letters".
It used to be that professions that required a lot of talent and/or practice in order to master a technique, such as painting or dancing were considered the "fine arts". Then, in the latter part of the 19th century or early part of the 20th century some revolutionary fine artists decided to discard these old-school techniques. In the process they succeeded in changing the traditional notion of art from something that was more technique to something that was more creative.
Unfortunately, the traditional use of the word art wasn't obilterated in the process, so many early computer scientists (particularly Knuth) started to talk about "The Art of Computer Programming." What they meant wasn't that it's a creative process, but rather that it's a technique that must requires talent and must be practiced. Knuth preaches a lot of things about computer programming: programs should be simple and comprehendable, they should implement algorithms that are mathematically proven. It might not be anything as formal as methodologies like ISO-9000, but there's no implication of creativity like, "Let's see if we can fit 30 function points on one line because that would be so cool!"
Anyway, I think the only reasonable solution at this point is to abandon the antique word "art" and start talking about software development as a "technique" or other similar term.
My $.02,
-"Zow"
In reply to all those that say that using methodical, systematized approaches to software is overwhelming expensive:
Bullshit.
The cost of fixing your own goddamn mistakes, and the cost of maintaining your P.O.S. application is far, far, *FAR* higher than the cost of taking the time to do it right. And the cost to your users is even greater, in terms of downtime, data loss, rework, and inefficiency.
Every naysayer needs to pull his head out of his ass. Go buy some quality education and/or books on software project management.
There has been extensive, exhaustive, and rigorous research on software project management methodologies, software programming methodologies, and software maintenance practices.
They all consistently come up with the same conclusions: the more time spent in planning and design, the less time spent in programming, debugging, maintenance, and end-user failures.
There are NO excuses for the shoddy practices in use today. Better ways have been clearly identified. Your ignorance or slothfulness is an embarassment to the profession.
Do it right, or get the hell out.
--
--
Don't like it? Respond with words, not karma.
NASA's software development methodology is remarkable. Their work should be the standard by which every programmer measures himself against.
Unfortunately, most programmers are underinformed, and haven't the foggiest idea that there's are methodologies that will reduce their error rates, increase their productivity, and meet their customers' needs fully.
It's a shame that programming still has this bullshit mystique of "art" to it. "Art" is just a lame excuse for laziness: instead of approaching the problems methodically and scientifically, it's just ever-so-much easier to take a half-assed hack-and-patch approach.
I think I'd better stop here, before I really kick into a rant...
--
--
Don't like it? Respond with words, not karma.
Dude, I work in a hospital. The firmware on the monitoring systems is reasonable. Almost everything else (financial, scheduling, even pharmaceutical) is the buggiest most bloated crap you're ever likely to see.
It's pretty sad... our main system is basically a Pick (that's right, Pick!) emulation layer running on top of HP-UX. It's horrible.
You try building something under that kind of methodology, and see what the cost to your productivity is.
Go you big red fire engine!
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
And you've got to be able to get it to print out Hello World in the correct language from a single mouse click,
and nowadays you also have to do exhaustive research to find out if someone has already patented the idea of printing "Hello World" with a single mouse click.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
I can't believe nobody's posted about Knuth yet. Donald E. Knuth is famous for writing high-quality software, and even proving some of it (all of it?) correct. He offers rewards to people who find bugs in his code. The reward for TeX and METAFONT is described here: http://www-cs-faculty.stanford.edu/~knuth/abcde.ht ml, under the heading "Rewards".
--Jim
Folks,
I think people don't understand that fly-by-wire systems in general are extremely tricky to set up and work properly.
Remember, when the F-16 was designed, the use of FBW was considered extremely daring--and it took a long time to work the bugs out of that system. Remember the Airbus A320 jetliner? Airbus thought its systems was better until that bizzare crash at Mulhouse, France in 1988 where the plane literally flew straight ahead into the hill. Or the numerous problems Saab had developing the FBW system for the JAS 39 Gripen fighter.
Even now, FBW systems are still tricky to use--witness the problem with the Boeing 777 and the fact there was much complaints from passengers about the plane pitching up and down substantially during turns causing motion sickness; Boeing had to carefully reprogram the FBW software so the plane wouldn't pitch up and down so much during turns.
I think the problem with the V-22 Osprey is that the FBW systems are extremely complicated, and it will take a tremendous amount of work to get everything to work correctly.
Raymond in Mountain View, CA
A) Their own buggy hello world programs gone wrong (modest) B) Other peoples buggy hello world programs gone wrong (liars) C) People who fear technology (luddites) D) People who say it was the human error (technocrats) E) People who don't know what Hello World is (people who use AOL)
---
--
Insert Witty Sig Here
Add in software to control some aspect of flight and the pilot is automatically "further away" from the craft he's aviating. It sounds obvious and I'm not going so far as to say that computer-controlled systems are evil in and of themselves, but it's an aspect of modern flight where the negatives are often overlooked.
I've seen compelling evidence that another layer on top of the data that the plot must interpret is a Bad Thing. Anything that changes the pitch of a chopper's rotors other than your left hand on the collective should be viewed with extreme caution IMHO.
--- Hot Shot City is particularly good.
I was going to cite the Shuttle as well. Here is a great article about it.
So mod me down, if you must...
The article is kinda thin on details in exactly what way the aircraft acted, but basically these three events occurred:
1. Hydraulic line is severed, warning lights go off, including one on the reset button.
2. Per training, pilot hits reset button once, then multiple times.
3. The rotor pitch changes, causing the craft to inevitably crash.
Now, the article says that the hydraulics have had problems in the past... Here is my take:
In the three pieces I outlined above, two stand out as being really underdefined (or wrongly defined) in the article: Number 1, in that they don't mention what the hydraulic line controlled, and number 3, in that they make it ambiguous as to whether the blade pitch changed, or the rotor (ie, the pod) pitch changed.
The V22 is a tilt rotor craft. Say perhaps it was the pod pitch that changed, and not the blade pitch, and the hydraulics that were damaged were the ones controlling the pod tilt on one side. The reset button is hit, computer say "go into hover mode" and only one pod tilts...
See where I am going at? Perhaps what the reset button did was intentional - but the programmer assumed that both pods were working. I am not saying there wasn't a bug - somewhere along the line there was - but the way the article was written doesn't really tell what happenned...
You may mod me down - I am sure I have various things incorrect (for all I know the tilting is done with electric motors - but I doubt it)...
Worldcom - Generation Duh!
Reason is the Path to God - Anon
I was waiting for a flight on an Airbus one night when they announced that they needed to replace a part, and while it might arrive that night, they told everyone the flight was probably cancelled. Not having anything better to do, and being stuck at the airport anyway, I decided to wait it out.
By the time the plane boarded at 3AM, I was one of only four passengers. Shortly after we boarded, the Airbus computers started having fits. They opened and closed the door to the plane 5 times, went through one shift change of mechanics, pulled away from the gate once, power-cycled twice and finally got the computers working properly at 8AM, exactly 12 hours from when I was originally scheduled to leave and minutes before the flight crew timed out.
The cool thing was, I was the only passenger left at that point. The other three chickened out and decided to go on other flights. I think the pilot would've liked to have done the same thing, actually... he was a little nervous. He told me at one point, after a flap malfunction light came on, that he had a real bad feeling about this flight. They kept asking me what I thought, and my answer was always the same: hey, we're flying light, we don't *need* flaps, right? Let's just go! Haven't they got real long runways in Detroit?
The pilots say they like flying the Airbus, but I dunno. My Win98 machine isn't as buggy as that airplane. You should've heard the mechanics bitching about Airbus computers... I already hate the seats they use, and now I sure don't trust the systems. I try to avoid Airbuses wherever possible these days...
But it sure was a fun being the only passenger on a big jet like that...
Has anybody ever seen a bug-free piece of software of any complexity greater than "Hello World"?"
I once wrote a "Hello Universe" that I was fairly confident was relatively bug free.
When will Windows be ready for the desktop?
maybe that sentence would be better as "one reason to fear poorly implemented technology".
Let's not forget that the technology in the V22 is akin to the technology that allowed us to wipe out most of the Iraqi army in 100 hours with an incredibly small number of Coalition casualties.
This reminds me of the negative press coverage the M1 tank received during trials because it used a turbine engine similar to those used in helicopters. The engine had problems during trials and was roundly criticized for being costly and "gold plated".
That same turbine engine, after the glitches were fixed, turned out to be more reliable than the old engines it replaced. The M1 went on to become the most feared and capable main battle tank on the planet, easily taking out T-72 and T-80 tanks before they could even DETECT the M1.
Technology in military systems is always risky in one way or another, but the ultimate payoff is usually more than worth that initial risk.
The V22 has been plagued with problems from the beginning, and they may be pushing the envelope a bit too far with an overall concept that is just not workable. But if they do work out the glitches, the V22 will go from becoming an example of "too much" technology to an example of "amazing" technology.
Read the EFF's Fair Use FAQ
The important question to ask is "Why doesn't good software exist?". The reason is that the consumer wants it yesterday, they will pay for an unfinished and incomplete project (nee almost all windows software). This leads to cut & paste programing, re-use of lots of pre-done code supplied by various manufacturers and the like. If the customer demanded well sorted out code, and refused to "upgrade" to the next best thing then you will see better coding.
Last and not least lets not forget the customer, the US govt. The Osprey has always been considered a boondogle from the get go. The pressure from congress and media to get the progect completed has been extremely intense. I'm sure a lot of pressure was put on the coding team to "get it done now". It's too bad that they didn't learn from the Space Shuttle, over budget, and over designed, it works. It has passed all expectations and is doing things now that it was never designed to do.
There is an engineering credo there, the famous "factor of saftey". A number usually applied to equations to take into affect the "unknown". Looks like the Osprey designers didn't make their "factor of saftey" high enough.
"Science is about ego as much as it is about discovery and truth " - I said it, so sue me.
A group of technical leaders were at a seminar called "Making Reliable Software". The teacher at the seminar, in order to make the participants appreciate the problems of reliable software, posed this question: "How many of you would be willing to fly in an airplane that was controlled by software that your company created?" The technical leaders thought about it, and looked around at each other. Nobody raised their hands, except for one lone guy in the back. The teacher was suprised. "You, in the back," he asked. "Why are you so sure that it would be safe?" "Simple," replied the programmer. "Knowing how my guys code, the plane wouldn't even be able to pull away from the gate."
I used to write drivers that communicated with medical monitors. Talk about garbage! It was the exception rather than the rule that the monitors worked exactly how they were documented.
My vote for the medical monitor with the worst software and by far the worst communication protocol has to be the Corometrics 115 monitor. Anyone else ever have to deal with that piece of crap? My condolences if so. It took me literally *years* to get the driver to work perfectly in all situations, because there where so many versions of the firmware that caused it act differently in different situations. It was always fun debugging software running in a live patient environment. :)
--
Sometimes it's best to just let stupid people be stupid.
Greater complexity than Hello World? That's nothing! How about concatentation the strings "Hello" and " World!" together and THEN outputting it to the screen! Man, oh man. Still need to do the regression testing, though.
-- dR.fuZZo
Planes don't "crash" anymore. Now they "blue screen."
Milo
Jeez, some people!
-atrowe: Card-carrying Mensa member. I have no toleranse for stupidity.
Back in early 90's I worked for a defense contractor that was involved with the testing of the V22's engines. I remember sitting in a meeting and looking at some of the mechanical drawings and saying to myself, "glad I'm not the test pilot". It is a very complicated design. Most military aircraft like the V22 have quadruple redundant electronic systems and even the engine ECUs will talk to each other and monitor the one that is currently in charge and if the others determine that the main ECU is malfunctioning the other three will vote to remove it. Very complicated algorithms. Also to maintain that both rotors have power with one engine failing a high speed drive shaft is used to transfer power from the good engine to the transmission on the side that lost power. The drive shaft has a very small diameter but would spin at something like 10,000 rpms. Low torque but high power. The designers are thinking about safety and I think this bird can work its just not being funded with the enthusiasm of the old cold-war projects.
Actually, a decent embedded system (RTOS or not) will re-boot itself when it needs to. That is, you got a "watchdog timer" which the main program loop keeps resetting; if it doesn't get reset, it resets the computer. But apparently they gave the pilot a pushbutton just in case the watchdog timer crapped out too. But he pushed it when the real trouble was a chafed-through hydraulic line. And when it didn't help, he pushed it nine more times...
As I see it (and I'm not claiming expertise), the problems were, in descending order of importance:
1. The designers failed to route and fasten hydraulic lines so they wouldn't chafe.
2. The Marine Corps and aircraft mfg both ignored warnings about chafing lines.
3. The system for some reason told the pilot he needed to reset the computers although the computer had nothing to do with the hydraulics failure.
4. The software changed propellor settings erratically while booting up.
5. They didn't give the pilot an effective manual override for any controls the computer might be messing up.
6. The pilot panicked a little. Understandable, but pilots and Marines are supposed to be panic-proof...
Atomic Energy of Canada, Ltd. (AECL) built a medical radiation device called the THERAC that had a major software bug in it that killed at least three people. If the operator hit the keys on the keyboard in the wrong sequence, the machine would deliver 1,000,000 times the dose displayed on the screen. Massive radiation poisoning, and the victims died in a few days. They finally traced the problem to a coding error made by a contractor.
This, of course, is a massive over-simplification of the problem. The full story can be found here
-----------------
www.lucernesys.comHorizon: Calendar-based personal finance