Software Problem Linked to Osprey Crash
An Anonymous Coward sent in: "While not the only problem facing the ill-fated V-22 Osprey, a bug in the software controlling the pitch of the Osprey's rotors was listed as a contributing factor to the crash of a Marine Opsrey last December. It appears that a hydraulic leak initiated a sequence of events that included the pilot pressing a computer reset button. Rather than resetting the computer, the software changed the pitch of the rotors. Not so good... One more reason to fear too much technology. Has anybody ever seen a bug-free piece of software of any complexity greater than "Hello World"?"
"Sorry Team.
You're going to have to stay late.
Until it's done.
And debigged.
And tested.
And documented."
Gawd that's fscking annoying. The moral of the ad? Why let staff rest when they can just be pumped full of drugs, I forget which particular ones were being advocated, to keep 'em working so they can meet the deadline that the boss horribly mismanaged and allowed to become a panic.
Well SCREW YOU! We are not slaves to be worked to death. It's management attitudes like this, that cause code to have bugs and look like shit. No wonder the code is buggy. The staff writing it was in a big fucking hurry to get it the hell done any way they could so they could go home and enjoy the weekend. What other kind of code could you expect to be produced?
Airbus, while showing off one of the first A320 aircraft at the Paris Airshow (I don't remember what year) suffered a fatal software error in it's fly-by wire system. The pilot was doing a low, slow pass at an altitude 50-100 FtAGL down the runway. The flight control software mistook this for a landing and refused to let the engines throttle back up. The plane plowed into a heavily wooded area near the end of the runway.
If you watch TLC or Discovery channel long enough you will see the footage of the crash. They've been using it in a commercial for the last few weeks.
Of course, this whole thing prompted the (poor) joke: Q: What's the difference between an Airbus and a chainsaw? A:10,000 trees per minute
He pointed out that the software that controls nuclear arms is required to have "path coverage" testing, where every possible execution path through the code has been tested. That is a combinatorial nightmare. (Not to mention that I have no idea how loops work with this sort of testing.)
I suppose it can be useful to formulate a program in a thoughtful, formalized system. But that's what good code is anyway, so...
It's a shame that programming still has this bullshit mystique of "art" to it.
In many cases, it's not at all a matter of not knowing how to properly plan, design, and test (though sometimes it is). It's a matter of what the customer will pay for, how long they will wait for it, and how important it is for the software to operate flawlessly.
Usually, they want it cheap and yesterday. They want the latest whizz-bang while you're at it, never mind that it has no track record.
When, if ever, the customer wants to pay what it costs, wait as long as it takes (even if the initial estimates are wrong), and use what's known to work, things will change.
??=include <stdio.h>
??=include <errno.h>
??=include <stdlib.h>
int main(void)
??<
int rc;
rc = printf("Hello, World!??/n");
if (rc < 0)
exit(EXIT_SUCCESS);
return 0;
??>
Now that's proper and portable C code! Behold it's beauty!
Remember, many characters are not portable across all character sets. The square brackets, for example, dot not exist in EBCDIC. The hash/pound character is often replaced by the British currency symbol in some ASCII variants. Reverse slants do not exist in Baudot or Morse code. And so on. So ANSI C (2nd revision) defines 'trigraphs' of the form ??x where x is a from a list of more universal characters and the preprocessor treats the whole sequence as the appropriate C character. Note that trigraphs can even be used in quoted strings (see ??/ above).
I wuz speaking rhetorically to illustrate the point.
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
anyone who has ever written anything, 'nuff said, right?
.
Now, what's described in this Osprey crash, that's definately a bug. The expected behavior was a reset - is the expected behavior of a reset to change the pitch of the rotor? Then the bug was in the TRAINING procedure that recommended hitting that button.
In the real world, bugs range from, mispelled words in dialog boxes, to crashes, to having an OK button two pixels too small. It's all a matter of the opinion of any single user of a peice of software, if a given feature "worked". Worked how? Fulfilled the expected requirements? In 99.999% of the cases out there, the requirements were not well enough defined. For that matter, when you THINK you've got them defined well enough, you start running into semantics issues that would make a lawyer drool. Marketing guy writes requirements, engineer interprets requirements. It's a beautiful world eh? You thought you were writing "Hello World", but now you've got to spit it out in 50 different languages on 10 different OS/Hardware platform combinations, and it's got to be able to notify SNMP and email the administrator if it was unable to do so. And you've got to be able to get it to print out Hello World in the correct language from a single mouse click, and for some languages it has to determine the time of day, and present the greeting as a "good morning, world" or "good evening, world".
Pretty soon you're talking about 200 pages of specs. .
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
My experience with realtime embedded control systems has been that (with ada anyway) they don't usually die that way. In fact as I type this I'm trying to recall the last time the system I'm currently coding on went off in the weeds, and I can't. In this case, it most likely didn't go out of bounds either.
...
In my experience, most deeply embedded systems, like flight controls, rarely use pointers. They rarely use any "exotic" language features. They rarely (read as never) use anything that is allocated in a dynamic manner.
When things like this happen (reset problems) usually you end up with a situation where your box gets reset, and you output a default values on all of your interfaces, but these default values may not apply for all situations.
Rather than blaming the coders for these issues and calling them bugs, blame at the systems engineers for not covering the situation when they wrote the requirements. Not that it's easy to do this.
Failure mode analysis in a system as complex as the V-22 is a job I would not want
"There's no secret. You just press the accelerator to the floor and keep turning left." -- Bill Vukovich
Read the manual page for mf(1) under the "BUGS" section.
--
I think there is a world market for maybe five personal web logs.
I think you are mistaken. Knuth will double the size of the bounty for every year that has passed since he made the offer (in 1986). See my other post in this thread.
--
I think there is a world market for maybe five personal web logs.
Go download the core (e.g. written in CWEB) part of TeX, which was written by probably the greatest computer scientist of our day, Donald Knuth. I don't think a bug has been found in TeX in at least a decade. It's gotten to the point where Knuth will cut you a check for several hundred "hexadecimal" dollars (256 cents ** years since he made the offer) if you find one, which you would never cash anyways but rather mount it on your wall. TeX is definitely a bit more complex than "Hello World". Many people in the publishing industry will tell you that the features it provides could be sold for many thousands of dollars or even tens of thousands as a closed-source software package. It's highly complex.
--
I think there is a world market for maybe five personal web logs.
I'm going to guess that the errors (I count eight so far) are intentional -- he's making a joke, see, in the guise of a stuffy response that failed to get the previous guy's joke, while also accusing him of failing to get the previous guy's joke. Very recursive; maybe his program should also have called main() recursively -- ooh, I'm not continuing it, am I?
David Gould
David Gould
main(i){putchar(340056100>>(i-1)*5&31|!!(i<6)<< 6)&&main(++i);}
Not that anyone will still be reading this, but...
It's fun to argue over which ones "count", huh? The ones I had in mind were 1, 2, 3, 4, 5, 6, and 8 above, plus no "#include <stdio.h>", which different environments might or might not give you for free. I didn't count 7 since I didn't think main() is required to take the standard "(int argc, char **argv)", or are we debugging my sig now, too?
David Gould
David Gould
main(i){putchar(340056100>>(i-1)*5&31|!!(i<6)<< 6)&&main(++i);}
A story in yesterday or today's (Raleigh, N.C.) News and Observer (the people who started Nando), and possibly in the local fish wrapper as well, indicates that the military knew about the problem with the wiring bundle rubbing against the hydraulic lines 18 months ago. The military may be at fault here, but not the aircrew.
I see even classic Slashdot is now pretty much unusable on dial up anymore.
Didn't know there were any other Slashdotters in town except the guy who went up to New Bern when Coastalnet bought out toddalan. Did they ever get the VR trainer for the Osprey installed out there? I was out there a couple of summers ago but forget the building number.
I see even classic Slashdot is now pretty much unusable on dial up anymore.
I think his point is that the basic hello world program:
int main(void}
(
printf ["hello, worldn\'),
}
will always be considered practically bug free. Jeez - some people never catch a joke.
I don't think that's really a bug, because if stdout is not valid, rc will be less than zero and errno will be set to EBADF. If stdout was a pipe that was closed, the program will receive SIGPIPE, which is not handled and thus the program will exit via the default handler.
errno.h was needed to declare errno. Otherwise, gcc won't build without warnings. If I were to just declare extern int errno, I might be stepping on an implementation that uses macros for errno.
As you note, there were several problems. The original design was weak. Then the builder unilaterally decided to build the box beams the way they did. They also did not want to thread the entire length of the rod, so they changed the design to two rods. The hotel's prime contractor would not allow the engineering firm to be represented on site, for cost reasons. All of this added up to the disaster. The hotel was even fucked up before it was complete: 2000 square feet of roof collaped during construction. One of my old professors put up a page about it with lots of picures. Check out this picture of the distressed box beam from the walkway that didn't collapse.
Maybe this is the ultimate version of hello world:
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
int main(void)
{
int rc;
rc = printf("Hello, World!\n");
if (rc < 0)
ex it(errno);
exit(0);
}
I think the real problem is the change in the meaning of the word art in the past ~100 years. What most people think of these days when they hear the word "Art" is what is traditionally thought of as "Fine Arts". The traditional meaning of the word Art was more akin to "Technique". The phrase "State of the art" is refering to the most advanced techniques available. This is also why many universities still call their science departments "The college of arts and letters".
It used to be that professions that required a lot of talent and/or practice in order to master a technique, such as painting or dancing were considered the "fine arts". Then, in the latter part of the 19th century or early part of the 20th century some revolutionary fine artists decided to discard these old-school techniques. In the process they succeeded in changing the traditional notion of art from something that was more technique to something that was more creative.
Unfortunately, the traditional use of the word art wasn't obilterated in the process, so many early computer scientists (particularly Knuth) started to talk about "The Art of Computer Programming." What they meant wasn't that it's a creative process, but rather that it's a technique that must requires talent and must be practiced. Knuth preaches a lot of things about computer programming: programs should be simple and comprehendable, they should implement algorithms that are mathematically proven. It might not be anything as formal as methodologies like ISO-9000, but there's no implication of creativity like, "Let's see if we can fit 30 function points on one line because that would be so cool!"
Anyway, I think the only reasonable solution at this point is to abandon the antique word "art" and start talking about software development as a "technique" or other similar term.
My $.02,
-"Zow"
#define RANT 1
#define BLEARYNESS 1
Software comes in two forms nowadays, it seems:
1. "Not for mission critical applications."
and
2. "We told you it's not for mission critical applications!"
This needs to change. There needs to be a
grade of software that its programmers will
stand behind. It should be governed
by slow and laborious reviews, written by
programmers certified for it. In other words,
we need a software version of the rigors of
civil engineering. (How often does a bridge come
with a disclaimer?) It should be expensive,
(for enough eyeballs, all bugs are shallow,
but also the price gets steep), and it should
come without deadlines. When a piece of software
can kill, the author should not be afraid to
miss his deadlines.
It's my understanding that the rotor size, dictated by the need to fit on carrier decks, prevents autorotation. I don't know enough aerodynamics to know why that is. But I believe it cannot autorotate.
--
Infuriate left and right
In reply to all those that say that using methodical, systematized approaches to software is overwhelming expensive:
Bullshit.
The cost of fixing your own goddamn mistakes, and the cost of maintaining your P.O.S. application is far, far, *FAR* higher than the cost of taking the time to do it right. And the cost to your users is even greater, in terms of downtime, data loss, rework, and inefficiency.
Every naysayer needs to pull his head out of his ass. Go buy some quality education and/or books on software project management.
There has been extensive, exhaustive, and rigorous research on software project management methodologies, software programming methodologies, and software maintenance practices.
They all consistently come up with the same conclusions: the more time spent in planning and design, the less time spent in programming, debugging, maintenance, and end-user failures.
There are NO excuses for the shoddy practices in use today. Better ways have been clearly identified. Your ignorance or slothfulness is an embarassment to the profession.
Do it right, or get the hell out.
--
--
Don't like it? Respond with words, not karma.
NASA's software development methodology is remarkable. Their work should be the standard by which every programmer measures himself against.
Unfortunately, most programmers are underinformed, and haven't the foggiest idea that there's are methodologies that will reduce their error rates, increase their productivity, and meet their customers' needs fully.
It's a shame that programming still has this bullshit mystique of "art" to it. "Art" is just a lame excuse for laziness: instead of approaching the problems methodically and scientifically, it's just ever-so-much easier to take a half-assed hack-and-patch approach.
I think I'd better stop here, before I really kick into a rant...
--
--
Don't like it? Respond with words, not karma.
Dude, I work in a hospital. The firmware on the monitoring systems is reasonable. Almost everything else (financial, scheduling, even pharmaceutical) is the buggiest most bloated crap you're ever likely to see.
It's pretty sad... our main system is basically a Pick (that's right, Pick!) emulation layer running on top of HP-UX. It's horrible.
> Has anybody ever seen a bug-free piece of software of any complexity greater than "Hello World"?"
I've even seen Hello World with a bug in it. Forgetting the \n on the end of the printf in a shell which didn't LF when a program exited resulted in the shell prompt being printed over the top of the output...
You try building something under that kind of methodology, and see what the cost to your productivity is.
Go you big red fire engine!
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
And you've got to be able to get it to print out Hello World in the correct language from a single mouse click,
and nowadays you also have to do exhaustive research to find out if someone has already patented the idea of printing "Hello World" with a single mouse click.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
My Win98 machine isn't as buggy as that airplane.
And you stayed on it? Of the four passengers to board that plane, we know which one isn't the smart one!
--Jim
I can't believe nobody's posted about Knuth yet. Donald E. Knuth is famous for writing high-quality software, and even proving some of it (all of it?) correct. He offers rewards to people who find bugs in his code. The reward for TeX and METAFONT is described here: http://www-cs-faculty.stanford.edu/~knuth/abcde.ht ml, under the heading "Rewards".
--Jim
eight? i only see six:
1. left-curly-brace after `void' instead of left-parenthesis.
2. right-parenthesis starts the block of code, instead of right-curly-brace.
3. printf args start with right-bracket, instead of right-parenthesis.
4. n\ instead of \n.
5. string arg to printf closes with a single-quote, not double-quote.
6. comma after printf(), not semicolon.
now, i could be lame and count not returning a value from main(), but that's a warning, not an error. and it only shows up if you use -Wall. and let's face it, no real hacker ever uses -Wall.
---
Folks,
I think people don't understand that fly-by-wire systems in general are extremely tricky to set up and work properly.
Remember, when the F-16 was designed, the use of FBW was considered extremely daring--and it took a long time to work the bugs out of that system. Remember the Airbus A320 jetliner? Airbus thought its systems was better until that bizzare crash at Mulhouse, France in 1988 where the plane literally flew straight ahead into the hill. Or the numerous problems Saab had developing the FBW system for the JAS 39 Gripen fighter.
Even now, FBW systems are still tricky to use--witness the problem with the Boeing 777 and the fact there was much complaints from passengers about the plane pitching up and down substantially during turns causing motion sickness; Boeing had to carefully reprogram the FBW software so the plane wouldn't pitch up and down so much during turns.
I think the problem with the V-22 Osprey is that the FBW systems are extremely complicated, and it will take a tremendous amount of work to get everything to work correctly.
Raymond in Mountain View, CA
A) Their own buggy hello world programs gone wrong (modest) B) Other peoples buggy hello world programs gone wrong (liars) C) People who fear technology (luddites) D) People who say it was the human error (technocrats) E) People who don't know what Hello World is (people who use AOL)
---
--
Insert Witty Sig Here
Right, and hand-held calculators aren't correct to 9,000 significant digits. What's your point? That limitations are bugs? I would suggest that well-documented limitations are not bugs.
Even if Naur proved his program correct, about 5% of bugs are caused by operating system, library, or hardware bugs. He would still need to test. For example from work, I work on a Win32 program that uses sockets. Windows 95, NT4, and 2000 have slightly different Winsock bugs. :-(
cpeterso
Most programmers, including myself, do money-critical software. Unless we are specifically trained to write life-critical software, it would be dangerous to even try. Money-critical coders would write unacceptable bugs into life-critical code, killing people. OTOH, life-critical coders working on money-critical code would get fired for failing to show progress.
Life-critical software is rare; the Osprey fly-by-wire and similar code is part of that. Note that its nav computer, comm scrambler (if it has one), and similar systems are not life-critical.
The reliability of life-critical software must be incredible, by commercial standards. Think of a pacemaker. To make sure that its failure rate is rare, it needs to have an uptime of centuries; that is, we should have hundreds of pacemaker-years before a software glitch.
Even Linux doesn't have that sort of reliability. It claims uptime in years, not even decades.
If you want to write life-critical software, you havew to know exactly what you are doing. You need to test extensively. And you need to limit the scope to the bare minimum.
When you build an airplane, the life-critical software they put on board takes about as long to write as the plane takes to build. That means that a "simple" fly-by-wire system can take five to ten years to write.
Imagine writing commercial software that way. If you really wanted to, you could build a life-critical office suite. However:
The priorities in money-critical software are radically different. We throw away the need for as much quality so that we can get the code out the door in six months, rather than ten years. For that, we get an amazing array of features, and very affordable software.
If we built money-critical software to life-critical standards, we might just might have gotten to QDOS by now. If we built life-critical software to money-critical standards, you'd never see me anywhere near a hospital or an airport.
--The basis of all love is respect
Add in software to control some aspect of flight and the pilot is automatically "further away" from the craft he's aviating. It sounds obvious and I'm not going so far as to say that computer-controlled systems are evil in and of themselves, but it's an aspect of modern flight where the negatives are often overlooked.
I've seen compelling evidence that another layer on top of the data that the plot must interpret is a Bad Thing. Anything that changes the pitch of a chopper's rotors other than your left hand on the collective should be viewed with extreme caution IMHO.
--- Hot Shot City is particularly good.
I was going to cite the Shuttle as well. Here is a great article about it.
The hostpital that I work at recently spent a couple of hundred thousand dollars on a very complicated piece of patient scheduling (it does other stuff too) software.
The company is MediServe (just so that you guys can know and avoid them in the future) and they sold the hospital this software that was at version 1.x. We should have gotten a discount from them for being a beta site.
First of all, this thing requires an SQL server in the backend to handle the data. OK, that's no problem. But this software can't even talk to M$ SQL server! It required an interface pc to translate between the desktop software and the SQL server.
After that, the real trouble started. Even though it was a 1.x release, the project manager felt that it should have been in the early beta cycle. He would spend hours on the phone relaying bugs to the developers. These were bugs that our poor users were wasting time documenting and working around. (luckily it is currently running in parallel to the original system.) In one week, we recieved 5 (yes, five!) new versions of the software. That was only in one week!.
My theory, is that some moron messed up on the budget and they started running out of capital and had to find some suckers to pay them. and they got paid to have someone else test.
A word of caution to everyone out there, Investigate new applications before you buy!
----------------------
Opportunities multiply as they are seized. --Sun-Tzu
In order for software to function perfectly, three things must be present: a perfect operator, perfect hardware, a perfect operating system and perfect code. If you've got buggy hardware or an unreliable OS, even something as simple as "Hello World!" can bring you down. And if your human operator makes a mistake, then the entire thing collapses like a house of cards.
Some people think that the provably-correct school of software design is a panacea to all ills. By writing code in a language like SPARK (an Ada83 subset), you can then use mathematical tools to formally prove each part of your code to be "correct". The reason why I put the word "correct" in quotes is because there is no good, formal definition of what "correct" means. The code may be provable to do exactly what you intend for it to do, but that's not necessarily the same as what it should be doing. If you're writing code to control the flight of an F-16, it doesn't matter how perfect your code is if your code makes the assumption "hey, if one engine fails, we can just switch to the other".
(For those who aren't aeronautically inclined, an F-16 has one engine.)
The way to produce better code is clear. Publish your code; do internal audits; put it out in the field in non-critical environments for real-world testing; get a good feedback loop going from your users. This results in code we can trust; it does not result in bug-free code.
Perfect systems require perfect operators, perfect hardware, perfect operating systems and perfect software. Even if you can get the last three, you'll never get the first one.
So mod me down, if you must...
The article is kinda thin on details in exactly what way the aircraft acted, but basically these three events occurred:
1. Hydraulic line is severed, warning lights go off, including one on the reset button.
2. Per training, pilot hits reset button once, then multiple times.
3. The rotor pitch changes, causing the craft to inevitably crash.
Now, the article says that the hydraulics have had problems in the past... Here is my take:
In the three pieces I outlined above, two stand out as being really underdefined (or wrongly defined) in the article: Number 1, in that they don't mention what the hydraulic line controlled, and number 3, in that they make it ambiguous as to whether the blade pitch changed, or the rotor (ie, the pod) pitch changed.
The V22 is a tilt rotor craft. Say perhaps it was the pod pitch that changed, and not the blade pitch, and the hydraulics that were damaged were the ones controlling the pod tilt on one side. The reset button is hit, computer say "go into hover mode" and only one pod tilts...
See where I am going at? Perhaps what the reset button did was intentional - but the programmer assumed that both pods were working. I am not saying there wasn't a bug - somewhere along the line there was - but the way the article was written doesn't really tell what happenned...
You may mod me down - I am sure I have various things incorrect (for all I know the tilting is done with electric motors - but I doubt it)...
Worldcom - Generation Duh!
Reason is the Path to God - Anon
There's a word for this, it's called "overengineered". The Marine Corps wanted a new helicopter, what they got was on over-engineered, overly-expensive, overly-complicated monstrosity that has already taken taken twice the time it would have taken to develop a new helicopter, and *IT'S STILL NOT READY*. In the meantime, their aged fleet of CH-46 Sea Knights (many with authentic Vietnam-era bullet-holes) is just getting older, and failing more frequently.
Folks, the Corps didn't *want* this piece of shit, they got it rammed down their throat by a congressman who is in the pocket of the manufacturer, and his buddies, who owed him a favor. Now they're stuck with it.
The Marine Corps is the smallest of the military branches, with the smallest budget. They *don't like* complexity or high cost. They want extremely reliable, easy to maintain equipment at reasonable cost. This monstrosity doesn't qualify for even one of these criteria, let alone all three. They even got the name wrong, it should have been named "Albatross".
You cannot believe how grateful I am that I am no longer with the Marine Corps, just so I *never* have to get in one of those things.
Too true. One of the stewardesses questioned my sanity, and all I could say was, hey, I don't really have a high regard for my own personal safety.
To that end, I also remember the pilot leaning back in his chair and yelling back at me something about some new system glitch, and whether or not I thought it was serious, and I yelled back my standard reply: "(whatever it was)? Hell, that's non-essential, ain't it? Screw the FAA, let's roll! How many people can we possibly kill here?"
But actually, since the "cancellation" of the flight, I knew that all the flights leaving that morning would be *packed* full of grumpy, anxious, sleep-deprived people, many of them with babies. Given the choice between that and a plane all to myself with all the first-class benefits, even a wonky Airbus plane, it wasn't much of a choice. What're the odds, right?
Besides, by that time me and the flight crew were *tight*, man. We were all goin' down together.
Uh... not in the Mile-high Club sense, though.
But if ever there were an opportunity, that would've been it...
Strangely enough, I did. Shame about the flight crew, though. Damnedest thing.
And come to think of it... I've never been sick, either. And I was only faking that car accident injury because I loved my wife. Huh. Maybe I'm... Unbreakable.
I'll be right back - I'm gonna go test that little hypothesis.
I was waiting for a flight on an Airbus one night when they announced that they needed to replace a part, and while it might arrive that night, they told everyone the flight was probably cancelled. Not having anything better to do, and being stuck at the airport anyway, I decided to wait it out.
By the time the plane boarded at 3AM, I was one of only four passengers. Shortly after we boarded, the Airbus computers started having fits. They opened and closed the door to the plane 5 times, went through one shift change of mechanics, pulled away from the gate once, power-cycled twice and finally got the computers working properly at 8AM, exactly 12 hours from when I was originally scheduled to leave and minutes before the flight crew timed out.
The cool thing was, I was the only passenger left at that point. The other three chickened out and decided to go on other flights. I think the pilot would've liked to have done the same thing, actually... he was a little nervous. He told me at one point, after a flap malfunction light came on, that he had a real bad feeling about this flight. They kept asking me what I thought, and my answer was always the same: hey, we're flying light, we don't *need* flaps, right? Let's just go! Haven't they got real long runways in Detroit?
The pilots say they like flying the Airbus, but I dunno. My Win98 machine isn't as buggy as that airplane. You should've heard the mechanics bitching about Airbus computers... I already hate the seats they use, and now I sure don't trust the systems. I try to avoid Airbuses wherever possible these days...
But it sure was a fun being the only passenger on a big jet like that...
7. Incorrect parameter list to main()
8. No return value from main specified.
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
Speaking as a former employee of a defense contractor, if I were the Navy brass, I'd court-martial the guy who considered the system to have passed FQT without an adequate test.
Whenever we had as system go through QT, we were not allowed to have any priority 1 or 2 problems.
Priority 1 = System failure. Danger to operators.
Priority 2 = Danger to operators.
Priority 3 = Nuisance, no workaround.
Priority 4 = Nuisance, workaround.
Priority 5 = other (minor).
What jerk allowed this obvious priority 1 problem through? I hope they find his ass and court-martial him!
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
they haven't got any new helicopters since the Blackhawk
What the hell are you talking about? My last job was working on the avionics upgrade for the AH-1Z Cobra and UH-1Y Huey choppers!
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
Let's see, roughly 100 casualties over 100 hours.
Extend that over the eight years of the Vietnamese War and you get slightly over 70,000 KIA, more than the 55,000 names inscribed on the Wall.
Yeah, that's progress alright.
k.
--
"In spite of everything, I still believe that people
are really good at heart." - Anne Frank
"In spite of everything, I still believe that people are really good at heart." - Anne Frank
The report said that the software had been tested in December 1996 but that those tests did not adequately check the reset system.
.357 and do the world a favor.
Speaking as an embedded real-time programmer, if I were designing software to control an aircraft full of servicemen, and I let my software get released into an aircraft without ever having tested pushing the "reset" button in flight, I'd take myself out in the back yard right now with a
I write embedded communications software, and we spend weeks on cold and warm reboot scenarios per-blade and per-chassis.
The worst thing that happens if my software goes down is that somebody's pr0n gets cut off.
Ho-ly shit, folks. Can we all just go out, buy a clue, and mail it to these people?
--
What happens when you outlaw guns
Well SCREW YOU! We are not slaves to be worked to death...
Why heck no! Don't be absurd, sir! You're not slaves! As all students of economics know, slaves receive no recompense whatsoever beyond the minimum barely required to sustain organic life. You programmers, on the other hand, get paid so well you can almost afford to live in San Francisco! In fact, if you're really lucky in your choice of employers, in return for all that exorbitant, nerves-destroying overtime, you get richly rewarded in (drum roll) stock options!
ha ha ha ha ha ha ha WD "f*ck computers" K - WKiernan@concentric.net
and then later...
So let's see... it's not a showstopper. A reset button that instead changed the pitch of the rotors and caused uneven acceleration (which the pilot could not have anticipated). That sounds like a showstopper to me. An we trust people like Rep. Weldon to run the country? Well, as long as he keeps the factory in his district open, I guess that's all that counts, right?
Sigh.
If I could only live my life with my threshold at 4...
It only needs to return a status code when you make main an int. So step 5 is not a bug until you do step 1.
Riiiight, you deliver me a development CV-22 aircraft and I'll get right to debugging that as soon as I download the source code from Boeing's website.
http://www.fastcompany.com/online/06/writestuff.ht ml
That has a good article on how the programmers at NASA write their software. They're on budget and on time.
Has anybody ever seen a bug-free piece of software of any complexity greater than "Hello World"?"
I once wrote a "Hello Universe" that I was fairly confident was relatively bug free.
When will Windows be ready for the desktop?
This gets into the whole Boeing vs. Airbus commercial airplane debate. Boeing has had a history of listening to what pilots want in an airplane, and they all want force-feedback on the controls. Therefore, the 777, currently Boeing's only fly-by-wire aircraft, has force-feedback controls. From a "touch and feel" point of view, you will get the same response from the controls of a 777 as you would from any other mechanical/hydraulic flight control system.
Airbus, on the other hand, decided to skip the force-feedback and use a sidestick as their control. Therefore, the pilot cannot feel what the airplane is doing by just resting his hand on the controls. Later models of their fly-by-wire jets have force-feedback as I understand.
You're also right about the air war of course, which again furthers my point. It lasted for weeks, but because of superior technology, we were able to beat down the Iraqis at minimal cost in lives.
I do remember the Valkyrie. I do remember the SGT York. But I'm also familiar with the Blackhawk, the F-16, and a few other extremely complex pieces of technology that work rather well. My whole point was that we shouldn't throw the baby out with the bathwater.
I also know that flashy technology doesn't always win the day. Remember the GAO study about all those Scuds the Air Force "wiped out"? Turns out the only Scuds that were ever provably killed were taken out by Special Ops units on the ground. The Pave Low and other very sophisticated equipment certainly helped them get there, though. ;-)
Read the EFF's Fair Use FAQ
maybe that sentence would be better as "one reason to fear poorly implemented technology".
Let's not forget that the technology in the V22 is akin to the technology that allowed us to wipe out most of the Iraqi army in 100 hours with an incredibly small number of Coalition casualties.
This reminds me of the negative press coverage the M1 tank received during trials because it used a turbine engine similar to those used in helicopters. The engine had problems during trials and was roundly criticized for being costly and "gold plated".
That same turbine engine, after the glitches were fixed, turned out to be more reliable than the old engines it replaced. The M1 went on to become the most feared and capable main battle tank on the planet, easily taking out T-72 and T-80 tanks before they could even DETECT the M1.
Technology in military systems is always risky in one way or another, but the ultimate payoff is usually more than worth that initial risk.
The V22 has been plagued with problems from the beginning, and they may be pushing the envelope a bit too far with an overall concept that is just not workable. But if they do work out the glitches, the V22 will go from becoming an example of "too much" technology to an example of "amazing" technology.
Read the EFF's Fair Use FAQ
Planes don't "crash" anymore. Now they "blue screen."
Only if they're landing in the water. On land it might be a black, grey, yellow, green, or white screen.
Tarsnap: Online backups for the truly paranoid
The important question to ask is "Why doesn't good software exist?". The reason is that the consumer wants it yesterday, they will pay for an unfinished and incomplete project (nee almost all windows software). This leads to cut & paste programing, re-use of lots of pre-done code supplied by various manufacturers and the like. If the customer demanded well sorted out code, and refused to "upgrade" to the next best thing then you will see better coding.
Last and not least lets not forget the customer, the US govt. The Osprey has always been considered a boondogle from the get go. The pressure from congress and media to get the progect completed has been extremely intense. I'm sure a lot of pressure was put on the coding team to "get it done now". It's too bad that they didn't learn from the Space Shuttle, over budget, and over designed, it works. It has passed all expectations and is doing things now that it was never designed to do.
There is an engineering credo there, the famous "factor of saftey". A number usually applied to equations to take into affect the "unknown". Looks like the Osprey designers didn't make their "factor of saftey" high enough.
"Science is about ego as much as it is about discovery and truth " - I said it, so sue me.
As many of you may remember, back in the old days commercial jetliners had three crewmembers. Pilot, copilot and radio, iirc. But, due to increasing automation and computer control, they were able to reduce it to two people. Now, things are becoming so automatic that Boeing is planning on reducing the crew requirement on their generation of planes to just a pilot and a dog.
Why a dog?
To bite the pilot's hand if he tries to touch anything.
Whoo, I crack myself up.
The only "intuitive" interface is the nipple. After that, it's all learned.
"The question of whether a computer can think is no more interesting than that of whether a submarine can swim" -EWD
Other A320 problems have generally involved having flight deck systems in the wrong mode. This is a recurring problem with complex aircraft.
The Osprey problem, though, may be an out and out bug. That's different.
Proving software correct shouldn't hold much weight. Testing is really the only way to go.
For instance, here is a pdf that mentions Naur "proving" a very simple ALGOL program correct, but obviously not testing the code. The program was only 25 lines long and reformatted text (basically a word wrapper). Later on there were (at least) 5 bugs found in the code.
Which goes to show that testing is what makes software robust.
as i read it, the hydraulic system degraded, and the pilot hit reset 8-10 times as the plane went down (per the book).
no offense, but given all the problems with the hydraulics on this beast, and the people ordered by the military to lie and falsify records regarding those problems, i think a s/w glitch is a bit later on the list.
i do think it's a shame that the software wasn't written well enough, or tested well enough, to save those people from a mechanical failure.
maybe management is right -- just give it all to microsoft and let the H1-B's deal with it.
Treatment, not tyranny. End the drug war and free our American POWs.
See my user info for links.
A group of technical leaders were at a seminar called "Making Reliable Software". The teacher at the seminar, in order to make the participants appreciate the problems of reliable software, posed this question: "How many of you would be willing to fly in an airplane that was controlled by software that your company created?" The technical leaders thought about it, and looked around at each other. Nobody raised their hands, except for one lone guy in the back. The teacher was suprised. "You, in the back," he asked. "Why are you so sure that it would be safe?" "Simple," replied the programmer. "Knowing how my guys code, the plane wouldn't even be able to pull away from the gate."
Lets say that you have 30 million lines of code? Is it possible to say "this code works 99.999% of the time?" What about 300 million? What about 30 billion? Is there some critical mass for software where it then becomes theoretically too complex? Will there be a similar limit for hardware? Any URLs?
Monkey sense
http://www.bootyproject.org
OtakuBooty.com: Smart, funny, sexy nerds.
First they crash, then they crash again!
--
Dyolf Knip
Ever seen Pick's D3 database software? By far the most worthless piece of junk I've ever had the displeasure of working with. Didn't do one tenth of what they claimed and failed to do anything on a system, client or server, that was running IE 4.0 or above. What's the connection between a browser and a DB? Beats me, but IE 5.5 came out before they got around to fixing it.
--
Dyolf Knip
I suggest you take your know-it-all attitude and use it to do something other than sully a dead man's honor.
The Ospery is a plane which requires computer control. No human being is able to compensate or be aware of all of the servos and control surfaces on that aircraft.
You are happy to sit back and say "You try to write a program so it boots up and takes control of an on-going process without glitches", defending the team that designed the software. Yet you question the pilot's intelligence for following his procedures during a few split-second moments of time.
It's a shame that young Marines and aircrew need to die because overpaid 'engineers' who fuck up the design of systems sit back and collect their consulting fees. If a civil engineer designed a bridge which collapsed under the weight of passenger traffic, he would lose his license and possibly face criminal charges. Software 'engineers' get to continue the project and collect more fees.
Conformity is the jailer of freedom and enemy of growth. -JFK
Very good! That has a very high density of bugs, I would say. I count at least 5 bugs in that code (4 if it's pre-ANSI C):
1. "void" rather than "int" on main.
2. Need to include "stdio.h".
3. Invalid declaration of main (which is not illegal in pre-ANSI C)
4. Use of "\r" rather than "\n".
5. Need to return a status code.
Any bugs that I missed? Any way to get an even higher bug density (while still displaying "Hello World")?
I guess you could do 'puts("Hello World\r")', which not only has the "\r" rather than "\n", but is also redundant! That counts as another bug in my book.
--
Sometimes it's best to just let stupid people be stupid.
Er, 'printf' doesn't return a status, it returns the number of characters printed. You don't need to check it.
--
Sometimes it's best to just let stupid people be stupid.
But you're assuming that we, as Americans, care about internationalization issues. [RM101 ducks, as the truth of this strikes too close to home...] :)
--
Sometimes it's best to just let stupid people be stupid.
I used to write drivers that communicated with medical monitors. Talk about garbage! It was the exception rather than the rule that the monitors worked exactly how they were documented.
My vote for the medical monitor with the worst software and by far the worst communication protocol has to be the Corometrics 115 monitor. Anyone else ever have to deal with that piece of crap? My condolences if so. It took me literally *years* to get the driver to work perfectly in all situations, because there where so many versions of the firmware that caused it act differently in different situations. It was always fun debugging software running in a live patient environment. :)
--
Sometimes it's best to just let stupid people be stupid.
Greater complexity than Hello World? That's nothing! How about concatentation the strings "Hello" and " World!" together and THEN outputting it to the screen! Man, oh man. Still need to do the regression testing, though.
-- dR.fuZZo
When Dick Cheney was Secretary of Defense the DoD tried to kill the Osprey because it's overpriced and unreliable. Idiots like this guy kept it alive, solely because the plants are in their districts.
sulli
RTFJ.
Planes don't "crash" anymore. Now they "blue screen."
Milo
Jeez, some people!
-atrowe: Card-carrying Mensa member. I have no toleranse for stupidity.
This is why I shudder when I think about these same companies rewriting the air traffic control system. Yes, it has to be done, but I worry that some people are going to be unwittingly involved in some "real world testing."
No human being is panic-proof when falling from the sky. You can train 'em all you want, and they will do as their told, but they still get scared.
Er, no. Most planes today are still good airframes that can fly/glide by themselves with no computer assistance. The X-29 was unique in its requirement of constant computer assistance. Helicopters are different in that when the engines turn off, you don't exactly glide per se...
The cost of developing verifiably bug-free software is not justifiable in some situations. The use of the word many is a value judgement that is at best subjective.
NASA, for example, can ill afford buggy software.
Medical centers and hospitals can ill afford buggy software.
The Army and armed services can ill afford buggy software.
Now the real question is, can the average user afford buggy software?
I dunno, I would much prefer no bugs to bugs. How much would I pay for that?
I don't know.
Geek dating!
GPL Deconstructed
Back in early 90's I worked for a defense contractor that was involved with the testing of the V22's engines. I remember sitting in a meeting and looking at some of the mechanical drawings and saying to myself, "glad I'm not the test pilot". It is a very complicated design. Most military aircraft like the V22 have quadruple redundant electronic systems and even the engine ECUs will talk to each other and monitor the one that is currently in charge and if the others determine that the main ECU is malfunctioning the other three will vote to remove it. Very complicated algorithms. Also to maintain that both rotors have power with one engine failing a high speed drive shaft is used to transfer power from the good engine to the transmission on the side that lost power. The drive shaft has a very small diameter but would spin at something like 10,000 rpms. Low torque but high power. The designers are thinking about safety and I think this bird can work its just not being funded with the enthusiasm of the old cold-war projects.
Having once worked on a project where, if I told you what it was I'd have to kill you, I find the notion of open-sourced military software very funny... But aircraft control software could easily be open-sourced, even for military aircraft.
Maybe so, but since they haven't got any new helicopters since the Blackhawk (30 years old?), they do really need the Osprey now.
IIRC, the original design required the builder to thread hundreds of rods through the whole multi-storied catwalk, from bottom to top. I'm not an expert on these things, but it would be difficult to put together a scale model on a tabletop, and to build it while hanging in midair? So I'm not surprised the builder changed to short rods -- but it meant that instead of the load being carried through the intermediate box beams by the long rods, it transferred from top rods to bottom rods through the box beam. Furthermore, the box beams were formed by welding two U-channels together -- so the rods were secured by nuts bearing on the welds. The structure had only a fraction of the design strength, unfortunately it was enough to hold until a crowd went out on the catwalk...
As I said, I'm not a structural engineer, but it sounds like the engineer on that design had no experience with actual building operations. And the builder should have gone back to the engineer with the proposed changes to re-evaluate the strength... I am an EE at an electronics manufacturing plant. A large part of my work is dealing with board designs that are unbuildable and untestable. Obviously the engineering schools still don't teach the practical aspects of making boards. We get the bad designs changed, but we get them changed by going back to the designer, not by unilaterally changing them -- whether or not the board is going into a safety-critical application...
I did mix up the dates a bit. It's at least 20 years since the first Blackhawk protos flew, and over 15 since they started using them -- however it wasn't flying right until 1988 or so, and I know that because I was peripherally involved in a critical avionics fix. The tail-plane was electronically controlled, the cable to that controller wasn't adequately shielded and filtered, and radio signals in the military voice band could be picked up and jam the tail-plane controller to full nose-down. In the typical Blackhawk mission, it would be at 100 foot or so above ground, and so it was almost impossible for the pilot to pull it out -- and difficult afterwards to tell whether something malfunctioned or the pilot just misestimated wind and trees... There were a number of crashes before the Army would admit there was a problem. I think a system I was working on in 1987 forced them to fix it -- it included a transmitter powerful enough to would drive the chopper full nosedown every time, and because it was usually fired up at 5,000 feet there was time to turn off the transmitter and recover control, so the pilots got back alive to report. And there is nothing angrier than a pilot whose aircraft just tried to kill him...
I suspect the reset button did exactly what it was supposed to. It reset the computer. This is NOT going to fix a bad hydraulics line. In fact, it makes the computer forget what was just going on, so it's probably going to make a bad decision as soon as it starts taking control. It is not impossible to store enough status information so the computer could have picked up just where it left off before -- but that isn't such a great idea when the pilot punches the button because the computer was screwing up.
Actually, a decent embedded system (RTOS or not) will re-boot itself when it needs to. That is, you got a "watchdog timer" which the main program loop keeps resetting; if it doesn't get reset, it resets the computer. But apparently they gave the pilot a pushbutton just in case the watchdog timer crapped out too. But he pushed it when the real trouble was a chafed-through hydraulic line. And when it didn't help, he pushed it nine more times...
As I see it (and I'm not claiming expertise), the problems were, in descending order of importance:
1. The designers failed to route and fasten hydraulic lines so they wouldn't chafe.
2. The Marine Corps and aircraft mfg both ignored warnings about chafing lines.
3. The system for some reason told the pilot he needed to reset the computers although the computer had nothing to do with the hydraulics failure.
4. The software changed propellor settings erratically while booting up.
5. They didn't give the pilot an effective manual override for any controls the computer might be messing up.
6. The pilot panicked a little. Understandable, but pilots and Marines are supposed to be panic-proof...
There are a few flaws with your list, but they're moot because the answer is:
0. Nobody tested the reset button.
There's no reason not to test this feature. The button was there for one reason--safety in a critical situation--and there were no doubt several pages of requirements written for the system's behavior when it was pressed. If the article is wrong and they did test it, and if it exhibited deviations from the required behavior, the planes would have been grounded until the problem was fixed and retested. Someone either lied about the completeness of the testing, or signed off on the risk.
Systems I&T on the Osprey program just killed a whole bunch of people. The rest of the problems you list merely conflated this one long enough for it to become deadly.
--Blair
We need a better model or computers will remain the elusive and vile creatures that, to the masses, they are...
you think it's easy, but you're wrong...
Atomic Energy of Canada, Ltd. (AECL) built a medical radiation device called the THERAC that had a major software bug in it that killed at least three people. If the operator hit the keys on the keyboard in the wrong sequence, the machine would deliver 1,000,000 times the dose displayed on the screen. Massive radiation poisoning, and the victims died in a few days. They finally traced the problem to a coding error made by a contractor.
This, of course, is a massive over-simplification of the problem. The full story can be found here
-----------------
www.lucernesys.comHorizon: Calendar-based personal finance