Slashdot Mirror


Space Shuttle Software: Not For Hacks

Jeff Evarts writes: " This article in Fast Company talks about the process the Shuttle Group uses to make software. At first it seems too predictable: a very cool project but no hacks, no pizza-and-coke all-nighters, etc. Then, however, it goes on to talk about why: They have an informed customer, they talk to that customer until they have a very clear idea of what is wanted, they have a budget focused on prevention, and they focus on fixing the process and not blaming the individual."

As someone who's done more than his share of late-nighters, it was an interesting view into the mission-critical environment. Maybe there are a few software firms out there that would rather spend some of their money on better processes rather than technical support engineers. Maybe a little more market research and a little less marketing, too. A good read."

These guys are "pretty thorough" the way Vlad the Impaler was "a little unbalanced." Still, you have to wonder how they can claim single-digit errors among thousands of lines of code, but I guess the proof is in the rocket-powered pudding. And lucky for them, their target platform was recently upgraded.

178 comments

  1. Who wrote the Mars landing software? by ahg · · Score: 2

    While the caliber of this grooup seems unbeatable, it's too bad NASA doesn't apply this rigid development model to its unmanned space craft. -- I still don't understand how a difference in units (english vs. metric) managed to go undetected!

    The only thing I could think of after hearing that such an error caused mulitmillion dollar craft to crash was IDIOTS - any scientist should be using SI units today.

    --

    --Aaron Greenberg

    1. Re:Who wrote the Mars landing software? by Ig0r · · Score: 1

      ... any scientist should be using SI units today.

      That might be true of the scientists and engineers, but not necessarily of the contractors or of other government agencies.

      --

      --
      Soma: because a gramme is better than a damn.
    2. Re:Who wrote the Mars landing software? by sigwinch · · Score: 1

      ... any scientist should be using SI units today.

      Wouldn't have helped -- the person reading them just blindly typed in the digits. Suppose they blindly assumed N*s, but they were supplied a value in kN*s? Both numbers are SI units, but the mission would fail.

      In fact, using a random collection of units would probably have saved the mission: they'd be forced to check everything very carefully. Dyne*fortnights, anyone? ;-)

      I'm still amazed that a highly-educated professional in charge of rocket engine firing would fail to rigorously check units...

      --

      --
      Kuro5hin.org: where the good times never end. ;-)

  2. Re:Processes in software engeneering. by DavidpFitz · · Score: 1
    Would the next generation programmers write in "logic language" instead of C++?

    Most likely not - but automatic verification of programs using logical constructs is a big growth area.

    You can test a program with all possible inputs, and have a clean run. But this does not mean the program is 100% reliable. You must prove the program is correct if you want to be sure it is good enough for circumstances such as shuttle or aeroplane flight.

    With all the complexities of semaphore control in parallel computing, you really have to make sure a program enters and leaves critical sessions at the correct times, without anything else running (that has been designated mutually exclusive).

    Many expertes believe that some Airbus crashes were caused by incorrect verification.

    On a single processor machine, this is much easier, but how many space shuttles do you know of that only have one CPU!

    Have a look at some of the links at Dr. Mark Ryan's page (university of Birmingham) for some more info.

  3. Re:Seems almost like ISO... by Coz · · Score: 1
    That's the point of SEI Level 5 - "Continuous Process Improvement."

    I've worked on programs assessed at Levels 3 and 4, and supposedly the folks I work with now are Level 5 (I know they made 4, but I'm not sure the certification for 5 is finished). I grind my teeth sometimes at the layers of process we have to wade through to get things done - but every six months or so, they make changes to (hopefully) make it better.

    The SEI's not just working with software - they're developing models for System Engineering and Integrated Product Development, as well as Personal and Team software process models for small and independent-minded folk. Your tax dollars at work!

    --
    I love vegetarians - some of my favorite foods are vegetarians.
  4. Structural problems with the software industry by WebSerf · · Score: 1

    The main reason Space Shuttle reliability is not a priority in the software industry in general is that the whole focus of the industry has become the quick buck, the rush to the IPO, the dazzling of the user with endless "features" that have minimal utility. The classic example was Windows 3.1. It was colorful, had lots of features - and barely worked.

    The marketroids who set timetables for software projects are another problem. Most of them think any arbitrarily complex piece of software can be designed, implemented and tested in about 3 weeks and get impatient when this doesn't happen. In the shuttle program the engineers are in charge and they determine the timetables.

    Yeah, I'm a bitter, angry little coder...

    --

    --
    Nothing to see here. Mooooove along...

  5. Suits? by Dreamweaver · · Score: 1

    Well, everything i could say about time requirements and budgets has been said a thousand times already.. so i'll just go to what annoyed me about the whole article:
    suits.

    Seriously. Why is wearing a suit such a huge thing in the business world? I can understand if you're a lawyer and need to impress people with your multi-thousand dollar clothing, or an executive who deals with customers and must appease the customers' sense of what's proper in an executive.. but other than that.. WHY?

    It's been proven and re-proven that people are more productive in an environment where they're comfortable. In this particular case the idea seems to be to make your coders as un-cumfortable as possible so they can think of nothing but getting the code perfect the first time so they can go home.. but most places (as has been mentioned repeatedly) aren't like that. So why is it that a guy who works in a cubicle and never gets closer to customers than a middle-manager who is in charge of a supervisor who is in charge of the customer service department has to show up to work in a tie?

    And the worst part is that the business world seems to think people enjoy this. Sure, it's nice to look good.. if you've got a $2,000 suit you're going to want to wear it on occasion.. but how many of us can honestly say that we feel more productive in it?
    Dreamweaver

    --


    "If a man hasn't discovered something he will die for, he isn't fit to live" -- MLK, Jr.
    1. Re:Suits? by brix · · Score: 1

      I agree that people usually work better (at least at coding) when dressed comfortably.

      However, maybe I missed it, but I didn't see anywhere in the article that it mentioned that they wore suits. It said "moderately dressy ... neat but nothing flashy, certainly nothing grungy". Still sounds like it could fit the "comfortable" range to me.

      Of the photographs I could make out, one group of people was wearing jackets, but the other group didn't even have ties. Probably what separates management from the grunts would be my guess.

  6. Reliability and Tough Professionals by Sara+Chan · · Score: 1
    To be this good, the on-board shuttle group has to be very different -- the antithesis of the up-all-night, pizza-and-roller-hockey software coders who have captured the public imagination. To be this good, the on-board shuttle group has to be very ordinary -- indistinguishable from any focused, disciplined, and methodically managed creative enterprise. (from the main article)

    When I meet programmers who think that they are cool and tough, I tell them to read Bravo Two Zero by Andy McNab. It's the true story of an SAS (British army special forces) unit that operated behind the lines during the Gulf War. Here in the UK, the SAS is revered by most guys in the way that Navy SEALS are in the US. The book has a lot to teach about programming.

    Many people seem to think that special forces troops are so good that they can just be handed a task, left to get it done, and that they will deal with whatever problems arise. Wrong. According to McNab, the True Motto(tm) of the SAS is "check and plan". For example, before approaching an Iraqi military vehicle, they would rehearse opening the vehicle's door: which way the handle turns, whether the handle has to first be pushed in or pulled out, whether the door swings open or slides back, how much force needs to be used, etc. etc. etc. Every little detail is checked like this. And there are backup plans.

    Now read the first sentence of the previous paragraph, but substitute "top software programmers" for "special forces troops". You can see my point. Truly good special forces/programmers/professionals all have some things in common: they are focused, disciplined, and methodical. And they don't feel a need to prove how good they are by taking unnecessary chances.

    The main article also notes that programming teams such as those used for the Space Shuttle seem good at drawing in women. This is hardly surprising. Women naturally like men who are justifiably confident about what they do.

    How well did the eight-man SAS unit perform? They were surrounded by Iraqis, who had armored vehicles. Three were killed. The other five retreated: over 85 km (>2 marathons) in one night with 100 kg (220 lb) of equipment each. About 250 Iraqis were killed along the way, and thousands more were terrorized.

    Sara Chan

  7. Re:What's fun in software development? by aiken_d · · Score: 1

    It is definitely different, and I'm one of those nay-sayers who read and loved the article while thinking "ugh, I could never work there." Obviously, they do things the right way, and the only way it can be done in those circumstances. It's also like the difference between the pioneers who moved west in the US, traversing difficult trails, versus modern yuppies who contract a moving company. Both ways work, and the latter is infinitely safer, saner, easier, smoother, and faster -- but it's also less fun. Like Harry Tuttle said, "I got into this business for the excitement and adventure; get in, get out, move on. Now, your entire place could be on fire and I couldn't turn on a tap without filling out a form 27B stroke 6." I, for one, like working on a million things at once, and seeing a ton of stuff grow quickly (with bugs). If I was the kind of perfectionist these people have to be, I'd still be working out the bugs in the BBS system a friend and I wrote in Pascal in 1984. Hmm... I think I'll go apply for a job at writing software for hospitals.

    --
    If I wanted a sig I would have filled in that stupid box.
  8. Re:Spacecraft Design by Coz · · Score: 1
    Heh - we don't ALL use 'em. I, too, am an SAIC software person, and my previous assignment was on a 15-person development - I was brought in as a lead developer, in part to provide experience with SW process as this project tried to work its way up to SEI Level 3. Didn't make it (gov't and the prime got the program canceled - LONG story), but the value of the processes we used was established in the minds of everyone involved.

    Our Telcordia subsidiary (formerly BellCore, half of what was once AT&T Bell Labs) is one of those Level 5 organizations - we're all learning from them.

    --
    I love vegetarians - some of my favorite foods are vegetarians.
  9. CMM Level 5 does not come from hacking by Yousef · · Score: 1

    The NASA Space shuttle project is generally described as being at CMM Level 5 (Software Engineering Institute of Carnegie Mellon). The CMM is basically a system to ensure software quality.
    The software fits the budget; is what the client actually requested etc...
    Many major companies/consultancies try to aim for CMM Level 3, and most defence contracts require it.
    It makes the acheivements of the NASA Shuttle program seem all the more impressive.

    It doesn't necessarilly fullfill the Hackers development model, however, it does try to ensure Software Quality.

    --
    -- "To ask a question is to show ignorance; Not to ask a question means you'll remain ignorant."
  10. Re:Again, I don't understand by pyrrho · · Score: 1

    it makes the priority clear. unlike most corporate work in which the fear of unspoken criteria is always yeilding random result, the shuttle program is sure of it's priorities. Of course, it means nothing to someone who signs such a thing without caring. I'm sure someone loves to point out that it hasn't stopped the shuttle itself from problems. But still, the answer is, the black magic is that it makes the priority clearly communicated and acknowledged as communicated.

    pyrrho

    --

    -pyrrho

  11. Re:Haven't we seen this before? by Seneca · · Score: 1

    I haven't either, but it does spark a thought. This is, apparently, a VERY good way of doing things. Why aren't other companies (who don't have NASA contracts) doing this? From what I've ehard, the Love bug wouldn't have worked if there hadn't been huge 'undocumented features' in Microsoft Everything big enough to launch the shuttle through. Pointing out that intelligent quality control can be done is a good thing.

  12. Re:Who are the kernel QA gurus? by Timberwoof · · Score: 1
    You have a specification for what the kernel is supposed to do ... don't you? That document tells you about environment, inputs, expected outputs, performance, and a bunch of other stuff. So write a test program that lives on top of the kernel and inflicts a bunch of specific tests on it, whatever is suggested by the spec. And since you have access to the code you're testing, you can even write nasty devious tests that look for errors at edge conditions.

    Part of one SW development process I've worked successfully with has QA engineers designing the test plan, based on the spec, while the SW wngineers write the code. When the code's done, you implement and run the test plan.

    If a change is made to part of the code, it ought to be reviewd, and the QA engineer should be present. He can then make some new tests that look specifically at the effects of the change. (And run at least a representative sample of the standard tests.)

    My experience was at the application level, on a multimedia authoring and playback system. I'd be tempted to apply similar processes to OS kernel development and testing.

    Scenario testing -- what you described -- can find bugs the formal tests didn't, for a hundred users can be more devious than one QA erngineer. But you can't rely on it to find bugs at the early stage; it's too random and undirected.

    --Timberwoof

    --
    I'm in it for the fun, but it's more fun when you win.
  13. Re:Seems almost like ISO... by Mr.+Slippery · · Score: 2
    hehe, well looks like you are sick of the "process system".
    I'm sick of this particular process, partly because I've worked under better ones (or perhaps, less bad ones; I've seen good bits and pieces here and there but they've all had serious flaws). It's not the major factor why I'm leaving in a few weeks (I'm partnering up with a friend who's starting a web development company) but it certainly helped.
    It makes sense because if all staff in your organization feel the same way as you, the process will simply not exist...because no one would be implementing it!
    Not all staff are equal - in any hierachy, each individual rises to his or her own level of incompetence and stays there. The managers who determine the process have the authority, and the rank-and-file coders are supposed to shut up and follow it.
    The point is your organization has processes in place to ensure STANDARDS are met and the final product is fit for mission critical systems.
    The goal is to ensure that the final product is fit. But when that is forgotten, the processes and standards that are actually practiced (whether or not they agree with those in the policies and procedures manual that no one ever reads) will not support software quality in an efficient manner.

    What's needed is a "meta-process", a process to develop the software process and keep it directed towards the goal. I would suggest that a democratic meta-process, where developers themselves work together to evolve the procedures they will use, would work better than decrees from clueless management.

    Religions have a process to you know. It calls for the 10 Commandments to be followed.
    Well, that's one set of religions. Others - such as Zen Buddhism - would say that such rules, or "process", are things to ultimately be transcended. The enlightened person, the sage or bodhisattva, does not refrain from killing based on some religious law; he simply acts. The practice of these religions is designed to help lead ordinary people to that state of enlightenment.

    Perhaps that should be the goal of software development practices, as well - to help lead ordinary programmers into that state where they are enlightened enough to be simply incapable of producing flawed software.

    --
    Tom Swiss | the infamous tms | my blog
    You cannot wash away blood with blood
  14. The value of an exacting process... by petark · · Score: 2

    Don't forget why the Arian 5 rocket blew up in 1996 , a conversion error caused a software shutdown that lead to the self-destruct of the rocket.

    "The internal SRI software exception was caused during execution of a data conversion from a 64-bit floating-point number to a 16-bit signed integer value. The value of the floating-point number was greater than what could be represented by a 16-bit signed integer. The result was an operand error. The data conversion instructions (in Ada code) were not protected from causing operand errors, although other conversions of comparable variables in the same place in the code were protected."

    What was the estimate, about $8,000,000,000 of uninsured losses, including 10 years of work for the scientists with satellites on board.

    I wonder how many other maiden voyages have started off so poorly, other that the Titanic that is.

    1. Re:The value of an exacting process... by Detritus · · Score: 2

      The problem was that the software was reused from an earlier Ariane launch vehicle without rechecking the requirements and assumptions to see if they were still valid. The flight dynamics of the Ariane-5 were different enough to trigger an overflow in the software. Sort of like taking the engine control software from a Toyota and dropping it into a Porsche.

      --
      Mea navis aericumbens anguillis abundat
  15. Can you imagine just how simple those things are.. by Alex+Belits · · Score: 4

    and how slowly they are being developed? I don't mean that it's a bad thing -- it's good that Shuttle program allows them to do it at reasonable pace and with reasonable requirements, but if everyone else wasn't under constant pressure, and if everyone's else software wasn't a victim of feature bloat, dealing with poorly documented and even worse implemented protocols, and never-ending stream of bullshit coming from the management, everyone else would write robust software, too. Well, not really everyone -- some "programmers" wouldn't be able to do anything because they have no skill, no education or are plain dumb, but reasonably geeky and educated programmer can pull something like that in ideal conditions -- and those guys _are_ working in ideal conditions.

    --
    Contrary to the popular belief, there indeed is no God.
  16. Re:Again, I don't understand by nikko · · Score: 2

    I think the point of that exercise is to promote a sense of well defined accountability and confidence, up and down the management chain. Sure, in theory, the project manager should be ultimately accountable. But all too often she can, post facto, dodge responsibility for failure by (accurately) claiming that other project stakeholders failed to provide their inputs to the project correctly. In Mr. Keller's case, he would not sign the certificate if he felt that failure was a possibility, for any reason. This also gives the decision makers a well defined "emergency brake" that perhaps could have averted a *Challenger* like disaster, where some line managers said STOP, while some higher-ups said GO!

  17. Re:Flight Software by Sunracer · · Score: 1

    >Likewise, people often ask why the shuttle continues to use such antiquated General Purpose
    >Computers: slow, 16-bit machines designed back in the seventies. There are many reasons, but a big
    >reason is that new hardware would almost certainly require massive changes to the flight software. And
    >rewriting and recertifying all that software would be a huge task. The current FSW works reliably; if
    >it ain't broke...

    Actually, AFAIK, the main reason is that old 386s are tested, tested and, once more, tested for space use. With newer processors, there are too many unkowns to risk a space shuttle. The line-widths in modern processors are so small that background radiation is beginning to cause problems in space without proper shielding. Probably they are testing 486s and Pentiums right now, but it'll be another ten years before they're ready for extensive space use.

    --
    "The Internet, of course, is more than just a place to find pictures of people having sex with dogs." - Time Magazine
  18. The importance of documentation by Glimmer_Man · · Score: 5

    I worked on some mission-critical/life-critical stuff about 2 years ago. It was aircraft related, and since it was basically carrying the data which made the plane fly it was critical by any definition. The processes we followed was absolutely document driven. User specs were examined, questions asked and the user asked to add definition and clarification for several iterations of the document. Then the software requirements etcetcetc were followed, ech document with quite a bit of iteration. Eventually we found that typically documentation and design would take 50% of the project. Testing would take about 30 to 35%, and the actual implementation hardly took any time at all. Now in the commercial world, I find that the process is VASTLY different. Implementation has started shortly after user specs have hit the desk, before design or documentation has begun. As a result, the system we currently have is very patchy in places. Its mission is a lot less critical, but the bugs slow us down tremendously. The bugs are due to the process. The process is requirements driven, not documentation driven. But it seems that the current system I'm working on has about the same complexity as that I used to work on. Only even though we are supposed to be pushing it all out the door faster, the bugs are slowing us to the point where we have approximately the same rate of progress as the mission-critical project!! Lesson: If you do it by the documentation, you will push it out faster and cleaner (and more bugfree!!!)

    1. Re:The importance of documentation by TomV · · Score: 1
      According to this story there were problems with the software on the Jubilee Line extension.

      "It was called Moving Block Signaling

      Oh there were certainly problems, especially with the MBP (Moving Block Processor). It was a truly great idea, no question. Idea was, roughly, that since the dawn of railways, signalling has been on the 'Fixed Block' system - divide the railway into chunks, and only allow one train per chunk. MBP idea is that if the trains have a map of the system, and monitor their own speed, and the condition of their brakes, and the diameters of their wheels to the nearest 10oth of a millimeter, and a whole bunch more data, plus information about the other trains on the network, then the MBP can work out the safe braking distance (LMA, Limit of Movement Authority, including several extra metres for safety), the upshot being that you can then cram a lot more trains per mile of track, driving themselves more safely than humans ever could.

      The old system would allow up to 12 trains per hour, MBP could potentially do 36, if you could get people on and off the trains fast enough.

      The project went so far over budget the whole firm looked like going tits up, and after losing 25 million pounds on this project alone, it was all rather scaled down.

      But to be honest, what really blew up everyone's plans was that, when the project started, it was meant to be delivered for 2003. Then the Major Government decided to have the Millenium Dome and furthermore decided that the Jubilee Line Extension would be the preferred (i.e. the only convenient) way of getting there. Bingo - the project delivery date suddenly moved forward by three years with no possibility of compromise. Now by early 1999, we'd run simulator tests - first two fake trains, then a real train and a sim, and were about due to get to the Two-Real-Trains test. However, at this point, a) London Underground needed pretty much constant access to the tracks, and b) we still needed to get our Safety Case. That could easily take another year.

      So yes, they got Colour Light Signals (which apart from the MBP benefits described above also means you need to think about stuff like braking distances vs Line-Of-Sight).

      The trains have most of the equipment, Automatic Train Operation, Automatic Train Protection, the Common Logical Environment (Effectively and Operating System for railways), and once MBP's finished it could be fairly easily retrofitted. It's still being developed for the Madrid Metro.

      Basically, if our client (London Underground) hadn't had their timescales rewritten for them by the Govt, I believe we would have delivered the most advanced railway in the world, and the repeat business would have made not millions but billions.

      Shame, really. A Lot of very good people did a lot of very good work on MBP.

      TomV

    2. Re:The importance of documentation by TomV · · Score: 5
      I worked on some mission-critical/life-critical stuff about 2 years ago. It was aircraft related, [...] The processes we followed was absolutely document driven.

      Likewise, i worked for a while on the signalling system for the Jubilee Line Extension for the London Underground.

      Totally documentation driven. First there was the CRS (Customer Requirements Spec). - this then transformed via an SRS (Systems Requirement Spec.) into the FRS (Functional...) and the NFRS (Non-functional...). From these we had Software Design specs, Module Design Specs, Object Design Specs, Boundary Layer Design specs. in all there were around 4000 specification documents for the project, often at issue numbers well into the teens.

      What really made the difference though, was not so much the existence of documentation, as the absolute insistence on traceability - every memeber function of every class in the whole system could be traced back to the Customer Requirement Spec, and every Requirement could be traced to its implementation. This meant - no chrome: everything in the spec was p[rovided, and nothing was provided that wasn't in the spec.

      Also worth noting that: the whole thing was in ADA95. The compiler was very carefully chosen. Coding standards were tight, and tightly enforced - function point analysis was king - anything with more that 7 function points was OUT, simple as that. Every change to anything, however small, required an inspection meeting before and after implementation, with specialits from every part of the system which could be impacted, plus one of the two people with a general overview. Then there were the two independent test teams and the validation team.

      Ye Gods it got tedious, no denying that. But in a situation where lives depended on good software...

      Now I probably apply only a tiny fraction of what I learned, but when I decide to ignore part of the methodology, at least I know I'm ignoring it. And I'm aware of what I'm missing.

      In short - learn about the safety-critical approach. Ditch most of it as excess baggage by all means - it's often simply not justifiable. But be aware of the choices you're making.

      TomV

    3. Re:The importance of documentation by seniorcrown · · Score: 1

      Tom,

      According to this story there were problems with the software on the Jubilee Line extension.

      "It was called Moving Block Signaling, and they wanted to move the train along in electronic envelopes. About a year before it opened it became clear it wasn't going to work. They decided to put in conventional color light signaling, which the line was not designed for."

      I would be interested to hear your take on this.

      In the past I worked on a project which was certified for ISO9000/Tickit. The process involved producing many documents. Unfortunately the time needed to produce the documents was under estimated. This meant that the coding phase was squeezed, leading to poor code (lacked attention to detail & had limited unit testing). The end result was a system which was worse than similar systems with low levels of documentation.

    4. Re:The importance of documentation by Thorgal · · Score: 1

      Hey, at BellStream they work this way on non-mission critical projects, too. Or rather, they treat every project as mission-critical. ;)
      --

      --
      "Man in the Moon and other weird things" - wfmh.org.pl/thorgal/Moon/
  19. If I had there budget... by myxlplix · · Score: 2

    I don't know a software company that wouldn't implement such a strategy to ensure that their software wasn't perfect if they had the budget to do so. As with all things of this nature it comes down to the money vs. quality contest. The better the quality the more it cost to produce but unfortunatly its not an even rise up the scale. It may cost you 2X$ as much to improve quality by 50% but it might cost you another 4X$ to get the next increase of quality of only 25%. Even the article points out that, that the Shuttle software is the most expensive in world and it still is run on old computers. Give me the same scale of budget/time and I'll give you a windows operating system that a fanatical Linux user would be hard pressed to complain about. Or, even better. I'll use the funds to set up an open source group to make Linux as versatile and useful across the board, from beginners to the "Linux guru's".

    1. Re:If I had there budget... by HiQ · · Score: 1

      This all depends on the environment that you're working in. In the Shuttle example, we are talking about a machine to machine interface - therefore you *can* write software that is bugfree, just because you *can* get all the specs. It is therefore possible the write down all possible situations, and write and test your software on that. Working in user-software is completely different: a) the software *must* have a degree of flexibility b) Nobody knows what stupid things users can do, but you can't foresee all those stupidities! So in user-software the total number of possible error-situation is greater, and errors are largely unpredictable. Therefore this kind of software can *never* be 100% bugfree, no matter what methods you use!
      How to make a sig
      without having an idea

  20. Re:Flight Software by kzinti · · Score: 2

    I don't work with the FSW people, so I'm not sure about the details of their work flow, but I think it's safe to say that new code goes through several readings, probably both at the pseudo-code and code levels.

    Schedule is driven by the planned date for launch, and worked backward from there. For example, if you're going to launch a mission at date L, then the crew begins training at L minus X months, which means that the software has to be ready for the SMS at L minus Y months, which means you have to begin design at L minus Z months, etc. I'm not sure what X, Y, Z and related time deltas are, but I believe they probably start planning at least a couple of years in advance.

    --Jim

  21. Re:I can understand why they want no hacks by eddy+the+lip · · Score: 1

    Or this:

    /* O2 systems monitor
    clean up later -
    too drunk right now */

    - eddy the lip

    --

    This is the voice of World Control. I bring you Peace.

  22. Italics by PhilHibbs · · Score: 2

    Sort out that closing italics tag! The front page article only has the first paragraph, and the second paragraph has the tag. All the headlines are italicised!

    My god! Where's all my karma going?

  23. Re:how the best are made... by Timberwoof · · Score: 1

    However, dropping to your knees and worshipping the brilliant scientist-programmer who wrote the core code your company's business depends on will not make you milions of dollars.

    That code still needs to be tested against specifications -- even if the specs are written afterwards -- and (re)engineered so that it can be maintained and expanded as new versions and applications demand. Trust me, it's better to write the code in a comprehensible and maintanable way from the start.

    If you have a genius who won't work within the programming *organization*'s process, you're sunk. If your genius sees the process as liberating, freeing his mind to create really good stuff ... then pay him lots of money and stock options.

    --

    --
    I'm in it for the fun, but it's more fun when you win.
  24. Re:Interesting stuff by Speed+Racer · · Score: 1
    Maybe you could release a free 'Light' version HAL/ER, High-level Assembly Language / Estes-Rocket for the rest of us.

    Please, think of the balsa wood and cardboard tubes. For their sake, please don't release such a dangerous tool!

    --
    Free Mac Mini. Yes, I'm
  25. Re:Processes in software engeneering. by Marketolog · · Score: 1
    Thanks for your ideas.

    My idea of computer logic was the following: one of my friends studies on a course on computer engeneering (The Netherlands). He's shewn me once one of his scratches. It was a difficult program, several factors envolved, etc.

    But it fit into one simple "logic" line!

    On the other hand, another "simple" programme took almost 3 long lines of "logical formulas".

    What I meant, it would be nice to write programmes in this language, but let the computer do his thing writing the code.

    Sorry if it sounded too stupid.

  26. Re:Formal Methods are the key. by El+Cabri · · Score: 1

    better link for the Methode B :

    http://archive.comlab.ox.ac.uk/formal-methods/b. html

  27. Re:The joy of PLCs by ballestra · · Score: 1
    A ladder diagram is used to represent basic logic circuits. Basically you have two lines going down the sides of your drawing, representing a voltage, and you put draw little circuits across to make rungs. A straight line across would be a short circuit.

    For example:
    |--Switch1---lightbulb1---|
    |--Switch2-/

    This represents two switches in parallel, so lightbulb1 will get juice if either Switch is on. So this is the equivalent of OR.

    |--Switch1--Switch2--Light1---|
    This is AND.

    You can add new rungs and include relays, so that a switch3 could be a relay driven off of lightbulb1. By cascading with relays, you can have states, which can represent steps in a process. Switches can be sensors and lightbulbs can be actuators, so you can build a very simple circuit that can control a multi-step process with safety conditions, such as "only activate the forge if there is a blank in place(detected by a proximity sensor), and the temperature is withing certain limits(sensors), and previous steps were completed successfully, and the operators hands are safely out of the way holding down switches 8 and 9." Instead of wiring all this up as actual circuits, you can connect all of the sensors and actuators to the PLC. That allows you to store your programs, it simplifies the wiring, and you don't need to use actual relays, timers, etc. (You'll still use some relays of course if you need the low voltage coming out of the PLC to activate heavy equipment.)

    Simple do-it-yourself application: You could connect all your home lighting, along with motion sensors and switches to a PLC, and set up any number of different logical relationships. So a single switch could be "home/away" which could control a large number of lights throughout the house. A single "movie lighting" switch could turn off certain lights, turn others on, dim a few more, turn off the dishwasher, and set a timer to go back to normal in two hours in case you fall asleep.

    I don't have one, but I think the cheapest models are probably under $100. They never crash, they can run for years, they're extremely reliable, easy to use, and cheap. If you can program a VCR, then you can program a PLC. Unfortunately, that rules it out as a product for the home market.

    "What I cannot create, I do not understand."

  28. How many of us wish by alarmo · · Score: 2

    ...that we could arrange for a situation where the requirements are all fixed and locked down, and documented, before any coding begins? In industry jobs, I've never seen a project that wasn't having some marketing group force "critical" changes the whole time something was being written.

    You get what you pay for, and take the time for. These days, most people and companies seem quite willing to settle for "bad, buggy programs now" rather than "better programs, later". Of course, without organization (also common), it's possible to wait and get nothing later, too. Process is expensive in terms of people involved and time, but it's a lot cheaper in the long run than the alternative. :)

    Open-source projects actually follow this - every successful open project I've seen has a definate hierarchy of people managing patches and controling what winds up in the latest sub-point build, and making key architectural decisions so nothing derails them. Oh - and there's no one who'll fire you if marketings last-minute changes aren't rushed through. :)

  29. Re:What's fun in software development? by mge · · Score: 1

    I can almost hear the moans from the pizza-and-coke crowd whem they read this: "Where's the fun? Where's the creativity?". But they're under the mistaken assumption that putting lines of code into the editor is the only fun thing about developing software.
    Typeing code is not what the job is about (despite what people seem to think). We're in the business of doing cool things for people. The crativity and ideas that flow from the (very smart) people around me are what drives me.
    Just sitting coding typing is a bit dull compared to human interaction...

    "The reason I was speeding is.....

  30. Re:Somewhat bogus article by frogjump · · Score: 1

    I think there are folks at NASA who are not satsified with the SYSTEMS engineering done. When the SOFTWARE engineers had the old Apollo hex keypad (as an example)dictated to them as a system requirement, I would say that the software engineering that followed was still pretty impressive. The project was, and still is, an impressive job of software engineering.

    --
    captain america
  31. If you have enough to do it poorly ..... by DnA+Works · · Score: 3
    ... you have more than enough to do it well.

    The problem with this arguement is that while many companies think that they can't afford to do it, what they really can't do is afford NOT to do it. Software is becoming more complex - it's the nature of the beast. For the most part, design is not; we are all still using procedures that were brought into being in 'dawn of computer age', with the exception of higher order languages and more focus on OO.

    You are correct in that it may be expensive, THE FIRST TIME. This is called a 'learning curve' and the cost is amortized over the number of times you use this technique. You may also say that the process itself is expensive but that is incorrect, or at least only partially correct. The process allows errors to be caught EARLY, which reduces cost. Please don't tell me that you believe a code-compile-fix routing can catch these sorts of errors as early as a well thought out design.

    Also, rigourous design allows for flexibility - this may sound contradictory but consider the use of design patterns. They are NOT things that can just be thrown into the code ad hoc; they require thought and intelligence. A good upfront design means the ability to use these tools. Consequently, use of these design patterns allows for a certain level of flexibility in statisfying the lower to medium level nasty customer requests, and certainly helps on the more egregious ones. Does a code now, look later approach allow this? (if you think so, I have this bridge I'd like to sell you ...)

    In short, yes, using these techniques is expensive. But they also produce code that cuts development time (i.e., no stuck in debug/extra request phase for 2 years) and once people get used to the process, the extra cost/load is minimal.

  32. Haven't we seen this before? by sjpadbury · · Score: 1

    I seem to remember seeing this article before, and since the only place I read anything interesting anymore is as a result of hearing about it here... ;)

    --
    We're all full up on Crazy here...
    1. Re:Haven't we seen this before? by MikeApp · · Score: 1
      Yes, I posted it to the last shuttle article: here

      It was moderated 5; interesting, so at least a few people must have read the Fast Company article back then.

    2. Re:Haven't we seen this before? by SEWilco · · Score: 1
      The mammals had to be properly trained before their verifiers allowed them to post this story. You'll notice that now that it was posted, it was posted properly and with few errors and omissions.

      Also note that I would have had posted the first message but I had to first properly review and check the specified article. I now need someone to verify and sign off on this comment, to prevent any of my errors becoming a part of the permanent record.

  33. Their code is good because... by Anonymous Coward · · Score: 2

    their process ensures it will be. The vast majority of software development is performed in an environment where individual "heroes" are the primary reason projects succeed. The Space Shuttle Onboard Software processes will seem to almost all of us to be "common sense", but how many of us work in a place where management mandates these things to ensure quality? Their environment is "ideal" because they have made it so. Unfortunately, many managers' (and too many developers', also) attitudes can be described as "get it done", and it shows!

    They were rated CMM level 5 in 1988 - one of the first organizations anywhere rated at that level of software process maturity. Another good description of their processes (and how they created them) is in the book "The Capability Maturity Model - Guidelines for Improving the Software Process" (ISBN 0-201-54664-7) in Chapter 6, "A High-Maturity Example: Space Shuttle Onboard Software".

    As far as making software error-free, a quote from the book will help illustrate the difference in attitude they have (it's talking about a graph). "These data include failures occurring during NASA's testing, during use on flight simulators, during flight, or during any use by other contractors. Any behavior of the software that deviates from the requirements in any way, however benign, constitutes a failure. Contrast this level of commitment with the cavalier attitude toward users in most warranties offered by vendors of personal computer software."

    The best place to find more about the CMM is their web site at http://www.sei.cmu.edu/

  34. Flight software crashes by eagl · · Score: 1

    I flew the F-15E for 4 years, and it was common to have to reset a system because of some sort of glitch. Whether the glitch was hardware or software based, I didn't really care. If a system stopped working reliably or failed outright, it was time to troubleshoot. That usually meant first a software reset, a hardware reset, and in the worst case (but still common) a complete power down/wait 30 seconds/power up cycle.

    2-3 times per flight is more than I usually experienced, but I think I had to reset at least one system on 50% or more of my flights. That's quite a bit more than 1 every 500 hours. Some aircraft were better than others too... One jet required it's radar to be reset every 15-20 minutes. That problem was eventually traced down to a wiring harness connector...

    In addition, there were and still are known software problems in that aircraft. The known ones usually have some sort of workaround (if the heads up display freezes, cycle power on the display processor, stuff like that), but the occasional random crashes or glitches (like occasionally the plane will suddenly think it's flying 100,000 ft below the ground) have no known cause and the only fix is to reset something until the jet behaves itself again.

    My last point is that the flight control software in the F-15E is designed to go offline if the aircraft exceeds certain parameters. In that case, the flight controls must be manually reset in one of four ways. There is a quick reset switch, a "hard" reset switch for pitch, roll, and yaw, we can cycle power for those systems, and worst case we can pull and reset the circuit breakers for the flight control system components.

    The funny thing is, it works only because the rest of the design is very robust. Most systems have some sort of backup, and the plane flies just fine without any electrical power at all. Once the software problems are known, they're dealt with as simply one more environmental factor until they're fixed. The fix may take over a year, but they are usually fixed eventually.

  35. Again, I don't understand by Anonymous Coward · · Score: 1

    Before every flight, Ted Keller, the senior technical manager of the on-board shuttle group, flies to Florida where he signs a document certifying that the software will not endanger the shuttle.

    Is this supposed to be black magic or something? If something bad is bound to happen, it will happen regardless of how many "certificates" and such were signed.

    Or maybe it's about transferring responsibility?

    Maybe Mr. Keller could sign a certificate that aliens will contact us next wednesday?

    1. Re:Again, I don't understand by homer_ca · · Score: 1

      The big deal is that he's actually taking responsibility for the software. This stands in stark contrast to the usual EULA disclaimer "this software is provided as is without warranty... you agree to hold harmless and indemnify, blah blah blah".

  36. Re:Flight Software by Maurice · · Score: 1

    They are going to use old Pentiums (no MMX) with Win95 on the new space station.

  37. Maybe they can fix 2.3's VM by Anonymous Coward · · Score: 1

    coz sure as heck, the kernel developers have lost the plot.

  38. Re:"No Pizza" is good by megamanic · · Score: 2

    I Think your problem here is that you still subscribe to the fallacy that "Code like Hell" Programming is faster than doing things properly.

    It isn't.

    Many organisations are starting to find this out and are moving to proper professional engineering practices that improve reliability increase schedule predictability and more importantly reduce costs.

    A couple of hundred years ago people built houses & bridges the way we build software - work until it's done. These days we have archaetects and project managers that build houses faster, more reliably and ON BUDGET.

    This is the way the wind's blowing. It's a lot less heroic but it's the future.

  39. Not just space shuttles. by BigStink · · Score: 4

    It's not just space shuttle code that needs extreme reliabilty. The embedded systems in civilian aircraft are not interrupt-driven because of the reliabilty issues associated with interrupt-driven code - interrupts make the software to hard to debug thoroughly (becuase there are so many combiniations and timings of input signals to test), make faults difficult to replicate and have the potential to go wrong on a spurious set of input signals. This sort of problem doesn't really matter too much in a home or corporate computing environment, but it would be a major disaster if a plane carrying a few hundred people were to crash into a city with a population of a few million, just because of a software error. These things need 100.00 per cent reliability, so obviously software hacks are frowned upon.

    1. Re:Not just space shuttles. by wass · · Score: 1

      It's not just the aerospace industry that needs extreme reliability. Real-time systems are in automobiles, too. For example, if you crash into something in your car, those accelerometers cause an interrupt which BETTER be acknowledged by the firmware to respond and trigger the airbags within X many microseconds.

      --

      make world, not war

  40. how the best are made... by darial · · Score: 1
    Some manager I had said: If you want to be sucessfull, find a sucess and see how it was made.

    The obvious canidate would be Bill Joy's TCP/IP implementation. Eveyone runs it:

    1. BSD's always used it

    2. SYS V incorperated it - thus it flowed to most commercial unixes

    3. LINUX borrowed heavily from is (recall that Regents of the University of California boot message?)

    4. If the TCP/IP fingerprint of WIN2000 is any indication, they borowed it too.

    And it works right every sincle time you use it. So, what process made it? A single genius. All the cool process in the world won't make up for the fact that the single requirement for great software is a great designer/programmer. The required process is simple - whatever that person requires to let their genius loose.

    The only way to circumvent this requirement is to do what NASA does and spend probably literally hundreds of $ per line of code.

  41. Re:Flight Software by acre · · Score: 1
    I also work down the hall from some of the folks in this article, and I know quite a few of them from college (Co Cyclones). Anyways, I thought I would mention this project from United Space Alliance's Dual Program. USA is a joint venture of Lockheed and Boeing that took over Shuttle Opperations a little while back (the group mentioned in this article is part of USA and has been for about year or so). The Dual program is a USA/Academic partnership for research in space operations. The project that I thought you might be interested in is the development of a space shuttle flight computer emulator for linux described here.

    On another note, the group that I work in (Flight Design and Dynamics) may start looking into moving from our IBM/AIX platform to a Linux platform. Penguins in space! I guess that is a bit offtopic, but oh well.

  42. But consider what "a crash" means ... by Seth+Finkelstein · · Score: 1

    Reliability obvious gets a big premium when crash is not a metaphor.

  43. Hey! Isn't this the way that profs say to program? by peengers · · Score: 1

    Hmm. Talk to the client until you fully understand the problem. What a concept! No doubt this will make some fast and bulletproof code. Now if only they can teach their engineers to convert units correctly....

  44. hehe... by Raymond+Luxury+Yacht · · Score: 1

    Ever hear of Boo.com?

    ;-?

    --

    Ceci n'est pas une sig.
  45. Re:Processes in software engeneering. by -brazil- · · Score: 1
    Every time I read a history of a programme and find a line "completely re-wrote the code", I begin having second thougths about how really good the programme is.

    Have you ever programmed a half-way complex system yourself? Re-writing it from scratch is often the best thing that you can do, the more often , the better. In fact, there are software engineering models that officially choose to re-write their code often. This is called "prototype-based SE".

    The reason is that while you write the code, you invariably notice some decisions that you made earlier were false, but they affected the design so deeply that changing it would be more work than rewriting it from scratch. The alternative is to live with the design flaws; most commercial projects do that because they don't have the time to re-write their code.

    --

    The illegal we do immediately. The unconstitutional takes a little longer.
    --Henry Kissinger

  46. Re:ohh if only... by -brazil- · · Score: 1

    Well, the point is really this: There is a point beyond which making the software more stable is so much more work that it's simply not worth it. Where this point is depends, of course, largely on what the consequences of failure are. Obviously, if multi-million-dollar equipment is at stake, it is worth being extremely thorough.

    --

    The illegal we do immediately. The unconstitutional takes a little longer.
    --Henry Kissinger

  47. I bet they also have enough time and enough $$$ by Lazy+Jones · · Score: 1

    Unreasonable deadlines and too few programmers are usually the reason for pulling all-nighters, it seems to me. Other environments where those kind of things aren't necessary can be found in the vincinity of banks and insurance companies, so look there if you want relaxed programming jobs.

    --
    "I love my job, but I hate talking to people like you" (Freddie Mercury)
    1. Re:I bet they also have enough time and enough $$$ by Delphis · · Score: 1

      I saw this story about a 'programming boot-camp' called Drop and Code me Twenty linked from the Space Shuttle software article and there's the lovely bit where they purposely try to overstress the programmers on the course to see how they react.
      "These guys pushed back hard," says TeamworX's John Rae-Grant. "It was great."

      I wish we could all do that when the management trolls figure that something can be done in X days without talking to the programmers actually doing the work..

      There's that thing called a paycheck though, that tends to curb people's unwillingness to have to rush to do things because management fucked up AGAIN :/

      --

      --
      Delphis
    2. Re:I bet they also have enough time and enough $$$ by megamanic · · Score: 1

      Although you are right about unreasonable deadlines, we as programmers are also to blame for :- a) Not understanding estimation and scheduling, hence producing woefully optimistic schedules. b) Not communicating to management. I know they're Micro$oft Press but I would suggest reading the work of Steve McConnell who has written 4 brilliant books on the subject of Software Engineering. Not dry academic tomes but practical, readable, useable books. No I'm not related to him or M$ in any way :)

  48. Re:An alternative strategy by sysadmn · · Score: 1

    US Military test pilots aren't stupid people. Most of them have advanced degrees in aeronautics or aeuronautical engineering -- at the insistance of the military or aerospace firm they work for.

    ...

    Or not... In truth, I suspect the first few questions would really be something like "You're kidding me, right? Do you think I'm crazy? Would you be willing to fly this deathtrap?"


    In the flight test programs I was associated with, the software had to meet several hurdles before it got near an aircraft. After unit test, and integration test, there was batch mode tests on a hardware testbed, then man-in-the-loop testing in a $Million simulator. When the test pilots accepted the results of their sim time, a final review was held and flight testing could begin.

    --
    Envy my 5 digit Slashdot User ID!
  49. Re:Interesting stuff by aclaudet · · Score: 1


    We are the only ones with a compiler, because we wrote it ourselves.

    Maybe you could release a free 'Light' version HAL/ER, High-level Assembly Language / Estes-Rocket for the rest of us.

  50. Re:Here's the difference by CaseyB · · Score: 2
    So, commercial software is lousy because we're all stupid, and choose not to use good development practices.

    Bullshit.

    NASA didn't just have a solid process, they had MONEY. They BOUGHT that quality, by hiring an order of magnitude more testers than you'd find in the commercial world. By budgeting several years of development time rather than weeks or months. By reducing the number of lines of code that any one developer is responsible for.

    There's a lot to learn from a highly structured development process like NASA's. But don't kid yourself that the quality they produced is simply because they 'had the right process' or had better management.

  51. Higher Quality != Higher Cost/Time! by severian · · Score: 3
    One of the assertions that seems to keep coming up is that higher quality code (i.e. more stable, predictable, etc.) always means more expense or time to create. That's not necessarily true. To take an example from the car industry: in the 60's/70's American car makers made cars by building them on the assembly line, and then having "quality inspectors" at the end of the line who would check for defective parts which would then get fixed. Using this model, it was always assumed that achieving higher quality naturally meant higher costs (you would have to spend more to hire more inspectors, and you'd have to replace more parts), and longer time (adding new checkpoints in the line would increase the time to manufacture a car).

    But then the Japanese came along with a radical new idea: if there are defective parts coming down the line, then we should figure out why they were created defectively in the first place and fix that. Then the number of defective parts at the end of the line would be less, thus you would need *fewer* inspectors and *less* time at the end of the assembly line. (Ironically, this principle came from an American named Edward Deming; unfortunately American companies were too successful during his lifetime for them to take him seriously :-) So the Japanese were able to build cheaper cars quicker than the Americans while actually having higher quality.

    I think that's very analogous to the current argument. Under the current system of coding, you basically hack together something that sorta works, and then use sophisticated debuggers/development tools to figure out which parts are buggy. Using that system, it's true that higher quality requires more cost and time.

    But I think the point of this article is that that is the wrong way to approach programming. First, figure out why defective code gets written in the first place (be it poor client specifications, poor management, poor documentation, whatever) and then fix those processes, and you'll turn out quality code without having to spend any more time or money!

    As a practical example, I first learned C under a CompSci Ph.D. who was a quality fanatic. In order to teach me to code properly, he would give me projects and then not allow me to use a debugger. Nothing at all. Zilch. Nada. The only thing I was allowed was to place print statements within my program wherever I wanted to see what was going on. As a result, I spend *a lot* of my time planning my code out, and reviewing it over and over again before even compiling it, because I knew that if there were bugs in it, I couldn't just fire up a debugger and take a look.

    And secondly, if there were bugs, I couldn't just trace through the entire program or create a watch list of every variable. I had to study the bug and understand it, look at the code and figure out where the bug most likely was, and then use selective print statements to look at the most suspicious stuff. That way, when I encountered bugs, I'd be forced to actually understand what the bug was and then analyze my code to figure out where that error most likely was.

    If this sounds like a programming class from hell, believe me, it was incredible! I couldn't believe how much of my code worked the first time it compiled. And when there were bugs, I actually fixed the underlying flaw in the logic rather than just applying a temporary patch. What's more, since the rest of my program was well planned and documented, there were no "hidden" effects: if I found a bug, I knew exactly which parts of the program it affected, and perhaps more importantly, *how* it affected those parts. Thus they were very easy to fix.

    Believe it or not, it took me less time to program this way than using debuggers, and the resulting code was much more stable and understood.

    If you look at commercial software these days, it's not uncommon for the debugging period to take longer than the actual coding. In other words, there are more quality inspectors than there are assembly workers, and the time the code spends in inspection stations is longer than it spends being produced. It's tough to say that this is the "efficient" method of programming...

    If you want to see where this is heading, just turn once again to the car industry: once American companies got their asses kicked by the Japanese, they adopted their techniques, and Surprise! Cars now come out of their factories with higher quality, in less time, and at less cost (adjusted for inflation and new features :-). Who would've believed it? :-)

    1. Re:Higher Quality != Higher Cost/Time! by otis+wildflower · · Score: 2

      If you want to see where this is heading, just turn once again to the car industry: once American companies got their asses kicked by the Japanese, they adopted their techniques, and Surprise! Cars now come out of their factories with higher quality, in less time, and at less cost (adjusted for inflation and new features :-)

      A good book on this (from 1986-8, so it leaves off when the US auto industry was in pretty much the nadir of its decline) is David Halberstam's The Reckoning... I'd go into further detail, but you have to read the book. It goes into Ford & Nissan overall, but it's very rich with both history and personality (particularly Mr. K of Datsun 240Z fame) and an excellent read.

      There are definitely some lessons to learn, particularly regarding American hubris during fat economic times..

      Your Working Boy,

  52. Re:"No Pizza" is good by Spoing · · Score: 2

    Noone wants to write buggy code...

    Well, mister know-it-all...how do you go about getting really obnoxious amounts of money out of the customer?

    --
    A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
  53. AI used in combat systems by Dr+Dick · · Score: 1

    At the other end of the scale I saw a talk a year ago by some US army guys. There they were modelling dog flight situations with a very real posibility of the system being used in a real combat senarios. Thing was they were doing all this with an AI system which is about as far as you can get from a verifiable system. Should we be worried

  54. Cluen't by dmiller · · Score: 1

    At first it seems too predictable

    Duh. The software that these people write is responsible for *lives*. If I had to depend on the code of some "pizza and coke" programmer for my existance I would want the development process to be predictable too.

    Most programming isn't sexy. Deal with it

  55. Hit the nail on the head by Delphis · · Score: 1

    Formal methods are also incredibly complex for any nontrivial program.

    Absofrigginlutely .. I studied formal methods in my degree course too.. Z specs and stuff. I understood its purpose, to verify correctness of a system. I think even the professor teaching it admitted that for anything real the amount of proof required would be *immense*. The professor was a very 'academic' professor - a very smart guy - you couldn't help think though that the stuff that was taught was only ever going to be useful in academia. I think he despised the real world for not being easy to define :)

    I do remember hearing though that there was a program somewhere to turn Z specs (formal specification method) into C code.. although I never did use it. Anyone else remember/hear of that or better yet actually use it on something?
    The idea being that you do the proof and feed it through the 'thingo' and it churned out C code.

    --

    --
    Delphis
  56. Several of my teachers work for NASA by Dungeon+Dweller · · Score: 1

    One of my teachers works at their Independant Verification and Validation facility. Several other do or have worked for them on some level. They are VERY good programmers. Several of my school projects are based on things that NASA has been doing.

    I find it funny to hear people talk about very BASIC things in computer science as being "cumbersome" and such. Like much of what was said about Ada.

    One should strive to do good computer science, not whine about it. Less bugs, and better performance should be a way of life. Good code should be the only kind that you make.

    Otherwise you might as well point and click your programs together...

    --
    Eh...
  57. SEI Web Site by north.coaster · · Score: 1

    The SEI's web site is at http://www.sei.cmu.edu/.

  58. knocking thhose who rewrite their code. by BoLean · · Score: 2

    Even if yo consider a project managment approach, often you will wind up rewriting the code from scratch anyhow. From personal experience I can tell you that realying too heavilty on user input in the proposal/design phase can cause a project to completey lose focus. Many rewrites are the result of "feature creep" associated with pandering to the user's every whim. The most sucessful projects I have seen started with a narrowly defined, strict set of goals. Even at that, the trend seems to be at least one major rewrite by the time the software reaches its third version. Code reuse is highly overrated.

  59. "No Pizza" is good by cshotton · · Score: 3
    ...no pizza-and-coke all-nighters...

    That's because pizza-and-coke all-nighters are a direct byproduct of poor planning, either by the engineer implementing the code, the architect creating the design (if there even is such a person) or the person making the engineer's schedule. And the result is usually hastily written, incompletely tested software that is typical of most product offerings for use on the desktop.

    The process of authoring mission critical, man rated software is so far removed from the ad hoc, informal, duct-tape-it-together approach that most programmers use that no direct comparison can be made. I've seen both ends of the software development spectrum and they each have their uses. You can't launch a shuttle with a bunch of last minute kernel patches and some stuff that was written the night before the launch date. But you can't compete in the commercial software marketplace with code that takes 2 or 3 years to specify, design, implement, test, and integrate, either.

    Stand in awe of the people who have the skill and discipline to write software of this quality. Learn what you can from their process and try and use the lessons they've learned. Their stuff doesn't break, because when it does, people die. If O/S developers had that same attitude about their code, we'd never see blue screens of death, kernel panics, or any of the other flakiness we tolerate on our desktop machines.

    --

    Shut up and eat your vegetables!!!
    1. Re:"No Pizza" is good by jlowery · · Score: 1

      This post should be moderated up. It explains precisely why defects in shipping code occur: almost nobody wants to wait or pay for perfection in software products today.

      --
      If you post it, they will read.
    2. Re:"No Pizza" is good by altman · · Score: 3

      The problem is, in the commerical world the product is driven by tight deadlines and getting the product out before you get eaten alive by your competitors (who are also doing the same thing).

      If your company took the time to write very stable, near-bug-free code, they'd take so long doing it they'd go out of business - their competitor would get the business with a flakey but shipping product and by the time you turned up with your perfect product, everyone would be locked into their stuff (and most likely would have been using it for a couple of years).

      Noone wants to write buggy code, we all try to do our best; logical & clear design, defensive programming & good documentation give a good base. Peer review and experience (been there, don't want to do that again) help a lot too. Just writing the comments first (saying what you're *going* to do before doing it) helps.

      Another problem is that writing bug free apps on (say) windows is almost pointless as the app will still fall over when some bit of buggy OS/windows API code falls over. Things have to be stable and bug-free from the hardware upwards to give an impression of stability to the user - the problem is, the average user can't tell the difference (and couldn't give a toss) whether the app or the os fell over, it's just "my WP crashed and I lost my work".

      Welcome to the real world. Software can be flakey because it was written to be useful before the hardware went out of date - not exactly a problem with the shuttle. You can spend ages hand-crafting efficient code to be overtaken by crap code on a faster CPU. Blame the chip companies for moving so quickly :)

      Hugo

    3. Re:"No Pizza" is good by segmond · · Score: 1

      Usually, no pizza-and-coke all-nighers are a direct byproduct of poor planning, but not in all cases, I have worked in projects that I loved, and I do work so much just cuz I am having fun that boss will kick me home. "Go home dude, Go have a life, we are 4 months ahead of schedule". :)

      The problem with software engineering is that a lot of people do not know it, and when they do, they do not practice it. In the last month or so, I have read 4 books on software engineering, because I am going to be heading a big project soon. ...and I am sad to admit that I was a hacker, but today, I am proud to say that I am an engineer.

      One of the things that bugs me tho, is how people are believing you can't write quality software today. "Quality Software" now sounds like a Myth or some dream. People only "try" to attain "quality" when something like life is involved.

      --
      ------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
  60. Re:THAT is how to write code by VSc · · Score: 1
    We are writing test suits for 3G protocols (software for mobile communications) and the downside of the whole process is that the standards are not yet finilized and our code has to be changed every time there is a change in the standard. So there's a motto on one of the office walls: "Keeping with the standards is just like walking on water: it's much easier when it's frozen"

    __________________________________________

    --

    God did not appoint us to suffer wrath but to receive salvation through our Lord Jesus Christ --1Thes5:9

  61. Re:An alternative strategy by BlaisePascal · · Score: 2

    US Military test pilots aren't stupid people. Most of them have advanced degrees in aeronautics or aeuronautical engineering -- at the insistance of the military or aerospace firm they work for.

    I suspect that, upon seeing the "computer restart" button, the test pilot evaluating the aircraft would start asking a series of questions:

    1. What is the failure rate of the computers; i.e., how often will that button have to be pressed.

    2. What is the time elapsed between the computer failing and the computer operational, including the reaction time of the pilot or weapons officer? Assume that the pilot and weapons officer are already a) flying the plane, b) lining up on target, c) watching for SAM sites, ground fire, enemy aircraft, and d) coordinating with friendly aircraft.

    3. How does the computer controlled, fly-by-wire system function during the timeframe covered in question 2? Will it fly steady (given that many modern fighter airframes are inherently unstable in flight, and rely on active computer control)? Will I have any control over the plane until it restarts?

    4. If this happens in a dogfight, what are the chances of recovery and survival?

    Or not... In truth, I suspect the first few questions would really be something like "You're kidding me, right? Do you think I'm crazy? Would you be willing to fly this deathtrap?"

  62. Re:Formal Methods are the key. by StormyMonday · · Score: 1

    Don't be silly. Formal methods prove that a formalized version of a program matches a formalized version of a specification. Very good for nice, clean, precise things like floating point algorithms.

    Unfortunately, real programs are written in real computer languages from specifications written in real human language. They also have to interact with real operating systems running on real hardware. Don't forget the nice, messy anynchronous "real world" data.

    Formal methods are also incredibly complex for any nontrivial program.

    Your assignment for Monday is to prove the correctness of the Linux kernel.

    --
    Welcome to the Turing Tarpit, where everything is possible but nothing interesting is easy.
  63. Re:Processes in software engeneering. by Doctor_D · · Score: 2

    Every time I read a history of a programme and find a line "completely re-wrote the code", I begin having second thougths about how really good the programme is.

    There have been several occasions last year where me and a co-worker ended up trashing pages and pages of code to re-write it with the same functionality, but modular and ended up being smaller in some cases.

    My company used consultants who wrote terrible code. Let's use this example...there is a program that calcuates x days ago. The consultant's program went and tried to calculate leap years and all of that. Our program that replaced it used system library calls to date, and then simply subtracted the proper amount of seconds. Other ones were hardcoded scripts to run sql on our database, we replaced that with a perl script that took the sql as a parameter.

    So there are times where a re-write is better than maintaining the code. I guess the biggest case in point is mozilla versus navigator. Basically I agree that projects were planned and used software engineering principles we would most likely end up with good products. Granted game programs seem to be done best when they're a hack.. But how many times have you seen long term maintenance of games?

    --
    "If you insist on using Windoze you're on your own."
  64. Re:Formal Methods are the key. by pnkfelix · · Score: 1

    Formally proving that an implementation satisfies a specification is possible, but NOT TRIVIAL.

    This, coupled with the fact that so few developers can handle writing formal specifications (can you see the average perl hacker writing a spec?), is why it's not "that simple"

    Finally, as for your suggestion that everyone use VDM/Z or Larch/CLU, can you grab me a copy of MS Visual CLU? No? How about the CLU gcc front end? No? Well then how do you expect me to compile it? Yes, I know that compilers are available, but reliable ones with decent optimization passes? Even Barbara Liskov seems to have moved on from CLU...

    Don't get me wrong, specification languages are definitely cool in the right places, but we've got a ways to go before they become palatable to the average human being

    --
    arvind rulez
  65. Somewhat bogus article by Animats · · Score: 5
    That article has been around for a while. It paints an excessively rosy picture of the Space Shuttle flight control software.

    Here's NASA's own history on bugs in that software:

    • So, despite the well-planned and well-manned verification effort, software bugs exist. Part of the reason is the complexity of the real-time system, and part is because, as one IBM manager said, "we didn't do it up front enough," the "it" being thinking through the program logic and verification schemes. Aware that effort expended at the early part of a project on quality would be much cheaper and simpler than trying to put quality in toward the end, IBM and NASA tried to do much more at the beginning of the Shuttle software development than in any previous effort, but it still was not enough to ensure perfection.
    Read the NASA history. They had a 200-page known-bug list in 1983, although they did fix most of them during the long downtime after the Challenger explosion.

    The Shuttle's user interface is awful. The thing has hex keyboards!. Some astronaut comments include

    • "What we have in the Shuttle is a disaster. We are not making computers do what we want" -- John Young, Chief Astronaut, 1980s
    • "We end up working for the computer, rather than the computer working for us." -- Frank Hughes, NASA flight trainer
    • "crew interfaces...more confusing and complex than I thought they would be" -- John Aaron, NASA interface designer
    • "(the) 13,000 keystrokes used in a week-long lunar mission are matched by a Shuttle crew in a 58-hour flight" -- NASA history

    This project should not be held up as a great example of software engineering. Even NASA doesn't think it is.

    1. Re:Somewhat bogus article by Stu+Charlton · · Score: 1

      Doing design completely up-front is as much a mistake as doing no design at all. The IBM person probably wouldn't have liked the results of an up-front design: more bugs and flawed assumptions.

      Iterative and incremental processes are what develop quality software.

      Now as for the quality of the software & user interfaces - IIRC the NASA group that has the SEI's CMM level 5 rating has only had this for around 5-6 years. That's not a long time given the slow pace of change of shuttle interfaces. From my what I've read before, the new control system is well designed and easy to decipher.

      --
      -Stu
  66. Re:Can you imagine just how simple those things ar by Gallowglass · · Score: 1
    I'm not picking on Alex here, other posters have made this point too: "But it's Slo-o-oow!" "We don't have the time!"

    Surely you are all familiar with the mantra, "There's never enough time to make it really right, but there's always time to fix it."

    Frankly the Shuttle Group works right is because it plans before it starts to code. Good planning prevents mistakes that have to be fixed later. Note also that unplanned changes normally = introduction of chaos. I've been programming since punched cards and all the good books on programming and system design warn against jumping into the coding before finishing the design. Yet time and again, I've been whacking out the code before the specs are really finished. (We say they are finished, but the number of changes that have to be made prove us to be liars.)

    My experience, and from what I've read, proper planning rarely extends the length of the project. The difference is that more time is spent on the planning, and *less* time on coding and a *lot* less time on debugging. Spend enough time on planning too, and you get rid of the bulk of last-minute changes from the client. If you find out what the client wants before you start coding, then you're a lot less likely to receive change requests when you're deep in the code.

    It is, of course, a management issue. Managers are generally the ones who set the schedule. But the programming staff have a responsibility - if they really want to think of themselves as competent professionals - to fight against foolish deadlines and a rush to code.

    We also don't follow good programming practices IMNSHO. I've just been reading "The Pragmatic Programmer" and I strongly suggest you all *run* out and get your own copy imeedjutment! This is a book I wish to The Maker I'd had to read back in college days. This book talks about good programming practices - which is something that is rarely discussed in any detail.

    Which is why there is so much crappy code out there. (And I include my own, alas.)

  67. Re:THAT is how to write code by drudd · · Score: 2

    Solidifying a contract like that works when the client actually knows what they want. More often they have absolutely no clue of what they want/need, and require the programmer to help them along that stage as well.

    With these type of clients (and I've dealt with many) taking the proper long stage of design and discussion doesn't work at all. The client immediately changes their tune after seeing initial results. Not so much to add features, but that the features they actually requested were not the ones they needed, or didn't work within their business practices.

    Doug

    --
    Venn ist das nurnstuck git und Slotermeyer? Ya! Beigerhund das oder die Flipperwaldt gersput!
  68. Re:Flight Software by danale · · Score: 1

    >Sadly when you try and apply those standards to commercial quality code, it flops. This is so true; I worked in an SEI level 3 group.. The time I spend there was invaluable in learning to design, write and test code properly. Everywhere I go, there is *so much* resistance to implementing any kind of structured process. Why is there so much reluctance to implement these ideas?

  69. Re:ohh if only... by guran · · Score: 2

    'Course if they started writing space shuttle code like that, it would be "Goodbye World"...

    --

    All opinions are my own - until criticized

  70. Re:I can understand why they want no hacks by proj_2501 · · Score: 1

    Somehow I think those comments would look more like this:

    ;Shuttle Waste Dump
    ;
    ;I dunno WHY this works, but it does!
    --
    The other side is crowded. The dead have nowhere to go.

  71. Re:Processes in software engeneering. by cutevoice · · Score: 1

    Every time I read a history of a programme and find a line "completely re-wrote the code", I begin having second thougths about how really good the programme is.

    minor correction: "completely re-wrote the code with new insights." Anyone who's done a half-complex software engineering project can tell you there are often subtle design decisions that will not become apparent until an attempt of a full-blown implementation. These design decisions which seemed insignificant originally, might dictate the organizational structure of the program, and such a thing may not be easily altered cheaply afterwards.

    Didn't the Linux TCP/IP stack undergo several rewrites?

    megumi

  72. Re:Processes in software engeneering. by Paul+Wright · · Score: 1
    Every time I read a history of a programme and find a line "completely re-wrote the code", I begin having second thougths about how really good the programme is.

    Really? You might want a complete re-write for a couple of reasons:

    • Often the best way to get a handle on a problem is to write a prototype. ISTR this is what's behind Fred Brooks' saying "Plan to throw one away: you will, anyhow."
    • Refactoring because the desired functionality changed and there was then a much better way to implement the remainder.

    With the ever-faster-growing complexity of programmes, it becomes more and more difficult for humans, even aided with computers, to keep track of the project. But if you teach everyone how the computer logic works, the programming would become only about writing the necessary simple code (ha! hackers, get this!).

    I don't understand your point here. What do you mean by computer logic in this context?

    There are ways of expressing solutions to problems (different programming languages, CASE tools and so on) which are well suited to particular problems, but there are no magic bullets.

  73. Re:This is what fault-tolerant systems are for by Paul+Townend · · Score: 1

    Argh! It mentions multiple versions in the first paragraph, doesn't it! *bangs head against wall*
    That'll teach me to read articles in advance!
    Sorry!

  74. 1 Bug? by Lozzer · · Score: 1

    The almighty process predicts that there is one bug in the system. This must be keeping the programmers from sleeping wondering where the hell it is...

    --
    Special Relativity: The person in the other queue thinks yours is moving faster.
  75. Re:Redundant - kinda by Anonymous Coward · · Score: 1
    Good point - The Fast Company article was actually linked to three times ( once with a rating of five ) in user comments on the Glass Cockpit story. That story itself received a complaint about being redundant, having been linked to from user comments in earlier stories.

    This is starting to look like a pattern. I suggest that either timothy be moderated down, or, in the spirit of the article, that the Slashdot story quality assurance process be improved ;)

  76. Re:Interesting stuff by SlydeRule · · Score: 1
    what language is all of this done in? Ada would be my guess, or is there something even better than that?

    I doubt that it's in Ada; I think the software predates Ada.

    As for something better: in a word, Eiffel.

    Unfortunately, there's an old but true saying:

    "Make something foolproof, and only a fool will want to use it."
    Eiffel is an amazingly clean, simple, and straightforward object-oriented language. It removes both the ability and the need for clever coding hacks, thereby shifting most of the development effort to software design rather than implementation. Most "programmers" aren't designers, though, and they scream bloody murder that they can't code something really cool like
    while (*to++ = *from++);

    obOSS: A GPL'ed Eiffel compiler is available at http://www.loria.fr/projets/SmallEiffel/.

  77. Re:Flight Software by kzinti · · Score: 1

    Actually, AFAIK, the main reason is that old 386s are tested, tested and, once more, tested for space use.

    You're thinking of the PGSCs, the IBM Thinkpad laptops that are carried onboard to help the crew do various non-critical tasks, usually (always?) related to payload operations. These are indeed old 386-based machines, radiation-hardened but still susceptible to crashes related to bit errors in RAM. They run Windows 95, so obviously they aren't used for critical ops. (BTW, linux has been run onboard too.)

    The GPCs (General Purpose Computers) that run the flight software are IBM AP101-S computers, cousins to the IBM 360/370 architecture machines of the 1970's. The AP101 is a big-endian, 16-bit, 4 MIPS (appr.) machine that can address up to a megabyte (actually 2^19 16-bit halfwords) of memory. It has been extensively tested for space use, as you note, which is another reason NASA sticks with it. An earlier version of this computer, the AP101-B, flew earlier shuttle missions and has been used in military operations like the B-52.

    --Jim

  78. Not Exactly by nathanm · · Score: 1

    I don't know about that.

    The F-16's COG (ctr of gravity) is so far back on the fuselage it is an inherently unstable aircraft. So unstable, in fact, that w/o the computer adjusting 2 control surfaces on the underside of the fuselage 60 times/sec, it would pitch up & down continuously. That's just one reason why the F-16 is a piece of junk, IMHO.

    The B-2 couldn't fly at all w/o computer assisted flight controls.

    1. Re:Not Exactly by luckykaa · · Score: 1

      Most fighters are designed to be unstable. It makes them more manouverable. typically they have wings pointing down towards the wingtips to allow faster banking. Computer assistance allows them to fly in a straight line when the pilot wants to.

    2. Re:Not Exactly by Wyatt+Earp · · Score: 2

      I think the last US fighter not to rely on computer controls was prbly the F-15. To be inherently unstable is a feature...not a bug. Wasn't it a software flaw that caused the prototypes of both the F-22 and the Saab Griffon to crash on landings? Although the F-22 was a walk away crash and fire...the Griffon was a bit more spectacular if I remeber it right.

      The B-1B has seven of the GCUs that the Shuttle has. So it's couldn't fly at all either. The FA-18E has a number of PowerPC chiped flight control computers...the FA-18E is the first US fighter to use Cat-5 Ethernet to connect the computers togeather instead of obscure military cabling...at least thats what I read.

      IMHO the biggest problem with the F-16 is the fact that it has a single engine. If you look single engine jets crash more than twice as much as twin engine jets. Single Point of Failure will get you every time.

    3. Re:Not Exactly by nathanm · · Score: 1

      You're right, they designed it to be unstable.

      I'm not sure about the crashes. They don't release many details about the F-22 yet.

      AFAIK the F-16 is the only production fighter incapable of being piloted w/o computer assistance (besides fly by wire). The F/A-18 (at least the older models) can still be flown w/o a flight computer.

      I hadn't heard about using cat-5 on the Super Hornet. I know the C-130J replaced lots of cable w/fiber optic, which reduced the weight by a few 1000 lbs.

    4. Re:Not Exactly by Ellen+Forradalom · · Score: 1

      You can find the same stability tradeoffs in the humble bicycle. Racing bicycles have a shorter wheelbase than touring bikes; this makes them more manouverable at the expense of stability.

  79. Who are the kernel QA gurus? by hubie · · Score: 1

    After reading the article it got me wondering what the QA process for kernel mods are? Is there a beat the hell out of the new driver process that goes on, or is it a release into a beta and see if anyone has problems? I assume there aren't official QA testers, but are there any guidelines that before something is accepted into the kernel, it should at least be tested for X, Y, and Z?

  80. The SEI CMM is the real deal ! by gelfling · · Score: 1

    This work uses the SEI CMM (capability maturity model) developed by SEI and used in many areas within the civilian and military software development community. Organizations are assigned levels with specific tasks, roles and processes defined to advance to the next level. All results that are compared to the CMM levels must be verified, quantified and repeatable. A good place to start if you are interested in how the military develops robust softare is the Software Technology Support Center @ http://stsc.hill.af.mil/

  81. I want my SPECS, dammit! by Anonymous Coward · · Score: 1

    From what I hear about in the real world, some (but by all means not all or even most) programmers look down on clients just because they don't know much about programming. They assume that just because they have a certain expertise over others that they somehow know more than them in general.

    No, I look down on them because they don't know what they want, but they want me to do it anyway.

    I'd be relieved if somebody came to me with clear, detailed specs for once. I usually get about three sentences, and, if I ask any questions, it's "uh, I dunno, what do you think?" I'M WRITING THE SOFTWARE FOR YOU!!! You decide! I'm not going to do your job, too! "Well, uh, we need it next week, and I don't have time." (This is where I rip his heart out of his ribcage as he stands there, or at least I'd like to.)

    Um, sorry 'bout that. That just gets to me sometimes.

  82. Here's the difference by CaseyB · · Score: 2
    Consider these stats : the last three versions of the program -- each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.

    This software is the work of 260 women and men...

    Commercial programs of equivalent complexity would have been written by 7 or 8 people.

    1. Re:Here's the difference by segmond · · Score: 1

      LOL!!!!! LOL!!!!
      That is not the difference my friend. If you take 260 people and put them to write such a project, it will be even more disaster. The difference is competent management, Can you imagine the kind of management it takes to handle 260 people? A lot of the people are testers. In commerical world, we have l tester to 3 or 4 programmers, in NASA it is the reverse. 1 programmer, 4 testers. Anyway, NASA uses very solid processes, you should checkout their software engineering page. In india there are CMM level 5 companies with 0.03 defects per KLOC, that is 3 errors in 100,000, where as most companies today ahem (MS is having 14-17 defects per KLOC). Those private companies do not have 260 people, they just have solid process plans and management.

      --
      ------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
    2. Re:Here's the difference by BigRedZX · · Score: 1

      260 people over twenty years.

  83. Ariane 5 by SlydeRule · · Score: 1
    The official report on the Ariane 5 failure can be found at:

    http://www.esrin.esa.it/htdocs/tidc/Press/Press96/ ariane5rep.html

  84. Re:Safety is cool... by Alan+Shutko · · Score: 2

    If more projects worked like that, there would also be a lot less software in the world. Say goodbye to whateever you're running to watch slashdot, you couldn't afford it. (You also couldn't afford the hardware to run it on, because faultless software is of little utility without faultless hardware.)

    I would suggest that if every software project werre SEI-5, there would be no internet and people would be doing papers on typewriters.

  85. Re:Interesting stuff by cheezehead · · Score: 1
    what language is all of this done in?

    That question is answered in other replies. But: the whole essence of it all is that it is the process that is the crucial factor. The implementation language is not the deciding factor (although I will admit that some languages will let you create a bigger mess than others), although I suspect that the selection of the implementation language is part of the process.

    --

    MSN 8: Now Microsoft even has bugs in their ad campaigns.

  86. 3rd Job = 1st Experience with Software Engineering by weston · · Score: 2

    My third programming job was my first experience with software engineering. I'd had 4 years of experience at two other jobs -- one where I wrote code for a InterLibrary Loan book lending database, and one where I worked on an e-commerce package. There was not a thing at either place that would qualify as a spec, and there was no process in place for engineering. I didn't know anyone who used specs. I assumed that this was something that was taught by computer science professors, but wasn't actually practiced by anyone.

    Then I got a job at the Waterford Institute. Their process wasn't probably as tight as the space shuttle, but there WAS a process, and there were specs. Nice specs. Nearly psuedo-code.

    We were programming educational activites for kids learning math. Activities were created by design teams consisting of an educator, an artists, a tech writer, and a programmer. The tech writer would document everything that went on at the meetings, and distill it into spec. The design team would meet regularly over a period of several months, refining the spec until it was solid.

    The spec described various states of the software. When a user did something, the state of the software changed, and did something accordingly. I'd never seen software described this way, but it made a big impression on me, and it made things easy to write and debug.

    ('course, the platform we were writting on was in Java, which kept changing, and in-house developers were writing our own object library, which kept changing too, so your code would work one day, and then wouldn't the next, so everything wasn't perfect. But hey. I was impressed with the specs :)

  87. Failure Criticality by CorporateProgrammerD · · Score: 1
    Doing a Military style Failure Modes Effects and Criticality Analysis (FMECA) is rather similar to a NASA style one. The criticality is obtained by multiplying the failure rate by the severity of the failure. (in general terms that's how all FME(C)A's are performed, even those in the auto industry) Thus, a minor failure that happens frequently is as much of a problem as a severe failure that almost never occurs. You MUST design out potential failures that are both severe and likely.

    However, the worst severity in a Military FMECA is NOT "loss of life." It is "Mission Failure." Which makes sense. Losing a pilot (and aircraft) is bad, and very expensive. Losing a battle is rather worse. Fortunately for the pilots, a failure that kills the crew also tends to hamper their effectiveness in completing their missions.

    --
    To email, do the obvious.
  88. Re:Flight Software by kzinti · · Score: 1

    The project that I thought you might be interested in is the development of a space shuttle flight computer emulator for linux described here.

    I've seen that project description before, because I wrote it. I'm very familiar with the GPCE project because I'm the principal author of the C++ version. Unfortunately, none of our Dual partners in academia wanted to tackle the conversion of GPCE to little-endian architectures, so for now we can't run on Lintel systems.

    --Jim

  89. Re:Its ONE way to write code by dimator · · Score: 1

    I agree. Developers in the commercial world are under different constraints and requirements than the shuttle software crew. Commercial shops are motivated by $$$, not perfect code. If "good enough" will sell (and that it does!), then why bother with anything else? If no one cares about 5000 errors, then why spend $35mil (assuming the company has/can/wants to spend that much money on 1 application) writing phone books of specs?

    There's also the matter of competition. The reason we have tales of hackers writing deep into the night to get a product out the door is because if they didn't, competition could very well spring up out of the blue and beat them to market.

    There are lessons to be learned, however, even for open source developers like ourselves. About every project on, say, freshmeat, has a link to a source tarball on it's web page. But how many of those projects have even a single design document on their sites? Why is that? Design/specification is not a development phase that can be skipped, just because you want to get coding because its fun. Proper documentation is vital before coding all but the most trivial of applications.


    --
    "And is the Tao in the DOS for a personal computer?"

    --
    python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s'; print x%(chr(34),repr(x),chr(34))"
  90. 99.9 percent by hughesma · · Score: 1

    I think this was mentioned in the show "From the Earth to the Moon", but it illustrates to how important getting perfection is in the space industry.

    If, for example, there are 100,000 parts on the Space Shuttle, getting 99.9 percent accuracy means that 100 parts can break. Getting 99.99 percent means that 10 parts can break.

    --
    ----------------------------------------- Well damn...so that's what that does...
  91. Interesting stuff by Straker+Skunk · · Score: 2

    I saw this article a while back linked from here. Incredibly cool stuff . . . the part about "blueprinting software" and "how we design software in the future" was especially cool. It makes one aspire to code to a higher standard.

    That said, something I was curious about that the article didn't answer, and that I don't see mentioned here yet-- what language is all of this done in? Ada would be my guess, or is there something even better than that?

    --
    iSKUNK!
    1. Re:Interesting stuff by KannibalKing · · Score: 2

      I work for this project and the bulk of the flight software application stuff is written in HAL/S while the system software (OS) is written in both HAL and assembly. HAL/s is a language developed specifically for real-time shuttle operations back in the mid-70's and looks alot like Pascal. It is currently maintained by a company called AverStar specifically for this project.

    2. Re:Interesting stuff by cheeto · · Score: 5

      I work in the Flight Software (FSW) Verification group in Houston.

      The shuttle FSW code is written in something called HAL/S. This stands for High-level Assembly Language / Shuttle. The language was designed to read like mathematics is written. Superscripts like vector bars are actually displayed on the line above, subscripts like indices are displayed on the line below. Vectors and matrices can be operated on naturally, without looping.

      We are the only ones with a compiler, because we wrote it ourselves.

      Here's a sample:

      EXAMPLE:

      PROGRAM;

      DECLARE A(12) SCALAR;

      DECLAREB ARRAY(12) INTEGER INITIAL(0);

      DECLARE SCALE ARRAY(3) CONSTANT(0.013, 0.026,0.013);

      DECLARE BIAS SCALAR INITIAL(57.296);

      DO FOR TEMPORARY I = 0 TO 9 BY 3;

      DO FOR TEMPORARY J = 1 TO 3;

      A =B SCALE + BIAS;
      I+J I+J J

      END;

      END;

      CLOSE EXAMPLE;

      I couldn't get the subscripts to line up, but you get the idea.

      --
      - "Sweet merciful crap!" Homer J. Simpson
    3. Re:Interesting stuff by Detritus · · Score: 2
      The software is written in a language called HAL/S (High-Level Aerospace Language Shuttle). It was originally developed by Intermetrics.

      The Shuttle was flying before Ada had been developed.

      --
      Mea navis aericumbens anguillis abundat
    4. Re:Interesting stuff by HP+LoveJet · · Score: 1

      Hey, this is really interesting.

      Are the specs and syntax for HAL/S (or even [drool] a distribution of the compiler) publicly available?

      --
      spawn_of_yog_sothoth
    5. Re:Interesting stuff by harmonica · · Score: 2

      I was wondering about the same thing! But while Ada certainly is superior to many other languages, it's not a panacea -- the exploded Ariane 5 rocket mentioned in the article failed because of an unhandled exception in some Ada code which was simply taken over from the Ada 4 (3?) control system without checking it against the new requirements. At least that's what they told us in the first lecture of our software engineering class ;-)

    6. Re:Interesting stuff by harmonica · · Score: 1

      simply taken over from the Ada 4 (3?) control system

      Of course I mean Ariane 4!

  92. Re:Can you imagine just how simple those things ar by FarHat · · Score: 1
    but reasonably geeky and educated programmer can pull something like that in ideal conditions -

    Not necessarily, a point that was repeatedly emphasized in the article was that these guys are not ego-driven. Most people who are geeky and educated have pretty huge egos as well. they like to put their signatures in the code. Did you notice that no programmer's name came up in the entire article while a typical article on a project like linux or quake would make gods of Linus and Carmack. You can't have it both ways. And although everybody agrees some tightening up is required but this kind of rigidity which is perfectly justified here would be harakiri in a corporate environment.

    A large part of the software in this world started as a hack in some university lab and was then improved upon till it came to a passably useful stage. That is why you have these EULAs which absolve the software makers from all responsibility in contrast with this group's software where they take full responsibility for anything going wrong.

    and lastly, the late night coding sessions are not all that bad. I bet a large part of the kernel code for linux and BSD was written that way. But you should have a independent review process that is responsible for catching the bugs. Peer review can do wonders for the code before the final version comes out.

    FarHat

    --
    At the intersection of computation and biology.
  93. Re:Flight Software by kzinti · · Score: 3

    Want to know what a Shuttle GPC looks like? Check out
    http://www.ksc.nasa.gov/mirrors/images/images/pa o/STS39/10064134.htm.

    --Jim

  94. Spacecraft Design by Catmeat · · Score: 1

    I did an MSc in Spacecraft Engineering and it looks to me like they've adopted many of the protocols for ensuring the reliability of hardware in spacecraft. However the question to ask is how long would a private company stay solvent if they tried doing this. Everything has it's place, including the pizza munching Xers, if no lives are at stake.

    1. Re:Spacecraft Design by snub · · Score: 2

      I work for SAIC and we use the same processes (SEI) in our software development. Our clients include banks, airlines, brokerages, the IRS, etc. We made >$5 billion last year alone doing this. It costs a bundle to set it up initially and requires a ton of training to make sure people do it right but the result is outstanding software and very, very few all-nighters.

      --
      "Shredded cabbage and mayo go good together." Cole's Law
  95. Re:Flight Software by michael.creasy · · Score: 1

    I thought they were the only group to achieve SEI-Level 5. If not, then who else has, I'd love to go and correct one my lecturers.

    My Webcam

  96. Its ONE way to write code by BigTom · · Score: 1

    You could hold this up as an exemplar on how to write code when you have a stable, well specified requirement and lots of resources.

    Most development teams don't live in that world and never will. Business users don't change their requirements because they are capricious, their requirements change, they just do, that's life.

    I'm not saying that many (any?) development shops get it right but you have to apply an appropriate process for your circumstances.

    That said a lot of what they do: team orientation, no-blame culture, focus on process improvement, focus on quality and fixing at source, will always help.

    Tom

  97. Refactoring, Extreme programming, and other books. by lonely · · Score: 1


    I think that re-writeing is a good thing check out the following books:

    Extremem Programming by Kent Beck
    Refactoring by Martin Fowler

    Both advocate the use of proper unit testing, which many people do not do, to make sure that any alterations to the structure of the code doesn't change the functionality.

    Also worth a look is "Software Project Survival Guide" by Steve McConnell. Has a groovey questionaire so that you can work out how doomed your project is.

    These books seems to be the best. I found that the projects I have been involved in have failed because msyelf and other didn't have enough knowledge of how projects go wrong. Also importantly processes to stop the failures.

    I have just spent quite a bit of time in the last month reading such book and OO design book and was amazed how obvious it all was.

    Note that 2 out of these 3 book mention the NASA programming method.

  98. Sounds like the companies I've worked for... by Nuncio · · Score: 1

    ... where you focus on a working solution, instead of just quickly trying to get something to work and then relying on more money to make it work right.

  99. Re:Flight Software by jammz · · Score: 2

    I thought they were the only group to achieve SEI-Level 5. If not, then who else has, I'd love to go and correct one my lecturers.

    When the Capability Maturity Model for Software was published by the SEI there was only one ML-5 orginzation; at the time they were known as the IBM Onboard Shuttle group. Thankfully, times are changing.

    According to the SEI's 1999 survey, 61 organizations reported a Maturity Level of 4 or 5. Of those, 40 were Level 4 groups and 21 were Level 5. The survey goes on to mention that as of 15-Feb-2000, some 71 organizations reported that they were Level 4 or 5. Those that gave their consent are listed in Apendix A.

  100. Re:THAT is how to write code by segmond · · Score: 1

    I must disagree. The client is always right, the mess is management, do not blame it on client. Before you ever write a code for a client, you go through what you call the prototype phase, you work with the client and you select all the features you want, you build a prototype, and show to the clients, you prioritize the features according to imporance and what not. You throw away the features that can't make it due to cost, time and such... You agree with the client, you talk to the client and explain to the client how making a change can greatly impact the project. For example an innocent change in UI, might mean rewriting of hundreds of pages of documentation and such. After you have made everything clear with the client, put it down in writing, have them sign it. If client comes back, remind him, show him the signed form, send him out of your office. Now, sometimes clients may have really valid reason to be back, if it is valid, compromise, so long as everyone understand that it will take longer. Notice, that the only impact a client's demand for more features on a program should have is that it will take longer, not that the program will become a mess. The problem with today's world is lack of management, most programmers or product/technical leads are not people's people. They can grok code, but they can't manage and talk to people.

    --
    ------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
  101. Redundant - kinda by Finni · · Score: 1

    This timothy guy posted an article about the Glass Cockpit installed in the shuttle a little while ago - the comments included a link to this Fast Company article (very good, but I'm sure most of us read it.) It's also from Dec 96 - not very new.

  102. Processes in software engeneering. by Marketolog · · Score: 2
    That is clear, that the more you know about the customer's wishes, the better soft you'd write. But how many programmers really studies computer logic, software engeneering, project management?

    Not as many as should have.

    Every time I read a history of a programme and find a line "completely re-wrote the code", I begin having second thougths about how really good the programme is.

    With the ever-faster-growing complexity of programmes, it becomes more and more difficult for humans, even aided with computers, to keep track of the project. But if you teach everyone how the computer logic works, the programming would become only about writing the necessary simple code (ha! hackers, get this!).

    Would the next generation programmers write in "logic language" instead of C++? Who knows, but it would IMHO make the programmes robuster and even better.

    1. Re:Processes in software engeneering. by john_many_jars · · Score: 2
      Proof of working high level language isn't even enough on a single processor machine. Optimizations in compilation use techniques that must be accounted for in some very special cases (like some uses of shared memory, function calling conventions, and row-major v. column major problems).

      The only way to know for certain is to either code directly in bits or be (extremely) intimate with the compiler and linker. At that point, a proof will be correct.

      The way the shuttle seems to work is you better have a damn good reason to write/alter/delete/modify/worship a line of code. This will catch the majority (~99.95% by their reports of errors v. standard commercial software) of reasonable errors.

      They identified the weakest link in the chain of software engineering and have fortified it quite well.

    2. Re:Processes in software engeneering. by JTB · · Score: 1
      Every time I read a history of a programme and find a line "completely re-wrote the code", I begin having second thougths about how really good the programme is.

      Many great designers operate under the pardigm of "throw-one-away". Solve the problem once, throw out your work, and solve it again.

      The idea is that your second attempt at solving the same problem will be much more elegant, as a result of lessons learned the first time you solved it.

  103. Unintentionally humorous quote in article by Seth+Finkelstein · · Score: 3
    Bill Pate, who's worked on the space flight software over the last 22 years, says the group understands the stakes: "If the software isn't perfect, some of the people we go to meetings with might die."

    I can see many Dilbert-fans wondering if that is a bug or a feature.

  104. If they ever release this to the public by linuxonceleron · · Score: 4
    System Requirements:

    1 Space Shuttle Endeavor
    1 Launch Pad
    1 Houston Mission Control Station
    4 Astronauts

    --

    Shine on, you crazy diamond.
  105. An alternative strategy by luckykaa · · Score: 1

    I've heard that fighter aircraft have inherently unreliable software with a very low restart time. Next to the trigger there's a hardware reset button. Typically the software will go down 2 or 3 times in a flight.

    And you though Windows was bad.

    1. Re:An alternative strategy by cburley · · Score: 2
      've heard that fighter aircraft have inherently unreliable software

      If that's so, it's an interesting illustration of the overall system's requirements imposing lower quality standards on components of that system.

      To wit: the article (I presume; haven't read it, but have read similar ones on the same topic) discusses the importance of achieving a 100% quality rate on a given chunk of software.

      Now, that software is merely one component in a much larger system.

      Actually, these larger systems nest "outwards". I.e. the shuttle itself is a larger system than the software it contains, but so is NASA a larger system than the shuttle; so is the US government larger than NASA; so is the USA larger than the government; so is the planet's population larger than the USA; etc.

      In this case, there are specific reasons I can suggest account for the 100% quality requirement that might otherwise go unnoticed:

      • Failure resulting in death of participants, and especially of non-participants (humans), is not an option.

        However, failure resulting in not launching, not even building it in the first place, especially not building it within some timeframe, is an option. That is, failure of the "commitment to quality" approach to actually deliver the component on a "timely" basis is an acceptable option.

      • The world generally will admire a program such as the space shuttle less if it crashes and burns frequently, killing/maiming people and destroying equipment, than if it succeeds on the extremely rare occasions on which it is tried -- perhaps even less than if it never happened in the first place.

      • A delay in a shuttle launch costs, overall, far less than the cumulative risks of premature shuttle launches. (Challenger demonstrated that.)

      (Yes, there's some overlap there, but these are subtly different points, that might apply independently in other projects. E.g. a not-publicly-visible project might have no risk of embarrassment should it fail in one way vs. another, but have a huge risk of $$$ loss.)

      Compare these elements to fighter aircraft, where the software is part of a somewhat different set of larger systems:

      • The deaths of participants and non-participants is expected by most everyone of this sort of system and the activities around which it revolves.

        On the contrary, the sorts of failures that result from failing to launch a fighter plane, or never having designed it in the first place, are generally not so well-tolerated.

      • The world will likely fear a non-existent fighter plane, even one that has 100% success in its flight-control software (doesn't require rebooting) but is launched extremely rarely (it's hard to build) or too late, far less than it will a large fleet of existing, dangerous fighters that have even a 10% "kill" rate of its pilots per year.

      • A delay in a fighter-plane deployment can literally cause lost wars. In that sense, the loss of pilots due to poor design is a calculated positive compared to the loss of a nation's (and/or its peoples') freedom.

      Of course, I'm making pretty much everything up, above, so don't bother arguing details or interpretations with me -- I have no idea whether they're correct or not.

      But, they're probably correct enough to illustrate why it's probably okay for us to be using highly buggy computers on a poorly designed (for the way it's being used now, anyway) Internet rather than, as another post on this thread put it, using typewriters and plain paper.

      Not that there aren't wonderful advantages to deploying 100% correct software components in a large-scale, much-buggier system! "Creeping quality" is not a bad thing at all, since it allows people working on the system to worry less about various portions of it as they try to debug it.

      But, the effort to deploy such perfect components may well outweigh the utility of doing so, overall, given the pertinent timeframe.

      In particular, when trying to deploy such a perfect component in a large, buggy system, it can be hard figuring out which component can be made so "perfect" and still be useful in that (presumably speedily-evolving) system by the time it's ready!

      So maybe it's appropriate to view almost everything we deal with on the Internet as a very early alpha-stage prototype after all. ;-)

      --
      Practice random senselessness and act kind of beautiful.
    2. Re:An alternative strategy by BigStink · · Score: 2

      So the way that Microsoft Flight Simulator keeps crashing is actually a feature?

    3. Re:An alternative strategy by Anonymous Coward · · Score: 1

      True that modern fighter aircraft allow the pilot to "reboot" one or many of the internal systems, the systems are also allowed to reboot other systems
      False that the software crashes 2 or 3 times a flight, I would not fly with that software if it did, noone would! It does happen, but its maybe once per 500 hours or so

  106. ohh if only... by Linus+H. · · Score: 2

    I had the time.
    I had the paitience.

    Well this is cool. I proves that you can't write perfect software*. However you can come close.
    If only everybody would do it this way, not just some cool company.
    This probably even produces better software the "open source" way. OpenBSD is the only open software project that comes close, it really is kind of sad. People need to relax to do it right, down with stress!

    Well if you met someone who works at some dot com ( well there quite a lot of them here in Stockholm ) they are always really really stressed. That might impress the stockmarket but not really anyone else... That is the reason everybody talkes about "When will the bubble bburst?"and I can tell you this:

    The "bubble" ( which consists of overstressed people ) will burst very soon. The more relaxed people will take it easyily.

    * Well you can, but Hello World! isn't really THAT
    complex.

    --
    It's called new wave but it's just the same.
    1. Re:ohh if only... by Emil+Brink · · Score: 2

      Of course, not everyone can even write a perfect Hello World implementation. *Sigh*

      --
      main(O){10<putchar(4^--O?77-(15&5128 >>4*O):10)&&main(2+O);}
  107. Process must suit the project by VonKruel · · Score: 1

    First, I do think the process used for the FSW stuff is probably an excellent choice for that project. This is because of the following factors:

    • s/w failure can be very expensive in human and monetary terms
    • requirements aren't changing every day - in fact they hardly ever change
    • big budget
    • long schedule

    The author of this article seems to be saying that if all software was developed the same way, we'd be living in some sort of bug-free software utopia. The reality is that many software software projects proceed under very different circumstances:

    • if the s/w fails, the user will be annoyed rather than dead
    • requirements are subject to constant change
    • very tight schedule : if you deliver later that your competitor, you lose market share and your company may even fail

    Under these circumstances, you had better have some "hot shot" (!= cowboy) people on your project, or you will have a failed project. The real challenge on most software projects is to write good software in a fairly short period of time. Unfortunately, there are a lot of bad programmers and s/w project managers, and what we end with a lot of the time is shitty or mediocre software.

    I thought the author's best point was that software development is unnecessarily crazy, and could benefit a lot from being done in a more relaxed atmosphere. Unfortunately, it seems beyond the ability of any one company to make this happen, given the need to compete in the marketplace. Basically, what would have to happen to make software better is for everyone to demand better software - to make quality a priority over speed. The markets currently are saying that speed is the bigger issue.

  108. it is not list of requirements... by silpol · · Score: 1

    it looks like BoM (Bill of Materials) :)))

    --
    this field has been intentionally left blank ;)
  109. Bug free?? by jonathanclark · · Score: 2

    This software is bug-free. It is perfect, as perfect as human beings have achieved. Consider these stats : the last three versions of the program -- each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.

    How can they be sure it's bug free? If the last 14 versions had 20 errors, did they think it was bug free each time - only to find more bugs? At 500k lines of code you can't prove it all mathematically and human checkers are.. well human.

    One way to measure how many bugs your code has is to purposefully introduce a bug and tell people to find it. Then you count how many new bugs they found along with the bug you introduced and scale that by the lines of code you have. But this technique won't work if you one have 1 or 2 bugs that people are actively looking for in the first place. So, my question is - how can they be sure it is bug free?

    1. Re:Bug free?? by zantispam · · Score: 1

      "So, my question is - how can they be sure it is bug free?"

      Keep in mind that it was Fast Compay that said the software is bug-free. They then turn around and say it had bugs in it :-)

      Anyway, the article mentions that the Shuttle Group uses a database to keep track of bugs. Whenever one bug is found, it is eliminated and then placed into the bug database. The testers then look over every other part of the codebase that has similar {contructs|calls|whatever} and analyze those parts for bugs. Considering how many entries are in the bug database, I would suspect that all of the code has been thoroughly gone over...

      ...which still doesn't ensure 100% percent that it's bug-free. Being human, we cannot, by definition, be 100% sure that a particular codebase is 100% bug-free. However, I think it's safe to say that the code is 99.995% bug-free.

      I'd ride the Shuttle on those odds...

      Here's my copy of DeCSS. Where's yours?

      --

      censorship is a form of noise, which actively seeks to drown out content with silence - Crash Culligan
  110. I can understand why they want no hacks by Ikari+Gendou · · Score: 3

    So people don't see lines like this in the code:

    #Shuttle Waste Dump
    #
    #I dunno WHY this works, but it does!

    --

    Call on God, but row AWAY from the rocks!

  111. Re:Seems almost like ISO... by VonKruel · · Score: 1
    Exactly.

    I have witnessed the same thing myself - the process becomes an end in itself. People forget that what we're paid for is software, the process is only useful to the extent that it helps us to create better software.

  112. What's fun in software development? by Harald74 · · Score: 4

    I can almost hear the moans from the pizza-and-coke crowd whem they read this: "Where's the fun? Where's the creativity?". But they're under the mistaken assumption that putting lines of code into the editor is the only fun thing about developing software.

    IMHO, software development is full of fun activities. What about analysis and design? In my experience, that's where the creativity really comes into play. Just talking to the customer, understanding the problem and making a working design is really difficult, and hence rewarding when you pull it off.

    And what about the process itself? Software development is a young dicipline, where individuals and small groups really can make an impact. Nobody really knows how to make good software. Maybe you'll be the one to find out? As the man says, in the shuttle software group, people use their creativity on improving the process.

    And last, but not least, I bet those guys have a really good feeling when they talk to the customer after delivery. Not like some people I know, who just hide. ;)

    If you can't see the fun of these other activities, maybe you shouldn't be working in this field...

    --
    A)bort, R)etry or S)elf-destruct?
    1. Re:What's fun in software development? by Adam+Selene · · Score: 1

      Having written hardware control code that was NO WHERE NEAR AS LIFE CRITICAL as the shuttle, but that would toast 500k worth of hardware if you had a bug, I'll tell you one of the MOST fun parts of this kind of work.

      When you sit back, and watch some totally amazing piece of hardware do it's stuff, with everyone going "Ooh" and "Aaah", and all you just do is smile

  113. Re:Safety is cool... by jhines · · Score: 1

    From a programmers perspective, there are two reasons to pull an all nighter.

    1) you are in the "groove", where ideas and code flows naturally. This is very common amoung creative people, artists and musicians and such.

    2) forced OT to met a deadline.

    The first is a very good thing, and the latter is where the bad reputation of all nighters comes from.

  114. This is what fault-tolerant systems are for by Paul+Townend · · Score: 2

    One way of increasing the reliability of software is to use n-version programming, whereby you implement several versions of the software, written by different people, and then create a voter system that constantly compares the data of each program and forwards the consensus one. Even if none of the programs agree, the voter 'knows' that something is amiss and can alert the pilot/engineer/whatever. I'm doing my PhD on this, and I know that NASA has implemented quite a few n-version systems, as well as the more tried and trusted multiple-redundant hardware. I heard somewhere that the space shuttle code costs the equivalent of $100,000 a line (feel free to tell me I'm wrong if you know the 'true' figure) so it might be worth considering. Certainly a number of prominent academics reckon that you can get a 45:1 improvement in a software system by implementing 3 channels as opposed to a single good system. Blah, anyway, that's my $.02 worth.

    1. Re:This is what fault-tolerant systems are for by Harald74 · · Score: 1

      As you say, Paul, this is one way of increasing the reliability of software. And I personally think n-version programming is way cool, but there are some problems with this approach:

      • Specification - It has been suggested that most software faults stem from inadequate specification. N-v.p. will not address this.
      • Independence of design effort - It is questionable if different design efforts indeed will produce different errors. Different studies show different things.
      • Budget - It is unclear weather an N-version system, which is several times more expensive than a regular system, will perform better than a single system made with the same budget. Think also of the maintenance costs.


      What I really want to say is this: N-version programming is not a magic bullet. It's just one more tool to use to produce reliable systems.

      Check out "Real-Time Systems and Programming Languages" by Burns and Wellings for a short treatment of this subject, and some pointers for further reading.

      --
      A)bort, R)etry or S)elf-destruct?
  115. this only works for few projects by jilles · · Score: 3

    Lets see,
    - half a million LOC (that's small)
    - under development for 20 years
    - new requirements are avoided at all cost

    So it is a small, long lived project with nearly unlimited budget. No wonder they can afford to have such a process in place. But now realistically, how long does it take to set up such a project from scratch. How about having a customer who does not know what he wants. How about deadlines of less than 10 years from now.

    I honestly believe that this way of delivering software is optimal for nothing else but long lived, multi billion dollar projects. In any other case you'll end up with something that is delivered years to late, indeed matches the requirements of 10 years ago and is close to useless.

    Unfortunately many software companies are in a situation where they can't afford to wait for perfect software. Take mobile phones as an example. Typically these things become obsolete within half a year after introduction. The software process is what determines time to market. Speed is everything. If you can deliver the software one month earlier, you can sell the phone one month longer.

    Of course testing, requirementspecs and software designs are usefull for any project but it's usually not feasible to do it properly.

    --

    Jilles
  116. Re: Amateurs built the Ark... by beamin · · Score: 1

    They had a better project leader.

  117. just like where I work by MooseMunch · · Score: 1

    The company that I work for has a very similar environment. We have over 60 developers, and probably over 300 people working on the project. Not as critical as a space shuttle, but we manage wireless phone networks. Processing over 1 bill. calls a month for nearly 8 mill subscribers requires the same sort of accuracy. If we screw up, or client looses money and customers.
    The big difference is that we're organized, follow a excellent software development process, and we pipeline well. The design processs involves several reworks of design documents and code go through several reviews before it is even tested.
    moral of the story...organize up front, less problems later

  118. Challenger Investigation by Ken+Hall · · Score: 1

    If you want a pretty good (if a bit outdated) overview of NASA and the shuttle program (warts and all), see if you can find a copy of "Why do you care what other people think?" by Richard Feynman. The second half of the book is about his involvement with the Challenger Accident Investigation. He also had very high praise for the shuttle software group, and adds a few comments/items that weren't covered in this article. Highly recommended!

  119. Re:Flight Software by FarHat · · Score: 1

    How primitive! So they don't want an 'ice' or 'peach' flavored powermac to do their numbercrunching.

    --
    At the intersection of computation and biology.
  120. The joy of PLCs by ballestra · · Score: 1
    This is why most manufacturing and heavy machinery systems are controlled by programmable logic controllers (PLCs). You can't risk someone getting hurt because of software bugs, so either you go through the rigorous and expensive process that the FSW team uses, or you keep the system so simple that rigorous testing can be achieved at low cost. PLCs are very simple, reliable, cheap and flexible. They're just NOT general purpose. You can't play Quake on them, but if you want to program a sequence of control switches, they can't be beat. It doesn't hurt that programming a PLC is about as easy as drawing a ladder diagram.

    "What I cannot create, I do not understand."

  121. Re:Seems almost like ISO... by Mr.+Slippery · · Score: 4
    everything revolves around the 'process'. The result is determined by the process.
    The problem is that often the process becomes primary, and the reasons behind it get lost.

    I'm working on a large NASA project now. I have determined that the purpose of this project is not to produce a working software system, but rather to produce a wall full of loose-leaf binders of incomprehensible documentation that no one will ever refer to again.

    The process says we must have code reviews - great! But instead of being an analysis of the logic of my code, it turns into a check against the local code formatting standards - "You can't declare two variables with one declaration, use int a; int b; instead of int a,b;" (yes, that's an actual standard around here) instead of "Hey, if foo is true and bar is negative, you're going to dereference a garbage pointer here!"

    The forms are observed, but the meaning is forgotten, like Christians going to church on Sunday then cutting people off and flipping them the bird on the drive home.

    "Process" won't save us. Which doesn't mean that a certain amount of it can't help, but there is no silver bullet.

    --
    Tom Swiss | the infamous tms | my blog
    You cannot wash away blood with blood
  122. Flight Software by kzinti · · Score: 5

    I happen to work just down the hall from the guys who maintain and upgrade the shuttle Flight Software (FSW), and I can tell you they have a rigorous design, inspection, and test sequence that they go through before they fly new or modified code. The story around here (which I have no reason to doubt) is that the FSW team was one of the first SEI level-5 certified shops in the nation.

    I can also tell you that NASA avoids having to make unnecessary changes to the FSW. For example, the new "glass cockpit" recently discussed here on Slashdot: when these upgrades were designed, they chose to design the interface to the new display modules to exactly mimic the interface to the old intruments. In other words, they are true plug-and-play replacements; one significant reason for this was so the flight software didn't have to be modified.

    Likewise, people often ask why the shuttle continues to use such antiquated General Purpose Computers: slow, 16-bit machines designed back in the seventies. There are many reasons, but a big reason is that new hardware would almost certainly require massive changes to the flight software. And rewriting and recertifying all that software would be a huge task. The current FSW works reliably; if it ain't broke...

    Huzzah! As I type, we just launched Atlantis. Go, baby, go!

    --Jim

    1. Re:Flight Software by maetrix · · Score: 2

      Congratz,

      Something I've always wondered about is at what point to you figure you have done enough planning and start to work on the actual project? What is their time-line dependant on; 100% error free pseudo code before everything gets actually implemented or after they put it through a set number of readings?

      I think a lot of the reasons most companies don't go through this extensive pre-stage process is because they fear the project will get lost in a black hole of redesign and doublechecks.

      Also where can one find the Software Engineering Institute ( SEI ) specs?

      --
      Dum spiro, spero --While I Breathe, I hope.
  123. Formal Methods are the key. by Anonymous Coward · · Score: 2
    Time and again we hear about the requirements for 100% reliability. But most of us are simply paying lip-service to this idea. Formal Methods and techniques which can PROVE a program is BUG FREE have been around since the late 60s, but hardly anyone is using them.

    If everyone would simply use VDM/Z or Larch/CLU for all their development work, it would be much easier for us to prove our software is correct, and then all bugs would be a thing of the past.

    It really is that simple. Don't these people remember what they were taught at college ?

    1. Re:Formal Methods are the key. by gargle · · Score: 2

      Well my professor at college asked this question:

      "Would you rather fly on an airplane with software that has been proven to be correct, or on an airplane with software that has been rigorously tested through actual flight time?"

      I think the answer is clear.

    2. Re:Formal Methods are the key. by El+Cabri · · Score: 3

      After the Ariane 5 maiden flight failure in '96, the software was tested by an academic lab in France with heavily mathematical formal methods. The arithmetic exception that caused the $1b explosion was proved to be possible, along with several other 'dangerous' operations. Formal methods are now taken much more seriously and the incident is invariably told as an incentive to students in majors that relate to the mathematical aspects of programming.

      Another formal system originated in France is the Methode B, that consists in progressively refining logical statements that apply to the desired behaviour of your program (like assert() you put before and after the body of a function) into the implementation of the behaviour :

      http://estas1.inrets.fr:8001/ESTAS/BUG/WWW/BUGho me/

      An academic formal methods team that checks the Ariane 5 software:

      http://pauillac.inria.fr/para/eng.htm

      http://pauillac.inria.fr/para/eng.htm

    3. Re:Formal Methods are the key. by 97jaz · · Score: 1
      I think the answer is clear.
      Yes. Both.
  124. Seems almost like ISO... by vchoy · · Score: 2

    Notice the article similar to ISO (International Standards Org), everything revolves around the 'process'. The result is determined by the process. I use to work for a company that had a documented process for everything...from software devlopment right through to filling out your wage timesheet! I think the important thing to note is that it all depends on the 'culture' and type of the organization. If people accept this style of operation then it's great. For a oraganization that has to program software that directly deals with lives at stake, there must be a 'process' to ensure the s/w written perfectly (and tested).

    I have come across fellow works where they absolutly hate this type of practice... well they probably best suited for development in non-critical life threatening systems.

    1. Re:Seems almost like ISO... by vchoy · · Score: 1

      hehe, well looks like you are sick of the "process system".

      "The forms are observed, but the meaning is forgotten"

      The forms are observed, and if implemented properly (hopefully) the desired output is produced...that's the whole meaning -ONE objective-THE process- correct output! Now it seems to me that either the "process" you are working on is not correctly designed leaving you unsatisfied with the output or as stated above, you simply dislike the "process". If you feel the latter case is your opinion, (as written in my previous post) your personal work culture is not harmonized with your organisation's culture.

      It makes sense because if all staff in your organization feel the same way as you, the process will simply not exist...because no one would be implementing it! The point is your organization has processes in place to ensure STANDARDS are met and the final product is fit for mission critical systems.

      "like Christians going to church on Sunday then cutting people off and flipping them the bird on the drive home."

      Religions have a process to you know. It calls for the 10 Commandments to be followed.

      One commandment states "Do not accuse anyone falsely". Thus, if that person resisted the cut'n'bird technique on the way home, the desired output of the process would be - "LOVE"

    2. Re:Seems almost like ISO... by vchoy · · Score: 1

      Well as I said before, it seem to me that the process is not producing the right output to your satisfaction. (You've seen better one's before). In your organization there should be a 'process' to ensure that you and everyone on involved in your project is satisfied with the process. If your organization did this, more focus would be placed on the output than complaining about 'forms' and this and that.

      Judging from this, your organization does not have this type of quality process reviews where everyone has a say. Your manager(s) are the be all and end of...etc. It appears that the 'design' of process is not effective rendering the process you are implementing ineffective.

      A meta-process is still a 'process' that encourages everyone involved to design the process. Your organization does not have this...DESIGN OF THE PROCESS goes hand in hand with the OUTPUT OF THE PROCESS.

      The proper design of process will have you focus on the output instead of the process itself therefore 'enlighten' both your staff and yourself.

  125. If you read past this... by pwhysall · · Score: 2

    "C is a great, if complicated language. It's simple, yet can get complicated very easily..."

    It's complicated.

    It's simple.

    It's complicated again.

    The article gets worse from there.
    --

    --
    Peter
  126. THAT is how to write code by dnnrly · · Score: 5

    Some of my most succesful programs (read, they actually worked or there abouts) came about because I was in a funny mood and decided to actually plan it out. From what I hear about in the real world, some (but by all means not all or even most) programmers look down on clients just because they don't know much about programming. They assume that just because they have a certain expertise over others that they somehow know more than them in general.
    The good thing about the way software is written here is that the requirements are written down and sorted out before they even do the planning. How many prgrammers, groups, firms etc. can say that. I will admit, though, that a major problem is changing requirements. Something that just happen in the same way for NASA. It might just be better if people decided to wait a bit before jumping in to the programming. They'll save themselves more time and money in the long run.

    1. Re:THAT is how to write code by ahg · · Score: 2

      All too often it's a fickle client that causes a program to become a mess... With each and every couple weeks they want a new feature. Then they get the first revision in their hands and they want something completely different. It's not that the programmer never gave them the time of day to figure out what they want, it's just that they are not engineers such as those at NASA that can write a tight spec on what is _needed_ as opposed to what their own whisical mind thinks would be cool - err I meant "productive".

      --

      --Aaron Greenberg

  127. Author missed the point. by magic · · Score: 1
    I was eagerly looking for the process secret and was dissapointed to find that the thing the author thought was the key was bug-tracking, revision control, and code reviews. These are standard practices. They seemed to impress the author, but I think the impressive part is the management process that actually requires a rigid design, where everything is clearly specified ahead of time and goes back to square 1 when something changes.

    I blame the captain when the ship goes down, and I blame the managers (starting with the CEO) when coders are staying up late and projects are overbudget. It's fun to pull an all nighter and be a hero every now and then, but if you have good programmers, it's not their fault when things go south. In my experience, projects fall apart because managers didn't allow enough time for specification and didn't stick to the original specs.

    On a separate point, the error rate given (99.9% of all errors are caught) is statistically bogus. First off, they mean 99.9% of the errors that were caught or manifested were found. They don't know how many errors that haven't manifested themselves are still there. It is possible to predict such numbers given bug rates and other data, but that is a prediction, not an actual accuracy rate. The real issue I have with the number is that they are saying that the first 80% are caught before QA gets the code. I call that "making my program work." If I counted every problem I fixed before I handed over my code, then my rate would be pretty good too.

    That said, the shuttle group is awesome. Special thanks to them for enabling the most important endeavor of this century.

    magic

  128. Ummmm... no? by -ryan · · Score: 1
    That's way off. A close friend of mine happens to be a programmer for the mission control systems and from what he has described, their software (mission control, et.al.) is a patchwork of c, fortran, and an alphabet soup of extinct languages. The systems are scary he says.

    -ryan

    "Any way you look at it, all the information that a person accumulates in a lifetime is just a drop in the bucket."

  129. Safety is cool... by MosesJones · · Score: 4


    I never quite understand why it is an act of macho bravado to work all night and live off pizza. It indicates two things 1) A badly run project and 2) poor maintainability in the code.

    In one of my previous incarnations I worked on display systems for Air Traffic Control, where the quality level was also very high, where the performance requirements were exacting and the specifications precise.

    Some would think that this means simple and boring... Of course not. Having to display a track from reception at the Radar to the display in 1/10th of a second isn't easy by any stretch of the imagination, and to do it so it works 100% of the time means you have to understand the problem properly rather than coding and patching.

    If only more projects worked like that then there would be a lot less bugs in the world.

    --
    An Eye for an Eye will make the whole world blind - Gandhi