Slashdot Mirror


Space Shuttle Software: Not For Hacks

Jeff Evarts writes: " This article in Fast Company talks about the process the Shuttle Group uses to make software. At first it seems too predictable: a very cool project but no hacks, no pizza-and-coke all-nighters, etc. Then, however, it goes on to talk about why: They have an informed customer, they talk to that customer until they have a very clear idea of what is wanted, they have a budget focused on prevention, and they focus on fixing the process and not blaming the individual."

As someone who's done more than his share of late-nighters, it was an interesting view into the mission-critical environment. Maybe there are a few software firms out there that would rather spend some of their money on better processes rather than technical support engineers. Maybe a little more market research and a little less marketing, too. A good read."

These guys are "pretty thorough" the way Vlad the Impaler was "a little unbalanced." Still, you have to wonder how they can claim single-digit errors among thousands of lines of code, but I guess the proof is in the rocket-powered pudding. And lucky for them, their target platform was recently upgraded.

6 of 178 comments (clear)

  1. The importance of documentation by Glimmer_Man · · Score: 5

    I worked on some mission-critical/life-critical stuff about 2 years ago. It was aircraft related, and since it was basically carrying the data which made the plane fly it was critical by any definition. The processes we followed was absolutely document driven. User specs were examined, questions asked and the user asked to add definition and clarification for several iterations of the document. Then the software requirements etcetcetc were followed, ech document with quite a bit of iteration. Eventually we found that typically documentation and design would take 50% of the project. Testing would take about 30 to 35%, and the actual implementation hardly took any time at all. Now in the commercial world, I find that the process is VASTLY different. Implementation has started shortly after user specs have hit the desk, before design or documentation has begun. As a result, the system we currently have is very patchy in places. Its mission is a lot less critical, but the bugs slow us down tremendously. The bugs are due to the process. The process is requirements driven, not documentation driven. But it seems that the current system I'm working on has about the same complexity as that I used to work on. Only even though we are supposed to be pushing it all out the door faster, the bugs are slowing us to the point where we have approximately the same rate of progress as the mission-critical project!! Lesson: If you do it by the documentation, you will push it out faster and cleaner (and more bugfree!!!)

    1. Re:The importance of documentation by TomV · · Score: 5
      I worked on some mission-critical/life-critical stuff about 2 years ago. It was aircraft related, [...] The processes we followed was absolutely document driven.

      Likewise, i worked for a while on the signalling system for the Jubilee Line Extension for the London Underground.

      Totally documentation driven. First there was the CRS (Customer Requirements Spec). - this then transformed via an SRS (Systems Requirement Spec.) into the FRS (Functional...) and the NFRS (Non-functional...). From these we had Software Design specs, Module Design Specs, Object Design Specs, Boundary Layer Design specs. in all there were around 4000 specification documents for the project, often at issue numbers well into the teens.

      What really made the difference though, was not so much the existence of documentation, as the absolute insistence on traceability - every memeber function of every class in the whole system could be traced back to the Customer Requirement Spec, and every Requirement could be traced to its implementation. This meant - no chrome: everything in the spec was p[rovided, and nothing was provided that wasn't in the spec.

      Also worth noting that: the whole thing was in ADA95. The compiler was very carefully chosen. Coding standards were tight, and tightly enforced - function point analysis was king - anything with more that 7 function points was OUT, simple as that. Every change to anything, however small, required an inspection meeting before and after implementation, with specialits from every part of the system which could be impacted, plus one of the two people with a general overview. Then there were the two independent test teams and the validation team.

      Ye Gods it got tedious, no denying that. But in a situation where lives depended on good software...

      Now I probably apply only a tiny fraction of what I learned, but when I decide to ignore part of the methodology, at least I know I'm ignoring it. And I'm aware of what I'm missing.

      In short - learn about the safety-critical approach. Ditch most of it as excess baggage by all means - it's often simply not justifiable. But be aware of the choices you're making.

      TomV

  2. Somewhat bogus article by Animats · · Score: 5
    That article has been around for a while. It paints an excessively rosy picture of the Space Shuttle flight control software.

    Here's NASA's own history on bugs in that software:

    • So, despite the well-planned and well-manned verification effort, software bugs exist. Part of the reason is the complexity of the real-time system, and part is because, as one IBM manager said, "we didn't do it up front enough," the "it" being thinking through the program logic and verification schemes. Aware that effort expended at the early part of a project on quality would be much cheaper and simpler than trying to put quality in toward the end, IBM and NASA tried to do much more at the beginning of the Shuttle software development than in any previous effort, but it still was not enough to ensure perfection.
    Read the NASA history. They had a 200-page known-bug list in 1983, although they did fix most of them during the long downtime after the Challenger explosion.

    The Shuttle's user interface is awful. The thing has hex keyboards!. Some astronaut comments include

    • "What we have in the Shuttle is a disaster. We are not making computers do what we want" -- John Young, Chief Astronaut, 1980s
    • "We end up working for the computer, rather than the computer working for us." -- Frank Hughes, NASA flight trainer
    • "crew interfaces...more confusing and complex than I thought they would be" -- John Aaron, NASA interface designer
    • "(the) 13,000 keystrokes used in a week-long lunar mission are matched by a Shuttle crew in a 58-hour flight" -- NASA history

    This project should not be held up as a great example of software engineering. Even NASA doesn't think it is.

  3. Re:Interesting stuff by cheeto · · Score: 5

    I work in the Flight Software (FSW) Verification group in Houston.

    The shuttle FSW code is written in something called HAL/S. This stands for High-level Assembly Language / Shuttle. The language was designed to read like mathematics is written. Superscripts like vector bars are actually displayed on the line above, subscripts like indices are displayed on the line below. Vectors and matrices can be operated on naturally, without looping.

    We are the only ones with a compiler, because we wrote it ourselves.

    Here's a sample:

    EXAMPLE:

    PROGRAM;

    DECLARE A(12) SCALAR;

    DECLAREB ARRAY(12) INTEGER INITIAL(0);

    DECLARE SCALE ARRAY(3) CONSTANT(0.013, 0.026,0.013);

    DECLARE BIAS SCALAR INITIAL(57.296);

    DO FOR TEMPORARY I = 0 TO 9 BY 3;

    DO FOR TEMPORARY J = 1 TO 3;

    A =B SCALE + BIAS;
    I+J I+J J

    END;

    END;

    CLOSE EXAMPLE;

    I couldn't get the subscripts to line up, but you get the idea.

    --
    - "Sweet merciful crap!" Homer J. Simpson
  4. Flight Software by kzinti · · Score: 5

    I happen to work just down the hall from the guys who maintain and upgrade the shuttle Flight Software (FSW), and I can tell you they have a rigorous design, inspection, and test sequence that they go through before they fly new or modified code. The story around here (which I have no reason to doubt) is that the FSW team was one of the first SEI level-5 certified shops in the nation.

    I can also tell you that NASA avoids having to make unnecessary changes to the FSW. For example, the new "glass cockpit" recently discussed here on Slashdot: when these upgrades were designed, they chose to design the interface to the new display modules to exactly mimic the interface to the old intruments. In other words, they are true plug-and-play replacements; one significant reason for this was so the flight software didn't have to be modified.

    Likewise, people often ask why the shuttle continues to use such antiquated General Purpose Computers: slow, 16-bit machines designed back in the seventies. There are many reasons, but a big reason is that new hardware would almost certainly require massive changes to the flight software. And rewriting and recertifying all that software would be a huge task. The current FSW works reliably; if it ain't broke...

    Huzzah! As I type, we just launched Atlantis. Go, baby, go!

    --Jim

  5. THAT is how to write code by dnnrly · · Score: 5

    Some of my most succesful programs (read, they actually worked or there abouts) came about because I was in a funny mood and decided to actually plan it out. From what I hear about in the real world, some (but by all means not all or even most) programmers look down on clients just because they don't know much about programming. They assume that just because they have a certain expertise over others that they somehow know more than them in general.
    The good thing about the way software is written here is that the requirements are written down and sorted out before they even do the planning. How many prgrammers, groups, firms etc. can say that. I will admit, though, that a major problem is changing requirements. Something that just happen in the same way for NASA. It might just be better if people decided to wait a bit before jumping in to the programming. They'll save themselves more time and money in the long run.