When Bugs Aren't Allowed
Coryoth writes "When you're writing software for an air traffic control system, military avionics software, or an authentication system for the NSA, the delivered code can't afford to have bugs. Praxis High Integrity Systems, who were the feature of a recent IEEE article, write exactly that kind of software. In "Correctness by Construction: A Manifesto for High-Integrity Software" developers from Praxis discuss their development method, explaining how they manage such a low defect rate, and how they can still maintain very high developer productivity rates using a more agile development method than the rigid processes usually associated with high-integrity software development."
probably helps too :P
Uh... it's going to be kind of hard for the NSA to do its job without bugs, isn't it?
*rimshot*
When you're writing software for an air traffic control system, military avionics software, or an authentication system for the NSA, the delivered code can't afford to have bugs
I've been in this industry for quite some time and let me be the first to say that I wish I could repeat this sentence with a straight face.
There are a huge number of yeast infections in this county. Probably because we're downriver from the bread factory.
The only method I have seen with almost perfect reliability is where the inputs and outputs are overloaded to handle any datatype and can be proven mathamatically not to crash. I guess a CS degree is still usefull.
The problem is to obtain it you need to write your own libraries and not use ansi or microsoft or any other products as you can not see or trust the source code.
If you can prove through solid design and input and output types that the program wont lose control then your set. Its buffer overflows and flawed design that has not been tested with every concievable input/output that causes most serious bugs in medical and aerospace applications.
However in practice this challenge is a little unpractical when deadlines and interopability with closed source software get in the way.
http://saveie6.com/
The authors contend that there are two kinds of barriers to the adoption of best practices... First, there is often a cultural mindset or awareness barrier... Second, where the need for improvement is acknowledged and considered achievable, there are usually practical barriers to overcome such as how to acquire the necessary capability or expertise, and how to introduce the changes necessary to make the improvements.
No, the reason so much software is buggy is economics. Proprietary software vendors have to compete against other proprietary software vendors. The winners in this Darwinian struggle are the ones who release buggy software, and keep their customers on the upgrade treadmill. Users don't typically make their decisions about what software to buy based on how buggy it is, and often they can't tell how buggy it is, because they can't try it out without buying it. Some small fraction of users may go out of their way to buy less buggy software, but it's more profitable to ignore those customers.
Find free books.
Luckily, bugs are just fine if you happen to run a company that builds voting machines, such as Diebold. And if you think that elections aren't in the same category as air traffic control, I suggest you take a tour of Iraq. Elections are very important for your continued existance upon the earth.
Electric Monkey Pants
When you're writing software for an air traffic control system, military avionics software, or an authentication system for the NSA, the delivered code can't afford to have bugs
I've been in this industry for quite some time and let me be the first to say that I wish I could repeat this sentence with a straight face.
That was my first thought, particularly with military avionics. A few years ago they put out a hardware/software update for the ENS system (Enhanced Navigation System) which led to frequent crashing... and it took over a year for them to come out with a message saying that it was a bug and not to waste countless man hours trying to repair it.
It's sort of a new concept, though, as I'd never really seen such problems with traditional avionics systems (non glass-cockpit stuff). I've always attributed it to people being used to the behavior of MS Windows. And I'm not saying that to start a flamewar. I'm serious. Unreliable avionics systems should be unacceptable, but these days, that doesn't seem to be the case.
Ususually when the software and the phrases "life support" or "nuclear weapons" are together in the same sentence.
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
The Master Money server done by Praxis was done Fixed Price, and with a warranty that says Praxis would fix any bug discovered over the net 10 years -for free-.
How many of you would be willing to place that kind of warranty on YOUR CODE?
dave (who's tried SPARK and liked it a lot, although proofs are much harder than they should be...)
In the world of software development, there have come to be two defacto models.
1. Get the software out the door ASAP - quite simply, bang out code as fast as possible that meets a loosely defined specification. Then once the product is adopted, parachute help in like no tomorrow to steadily improve the product.
2. Engineer the software - not as a simple as it sounds. This requires that a specification be drawn. A plan be prepared. A team of solid engineers formed and lead by a competent manager. Then, throughout the entire development cycle, test and debug code.
My company does the latter and to do date we have retained 100% of our customers. I'm shocked by the number of developers that approach our company for jobs that don't have the first clue about how to even write a test harness, let alone do any real debugging. Then again, they don't teach much of that stuff in school and it seems that unless your role was specifically in testing at a previous job, that you're not going to have too much experience in that area. Its economics and marketing that put the bugs in software, not computer science.
Linux, Firefox, and OpenOffice are some of the best software on the planet. I think is a good practical testament to the OSS philosophy.
And yet they all still suffer from a metric crapload of bugs. Praxis produces software with so few bugs that they are willing to provide a warranty that says they'll fix any bug found within the first 10 years, for free. If their software had the defect rate of Firefox or OpenOffice they'd be bankrupt in short order.
control structures != Turing complete. You can have loops as long as they have constant maximum bounds. Whatever it happens to be that you mean when you say "Nand is Turing complete" it makes no sense when you actually typed it. "turing-complete (for arithmetic)." makes no sense at all. WTF? Someone failed CS 315.
autopr0n is like, down and stuff.
unlimited risk can be an incentive too.
Professor Middlebrook at caltech was an innovator in an unusual field. Sattelite electronics. Since no repairman was coming they wanted robust electronics. He desigined circuits in which any component could fail as an open or a short and it would remain in spec. You know that's a remarkable achievement if you've ever desinged a circuit before. Notably you can't really do this using SPICE. Speice will tell you what comething does but not how to design it. To do that you need a really good sense of approximations of the mathematical formula a circuit represents to see which components are coupled in which terms. And you need one more trick. The ability to put in a new element bridging any two points and quickly see how it affects the cicuit in the presence of feedback. To do that he invented the "extra element theorem" which allows you to compute this in analytic form from just a couple simple calculations. They still don't teach this in stardard courses yet. You can find it in Vorperians text book, but that's it. If you want to learn it you gotta either go to the original research articles from the 70s.
Some drink at the fountain of knowledge. Others just gargle.
I was at an X windows technical conference many years ago when someone gave a presentation on X with Ada. When the speaker mentioned that it was for an air traffic control application, there was a sharp intake of breath all around the audience, most of whom had flown in for the meeting.
30 LOC is net. You spend the first 45% of a high-reliability project doing the design work, and the last 45% doing the verification. The 10% in the middle is code generation.
These guys seem to be claiming they can reduce redundancy in the design work, and rework in the verification work. They're doing it by using a design-description method that prevents unambiguity (and therefore using a team that is TRAINED to write unambiguous requirements, so their magic language may not be the key), a coding method that avoids unprovable structure (and probably eliminates a lot of other sorts of flexibility), and a verification method that first validates the design and then verifies the code as it's produced (no new value there as everything has to be touched at least once anyway, and if a big bug turns up that causes a lot of code to be redone you have to redo formal verification on those units again; something that's less likely if formal verification is delayed until full-alpha code is demonstrated, having been informally verified along the way).
Their claims of massive error reduction are, at best, anecdotal. Let's see them do this after taking over a half-coded project with minimal design requirements, a hard deadline, and a budget that can be cut by governmental forces at will.
TFA cites a particular NSA biometric identification program which has "0.00" errors per KSLOC.
Now, this got me thinking. It is completely possible for a biometric identification program to identify two different individuals as the same person (like identical twins), or for it give a false negative identification (dirt on a lense, etc). Is this a bug? The code is perfect: no memory leaks, the thing never halts or crashes or segfaults, all the functions return what they should given what they are.
I think the popular definition of "bug" tends to catch too many fish, in that it seems to include all the behaviors a computer has when the user "didn't expect that output," what a more technical person might call a "misfeature." TFA outlines a working pattern to avoid coding errors, not user interface burps -- like for example, giving a yes/no result for a biometric scan, when in fact it's a question of probabilities and the operator might need to know the probabilities. Such omissions (the end user would call this a 'bug'), are solved thru good QA and beta-testing, but TFA makes no mention of either of these things, and seems to think that good coding is the art of making sure you never dereference a pointer after free()'ing it. It does mention formal specification, but that is only half the job, and alot of problems only become clear when you have the running app infront of you.
Discussion of TFA has its place, but it promises zero-defect programming, which is impossible without working with the users.
Don't blame me, I voted for Baltar.
The site is slashdotted at the moment, so I can't read the article.
A good example of people writing complex but bug-free software under time pressure is the annual ICFP Programming Contest. This contest runs over three days, the tasks are complex enough that you usually need to write 2000 - 3000 lines of code to tackle them, and the very first thing the judges do is to throw corner-cases at the programs in an effort to find bugs. Any incorrect result or crash and you're out of the contest instantly. After that, the winner is generally the highest-performing of the correct programs.
Each year, up to 90% of the entries are eliminated in the first round due to bugs, usually including almost all the programs written in C and C++ and Java. Ocassionally, a C++ program will get through and may do well -- even win, as in 2003 when you didn't actually submit your program but ran it yourself (so it never saw data you didn't have a chance to fix it for). But most of the prize getters year after year seem to use one of three not-yet-mainstream languages:
- Dylan
- Haskell
- OCaml
You can argue about why, and about which of these three is the best, or which of them is more usable by mortals (I pick Dylan), but all of them are very expressive languages with uncluttered code (compared to C++ or Java), completely type-safe, produce fast compiled code, and use garbage collection.
I have some beef to pick with the article: 1. It alleges that CMM5 organizations have about 1 defect/KLOC. Having worked and knowing such organizations, I can anecdotally confirm numbers like these are fiction. CMM5 certification has more to do with greasing palms rather than any absolute defect measurement. 2. A defect rate of 0.04bugs/KLOC is not zero bugs/KLOC. The difference is infinite in magnitude if that single bug is -- kills the user. 3. Low defect rates are more often a product of poor testing, not superior development.
Their claims of massive error reduction are, at best, anecdotal. Let's see them do this after taking over a half-coded project with minimal design requirements, a hard deadline, and a budget that can be cut by governmental forces at will.
Their claims of error reduction are based on the development method and a lot of the important stuff happens very early on, taking over a half finished project that failed to follow such a method is of course not going to work. They can't make existing code bug free, but they can write new code that has vastly less errors than most software. As to hard deadlines and budgets - as far as I am aware Praxis already works with deadlines, and apparently their project for Mastercard was done on a fixed flat fee, so working with fixed or limited budgets doesn't appear to be an issue either.
Jedidiah.
Craft Beer Programming T-shirts
The end result - In a year, no one will remember that you were 6 months late - make a buggy release and in a year EVERYONE will remember the buggy release.
Why I always have time to do it over, and never the time to do it right in the first place
I have mod points and I am not afraid to use them
If operating systems ran airlines:
UNIX Airways: Everyone brings one piece of the plane along when they
come to the airport. They all go out on the runway and put the plane
together piece by piece, arguing non-stop about what kind of plane they
are suposed to be building.
Mac Airlines: All the airline personnel look and act exactly the same.
Every time you ask questions about details you are gently but firmly told
that you don't need to know, don't want to know, and everything will be
done for you without your ever having to know, so just shut up.
Windows Air: The terminal is pretty and colorful, with friendly stewards,
easy baggage check and boarding and a smooth take off. After about 10
minutes in the air the plane explodes with no warning whatsoever.
Windows NT Air: Just like Windows Air, but costs more, and uses much
bigger planes, and takes out all other planes in a 40 mile radius when it
explodes.
Linux Air: Disgruntled employees of all other OS Airlines, (with UNIX
geeks who finally figured out what kind of plane they were suposed to be
building) decide to start their own airline. They build the planes,
ticket counters, and pave the runways themselves. They charge a small fee
to cover the cost of printing the ticket, but you can also download the
ticket and print it yourself. When you board the plane you are given a
seat, four bolts, a wrench, and a copy of the Seat-HOWTO.html. Once
settled, the fully adjusable seat is very comfotable, the plane leaves
and arrives on time without problems, and the in-flight meal is
wonderful. You try to tell the customers of the other airlines about the
great trip, but all they can say is, "You had to what with the seat?"
with apologies to Doc Searls and Linux Journal.
Professional Politicians are not the solution, they ARE the problem.
I've been a controller for 13 years and have worked in the automation end of things for almost 4 years now. There is NO SUCH THING as bug-free Air Traffic Control software. The best we can hope for is heterogenous redundancy and non-simultaneous failures. Some engineers seriously think they could replace all those controllers with an intelligent algorhythm. What really scares me is that the more they try, the less engaged the people become and the harder it is for them to fall back to manual procedures when the worst happens.
Everyone used to laugh at how Windows NT could only run for 34 days before it needed a reboot. Some of our systems can't run HALF that long without needing a server switch-over or complete cold-start.
So, if this toolset and methodology are so good, I have to wonder why it does not get more widespread use? According to their info, it is developed in the 70's and 80's, so that's not new. And why are softwares so buggy and have such a lousy reputation anyway? Not to start a flamewar, but let's just list a few possible "reaons" here:
.... so, are software vendors a bunch of irresponsible kids that need constant monitoring?
1. Why aren't schools teaching this methodoly thoroughly? Why aren't this toolset and programming language taught in school by default? I learned a bit of Ada at school, but that's only part of a comparison between programming language design. So, are schools to be blamed? Or those profs don't know better? Why aren't proper engineering methodologies emphasized?
2. Someone developed a nice methodology, with a nice toolset and programming language, and got greedy and made it too expensive to acquire. Other tools are good enough, and the resulting softwares are acceptable to the market, so, this nice thing never got widespread use.
3. Programmers are asked to do the impossible. We (I include myself here) had to work with customers who don't know what they want, only give very fuzzy requirements (Praxis's customers, from their list, are different kind of animals, and they probably know better than most of the customers we had to work with), and even if we lay out the whole detailed plan in front of them, they still don't know what they want. They will agree to the plan, sign and approve it, and until you have completed the whole system according to the plan, they would ask to redo the whole thing. If a customer dares to ask a civil engineer to add 2 more stories between the 3rd and 4th floor after the custom-built building is finished, guess what would the civil engineer say? Programmers are asked to do this all the time (I know I have been asked to), so are customers to blame? You can't get the system done properly if requirements are shifting all the time.
4. Programmers are a bunch of bozos who know shit about proper engineering. Yeah, I can take the blame, I've been programming for over a decade, and I know how programmers work: methodologies are for pimps! If a bridge engineer can't tell or prove how much load the bridge can take, I'm sure people would tell him/her that s/he has no business in building bridge.
5. Customers of packaged softwares would buy a buggy software to save one buck anyway, why would vendors put extra efforts and costs to make it better? Look at the market, a lot of good softwares didn't survive, and sometimes, the worst of the line prospoered (no naming here!). So people get what they asked for.
6. Customers (even custom-built projects customers) are a bunch of cheap folks, they would go to the least priced, no matter what. Praxis's customers are willing to pay 50% more for quality work, how many of your customers are willing to? We are willing to fix our bugs, free of charge, for the first 10 years too, if our customers are willing to pay 50% than the market rate for quality work. But so far, I've never met one such customer yet. Granted, I don't work in the defense industry. So, don't blame us for lousy work, if customers try to squeeze out every single buck out of it. And in China (and some other countries too), you have to pay a huge amount for kickback too, sometimes, as high as 80% of the project's budget.
7. Software vendors are a bunch of greedy bastards, they put buggy softwares on the market, without having to accept any responsibility (just read your EULA!). Industry problem or government problem? Not enough regulations (for safety, for useability, etc)? Other industries seem to do ok, e.g. medical, civil,
8. The indsutry is developing too fast, people are chasing the coolest, hippiest, most buzzword-sounding technologies. No one gives a shit about "real engineering". And there are simply too much to learn too, in how many industries can you say people are required to master that much technologi
"If you want to learn it you gotta either go to the original research articles from the 70s."
r ial/nEET.pdf
r ial/slidesAppC.pdf
http://www.edn.com/archives/1995/080395/16df4.htm
"The extra element theorem is used for analog circuitry. The gist of it is that you remove the reactive elements ( or dependant sources ) from a circuit and then put them back in through a process of correction factors."
[n-extra element theorem]
http://ece-www.colorado.edu/~ecen5807/course_mate
[Middlebrook's extra element theorem]
http://ece-www.colorado.edu/~ecen5807/course_mate
In the end, software companies are in it for the profits. They have no lemon laws to respect, they have no trades description act to obey, no ombudsmen to answer to, no consumer rights groups to speak of, no Government-imposed standards certification and virtually no significant competition. Customers are often infinitely patient and completely ignorant of what they should be getting - the machines are like Gods and the software salesmen are their High Priests. To question is to be smote.
Were standards to be mandated - perhaps formal methods for design, OR quality certification of the end result, you would see no real impact on net software costs. Starting costs would go up, but long-term costs would go down.
Nor would you see any serious impact on variety - if anything, there is a greater range of car manufacturer and design today than there was in the 50s and 60s when cars had the unnerving habit of exploding for no apparent reason.
What you'd see is a decline in stupid bugs, a decline in bloat, an increase in modularity, possibly a reduction in latency and a move from upgrades to fix things that SHOULD have worked in the first place to enhancing things that can be relied upon to CONTINUE working fter the patches.
Money would not be made by selling the same product with a different set of defects to the same market, money would be made by always going beyond last year's horizons. The same way most manufacturers, from cars to camping gear to remote control aircraft to air conditioning units to microwave ovens to home stereo manufacturers have all been doing - very successfully - for a very long time.
The IT industry isn't going to change in the foreseeable future, the only way we'll see change in our lifetimes is if it is imposed on the Pointy Haired Bosses. We could easily see 99.9% reliable software, with no additional cost, in our homes in a year, with the lack of constant fixes actually saving money. And that's why it won't happen. Not because the IT corporations are mean, thuggish and ogreish - they are, it just isn't way it won't happen.
It won't happen because they're geared both towards the profit motive and towards the outdated notion that the market is tiny. (That last part was true - in the 1950s, when entire countries might have three or four computers in total, operating in two, maybe three different capacities. You can understand a desire to go after the after-sales service, when there simply isn't anything else left to do.)
Today, Microsoft's Windows resides on 98% of the desktop computers, but because of the support system needed to run the damn things, 98% of the world's population didn't have significant access to one. Ok, putrid green is a lousy colour, but the idea of clockwork near-indestructible laptops that - in theory - could be built to weigh 5 lbs or less and run high-end, intensive applications is beginning to filter through to the brain-dead we call politicians.
You think someone in the middle of Ethiopia who is fluent only in their native tounge is going to want to pay for telephone technical support from someone in India, in order to figure out why their machine keeps locking up?
When computing is truly available to the masses (ie: when even a long-forgotten South American tribe can reasonably gain access to one), the ONLY way it can be remotely practical is if said South American can look forward to a reliable, usable, practical experience where all usage can be inferred from first principles, and where NO software service calls are required to get things to work, ONLY required to get more things for working with.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
http://en.wikipedia.org/wiki/IEFBR14
/dev/null. It returns immediately when called. It had four or five bug reports filed against it.
IEFBR14 is sort of an executable version of
IBM could of course write a defect-free return statement. All the bugs were requirements drift that Praxis could not have prevented.
This article couldn't have been more coincidental to my current project: I've been re-reading James Martin's books, "Application Development without Programmers" and "System Design from Provably Correct Constructs", with the goal of selecting a method to program mechanical devices.
Martin's thesis, and remember this was back in the 70's and early 80's, was that the program should be generated from a specification of WHAT the program was to do, rather than trying to translate faulty specifications into code telling the computer HOW to do it. (Trust me, that poor sentence does not come close to describing the clarity of purpose in Martin's books.) Martin proposes that a specification language can be rigid enough to generate provably correct programs by combining a few provably correct structures into provably correct libraries from which to derive provably correct systems.
The definition of the time, HOS (for Higher-Order Software) was actually used by a company called HOS, Inc.(!), and apparently worked pretty well. Many of the constructive ideas were included in OOP and UML, but ideally, if I understand the concept properly, it would be equivalent to generating software mostly from Use-Case analysis. There are similar approaches in MDD and MDA methodologies. I wonder what ever became of the HOS,Inc. and the HOS methods? It looks like they had a handle on round-trip software engineering in the 80's.
OK, why would this be a good thing? Well, for one thing, computational/programmable devices are prolifierating at a tremendous rate, and while we can engineer a new device in 3 to 6 months, the programs for the device take 18 months to 3 years (if they are finished at all). Hardware development has greatly outpaced software development, by some estimations a 100x diference in capacity...yet they are built on the same fundamental logic!
The best argument, IMO, is that since larger systems are logarithmically more complex, and since it is impossible to completely test even intermediately complex systems, it will require provably correct modules or objects to produce dependable systems. If the code is generated from provably correct components, then the system only has to be tested for correct outputs.
Furthermore, code generated from provably correct components can allow machinery and devices to adapt in a provably correct way by rigorously specifying the outputs and changes to the outputs.
Praxis is on a roll. The methodology employed is probably more important than the genius of the programmers. It should get better,though. The most mediocre Engineer today can produce better devices than the brilliant Engineers of 30 years ago using tested off-the-shelf components. IMO, this the direction programming should be taking.
"The mind works quicker than you think!"
Then their ability to produce bug-free code depends, as usual, on control factors, not on real-world engineering.
In as much as a civil engineer depends on control factors via refusing customers who demand that the building have 6 stories not 4 just one month before construction is due to finish, yes. Real world engineering makes certain demands of the client. Someone who wants to build a treehouse for their kids doesn't consult an architect and a civil engineer, and civil engineers don't take contracts from people who refuse to set out some limits on what they want built, and what they expect of it.
Praxis uses solid engineering. Their "Correct by Construction" approach is solidly grounded in axiomatic mathematics and uses similar sorts of formal calculations and logical and mathematical proofs as you might expect to see from civil, electrical, aerospace, or ny other kind of engineers. Take the time to read sample chapters from the SPARK book to get an idea of exactly what they are doing. There is very definitely quite solid engineering going on.
Craft Beer Programming T-shirts
Screw funding. It's irrelevant.
Screw specifications. Nobody has them anyways.
Give me a clear, predefined spec, and I'll meet it. I'll guarantee bug fixes,too.
But that's not how software evolves.
Despite careful attention, despite voluminous meetings, emails, and specifications, I never get a clear idea what the client needs me to develop until AFTER a prototype has been built.
In fact, I'd wage that there's a quasi-quantum principle at work: You can either work towards the customer's actual needs, or the predefined, agreed upon specification/costs/specifications. Answering either means ignoring the other.
Consider this the Heisenberg Uncertainty principle. The software is half-dead, half-alive. Either it meets the needs of the customer (and associated scope creep, bugs, ets) or the originally defined specification. Releasing the software defines whether the cat is dead or alive.
It seems that:
1) People will commit, in aggressive fashion, that they need something until they get it, at which point, they'll angrily point out all the flaws in it.
2) People don't actually know what they need until they see that what they have isn't it.
3) When you take anything produced because of (1), and then compare that to the feedback produced by (2), you end up with cases where the code is producing a result unexpected in the original design.
These are called bugs.
4) The only intelligent way to proceed with (1) and (2) is to consider software an iterative process, where (1) and (2) combine with (3) and lots of debugging to result in a usable product.
I have no problem with your religion until you decide it's reason to deprive others of the truth.
At the age of 17 I spent a week in the Praxis offices on "Work Experience" (Americans may think of this as a very short internship), to find out what developing software would be like as a career. This was a major formative event of my life: I thought that developing software sounded good, I really liked using Real Computers (multiuser, multiprocessing systems with powerful operating software, like VMS and SunOS), and the people impressed me greatly. It definitely set me on the path to the career in systems development and administration that I have today.
The person who made the biggest impression on me was the sysadmin. He got his own office-cube instead of having to share, he wore much more casual clothes and had a lot more hair and beard than most of the staff, he got to have big toys (several workstations, a LaserJet IIIsi big enough for an entire office that seemed to be his alone, etc) and he didn't seem to get much hassle from anyone else. This was obviously the job for me.
The sysadmin was obviously rather a BOFH. When I was sat at the UNIX workstation for the first time, and had poked around with basic file-handling commands, I asked "What's the text editor on this system?". He answered "emacs - e m a c s - here's a manual" and picked about 300 sheets of paper off the Laserjet and handed it to me.
I got to play with UNIX (SunOS), VMS, Oracle development environments. I still have the Emacs manual printout somewhere at home - it served me well when I went to University where printing anything out was charged by the sheet!
I'm very glad they're still around.
"For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled"