When Bugs Aren't Allowed
Coryoth writes "When you're writing software for an air traffic control system, military avionics software, or an authentication system for the NSA, the delivered code can't afford to have bugs. Praxis High Integrity Systems, who were the feature of a recent IEEE article, write exactly that kind of software. In "Correctness by Construction: A Manifesto for High-Integrity Software" developers from Praxis discuss their development method, explaining how they manage such a low defect rate, and how they can still maintain very high developer productivity rates using a more agile development method than the rigid processes usually associated with high-integrity software development."
And yet the reports I've seen on Praxis claim costs and schedules the same or less than the development of software of similar complexity...
When you're writing software for an air traffic control system, military avionics software, or an authentication system for the NSA, the delivered code can't afford to have bugs
I've been in this industry for quite some time and let me be the first to say that I wish I could repeat this sentence with a straight face.
That was my first thought, particularly with military avionics. A few years ago they put out a hardware/software update for the ENS system (Enhanced Navigation System) which led to frequent crashing... and it took over a year for them to come out with a message saying that it was a bug and not to waste countless man hours trying to repair it.
It's sort of a new concept, though, as I'd never really seen such problems with traditional avionics systems (non glass-cockpit stuff). I've always attributed it to people being used to the behavior of MS Windows. And I'm not saying that to start a flamewar. I'm serious. Unreliable avionics systems should be unacceptable, but these days, that doesn't seem to be the case.
The Master Money server done by Praxis was done Fixed Price, and with a warranty that says Praxis would fix any bug discovered over the net 10 years -for free-.
How many of you would be willing to place that kind of warranty on YOUR CODE?
dave (who's tried SPARK and liked it a lot, although proofs are much harder than they should be...)
No one can ever make something that is completely "bug-free" - even in traditional, non-software disciplines. All you can do is make the probability that the system will work as high as possible. Praxis has some techniques that can help developers create software with a much higher probability of working correctly than it would otherwise have. That's a good thing, even if it doesn't result in perfection.
Its buffer overflows and flawed design that has not been tested with every concievable input/output that causes most serious bugs in medical and aerospace applications.
It's the fact that Praxis relies on static checking far beyond anything you've ever seen (using a carefully designed subset of Ada that can be statically checked in all sorts of interesting ways) that helps to ameliorate this problem, since the static check is effectively equivalent to an exhaustive input/output test.
unlimited risk can be an incentive too.
Professor Middlebrook at caltech was an innovator in an unusual field. Sattelite electronics. Since no repairman was coming they wanted robust electronics. He desigined circuits in which any component could fail as an open or a short and it would remain in spec. You know that's a remarkable achievement if you've ever desinged a circuit before. Notably you can't really do this using SPICE. Speice will tell you what comething does but not how to design it. To do that you need a really good sense of approximations of the mathematical formula a circuit represents to see which components are coupled in which terms. And you need one more trick. The ability to put in a new element bridging any two points and quickly see how it affects the cicuit in the presence of feedback. To do that he invented the "extra element theorem" which allows you to compute this in analytic form from just a couple simple calculations. They still don't teach this in stardard courses yet. You can find it in Vorperians text book, but that's it. If you want to learn it you gotta either go to the original research articles from the 70s.
Some drink at the fountain of knowledge. Others just gargle.
The site is slashdotted at the moment, so I can't read the article.
A good example of people writing complex but bug-free software under time pressure is the annual ICFP Programming Contest. This contest runs over three days, the tasks are complex enough that you usually need to write 2000 - 3000 lines of code to tackle them, and the very first thing the judges do is to throw corner-cases at the programs in an effort to find bugs. Any incorrect result or crash and you're out of the contest instantly. After that, the winner is generally the highest-performing of the correct programs.
Each year, up to 90% of the entries are eliminated in the first round due to bugs, usually including almost all the programs written in C and C++ and Java. Ocassionally, a C++ program will get through and may do well -- even win, as in 2003 when you didn't actually submit your program but ran it yourself (so it never saw data you didn't have a chance to fix it for). But most of the prize getters year after year seem to use one of three not-yet-mainstream languages:
- Dylan
- Haskell
- OCaml
You can argue about why, and about which of these three is the best, or which of them is more usable by mortals (I pick Dylan), but all of them are very expressive languages with uncluttered code (compared to C++ or Java), completely type-safe, produce fast compiled code, and use garbage collection.
The end result - In a year, no one will remember that you were 6 months late - make a buggy release and in a year EVERYONE will remember the buggy release.
Why I always have time to do it over, and never the time to do it right in the first place
I have mod points and I am not afraid to use them
Easier actually.
http://www.rockwellcollins.com/news/page6237.html
Rockwell collins designed a secure processor for the NSA that PROVABLY can run multiple threads where each thread can get no information about the others. That way they can run processes with different clearance levels on the same CPU.
Model checking can make you big bucks if you can find the customers to pay for it.
bash-2.04$
bash-2.04$yes "Don't you hate dialup connections?"| write USERNAME
You can't staff a whole large application development project with the best gurus: there aren't enough out there in the world.
And why aren't there? Geeks lament the fall of IT and Computer Science programs at institutes of higher learning, and wonder why people don't want to go into these fields. If there was demand for programmers of the caliber you mention and companies willing to pay salaries deserving of such abilities there would be more people studying towards such a position.
But if companies are going to just throw up their hands and say "we can't hire competent people, there aren't enough of them in the world, they only doom themselves to a continued shortfall in talent, and an increase in buggy software.
Nursing is a profession that is always looking for new people and folks who didn't make it through the four-year grind back when they were young are flocking to it because the jobs are waiting. If companies held their coders to a higher standard, they would trim some of the fat in their projects and have jobs open for people willing to do the work. The result would be more productive teams and better applications, a win-win situation.
I've been a controller for 13 years and have worked in the automation end of things for almost 4 years now. There is NO SUCH THING as bug-free Air Traffic Control software. The best we can hope for is heterogenous redundancy and non-simultaneous failures. Some engineers seriously think they could replace all those controllers with an intelligent algorhythm. What really scares me is that the more they try, the less engaged the people become and the harder it is for them to fall back to manual procedures when the worst happens.
Everyone used to laugh at how Windows NT could only run for 34 days before it needed a reboot. Some of our systems can't run HALF that long without needing a server switch-over or complete cold-start.
http://en.wikipedia.org/wiki/IEFBR14
/dev/null. It returns immediately when called. It had four or five bug reports filed against it.
IEFBR14 is sort of an executable version of
IBM could of course write a defect-free return statement. All the bugs were requirements drift that Praxis could not have prevented.