Are You Using Z-Notation to Validate Your Software?
ghopson asks: "Many years ago in the early nineties I became aware of a software definition language called Z, which was based on set theory. The idea was that a software system could be defined in 'Z', and then that system could be mathematically proven to be 100% correct (in other words, zero defects).What is the current state of play with 'zero defect' software development? I know Z still exists, that there is a software development method (the B-method) associated with it. Have techniques and advances such as MDA advanced zero-defect software development? Or would a set of robust unit tests suffice?"
For the practical world, unittests seem to be the way to go. A large number of people have been experiencing very great and real success, without having to themselves be absolute geniuses to use.
Provable programs, on the other hand, are typically so complicated for a real-world program that your provably bug-free program still has bugs, because it turns out your model is wrong, because it was written by a human, not a god.
I'm very unimpressed with the whole "provable program" crowd for any but the most trivial proofs; decades of claims, little to show for it, and all based on the rather wrong idea that if we just push the need for perfection back one level, somehow the computer can magically take up the slack and prevent the bugs from ever existing. The unittesting philosphy, that bugs will happen and it's more important to test and fix them, since complete prevention is impossible anyhow, is much more based in reality, and correspondingly works much better in the real world.
(Provably correct computing is one of those branches of technology that may produce something useful, but is unlikely to ever acheive its stated "ultimate goals"; I class pure q-bit based quantum computing and pure biologically-grown computing in here as well.)
``Beware of bugs in the above code; I have only proved it correct, not tried it.''
Donald E. Knuth
Felipe Hoffa
No, you're wrong. Please see my post on this topic. It is impossible to prove all programs correct for any useful definition of "correct".
There exist proof systems which you seem to be referring to that function by examining the complete state space of a given algorithm, and validate it against some criteria. This only works on algorithms that have a finite state space. Since Turing Machines are allowed to have arbitrarily long tapes, the state space for some algorithms can also be arbitrarily large, which to put it colloquially would require an infinite amount of time to validate.
This is still useful; communications protocols are useful programs that have a very easily characterizable finite state space, and have been validated to prove there are no deadlocks in them, but that's because communications protocols are typically very simple things with tight loops and a small handful of state variables; they are very much the exception, not the rule, and even these simple, ideal cases for state space examination spin out of control exponentially quickly as more state is added to them.
Ironically, both my undergrad and grad level Software Engineering classes spent a disproportionate amount of time on these system, which are only useful in such a small handful of cases, where our coverage of "testing" consisted mainly of regurgitating the traditional, and damn near useless outside of a classroom, definitions of black- and white-box testing. (Nearly all real testing is some shade or other of "gray"; this grudgingly admitted and immediately glossed over.) Pity the time was spent that way, it would have been much more useful to practice writing tests, not using those state-space provers; there's an art to test-writing and it would be much more useful for just about everybody except someone who is 100% academic. (I'm not; I'm as close to 50-50 as I've ever seen anyone, so I have virtually no use for the state-space provers; elegant in their niche, but I can't help but see it as a very, very small niche.)
uhm, how will that function ever exit?
If this is true, call our function again.
If not, call our function again.
Ah crap, infinite recursion.
In IBM mainframe land, there's a program called IEFBR14 designed to do nothing (don't ask, you DON'T want to know).
The usual story is that the first version had a bug report filed because, although it did indeed return, it didn't set the exit code.
Next, according to the conventional version, the linkage editor gagged because the source ended with an END directive instead of END IEFBR14.
With that fixed, the next bug report was that it didn't interoperate properly with the dump tools, in that it didn't stick the program name into a predictable area of the code.
Then came the bug report about chaining register save areas (again, you don't want to know -- it's related to the fact that the machines didn't have a stack).
Nor was that the end of it, a later bug report was that it wasn't being flagged as reentrant when it got linked.
That's five bug reports against a null program. All of them were integration problems and/or omissions in the specification.
There's a class of bugs formal methods can help with, but that class is a minority in most commercial work.
Just to restate some of your points in a different way...
:)
Z is an abstraction layer in the same way any specification is an abstraction layer.
You don't really want to ever go from what the user wants to C. Instead you have a design and specification. Then you validate that specification against what the user wants, code the C, and validate the C and specification.
This is done because it is easier to validate C against a written specification, rather than against some idea in the users head.
In the same way, in some cases it is easier validate the specification against Z, then validate Z against the C code, then validating directly.
I can't help but paraphrase Knuth's most famous phrase: Beware of this code for I have only prooved it to be correct, not tested it.
or something like that
I studied for my CS degree at the University of York in the early 90's, who - at the time - were really pushing Z and Ada as the stuff of the future. One thing that stood out from the formal specification course, at least in my mind, was the example Z specification for a function to swap the values of two variables. Basically :
:
swap(a,b)
tmp = a
a = b
b = tmp
The pseudocode is simple enough, but the Z spec took up a whole page. This put people off - if the smallest function in a minor subsystem of an application takes a whole page and much head-scratching to figure out, how long will the rest of the software take? Surely it's so time consuming as to be infeasible for real world use. Hardly any on the course really "got it", and I was one of the many who didn't. A friend of mine went on to use Z in his final year project, and at some point *ting* - he got it and found that it was good. I've never seen or heard of Z being used anywhere since, but some day I want to have another look at my notes and text books to see if it makes any more sense yet, and if so, whether it can help my work.
I don't know if they still advocate Z, but they started teaching Eiffel instead of Ada the year after I left..
Great I just passed an exam in this, but lets see if I still remember it..
First you we have to get the question right - are you talking about validation or verification of software? They are quite different from each other:
The above is a quote of Boehm from 1979 I think... can't remember the exact title.
As you can propably guess there is quite a difference betweem the two. Unit testing is a method for verification, but from your description Z sounds like validation to me.
--- Martin Skøtt aka. Qoumran
Unless there's been some radical change in the state of the art that I've missed hearing about in the last few years, there's a big problem with Z notation and other methods of forming provably correct programs.
Said killer problem is that all the popular programming languages have features which make it impossible to model them using denotational semantics--that is, to produce mathematical functions which represent the program behavior. For instance, no C-based language can be modeled.
That's not to say that denotational semantics is completely useless. It's used in certain narrow domains, such as hardware--there are numerous chip designs where the circuitry and microcode is proven correct.
However, there's no way you're going to see --check-correctness flags on gcc or javac any time soon.
Of course, the bigger problem with mathematical techniques for proving correctness is that there's no measurable market demand for bug-free software (outside of the few narrow fields alluded to above), and hence no money available to spend on paying highly skilled computer scientists to spend a long time developing provably correct software using obscure programming languages. Instead, everyone just hacks together some alpha-quality code using C++ and calls it version 1.0.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
The idea was that a software system could be defined in 'Z', and then that system could be mathematically proven to be 100% correct (in other words, zero defects).
I've heard of a different idea. Once you define the problem in this language called "C", the software is automatically written by this device called a "compiler", and it is already proven to be 100% correct.
I don't mean to be glib by making such a statement as "software is still voodoo", but after reading this thread, it is quite obvious I think that clear understanding of our art is achieved by far too few practitioners, and that is the crux of the problem of software quality.
While provably correct solutions are presently unrealistic, we have myriads of tools to assist in the process of creating high quality software. Specification tools like Z, modelling languages, coverage analysis and profiling software, unit testing and regression prevention methodologies, runtime assertions---all are proven to increase and control quality. Moreover, language constructs such as taxonomic class hierarchies with crisp declarations can prevent many logical or semantic flaws at compile time, but only if used by class designers who understand good architecture and the importance of minimizing conceptual baggage and side-effects.
Yet, I live in a world where I regularly encounter professional developers who do not use version control and move from idea to implementation quickly without adequate (or sometimes without any) specificiations at all. These are not amateur hackers, but well-paid professionals who happen to be working on websites which accept credit cards and banking details, among others.
While many of us can understand and use quality-assurance tools and processes, there are just as many (perhaps more) that rely solely on their intuitive ability to "sling code", and worse, are arrogant in their belief that time spent in formal specification and analysis is a waste of time.
Arguing that particular specification tools are useful is fine, but don't entertain for a moment the belief that the larger problem of software quality is even vaguely related to a lack of theoretical foundation for quality. The theoretical foundation is alive, well, and rich with solutions.
The problem is people, education, and professional standards.
Until there are employment and "best practice" requirements for software engineers which are as rigorous and well-accepted as those for other engineering disciplines, we will continue to unacceptably low quality in the software we use, with an incredible corresponding waste in time and human effort.