Are You Using Z-Notation to Validate Your Software?
ghopson asks: "Many years ago in the early nineties I became aware of a software definition language called Z, which was based on set theory. The idea was that a software system could be defined in 'Z', and then that system could be mathematically proven to be 100% correct (in other words, zero defects).What is the current state of play with 'zero defect' software development? I know Z still exists, that there is a software development method (the B-method) associated with it. Have techniques and advances such as MDA advanced zero-defect software development? Or would a set of robust unit tests suffice?"
Your question is about Formal Methods. This field in computer science seems more popular in Europe than in the US theese days.
Two good articles you should read :
If your question was about their use in practical projects in industry, I've heard recently of teams doing serious work for Airbus (safety-critical embedded software using either B or Esterel) or for Gemplus (entirely proven JVM for javacards using B).
Some people here don't seem to understand the usefulness of Z.
...
...
:-)= ),
When developing software the process might look something like this:
Specification (created with or by client)
Design
Implementation
Testing
Typically a specification could be a great big document written in english. It could have ambiguities, it could have subtle contradictions, it could be missing vital information.
If the specification is written in Z, then there are tools that let you:
- Find ambiguities or contradictions.
- Prove that some fact of the system necessarily follows from some other fact.
- Prove whether assumptions are correct.
- Test what results are possible given certain inputs and sequences of operations.
There are processes for, step by step, turning a specification into something more like real code, while proving that the two forms mean the same thing (apart from whatever limits are introduced). So, for example the transformed specification might change a variable from being an integer to being an integer in the range 0..65535, and you can test what effects this has.
All of this can be done before writing a single line of actual program.
Yes mistakes are still possible, but if done right, they are significantly less likely; and unlikely to be due to a misunderstanding about the capabilities of some part of the system.
Z can also help in the process of testing an implementation. From the specification you can see that, for example, the number 50 is the input value at which the behaviour of a function significantly changes, so you know to produce test sets checking the input of 50 and values on either side. This doesn't _have_ to be done by hand; my thesis supervisor is part of a project to automatically generate test sets from B and Z specifications.
So creating a Z specification:
-mean more work before coding starts (I think this is what makes most people uncomfortable with it
-forces you to be very specific about what correct behaviour is,
-allows you to test properties of the system before it is created,
-can help to make more useful test sets.
(Disclaimer: My Master's thesis involves making tools for Z, so I could be biased; but it also means I probably know more than average about Z).
bits and peace
Nicholas Daley
Actually, I think the real kicker is the inverse of the problem you describe. It may very well be possible to prove that your C program properly conforms to the Z model. But is your Z model correct?
For many real-world problems, figuring out *exactly* what you want your program to do in all situations (and figuring out what all the situations are in the first place) is the hard part. You can then go implement that in C almost as easily as you can in Z.
I'm not saying this (Z and B) is useless. I just want to focus on the real problem. Using a modeling language just introduces one additional layer of abstraction into the process. While this may make the system easier to discuss and easier to "prove correct", it also introduces another layer at which errors may creep into the system. For some cases, the extra layer introduces fewer errors than it prevents. For other cases, the opposite is true.
I think to some extent it comes down to "how much time are we going to spend making sure this software is darn near perfect." The law of diminishing returns says that the number of flaws fixed per unit of programmer time will go down as more programmer time is applied, providing a "cost to flaw" curve. Customers also have a curve of how much they are willing to pay for additional reliability. The curves meet at some price point/quality level (keeping features constant), and that is where the product will ship. If the customer's need for extra reliability is large enough that the modeling phase can be integrated properly into the product schedule, it becomes worthwhile to spend time with formal models; otherwise, these will only decrease reliability since the modeling phase eats into the time needed for other phases to properly implement the product.
In other words, I think that a formal modeling phase is overkill for most requirements ("doesn't crash too often") since it has less bang for the buck as far as generating features and fixing glaring bugs. But it continues to have at least some bang for the buck long after other processes drop off in effectiveness. So it is worthwhile to add formal modeling when unit testing isn't going to get you to your desired flaw rate.
Time flies like an arrow. Fruit flies like a banana.
There are points at which you have to help the software along, tell it which strategy to use, or give it a half-way point to prove first.
If the assumptions being tested are provable, then it is valid to make those assumptions. If you can't prove them, then you don't really know that they are correct, and you shouldn't be making them.
Some pieces of code it is possible to prove whether it halts or not. It is possible to prove that:
while(true);
does not halt.
It is possible to prove that:
for(int i=0;i<10;i++)cout <<i<<endl;
does halt.
Similarly, as you say, there are other facts it is not possible to prove about a system. However if these facts need to be true for your system to be working properly, and they can't be proved, then there is something wrong with the system. e.g. If it is not possible to prove that your hello world program outputs "Hello World!", then it should be rewritten/fixed so that this can be proved. Otherwise the program is faulty, and may not do its job.
bits and peace
Nicholas Daley
So what happens when someone is programming in some language/dev-system that won't permit non-correct program ta be wrotted? Eh?
Ironically, it's provable that no Turing Complete language can be limited to create only correct programs for any non-trivial definition of correctness. Computer science is full of such fun things.
Proof: In your supposed language, there is a set of "correct" programs for some criterion, and a set of incorrect programs.
(Sub-proof: The program "return 1" is only correct for the problem consisting of "Always return true" and isomorphic problems, and "return 0" is only correct for the problem consisting of "Always return false" and isomorphic problems. Thus, for any given correct behavior there exists at least one incorrect program in your language, OR your language contains only a trivial number of programs and as such is hardly deserving to be called a "language".)
If there does not exist a Turing Machine (read: 'program') that can decide in finite time whether a given program meets a correctness criterion, then we can never decide whether a program is correct, because Turing Machines are the best computational tools we have. Therefore, suppose some Turing Machine M exists that accepts as input some suitable correctness criterion, and a program, and outputs whether that program meets the correctness criterion (in finite time).
By Rice's theorem, no such TM machine can exist. (Indeed, it's fairly trivial to show that any non-trivial definition of "correctness" must certainly include asserting that the program executes in finite time, and therefore a program that could check such a correctness criterion must be able to solve the Halting Problem, if such a machine existed.)
Since any language that only allowed one to express correct concepts is isomorphic to a Turing Machine that correctly validates the existance in that language, and no such TM can exist, no such language can exist, unless it is trivial and the correctness criterion is trivial. In which case as I alluded to earlier, it's hardly worth calling a "language" in the computer science sense.
In fact this sort of proof is one of the reasons I seriously wonder why so many people still seem to be pursuing this. (Fortunately, I see many signs that the field is waking up to this fact and good, dare I say useful research has been starting to get done in the past few years; perhaps the empirical successes of unit testing in software engineering has helped prompt this.)
BTW, no offense but I've been on Slashdot for a while, and I direct this at any replier; if you don't even know what Rice's Theorem is, please don't try to "debunk" this post. Many exceedingly smart people have dedicated their lives to computer science. I only wish I were smart enough to craft these tools, rather then simply use them. Rice's theorum has withstood their attack since 1953 . The odds of a Slashdot yahoo "debunking" this theorem are exceedingly low. (On the other hand, if you do understand what I was saying I of course welcome correction and comment; still, this is a fairly straightforward application of Rice's theorem and I can't see what could possibly go wrong. It's a pretty simple theorem to apply, it's just the proof that's kinda hairy.)
Also, your determination that grown biological computing is unlikely to ever achieve its ultimate goal, yeah, we prove that, don't we...
I do not believe biological computing is impossible; we are an obvious counterexample. What I don't think is going to happen is that at any given point, the most powerful (man-manufactured) computing device that exists is biological, because biological systems by definition require a life support system to support them, which must consume space, power, and resources better spent directly on the actual computation by a non-livi
I used Z to design the control software for a radiation therapy machine.
I expressed the safety requirements in Z, and also expressed the detailed design in Z, and argued (I wouldn't say "proved") that the design didn't violate the safety requirements.
We coded the control program in C from the detailed design in Z. Of course we tested the program thoroughly too - you still have to do all that, of course.
The machine has been in use treating patients since July 1999. We haven't killed anyone yet (unlike some other products in this category).
There are some papers and reports about the project at
http://staff.washington.edu/~jon/cnts/index.html
There is a short (3-page) report about our use of Z in this project at
http://staff.washington.edu/~jon/cnts/iccr.html
There is a longer report about the whole project at
http://staff.washington.edu/~jon/cnts/cnts-report. html
The Z for the radiation therapy machine itself is at
http://staff.washington.edu/~jon/z/machine.html