Is Code Verification Finally Good Enough?
Jonathan asks: "As someone who has been following the development of software verification technology, a recent trend has intrigued me. It seems that the formal method people have finally come down off their high horse and are offering code verification as just another tool. This approach shows up in recent Java and C# based code verification tools that aren't aimed at 'proving correctness' so much as finding potential errors. Now it seems that such an approach is beginning to find its way into XP methods[pdf] as another verification tool to supplement unit tests. Given the current speed and effectiveness of tools like the Spec# verifier, is code verification via automated theorem proving finally going to make its way into the mainstream?"
The main reason program verification didn't catch on was that it was hopeless for C and C++. The semantics of those languages were so messy that formalizing them was nearly hopeless.
Java and C#, however, are good enough. (So were Pascal, Modula, and Ada.)
Here's the manual for the Pascal-F verifier, a system written by a team I headed back in the early 1980s. This was a proprietary system done internally for Ford Motor Company. Take a look at the example real time engine control program beginning on page 155. It was painfully slow back then; it took 45 minutes of VAX 11/780 time (1 MIPS) to verify that program from a cold start. Today, it would take about a second.
What's being proved in that example? First, that there are no subscripts out of range or arithmetic overflows. Second, that all loops terminate. (Yes, you can prove that for most useful programs; the halting problem applies only to pathological programs.) Third, that the following constraints hold:
Useful stuff, the conditions needed to keep the engine running.
This is "design by contract" with teeth. Each function is checked to insure that it always satisfies its exit conditions if its entry conditions are satisfied by the caller, and that the function doesn't overflow, subscript out of range, or fail to terminate. Each call is checked to insure that its entry conditions are always satisfied. The end result is a guarantee that those properties hold for the whole program.
This is a very valuable check. It insures that caller and callee are in agreement on how to call each function. That's the cause of a huge number of software bugs - the caller made some incorrect assumption about the function being called, or the function didn't check for something which it needed to check. Both of those can be statically machine checked.
It's not easy to get a program through formal verification with a verifier like that one. The verifier does almost all the work on easy sections of code, but where correctness depends on anything non-trivial, you have to work with the theorem prover to get the proofs through. This isn't easy. The DEC Java checker and Microsoft's Spec# checker aren't as hard-line.