How Do You Know Your Code is Secure?
bvc writes "Marucs Ranum notes that 'It's really hard to tell the difference between a program that works and one that just appears to work.' He explains that he just recently found a buffer overflow in Firewall Toolkit (FWTK), code that he wrote back in 1994. How do you go about making sure your code is secure? Especially if you have to write in a language like C or C++?"
If you program using strictly functional programming, you can not only verify that your code is 100% secure, but you can even automate the process. (Preferably in a functional programming language such as Scheme, caml, Haskel, LISP, or Erlang; imperative languages make it very difficult/slow to do with functions what functional languages do very naturally and easily.) Purely functional code can be subjected to automated code auditing easily, whereas code auditing imperative code cannot be guaranteed to catch every bug and unintentionally available abuse.
Here's why, and why just about any computational problem can be solved using FP (functional programming):
Functional languages conform to lambda calculus, which has been shown to be Turing equivalent, which means that any program that can be computed on a Turing machine can be solved using Lambda calculus. So long as you program using strictly functions, your program can be verified according to the rules of lambda calculus, and the verification would be as sure as a mathematical proof. This is the only sure way I know of really knowing with mathematical certainty that your application is secure.
Pure functional programming has no assignment statements; there are no state changes for you to keep track of in your program, and in many cases abuses resulting unintended changes of state are the root of security problems. This is not to say that there is no state in functional programming; the state is maintained through function call parameters. (For example, in an imperative programming language, iteration loops keep track of a state variable that guides the running of the loop, whereas a functional program never actually keeps track of state with a variable that changes value; a functional program would carry out iteration by recursion, and the state is simply kept as a parameter passed to each call of the function. No variable with changing state is ever coded.)
Since functional programs lack assignment statements, and assignment statements make up a large fraction of the code in imperative programs, functional programs tend to be a lot shorter for the same problem solved. (I can't give you a hard ratio, but depending on the problem, your code can be up to 90% shorter when described functionally.) Shorter code is easier to debug, which helps in securing code. The reason functional code is so much shorter is that functional programing describes the problem in terms of functions and composition of functions, whereas imperative code describes a step by step solution to the problem. Descriptions of problems in terms of functions tend to be far shorter than algorithmic descriptions of solving them, which is required in imperative code.
Here's the biggest benefit of managing complexity with functional programming: as a coder, you NEVER have to worry about state being messed with. The outcome of each function is always the same so long as the function is called with the same parameters. In imperative programming as done in OOP, you can't depend on that. Unit testing each part doesn't guarantee that your code is bug free and secure because bugs can arise from the interaction of the parts even if every part is tested and passed. In functional programming, however, you never have to deal with that kind of problem because if you test that the range of each function is correct given the proper domain, and pre-screen the parameters being passed to each function to reject any out-of-domain parameters, you can know with certainty where your bugs come from by unit testing each function.
If you need to guarantee the order of evaluation (something that critics of FP advocates sometimes use to dismiss FP advocacy), you can still use FP and benefit: in functional programming, order of evaluation can be enforced using monads. Explaining how is beyond the scope of a mere comment though, but in any case, if you need really reliable code, consider using a functional programming style.
I can't do justice to the matter here; for more information, see th
To summarize, here's how you verify with mathematical certainty that a functional program is secure:
That's the gist of it. Anything more on this topic, such as automatic code auditing with the certainty of mathematical proofs (by means of lambda calculus proofs) is beyond my expertise. I just know that it's possible to truly secure functional code with mathematical certainty, whereas with imperative code, you can only be sure that your code has not yet failed or exposed a rare bug or failure condition.
The best advice I read was from the Erlang documentation. It suggested that you program defensively on a system level, but not on a module level. If a module receives input it can't understand, or thinks it is in an invalid state, the correct behaviour is for it to crash. A system of monitors should deal with failures of components, because they can determine how the failure will affect other components. There has only been one remote root hole in OpenBSD in the last ten years, and it would have been avoided if the OpenSSH developers had used this principle.
I am TheRaven on Soylent News
Once you go outside of a container, you already have a fatal error and the appropriate response is to crash (albeit gracefully if possible). The problem isn't so much that the program crashes, but rather that the program may consider data outside of bounds as valid memory, thus allowing buffer overflows and undefined behavior to occur.
The difference between pure C/C++ and the STL is that something like strcmp can create a rather subtle sort of buffer overflow error, whereas buffer overflows involving STL containers are generally easier to avoid and detect. For that matter, if you use the STL algorithms library to its full potential, you may find that you hardly ever need to use explicit indexing or iterators other than begin() and end().