How Do You Know Your Code is Secure?
bvc writes "Marucs Ranum notes that 'It's really hard to tell the difference between a program that works and one that just appears to work.' He explains that he just recently found a buffer overflow in Firewall Toolkit (FWTK), code that he wrote back in 1994. How do you go about making sure your code is secure? Especially if you have to write in a language like C or C++?"
Modern C++ provides a very nice and functional Standard Library which provides a lot of functionality and data structures such as strings, vectors, lists, maps, sets. While using these available classes does not completely rule out making programming mistakes related to buffer overflows and such, it at least minimizes the risk of producing stupid buffer overflow through badly done string handling. At least that's what my experience is.
Actually, the best thing would be not to use C or C++ at all, but that's where reality comes into play. Most developers don't even have the choice which language they should use, but that is predetermined by the employer and/or supervisor.
A monkey is doing the real work for me.
You introduce buffer overflows when you deal with buffers directly. In conventional C with its standard library you're encouraged to do this rather a lot, for example many of the string functions expect you to allocate a char buffer of big enough size and pass it in. The language's arrays are just syntactic sugar for accessing raw memory, with no bounds checks.
However you don't have to do it like this, especially not in C++ which has a safe string class (for example) as part of its standard library. Unfortunately C++'s vector type still doesn't do bounds checking with the usual [] dereferencing - you have to call the at() method if you want to be safe. But the general principle is: don't do memory management yourself, use some higher-level library (which exist for C too) and let someone else do the memory management for you.
You can write a C++ program and be pretty confident it doesn't have buffer overruns simply because it doesn't use pointers or fixed-size buffers, but relies on the resizable standard library containers.
-- Ed Avis ed@membled.com
Anyone who develops software knows the axiom - the number of bugs discovered in any piece of software is directly proportional to the amount of testing you perform on that software. From this, it follows that you can keep testing forever and at best only asymptotically approach bug-free code. Sounds hyperbolic, but I've observed it to be true in my experience. And as long as there are bugs, there are bound to be security bugs.
You can only minimize the risk that security issues will be found with any software. The best way to do this is to perform a rigorous code audit, preferably by security professionals. And if you can, make the software open source. You get a lot more eyes staring at it for free that way.
It's not that C/C++ is so insecure by itself, the problem is that programmers may not have used the best programming practices. There are plenty of libraries for handling strings and memory allocation in C, in C++ there are string and storage classes that do as much or as little checking as you need.
When you are an expert programmer there are places where you need more efficiency than the super-safe string routines can give you. It's the job of the expert to determine exactly how to balance efficiency against security, and only C/C++ can give you this balance.
Every function should be designed with the assumption that its input is faulty, and should have safe failure modes for every possible value and all possible content. Any unsafe external libraries must be wrapped in handlers which verify the data being passed to them with a similar mindset. Do not EVER presume data will be of a certain form, or that a function will be used a certain way. If sequential routines are becoming long such that you cannot verify the accurate function or the absence of a buffer overflow immediately in your head, then stop and look for a way to break it down into smaller abstract pieces.
Combine this mentality with the usage of safe classes as datatypes whenever possible, so that you can wrap your input verification into the functionality of the classes. If prudent, wrap external library routines in classes which manage the interaction with them, and which verify the data content being passed.
Use test suites to test every component of your program, and be sure to include invalid and pathologically insane input in your test suites.
Do not trade security for efficiency. And don't forget to cross your fingers.
You would sacrifice the flexibility and usefulness of the STL to get a class of bugs that are old and well-known? Hardly seems like a fair trade-off to me.
We all know the answer if we've studied computer science. The problem is that the answer is boring, lots of work and totally non-hip.
It's specifications, pre- and post-conditions, all that "theoretical bullshit" we learned in university. It's just that writing code that way is very un-exciting, and that's a vast understatement.
Assorted stuff I do sometimes: Lemuria.org
For high-integrity stuff, we use SPARK (http://www.sparkada.com/) - a design-by-contract subset of Ada95 that is entirely designed-from-scratch for verification purposes. :-) )
The verification system implements Hoare-logic and is supported by a theorem prover. Buffer Overflow is only one of many basic correctness properties that can be verified. Properties that can be verified are only limited to what can be expressed as an assertion in first-order logic.
SPARK is a small language (compared to C++ or Java...) but the depth and soundness of verification is unmatched by anything like FindBugs, SPLINT, ESC/Java or any of the other tools for the "popular" languages.
(If you don't know or care what soundness is in the context of static analysis, then you've probably missed the point of this post...
- Rod Chapman, Praxis
There is also the question of what the proof actually says. You can't prove, for example, whether a lambda program will terminate (Halting Problem), and in fact you can prove that you can't prove this. If you have a sufficiently well expressed specification for your program, you can verify that the program and the specification match. Unfortunately, if you have a specification that concrete, you can just compile it and run it.
By the way, Scheme is not a functional language. It has a number of properties that make it possible to write functional code, but saying Scheme is a functional language is like saying C++ is an object oriented language.
I am TheRaven on Soylent News
Modded as funny? This is as real as it gets. At least in the private sector.
Let's not forget their wonderful documentation! Complete and accurate API documentation is absolutely necessary for writing secure and reliable software. And of course the programmers should actually read the documentation and check all the details of the API calls they are using (return values, etc...)!