How Do You Know Your Code is Secure?
bvc writes "Marucs Ranum notes that 'It's really hard to tell the difference between a program that works and one that just appears to work.' He explains that he just recently found a buffer overflow in Firewall Toolkit (FWTK), code that he wrote back in 1994. How do you go about making sure your code is secure? Especially if you have to write in a language like C or C++?"
Secure?? What does it mean?
Modern C++ provides a very nice and functional Standard Library which provides a lot of functionality and data structures such as strings, vectors, lists, maps, sets. While using these available classes does not completely rule out making programming mistakes related to buffer overflows and such, it at least minimizes the risk of producing stupid buffer overflow through badly done string handling. At least that's what my experience is.
Actually, the best thing would be not to use C or C++ at all, but that's where reality comes into play. Most developers don't even have the choice which language they should use, but that is predetermined by the employer and/or supervisor.
A monkey is doing the real work for me.
You introduce buffer overflows when you deal with buffers directly. In conventional C with its standard library you're encouraged to do this rather a lot, for example many of the string functions expect you to allocate a char buffer of big enough size and pass it in. The language's arrays are just syntactic sugar for accessing raw memory, with no bounds checks.
However you don't have to do it like this, especially not in C++ which has a safe string class (for example) as part of its standard library. Unfortunately C++'s vector type still doesn't do bounds checking with the usual [] dereferencing - you have to call the at() method if you want to be safe. But the general principle is: don't do memory management yourself, use some higher-level library (which exist for C too) and let someone else do the memory management for you.
You can write a C++ program and be pretty confident it doesn't have buffer overruns simply because it doesn't use pointers or fixed-size buffers, but relies on the resizable standard library containers.
-- Ed Avis ed@membled.com
Writting in C/C++ doesn't do the whole thing better. A strong/typed language like Delphi or a managed language like C# are less likely to have any buffer overflow type bugs, etc, but you never know. Code writting is not pizza baking.
It's time to realise that Abble's products are the biggest abomination these days. Just say NO to the dumb iAbble way!!
Anyone who develops software knows the axiom - the number of bugs discovered in any piece of software is directly proportional to the amount of testing you perform on that software. From this, it follows that you can keep testing forever and at best only asymptotically approach bug-free code. Sounds hyperbolic, but I've observed it to be true in my experience. And as long as there are bugs, there are bound to be security bugs.
You can only minimize the risk that security issues will be found with any software. The best way to do this is to perform a rigorous code audit, preferably by security professionals. And if you can, make the software open source. You get a lot more eyes staring at it for free that way.
It's not that C/C++ is so insecure by itself, the problem is that programmers may not have used the best programming practices. There are plenty of libraries for handling strings and memory allocation in C, in C++ there are string and storage classes that do as much or as little checking as you need.
When you are an expert programmer there are places where you need more efficiency than the super-safe string routines can give you. It's the job of the expert to determine exactly how to balance efficiency against security, and only C/C++ can give you this balance.
You cannot know for sure (unless you want to develop code by mathematical proof, which requires a considerable amount of effort). However, you can do some things to help prevent buffer overflows and security problems in general: - encapsulate all buffer access, and make the interface overflow-safe. Then you need only ensure your encapulation is secure. - use a static code analysis tool that detects buffer overflows. I do not know of any open source ones off the top of my head, but I remember seeing an article on slashdot a few months ago about a new open source static analysis tool - avoid unsafe functions. Nearly all standard C functions that deal with buffers are unsafe (that is, a typo or oversight can give you a difficult to detect buffer overflow). Sprintf and strcpy are common culprits off the top of my head. If you're writing for Windows, the Microsoft extensions to the standard library have equivalent 'secure' functions (usually postfixed with _s). I do not know if there is an open source equivalent. - Use your compiler's buffer overrun detection. I think this is -fmudflap for gcc. That's all I can think of for now.
Every function should be designed with the assumption that its input is faulty, and should have safe failure modes for every possible value and all possible content. Any unsafe external libraries must be wrapped in handlers which verify the data being passed to them with a similar mindset. Do not EVER presume data will be of a certain form, or that a function will be used a certain way. If sequential routines are becoming long such that you cannot verify the accurate function or the absence of a buffer overflow immediately in your head, then stop and look for a way to break it down into smaller abstract pieces.
Combine this mentality with the usage of safe classes as datatypes whenever possible, so that you can wrap your input verification into the functionality of the classes. If prudent, wrap external library routines in classes which manage the interaction with them, and which verify the data content being passed.
Use test suites to test every component of your program, and be sure to include invalid and pathologically insane input in your test suites.
Do not trade security for efficiency. And don't forget to cross your fingers.
The mostly STL gets rid of the old problems such as buffer overflows but introduces new ones that can a lot more subtle and harder to track down such as deep/shallow copy issues. Personally (and I'm probably in the minority) I prefer to deal with the old fashioned bugs since you can usually guess where they're happening whereas in a highly abstracted C++ program using the STL with lots of objects being copied and references flying around it can be a LOT harder to figure out whats really going on , especially since different compilers do different things under the hood.
You would sacrifice the flexibility and usefulness of the STL to get a class of bugs that are old and well-known? Hardly seems like a fair trade-off to me.
How do you validate code for correctness? Well, either you use some cool formal specification language, such as Z, and then spend a great deal of time and effort validating (which is actually very advisable for critical code in, say, device controls for medical equipment) or you use blind luck and "proven" techniques, collectively known as Good Programming Practice.
:)
Traditionally it has been important to "specify and validate" requirements acribically, in the belief that this is was the way to write good code. This is partly true, but that way can quickly turn your process into a dinosaur - stifling change and preventing improvement because of non-compliance with "The Requirements".
You can try "defensive coding", which really treats all messages with great suspicion, messages being an old term for parameters. This is a cool technique, but can lead to slower code than necessary, and can lead to some bug being buried if code attempts to heuristically correct for "bad" messages (there is rarely any way to formally specify what is "bad"). You can use LINT tools (and there are very many, very sophistacted tools) which will catch a whole lot of stuff before it leaves the developer's screen. You can try practices such as pair programming and independent code inspection. On the coding side, you can even try (gasp) such methods as test driven development and contract based development.
On the testing side, there is nothing quite like having an experienced, qualified, motivated and _empowered_ testing team. A testing team which knows how to find bugs, knows how to communicate with coders and has the power to step defects going in to production. A technique I particularly like is defect insertion - secretly insert 10 bugs into the code base and see how many get squashed, this will give you an estmate of how many defects your process doesn't find. There are other cool techniques too, some based on mathematical analysis of the code's attribute - the more complex the code the costlier it is to maintain.
Opening up the codebase to many people might well increase the chance that someone will find the line which causes an error - but IMHO no one goes around looking for bugs unless they are looking for weaknesses. And there we have another (unethical) method - pay some hacker doodz to 'sploit your code. Hopefully they will not find a higher bidder LOL.
All of these methods are likely to increase development effort and cost, decrease the number of defects, increase user satisfaction, decrease maintainance costs and increase well-being and harmony. So it is a trade off, perfect code is incredibly difficult to create - the question is what level of perfection are you (and your customers) willing to pay for. Problems mostly arise when expectation does not meet reality - some flakiness in an F/oss application suite is more acceptable to me than random crashes in software which cost me hundreds - or tens of thousands - or millions - of dollars.
In order to increase some quality aspect of code (security, performance, robustness, correctness...) one can therefore focus on one or several categories - the people, the process, the culture, the tools, the technique, the time&cost etc. The choice of what to focus on is dictated by reality: no one has unlimited resources (except, almost, Google).
There is no silver bullet - but there are golden rules. Finding people who know the difference is crucial I believe.
(Full disclosure: Yeah, I'm looking for heavy duty PM work
They who would give up an essential liberty for temporary security, deserve neither liberty or security - Ben Franklin
Ok, What language is your Ada compiler written in? There are very few self-hosted languages that do not rely on "C" at some level. Also, the OS and the system libraries were written in C. At some level you need to deal with the stated problem. All that being said many people are probably better off with Ada unless they actually "study" software security on a daily basis.
Not necessarily , all I'm saying is that the STL can introduce bugs of its own that can be a lot harder to find than old style buffer overruns so its not a solution that will get rid of obscure coding (as opposed to logic) bugs.
You can write code that can be as secure as you want, but what about libraries, compilers and hardware?
I think the question itself makes little sense without a deeper investigation in the system!
Intelligence has limits. Stupidity doesn't.
To be honest, I have got _no_ idea what you are trying to say.
They who would give up an essential liberty for temporary security, deserve neither liberty or security - Ben Franklin
We all know the answer if we've studied computer science. The problem is that the answer is boring, lots of work and totally non-hip.
It's specifications, pre- and post-conditions, all that "theoretical bullshit" we learned in university. It's just that writing code that way is very un-exciting, and that's a vast understatement.
Assorted stuff I do sometimes: Lemuria.org
For high-integrity stuff, we use SPARK (http://www.sparkada.com/) - a design-by-contract subset of Ada95 that is entirely designed-from-scratch for verification purposes. :-) )
The verification system implements Hoare-logic and is supported by a theorem prover. Buffer Overflow is only one of many basic correctness properties that can be verified. Properties that can be verified are only limited to what can be expressed as an assertion in first-order logic.
SPARK is a small language (compared to C++ or Java...) but the depth and soundness of verification is unmatched by anything like FindBugs, SPLINT, ESC/Java or any of the other tools for the "popular" languages.
(If you don't know or care what soundness is in the context of static analysis, then you've probably missed the point of this post...
- Rod Chapman, Praxis
There is also the question of what the proof actually says. You can't prove, for example, whether a lambda program will terminate (Halting Problem), and in fact you can prove that you can't prove this. If you have a sufficiently well expressed specification for your program, you can verify that the program and the specification match. Unfortunately, if you have a specification that concrete, you can just compile it and run it.
By the way, Scheme is not a functional language. It has a number of properties that make it possible to write functional code, but saying Scheme is a functional language is like saying C++ is an object oriented language.
I am TheRaven on Soylent News
Modded as funny? This is as real as it gets. At least in the private sector.
TFA: "Especially if you have to write in a language like C or C++?"
Why would you HAVE to use C or C+ or C*+**+++? I don't mean to be a troll, but if you are writing in an inherently insecure language (i.e., any compiled language) you aren't going to get secure code.
OTOH of you write in, say, assembly, you are setting yourself up for the complexity. You have to make sure your buffers won't overflow, as opposed to leaving it to the compiler writers.
As to overflows, if you KNOW your language is prone to overflowed buffers, it seems wise to check for overflows with your own code. After this long, there really is no excuse for buffers that overflow. It isn't hard to check for the length of a string, after all.
If bridge engineers were as lazy as programmers, bridges would be falling down by the hundreds. My 1992 car is full of hundreds of thousands little bitty moving partsand fluids, but as long as I keep clean oil and filters in it, it doesn't break. My last car was an 1988, it lasted until last year. But I have to replace my 2002 Microsoft operating system because it's not secure? Somebody is making a lot of money off of poorly designed and poorly built software. There is no reason why I should have rto replace an OS.
There are reasons for program errors, but no excuses. If your code is shit, it's shit because you wrote shit. Either you're incompetent or lazy. "You can have cheap, secure, or fast. Pick two."
I think it was Knuth who said, "In theory, theory and practice are the same. In practice, they are not."
In theory, for any nontrivial program, you cannot know absolutely that it is secure. You cannot even know that it will terminate. The Turing showed that there is no algorithm which will decide if a program will halt. Most other problems of program behavior can be reduced to halting. (Just place a call to exit() immediately after the code that outputs the behavior in question.) In general, there is no way to prove that a program has any particular property that can be reduced to a termination property.
The choice of language does not matter, either. Turing used a language that was very primitive, even compared with the simplest assembly languages. But Turing's language is equivalent in computing power to every modern general-purpose programming language. Church's completeness hypothesis is widely accepted as valid, though a proof in the strict sense cannot be written. So, Turing's mathematical proof of the halting theorem is valid for every modern programming language.
There are some programs for which we do know that the program is correct. Such programs are all very small, solve well-defined mathematical problems, and are written in well defined functional programming languages. These proofs depend on very careful, mathematical definitions of the programming language, and of the function to be computed. The programming language is, strictly, an algebra. The proofs simply show that the algebraic formula (the program) transforms the algebraic input to the correct algebraic output. In every case, such proofs are quite difficult and tedious. And, as noted above, they are not possible in the general case.
In practice, we can apply methods that are known as "engineering". That is, we can apply logic, design, inspection, review, and testing to develop some amount of confidence that it will behave as expected. But, engineering methods do not provide certainty. They only provide high confidence. The choice of language and tools have some effect on the ease or difficulty of doing the engineering work, but do not change the boundaries of what is possible.
How do we "know" that a bridge will not fall down. There are no proofs of bridges. There is only engineering. Engineers apply logic, experience, design, inspection, reviews, and tests, so that they can have confidence in the design. The confidence is based on statistics. For a given shape of steel or concrete, we can measure loads that cause the steel to fail, and we can measure the variance in those loads due to the manufacturing tolerances of the material. When we use that shape and material to build the bridge, we can have statistics about how much load the bridge can support without failing. But even with all that engineering, sometimes bridges do fall down. The load measurements are only statistics, not proofs. There is always a confidence interval around every measurement, and the confidence can never be 100 percent.
We can never have absolute proof of any property of any real, nontrivial program. We can have confidence as close to 100 percent as we want, if we spend enough effort on the engineering.
Just get others to formally review it so if anything is found, there's collective responsibilty
Yes, that is funny, but there is truth to it as well (which is why its funny).
Security, software development, and everything else is a process, not an event. It gets better over time, and basically, the way that issues come out is for them to be found "in the wild". And as these issues are found, better tools and techniques make the process better over time, but I don't envision a world where people just think of bugfree, usable, featureful software that just appears, but all in all it keeps getting better.
The error in question:while(lp != (struct listelem *)0) {
free(lp);
lp = lp->next;
} is pretty silly, and I don't know how it took over a decade to find that. In my experience, code like that crashes pretty regularly, and debugging it will point to the error.
Today, what some programmers do is to do FREE(lp); where FREE() is a macro or something that does if (a) { free(a); a=NULL; }. This prevents double frees, and ensures that future use of the pointer will predictably die with a null pointer exception. In 2006, bugs like this should not find themselves in C code. We now check our stuff, use languates or tools that check for crap like this for us, or whatever. In 1994, I guess it was OK for such a bug to be interoduced into code, but not in 2006.
If you think that the flexibility and usefulness provided by C++ is present in Java or C#, then you are only using it as a nicer C.
Let's not forget their wonderful documentation! Complete and accurate API documentation is absolutely necessary for writing secure and reliable software. And of course the programmers should actually read the documentation and check all the details of the API calls they are using (return values, etc...)!
As a C/C++ developer I am a little offended by the article summary. Certainly C/C++ has a lot of flexibilities that allow bad developers to write bad code. However, many other languages, e.g. Java, allow bad programmers to write code that looks good because of stronger type checking, reduced use of pointers and the like. However, nothing stops a bad developer from writing insecure code in any language. Maybe you don't manage your resources correctly. Maybe you do a bad job of implementing encryption/protected storage. Maybe your authentication scheme is weak, your site is vulnerable to cross-site scripting vulnerabilities, or your session data can be easily spoofed.
Secure code is not a product of language, it's a product of developers who take the time to fully understand the tools that they are using to build the product, including the ins and outs of their language of choice and its key risk elements, and who research risk elements for all other parts of the system.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Ada is comparable to C and C++ in the area of "efficiency/performance." The misconception which you propagate disappears when fair comparisons are made. Remember that Ada is used in many embedded real-time applications; indeed, that is much of the reason that it came into existence in the first place.
Seems the process follows the laws of construction instead, as aggregated below:
:-).
1 - measure with a micrometer
2 - mark with chalk
3 - cut with an axe
4 - if it doesn't fit, use a larger hammer
Insert