Secure Programming Cookbook for C and C++
The Target Audience of the Book In the foreword to this book Gene Spafford observes that there really are four types of programmers:
- Those who are constantly writing buggy code, no matter what,
- Those who can write reasonable code, given coaching and examples,
- Those who write good code most of the time, but who don't fully realize their limitations,
- Those who really understand the language, the machine architecture, software engineering, and the application area, and who can write textbook code on a regular basis.
There are, as Spafford claims, too many people in category 3 who think they belong to the category 4, and that's the primary target audience of the book. John Viega and Matt Messier co-wrote Secure Programming Cookbook for C and C++ not with the intent of proving the necessity of application security, as they mention in the foreword, but to illustrate its application. If you're reading this book, you are probably well aware of the security needs at your workplace or in your projects, and you would like to have a large library of sample code for various operations.
The book has yet another Web site, and since John Viega didn't mind a little slashdotting during the launching stage, so he probably won't mind another link to SecureProgramming.com.
The Book Itself The structure of the book will be familiar to anyone who has read an O'Reilly Cookbook before. The "cookbook" part of the text is nothing more than a collection of solutions to common problems. The code is generally of high quality and written by an expert in the field. What's more important is the discussion section following the code, which explains why things are done in a certain way, what alternatives exist, and what are the best practices in the field.Viega and Messier have expanded the discussion session, basically doubling the content, by introducing separate Windows and Unix sections where applicable. The reader has a chance to peruse the code for both platforms as well as read separate discussion sections, which helps in navigating the content of the book.
Microsoft platform developers, though, will only be introduced to native Win32 API -- the authors chose to ignore the STL/ATL/COM/DCOM/.NET solutions on the assumption that those could be derived by someone closely familiar with the lowest-level API available from Microsoft. Even though the discussion section is quite detailed and informative for both Unix and Windows developers, the authors do not discuss the design and architecture issues behind secure programming in C and C++. That falls outside the scope of this book; besides, John Viega co-authored Building Secure Software , where a lot of attention is paid to the philosophy of secure programming as well as initial application design with security in mind.
The Contents You can view the table of contents on the O'Reilly Publishing Web site, and with the cookbook format, it's pretty much WISYWIG -- whatever the title of the subchapter is, you will be introduced to the nature of the problem, followed by C/C++ solution, followed by the discussion of the subject with occasional URLs to relevant information on the Web.
Just to sum it up, usage of encryption, message integrity checks, symmetric and public-key cryptography and secure programming get a lot of attention. With 41 recipes (Chapters 4 and 5) on symmetric encryption and 29 (Chapters 7 and 10)on PKI-related code snippets, you can get your yearly supply of Unix and MS CryptoAPI examples.
But this book is not entirely about encryption, since current security problems are rarely caused by the encryption algorithm failures. The networking and Internet-related programming issues are covered in Chapter 8 (Authentication) and Chapter 9 (Networking). In Chapter 3, those designing Web interfaces will find some useful examples of validating the input URL and checking the SQL string against injection attacks. Admittedly, such examples would serve a better purpose in Perl/PHP/ASP, however, anyone familiar with C should be able to derive their own variations of the algorithm. Chapters 1 and 2 provide a great deal of insight into operating system specifics in regards to such system security issues as environment variables, spawning child processes, revealing memory dumps, using temp files on Windows and Unix, etc.
Off-the-beaten-path chapters include information on random numbers (the chapter is available online for free) and preventing tampering with applications. The random number chapter would be interesting to both professional programmers with good math skills and beginners in the computer programming field writing their first number-guessing C++ game. Recipes on gathering entropy and access to standard Windows/Unix APIs for random number generation are of great practical use. The application tampering chapter was probably the most informative thing for me - great collection of information, rarely found in other application or network security publications. How do you protect against software piracy by using checksums? How much time should you dedicate to software protection? What is the theory behind code obfuscation? How do you hide ASCII strings in data segment? How do you detect modern debuggers? The answers to such questions are usually fragmentary and are usually considered either intellectual property of the company or belong to a 'warez' site, where the quality of sources is questionable.
Is the Book Useful? This book is a great resource for quick look-up of readily available solution (I've read it online on Safari, so I cannot vouch for the usability of the paper edition when searching for information). I've written a Master's thesis on this topic (although my actual topic was way more narrow than the scope of this book) and still found a lot of great information. If you've never seen C/C++ code or feel uncomfortable with Unix/Windows API programming, you will probably find the Cookbook overly technical. A higher-level application security text is available for those new to the subject (besides the Building Secure Software title mentioned above, there's a great title called Writing Secure Code from Microsoft), while this book gets into dirty, nitty-gritty details.
Yeah, everyone and his brother knows how to implement a symmetric encryption algorithm, but how do you actually do it without compromising the system and introducing new possible loopholes? The cookbook answers questions like that, and, as mentioned above, provides detailed overview of programming strategies for the two most popular platforms. Taking the cookbook concept further, this book teaches you how to make a basic ham-and-cheese sandwich as well as fine cuisine. Too often the code measures for basic security and preventing buffer overflows are summarized in higher-level concepts, thus allowing the developers to make errors even with the most trivial applications. If you're a professional programmer and do not get tired by looking at sometimes profuse code examples, this book would probably be a good read from the beginning to the end. If C/C++ is not your preferred area, the usefulness of this title decreases severely, however, it might serve as a good reference.
You can purchase Secure Programming Cookbook for C and C++ from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Ref: Amazon has this book for $5 less than bn and with free shipping
I would send a free copy to openssl staff
Don't use system(), and be careful with buffers.
Don't.
Save your wrists today - switch to Dvorak
Headline: Previous Knowledge Required Before Reading Technical Book
Film at 11.
int main(void)
{
system("python mainapp.py");
return 0;
}
there are too many people in catagory 1 that think they are in catagory 4.
funny enough, if there code was a hurricane, it would be at least a catagory 4.
The Kruger Dunning explains most post on
- Those who are constantly writing buggy code, no matter what,
-
Those who can write reasonable code, given coaching and examples,
-
Those who write good code most of the time, but who don't fully realize their limitations,
-
Those who really understand the language, the machine architecture, software engineering, and the application area, and who can write textbook code on a regular basis.
There are, as Spafford claims, too many people in category 3 who think they belong to the category 4, and that's the primary target audience of the book.Well, I don't need this, since I'm in category 4. Instead of reading this nonsense, I'll go finish my Visual Basic project.
I'm just curious...
Too damned many of them.
-bofh
What we really need is a simple security lib that can easily be retro-fitted to the majority of applications. The application's code wouldn't have to change, as it would be "wrapped" or protected by the "security" library. ie:
// do stuff
#include <sb.h>
(void) init_secure();
(void) done_secure();
I mean, we've put a man on the moon, but we can't make a software library with a simple interface that solves all of our problems? Puh-leeze! I think the many-eyeballed monster that is the open-source community should turn its attention to this important issue.
There is a fifth type of programmer, not covered by the categorization mentioned above: Those who really understand the language, the machine architecture, software engineering, and the application area, and who write code which is absolutely antithetical to anything you'd find in a textbook.
I, for example, severely abuse short-circuit evaluation -- I'll often put five or more function calls into an if() conditional, ||ing their error conditions together -- but there's nothing wrong with that; you'll never find it in a textbook, but once you're used to reading that sort of code, it is more compact, easy to understand, and easy to maintain than the alternatives.
Tarsnap: Online backups for the truly paranoid
I'm with you 99%.
There are, as Spafford claims, too many people in category 3 who think they belong to the category 4, and that's the primary target audience of the book
:-)
People who think they are in category 4 wont buy the book, because they belive they dont need it. So who is the target audience anyway?
Is this supposed to be some kind of joke?
"Those who would sacrifice liberty for security deserve neither!"
There are 4 types of programming book authors:
...
1. Those who categorize programmers artificially for the sake of a point.
2. Those who categorize programmers incorrectly because they don't know better, but for good reason.
3. Those who categorize programmers because they figure that, by doing so, they will establish themselves as an authority on ranges and types of programming skill.
4. Those who avoid categorizing programmers because they realize that it's kind of goofy to do so.
Everyone knows that there are folks out there that can do their job better than others. But do those categories really exist? It may seem like I'm picking nits, but is there really a class of programmers that writes buggy code almost all of the time? I mean, I suppose there is, but it doesn't seem to me like they'll have a long career in software
Chr0m0Dr0m!C
...there's a great title called Writing Secure Code from Microsoft
Would you ask a starving person for opinions on food?
if there code was a hurricane,
i hope you dont code the way you write --
their code
maybe you're the reason why I keep on getting those damn out of memory errors...
calling the wrong memory reference....
We're like rats, in some experiment! -- George Costanza
Password sniffing, spoofing, buffer overflows, and denial of service: these are only a few of the attacks on today's computer systems and networks. At the root of this epidemic is poorly written, poorly tested, and insecure code that puts everyone at risk. Clearly, today's developers need help figuring out how to write code that attackers won't be able to exploit. But writing such code is surprisingly difficult.
Secure Programming Cookbook for C and C++ is an important new resource for developers serious about writing secure code. It contains a wealth of solutions to problems faced by those who care about the security of their applications. It covers a wide range of topics, including safe initialization, access control, input validation, symmetric and public key cryptography, cryptographic hashes and MACs, authentication and key exchange, PKI, random numbers, and anti-tampering. The rich set of code samples provided in the book's more than 200 recipes will help programmers secure the C and C++ programs they write for both Unix(R) (including Linux(R)) and Windows(R) environments. Readers will learn:
How to avoid common programming errors, such as buffer overflows, race conditions, and format string problems
How to properly SSL-enable applications
How to create secure channels for client-server communication without SSL
How to integrate Public Key Infrastructure (PKI) into applications
Best practices for using cryptography properly
Techniques and strategies for properly validating input to programs
How to launch programs securely
How to use file access mechanisms properly
Techniques for protecting applications from reverse engineering
The book's web site supplements the book by providing a place to post new recipes, including those written in additional languages like Perl, Java, and Python. Monthly prizes will reward the best recipes submitted by readers.
Secure Programming Cookbook for C and C++ is destined to become an essential part of any developer's library, a code companion developers will turn to again and again as they seek to protect their systems from attackers and reduce the risks they face in today's dangerous world.
I have over 70 freaks, do you?
I read the sample chapter and the table of contents, and this certainly looks like a very useful book for developers.
/dev/urandom. ... I wonder if this book is out on Safari yet.
The random number generation chapter is excellent. Many people overlook this necessity in cryptographic applications unaware that flaws introduced by insecurely random (i.e. predictable) enrtropy sources can render the best PKI, ciphers and authentication mechanisms crippled.
One of the reasons I tend to drool over VIA hardware is that their MiniITX EPIA systems have support for hardware entropy on the CPU via the Nehemiah core, which is also available for a wide variety of OEM/embedded applications.
This means you can use a highly secure entropy source (/dev/hw_random in linux for example) for all of your cryptographic applications, from GPG to ssh to the linux kernel itself (IPSEC). And best of all, you never have to worry about the entropy pool blocking, or reverting to less secure PRNG like
"a book which he says is useful -- but only if you have the background to use it"
So, uhh...it's useful, but only if you can use it...
"You tried your best and failed miserably. The lesson is...never try. Heh!" -Homer
And it's the perfect holiday gift for that special Microsoft Programmer you know and love!
I'll tell it out loud flatly, the reason is because it's not a "my system is better then your system" kind of book from what it seems. Those are the books that annoy me the most "Well, you see, you could be using ASP, but then your app would be WAAAAAY more insecure."
On top of that, actually seeing equivalents of the same code on both system families will be a nice intro to some, including me, for equivalent APIs that we didn't know existed in other systems.
Btw, the Secure Coding book by microsoft is really good too (very few actual API references, so it's not really microsoft platform targeted).
OPEN Skull, and poke around with stick.
Calm down, think of Hilary Rosen and tell us a) who is Bryanna, b) what does she look like and c) why would it bed to get reamed by your boss?
BOO! TERRO
....can be found here. My favorite:
"You're proposing to build a box with a light on top of it. The light is supposed to go off when you carry the box into a room that has a Unicorn in it. How do you show that it works?"
The Army reading list
You'll be surprised. Guess what, the guy who wrote the book really knows how to write secure code, and the book really teaches you a lot without offering many pre-cooked examples. This is a good thing. Helps you code with security in mind.
AC comments get piped to
to use the Whitespace language!
How much does the programming language matter?
Posted by John Viega on Mon, Sep 15, 2003 (07:59 AM) GMT
We've now been slashdotted. After lowering the idle connection timeout from hours to minutes, we're doing fine (famous last words). The comments are full of "C sucks" rants. I thought I'd summarize a few of my thoughts on this issue.
Yes, C and C++ have special "features" that make adding security problems easy, even for a fairly informed and careful developer. That's impossible to deny, though the book and this site do cover mitigation strategies that can make a big difference. However, people are miscalculating by assuming that just switching to another programming language is going to make a big difference. It can make a difference, but not as big of one as people are expecting. Defensive practices can offset the problem.
We've done a few case studies on number of defects per line of code when performing code audits. C and C++ programs have averaged 4-5 security-critical defects per thousand lines of code. Java programs still average 1-2 security-critical defect per thousand lines.
There are plenty of problems that programming languages themselves haven't fixed. And, honestly, most of those problems should be fixed at the API level. For example, it's stupid that neither OpenSSL or Microsoft supports full certificate validation by default. The programmer has to know what security checks to perform and write the code to do them manually, instead of getting "secure by default" behavior. As a result, most applications that use SSL/TLS are vulnerable to man-in-the-middle attacks. Sure, this is a problem in some common C-based libraries, but it's just as common in the SSL implementations for other languages. Other problems such as cross-site-scripting and SQL injection affect other languages far more commonly than C and C++, since those languages aren't often used in web apps.
In C and C++, the common security problems are relatively easy to understand, and if you are diligent and take the right preventative measures, they're not so hard to avoid. In other languages, the easy/obvious problems don't apply, but as people use high-level primitives to build complex applications, they tend to introduce complex security problems (race conditions in servlets can be quite tough to identify, and still have security implications).
In short, you aren't likely to accidentally end up with a "secure" program, no matter which programming language you use. We're currently working on a Java Secure Programming Cookbook, and are assembling a team for a PHP Secure Programming Cookbook. There's plenty of material for both books, without question. Expect both to be at least 400 pages, without even covering all of the low-level cryptographic stuff we cover in the C/C++ version.
At the end of the day, if you're going to be diligent, then security can be reduced to a fairly minor consideration in programming language choice.
One final note: C++ is often perceived as being more secure than C, because it has an abstracted string type. That's not really true, even ignoring the few cases where you can still overflow using C++ strings. Basically, heap overflows are far more dangerous in C++, because lots of function pointers tend to be stored on the heap, due to the way classes and exception handling is implemented (the GOT is stored on the heap even in C programs, but C++ programs tend to have function pointers coming out of the wazoo). If an attacker can overwrite one of those pointers, then it's often possible that he can replace it with a pointer to some sort of malicious payload.
What does that stand for? What I See you What I get? I think you'll find the acronym is WYSIWYG.
Doesn't claiming to be a transcendent #5 type actually make you a Number 3 (almost by definition)?
Damn it!
I realy need a book on howto write buffer-overflow free Java code!
Thanks for the good review. A few minor points:
1) All of the book's code is available on our web site. The web site is probably the right place to go to to get the code, just because we can update it when there are errata (and you don't have to copy it out manually if you want to use it).
2) This is an implementation-focused book. You're right to refer to other texts for architecture, and besides my other book, the Microsoft press book you recommend and David Wheeler's free online HOWTO are both excellent (though I personally think the O'Reilly entry into that space is poor). At the same time, we do end up covering many aspects of good architecture in the discussion. Providing the context for a good implementation requires an understanding of the architectural issues, at least to some degree.
3) We have had several people tell us that they find the book very useful for other languages as well. I think it covers a lot of low-level implementation stuff that isn't available in other books, and is useful as long as you can READ C code. If there's anything people want to see for other languages, etc., feel free to send us email suggesting it. We will have frequent updates to this web site with new content (at least monthly). Much of the content will be for other languages.
if I don't have the class headers?
And don't argue that C is a subset of C++ - is isn't. It never really was (look at the output of "printf("%d", sizeof('x'));" for a start, never mind "#include <string.h>" vs. "#include <cstring>"), and they evolved independently since. Today, C contains lots of stuff not in C++, like variable-size arrays, some fixed-sized integer types (like "32-bit integer" or "integer as large as a pointer"), the boolean types are incompatible etc.
One would expect that people who talk about programming languages, or even write a book about them, would know what language they are talking about.
Programming can be fun again. Film at 11.
Marketing.
People are more likely to buy a product if they think it's specifically designed for them. Those four categories serve that purpose.
Please observe how the description of the third category has been made as broad as it can be. Basically the author is saying that the book is not targeted at you if you are the worst programmer in the world, not a programmer, or Donald Knuth. Such an asymmetric categorization can only be for marketing purposes.
-- Repeat with me: "There is no right to profits".
to help me buy a copy for Microsoft?
Do you have ESP?
for those of you with MCSE certification, please be advised that does not count as "Previous Knowledge."
The same goes for "learning facilitaters" employed in the public school system.
You can't judge a book by the way it wears its hair.
Ever take a look at the original code for the Bourne shell, for example? It looks like... a big shell script. Why? Well, due to a *lot* of #define abuse. It's quite scary.
pb Reply or e-mail; don't vaguely moderate.
Somebody send a copy to Microsoft...
Is this the same Matt Messier that was/is the MudOS maintainer?
here
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
Post pics of the hot mommas with big sweater meat plz. :)
This is mostly me talking out of my ass. I never really formally studied any of these topics, just some stream of consciousness stuff that I thought of last time our workplace went down from Blaster.
...).
I was at lunch recently with a friend who happens to work with me. We talked about the Von Neumann programming model. Any code that runs on a current processor ends up being put into a finite state Von Neumann machine. Machines have gotten so complex now, with so many services exposed, that though it's in theory still a finite deterministic state machine, it's essentially become an infinite state machine. There are too many places for things to interact, too many places for "gee, I never thought of that". Some of these are marketing related - "so how about we have IE and Visual Basic integrated into Outlook. That will be a cool feature we can sell, and there's no way that can bite us in the ass...", some are people who fit categories 1, 2, and 3.
Language does matter, but insomuch only as a filter. A lot damage I've seen comes from Outlook running VBS. Language choice reduces some (possibly many) of the states. But it is only a filter; at some point code has to meet silicon. C/C++ gets the worst reviews because it does the least amount of filtering. Hell, not being a filter is C's core design principle. In an old non-memory protected OS (DOS, Win16, MacOS = 9) you could essentially hit anywhere that wasn't kernel memory. Modern OSes reduce the possible states somewhat by protected memory spaces; you can't trash anyone elses code because you're in your own VM space. It also reduces some damage by shutting down the app if it knows of a bad access (out of program space) but it can't eliminate all because it only has a very course idea of the app going to an unintended state (e.g. SIGSEGV, SIGBUS, SIGFPE
Other languages are more effective filters, because they were written with that in mind. Java is a much more effective filter, eliminating many states, but not all error states. Java can't prevent you from having all your code being world writeable.
I guess one thing I'm asking is what research is there in non-Von Neumann architectures? And since we're probably not going to a post Von Neumann world any time soon, what lessons can we take from these to help secure our machines? In a more and more connected world where my fridge will eventually tell me when I'm out of milk, what can we do?
I also think tools suck. We still live in a C and C++ dominated source world. And while the tools to help reduce that complexity and unintended states have barely advanced (how many people think machine aided code analysis means to add -W -Wall to their compiles) the complexity (think how many more lines of code are in by default in your basic Qt app vs. hello world; how many more lines of code in your OS?) and threat (EVERYTHING is connected to the Internet, everything is under 24/7 threat from some bored kid in Mozambique). Where are the tools? Why doesn't splint do C++ code? Everyone and their mom has a combo IRC client/Text editor, but some of our basic infrastructure tools are aging very ungracefully.
Hmm, rant mode off. =)
Your fifth type of programmer fits in either the third or fourth category.
0 ));i++) notifyManager(x,z);
//Give raises until someone does not deserve one or an error occurs( record))>0) //get next employee //get number of days, errors return 0 //get amount of raise, verify it is positive number //Next record //tell the managers so they can verify before telling the employee.
3. Those who write good code most of the time, but who don't fully realize their limitations
If you write code that is difficult to maintain for the fun of having written it, you belong in the category of those who should be limited. Let another programmer review your code before it is committed. If any line takes more than a minute to understand, he either rewrites it or passes it back.
4. Those who really understand the language, the machine architecture, software engineering, and the application area, and who can write textbook code on a regular basis.
Part of software engineering is knowing that code will require maintenance someday. The code is easy to understand. The code itself code be complicated, but comments make certain the next programmer knows exactly what is happening.
severely abuse short-circuit evaluation
It is the difference between:
for(i=1;((x=getEmployeeNumber(i))>0)||((y= getDaysSinceRaise(x))>0)||((z=calculateRaise(y))>
And
for(record=1;
((empnum=getEmployeeNumber
||((days=getDaysSinceRaise(empnum))>0)
||((amount=calculateRaise(empnum, days))>0));
record++)
notifyManager(empnum, amount);
This is the same code, but the latter is more understandable.
It is not the code that matters, it is how you use it.
I spend my life entertaining my brain.
Mark Messier. Only he has a bit more hair and is unlikely to be forced into retirement anytime soon (unlike Mark).
I bought a copy of the Building Secure Software book, and it wasn't a bad read -- but the one flaw I did see was the code examples, which had too many correctness issues to be considered secure. The first rule of writing secure code in C or C++ is to avoid reliance on undocumented behaviour. "void main" is a nonstandard extension, casting from a char* to an int* without knowing if the alignment is suitable is undefined behaviour, and so on. Unfortunately I don't have my review notes from this book with me, so I can't give all of the specific defects I found in the code snippets.
Could this code not be reviewed by a C or C++ "expert" before publication? At least someone who can spot things that could cause a compilation failure or a core dump?
C and C++ inherently increase the risk of security problems because they don't check buffer bounds, so the program ends up with all of the security problems that would've existed in a buffer-safe langauge, plus buffer overflow vulnerabilities.
I want to know why the language should check for buffer bounds. I think a class system should do this. I think a class system knows (or can know) what the memory is for, or at least the pattern of it's usage, and these are assumptions you do not want to make at the language level.
Why is your argument to use a different language, when you could easily just adopt the policy of using a particular memory management strategy, in class you write or buy.
As far as being happy they found a buffer overflow, imagine how happy I was to find a memory leak in a java program! well, not too happy actually.
-pyrrho
To the ACs who noticed that they should have been &&s, not ||s, you are correct.
About the lousy comments, you are also correct. Some of my code does look like this, because I would start by writing an English version, comment it out, and then insert code. The redundant comments get removed as time passes and the code is revised. My comments tend to focus on the business reasons, with a few to explain more complicated code or to warn about possible failures. I agree that "Code Complete" is a good book for programmers who want to move to category 4.
I wrote the example quickly to point out that using chained "||"s in an "if" were not necessarily bad, but if it made the code difficult to understand, then it was bad. So there was no fifth category of programmer, since someone who coded this way would still fit into the first 4 categories. (A case could be made that many who program this way would be category 1.)
About the lousy code, sorry. My first example had functions with names like "action1()", and could have done anything. There were 4 examples of code, with the last one attempting to get cute by finding a possible business use for the code. I redid the other examples to match the last version, then deleted the middle ones to save space. I had not noticed that I needed to switch from ||s to &&s to make sense. I almost never code that way, since it does make the code less understandable. Try working on code you wrote 10 years ago. You will wish it was very easy to understand. Then imagine someone else trying to work on it. I do not want the negative karma from having that other programmer curse at me.
I spend my life entertaining my brain.
what research is there in non-Von Neumann architectures?
Take a look at TTA. Probably the coolest computer architecture ever. Processors designed to be so simple that they don't even have an instruction set. I read about this concept years ago, in Byte, but the idea has never really made it out of academia.
IIRC, it was originally intended for massively parallel computing (possibly as a backend for Lisp programs), which it was suitable for because it increases the granularity of operations.
Someone should write a similar book, but with recipes for what *not* to do. "Don't Do What Donny Don't Does". :)
...
Oh, wait. Just go buy any book from Microsoft Press
...you will realize that writing beautiful, readable code takes more prowess, discipline, and skill, than writing "cleverly convoluted" code.
1) Those that are to lazy to write good code
2) Those that are truly incompetent for the code complexity they
currently have to develop
3) Those that think they are in category 4 instead of category 1.
4) Those that would be in category 2 if given the proper coaching
RogerWilco the Adventurous Janitor