Fix the Bugs, Secure the System
LiquidPC writes: "OpenBSD's Louis Bertrand has put his MUSESS 2002 presentation online, entitled
Fix the Bugs, Secure the System. Does an overview of OpenBSD, then explains Format String Ugliness, Buffer Overflows, The Wrong Way to Fix Overflows, along with numerous other things."
that doesn't mean there are 20,500 unique bugs.
It was a bit tedious flicking through all those slides but the final one did bring a smile to my face.
Sure, the kiddies can still twiddle with system calls, but if they can't put _their_ code somewhere where _they_ can execute it, it raises the difficulty level of creating an exploit by an order of magnitude. Sure, false sense of security, blah blah blah, but really, shouldn't this (non-exec stack) be a standard feature of any OS that purports to be secure?
Damn it's tough to code in C these days, keeping track of all the stuff that one needs to to be reasonably secure.
Not to mention the added overhead of making the system secure from semantic errors. Yeesh, it's a good think I get paid a lot for my C work.
But that's all okay, becuase (finally) technology, like Java, C# (okay this one sucks but whatever), etc that will help out and provide a truly _secure_ development platform.
I jsut hope they still pay me as much when this stuff finally gets easy, like it should be.
But then I guess producing a high quality operating system keeps then busy enough...
Programming can be fun again. Film at 11.
Just becomes something does something in error doesn't mean its exploitable. If say the newest OBSD distrib forgot to provide a copy of disklabel, that's a pretty serious bug. You can't do a fresh install. A denial of service? Hardly. If the /etc/services file was missing an entry for httpd, it's an inconvenience, but still a bug.
Maybe I've been trolled, but thought I'd clear that up. A bug is an error in that a piece of functionality isn't right. An exploitable program or process can be a subset of it... that is, if being exploitable isn't part of the original plan.
-
ping -f 255.255.255.255 # if only
Just searching for 'OpenBSD Bug' on Google Groups retrieves over 20,500 queries.
Searching for "Brian bug" on Google shows 441,000 hits. Clearly you're 20 times buggier then OpenBSD, so I wouldn't be slinging implied accusations around.
What's the point of a rock-solid operating system if very few are actually using it (and of course, that happens because of lacking features)? For a server security is always the second issue - the first being the service provided.
(I'm definitely exagerating here, so flame me as you like)
The Raven.
The Raven
One of the problems with secure programming is the inertia in the computer industry; most of the operating systems in widespread use today (The *nix clones and DOS derivitives, these days) we developed in a time when security did not matter; *nix has a crude root-or-not security model and MS-DOS has no conception of security at all.
Personally, I think the solution is a model which has a real security model, such as EROS. The "audit the code so that it is perfect code without bugs" approach to security does not always work, even with OpenBSD.
- Sam
The secret to enjoying Slashdot is to realize that it should not be taken too seriously.
The skeleton in front just left of the middle? The one with a beak and wings?
:-)
That was a penguin.
with the same technique, searching for '"OpenBSD bug"' (note the quotes) returns only 93 results.
but this is only using the same yard stick.
beat yourself which ever way you want.
Note that this was google groups, by the way, not generic google search.
on the generic google search, with quotes, the total results are 352 for "openBSD bug"
"It is a greater offense to steal men's labor, than their clothes"
ok, you're just full of it now. Most businesses look at FreeBSD as a sane unix OS. Linux on the other hand is almost communistic. FreeBSD has allways been the better server OS over linux. Every single benchmark I've ever seen proves that. Sad thing is though, newbie sysadmins have this strange notion, due to posts like yours, that linux is easier to use. FreeBSD is simply server-orientated. Just because its not the most popular doesn't mean its not better. Let me further my point: I mean heck, windows is more 'popular' than linux. But who gives a hoot? (hint: whats this new vm for the linux kernel modeled after?) And, this is especially critical in proving BSD isn't dead: Mac OS X uses BSD. Hello! Apple choose bsd for its core, and now Apple Computer sells more copies of a unix-based OS than ANY OTHER COMPANY. More than RED HAT.
While we're on this topic, this Secure Programming HOWTO for Linux and UNIX might be of interest. It's a pretty comprehensive book. And best of all, it's free! :-)
Linux is for the windows convert. FreeBSD is for the unix convert.
Linux continues to copy off FreeBSD - just look at the latest VM work being done to the kernels.
I don't care whats popular - if we went by popularity, we would be saying linux was dead.
SCREW THE NUMBERS, BSD FOR EVER!
If this had been converted from presentation-style to an actual webpage, it would have been deemed a big waste of time. Where is all the information? There isn't even anything new here, I already knew everything there, and I've only been using OpenBSD for a couple weeks.
The only thing there was a long list of titles with no information, old or new.
Lack of eloquence does not denote lack of intelligence, though they often coincide.
Why is it that when MSFT does something like stopping to fix bugs and secure systems, we make fun of them, but if it's BSD we look at it as something we can learn from?
I'm a CS major, and we just got some sample code from the professor to help us on our first project. The very first thing it does in main is have a buffer overflow.
// BAM!!
#define SZ 100;
char buf[SZ];
cout << "Enter courses filename: ";
cin >> buf;
This is C++! We have the string datatype for this! There's absolutely no excuse for this--especially in code that will be referenced as "good" code by everyone else in the class.
So anyway, the point of this rant is that security will remain horrible until we start teaching people to write securely in the first place.
~~~LXT~~~
Life is like a computer program: anything that can't happen, will.
strcpy (dest, input); /***WHAM!***/
Here the code copies the input string to the destination, regardless of what size the input string is.
if (strlen(dest) => MAXLEN) {}
Here the code checks to see if the input data is larger than the buffer that it is being copied to, which is great and all except that it is being done AFTER the cpy took place. It's like drinking a bottle of clear liquid in a chemistry lab and THEN checking the label to see if it's sulfuric acid.
I'm no C expert either, so I may have missed something.
Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
That said, yeah, he should use cin.getline().
Hey, at least he used #define to set the array size. Wait until you get hit with a 100,000 line program to modify where the author didn't use #define...
Best Slashdot Co
I can't believe there is not one mention of using a language other than C. Is it the systems community? Is it because of BSD's history?
I don't know why this idea fails to even come up. Network servers are bandwidth-limited, not cpu limited, and writing them in a safe high level language is not only easier, but makes buffer overflows impossible. Being easier to write also of course allows more time for optimization and for other security fixes. (For those that need really high-performance for their gigabit links, maybe a C version and very careful maintenance is possible. For home users, this prospect is ridiculous.)
C seems almost *designed* to allow for buffer overflow exploits. If we want secure programs, we should be starting from more secure foundations!
For more detail, check my previous rant, "C lang remains inappropriate for network daemons": http://slashdot.org/comments.pl?sid=24271&cid=2629 013
For x86 with standard stackframe setup, there is an answer: length _MUST_ be less than (EBP - *ptr) if the stack isn't to be trashed. Note that other local data may well get trashed. But at least the pgm doesn't lose control.
The wrapper could drop early chars or trailing chars, but should signal an error in the unlikely event the code has been made with error trapping. Of course, this wouldn't work if the code was compiled with -fomit-frame-pointer [or equivalent], but there is a price for security.
I don't agree with your assessment that safe high-level languages necessarily perform badly. (What is the difference between speed and performance?) But, let's forget about that.
What is "OS-level" about an ftp daemon? BIND? Mozilla? Gnutella? All sorts of network (and other) applications are written in C, even though there certainly isn't any need for performance or device-level bit manipulation. (At least, I would place security way above performance!)
Cyclone is actually from Cornell, by the way. It's a good project for moving systemsy people away from C, but there are already mature programming languages that are not slow, and yet are secure by default. (Try SML or O'Caml, for instance.)
If you want to make sure people don't make a particular mistake, make it impossible for them to do so. That means you either 1) fix C to eliminate all buffer overflow issues (impossible, IMO), 2) enforce proper coding technique, possibly through a special string library and/or macros (very difficult on a project as large as an OS), or ditch C completely (virtually impossible given the size of the Linux code base).
What? I don't think you know what you're saying. In any modern operating system, it's not possible for one process to write over the memory of another. Furthermore, saying that this is a Java exploit when it necessitates another process in another language is totally missing the point.
If a hacker exploited one process this way, then why would he bother to exploit the java program rather than just execute whatever code he plans on executing?
You are still totally wrong and I WILL be surprised if something like that happens.
The reason for all this bufferoverflow crap is that in C, and thus also in C++, people tend to use arrays or blocks of allocated memory to represent strings. What's needed is a string datatype IN the language, like int and char. Then, the compiler can do as the CLR does: allocate the strings, even local scope ones, on the heap. This way, no buffer overflows can happen, since the type is in fact a black box, so the overflow will cause some kind of error, plus the overflow can't be used to modify the stackframe and thus the returnaddress, since the string variable isn't allocated on the stack.
In C++, there is the string class in the std lib, but it's not native to the language. (almost native ok, but not totally like in C#).
C is a language where the respect for the borders of a block of memory is in the hands of the developer. Clearly, that's too old fashioned today, since languageelements can prevent mistakes C allows developers to make.
Never underestimate the relief of true separation of Religion and State.
Ofcourse this is a hit on a newspost containing the quote "I did some OpenBSD bug research, and found that there are none". One reply states that "OpenBSD bugs are dying" and the other 91 results are AOL "me too" replies to the first post.
karma capped
Since strncpy() does exactly the same thing, just don't bothering always NUL terminating the resulting string.
Data discarding can be detected by checking return values, you can't do much against people not checking the result of their call. The question is, what API is the less troubling ? strncpy() or strlcpy() ?
buffer overflow.
A function should always throw out data that doesn't match its parameters. If a function expects an int and the user passes a double, it gets changed back to an int. The user's data gets lost, but thats his fault for using the program incorrectly. Every C compiler known to man behaves this way. Why should strings be any different?
No, Thursday's out. How about never - is never good for you?
Learn it in Python. Really. Python 2.2 offers a whole host of lovely functional-programming features. Continuances, even. :)
I prefer to write functional code in LISP or Scheme, but I won't sneer at someone who uses Python functionally. It might lessen the learning curve for you, let you experiment around with functional programming, and then use what you learn there in Scheme, LISP or Ocaml.
The problem is that most univerities out there still only have a CS program, not a SE program. I've been ranting on this topic for at least a dozen years or so.
The head of the CS department of my old college is a friend of my Father-in-law, and they don't see the problem - which is why they keep producing people with CS degress, and they can't work in the real world
-- 73 de KG2V For the Children - RKBA! "You are what you do when it counts" - the Masso
Not to flame, but
/* this is only ever called from SomeFunc(),
"Four years without a remote hole in the default install!"
is nothing compared to MS-DOS's twenty year safety track record. That, and thousands of "potential" buffer overflows in realistically safe code like this:
int SomeFunc ()
{
char foo[5] = "Hello";
OtherFunc(foo);
}
OtherFunc(char * foo)
{
* whic passes a string literal. This is, of
* course, completely undocumented. You never
* read this comment.
*/
char * bar = malloc(strlen(foo)+1);
strcpy(bar, foo);
}
Yes, OpenBSD is a very nice OS, but no, it isn't a magic bullet.
It -is- relevant that it is used inappropriately. That it is so easy to do is exactly the problem. Scissors are a bad example. Try the Chevy Corvair. Sure, if you were really carefull it wouldn't blow up... The problem is just how careful you had to be.
Like any computer operation, strcpy() is safe given a certain set of invariants. In this case, the invariants are that both src and dest are non-null buffers, and that the src is of at most equal size to the destination buffer. However, the only way to know this is to know the size of the src (either at compile time, or strlen()), and the size of the dest.
But since you already have to know the size of the dest, why not just include that as a parameter to the copy? You've eliminated the problematic invariant, and replaced it instead with the invariant that the length parameter you pass has to be correct. Since you have to know that anyway, this should clearly be better.
The only time strcpy() ever made sense was on machines so small that it was advantageous to -not- have to check the size. As soon as this was no longer the case (which i'd argue was as early as the C64), strcpy() should have become deprecated in favor of strncpy().
The enemies of Democracy are
Not with java. Exceptions are a normal part of program flow. Not of necessity, but enough of the standard APIs and documentation relies on them to make it fairly standard.
ahde said: Not with java. Exceptions are a normal part of program flow. Not of necessity, but enough of the standard APIs and documentation relies on them to make it fairly standard.
I don't buy that. Yes, just about any function that can signal an error condition does so by an exception. But if your code is correct, that will not happen many times in an execution. I.e., if you've got an inner loop that throws/catches an exception at every iteration, you're doing something wrong. Exceptions are, by definition, not regular program flow.
you need to do:
/* strlen() does not count the terminating null */
/* handle error */
/* you have to check this before doing your use strcpy() -- or else the damage will already be done */
/* check for null byte at last place */
/*ok */
/* optionally add it, or handle error */
char dest[MAXLEN];
if (strlen(input) >= MAXLEN -1)
{
}
else
{
strcpy (dest,input);
}
this still leaves two possible errors, if
input is less than or equal to MAXLEN but
not guaranteed to have a terminating null
character. You will either lose a character, or end up with an unbounded string.
you need an additional condition:
else if (strlen(input) == MAXLEN -1)
{
if (input[MAXLEN] == '\0')
{
}
else
{
}
}
or else, do the same as strncpy and call bzero() or memset() to fill the whole dest[] array with zero bits before copying. This is a little more expensive.
Actually, in a long-running system (such as a network server), a garbage collector is an advantage, not a liability:
1. Memory leaks are not possible.
2. Heap compaction IS possible (the garbage collector can move around data rather like DOS defrag). That means that the heap loses its fragmented nature when necessary. It's true that a C program does less allocation, but the malloc model doesn't allow for the heap to be defragmented! So for a long running program, you are typically stuck with fragmented memory that can't be reused...
So I say garbage collected languages win on this point!
strlcpy(dest, input, MAXSIZE);
or:
strncpy(dest, input, MAXSIZE);
dest[MAXSIZE - 1] = 0x00;
No need to go to all that trouble.
Bah, if your string is not null terminated, you are introducing a bug. Neither strcpy nor strncpy ensure this, so it is still left to the programmer to do it. If it is done to the source string (and as you said, you know the destination size) before it is ever feed to either function, which one you use is irrelevant.
If you force a null in your source at a location determined by the size of the destination prior to copying, then you've effectively changed the semantics of copying strings, which normally leave the source unchanged. Copying long strings into various smaller-sized buffers would have different behavior depending on the order in which you did the copying.
Forcing null-termination of sources is not the answer, because if you have a source that is not already null-terminated, then you have a bug elsewhere in your code. We're talking about strcpy()-related bugs here. That means you already have a string in a buffer of sufficient size to hold the string.
So by fixing the bug in the program (by ensuring all strings are null terminated, not by replacing every strcpy call with strncpy) you are also ensuring the program is secure (it is no longer vunerable to buffer overflows), wasn't that one of the points of the original article?
No, because the problem isn't strings that aren't null-terminated. The problem is strings that are too big for the space they are going to go in. It might seem like these are the same thing, but they aren't.
This doesn't mean strncpy is useless, for example, you may have to use a string you don't want to (or can't) vary by null terminating it (a constant perhaps), so you'll need strncpy to safely make a copy of it that you can play with.
Having a copy to play with is why strcpy() is used. If it was acceptable to modify the source, then you wouldn't make a copy.
The enemies of Democracy are
The bug being that you didn't ensure the termination...
:)
:)
No, the bug is that your input reading routines are unsafe. But regardless, the point is to fix that bug at the source, not fix it every time you're about to call strcpy().
Making assumptions about the input is the strcpy bug isn't it?
No, that would be a gets() or scanf() bug, in most cases (which, btw, are worse than strcpy()). The strcpy() bug is copying a string too big for the dest buffer, and that doesn't have to have anything to do with input. Not to mention that ensuring that your very large input is null-terminated doesn't stop you from having a buffer overflow when you strcpy() that string into a smaller buffer. I told you they weren't the same thing, but you didn't believe me.
Well you may want to store a copy of a string, the contents of which changes, there are plenty of reasons for duplicating a string, modification is only one of them.
If your intent is to copy source so you can change it later, well that's not really any different. In fact there is very little reason to make a copy of a string unless you intend to (or intend to allow) changes to one or the other that you don't want reflected in both. The only exception would be copying into a specific buffer for something like an RPC call. In any case, strcpy() is unsafe unless you modify the source, which as I said changes the semantics of copy.
BTW, most of those cases where you are making a copy of a string whose contents change -- such as getenv() or other functions that return a string pointer -- are bad because they aren't thread safe.
I just feel strcpy is offered as a scapegoat, for the causes of insecure programs (when I'm sure we both know it is poor programming practices). So I stand by original statement, strcpy does what it claims to do, if you use it to something when you actually wanted to do something else, more fool you. Forcing people to use strncpy instead of strcpy will not ensure safer programs.
If you think I'm saying not using strcpy() will make all programs secure, then you've misread everything I've said. With that understanding, realize that I'm saying strcpy() is a poor programming practice. It "claims" to do something that is fundamentally unsafe to do without even the basic check of having a length field.
The enemies of Democracy are
So if I assume that my source string will fit into my destination string, this is not making an assumption on the input to the function? It seems the same to me. :)
:P
Oh...Input to the function, in this case strcpy(). I was talking about input to the program. Input you read from a file (which includes stdin, of course) can be of any length, and assuming otherwise is a common error.
It doesn't however enforce that you can't check the field length before hand, the same way there is nothing stopping you from checking the value of a pointer before you try to dereference it.
And what do you do if the dest field is too small? You can't use strcpy() anymore. You either 1) modify the src to make it smaller (bad, modifies copy semantics), 2) exit on a failed assert() (bad, makes a condition unecessarily fatal) 3) use an alternate code path in this case (hacky, ugly, and would probably just be a call to strncpy()) 4) just use strncpy() in the first place.
Note that I'm assuming that if you are using a dynamically allocated buffer, you would just be checking the size and (re-)allocating a big enough buffer in the first place. That isn't a general solution, so I don't include it.
If I've already verified my assumption that the source string is smaller then the destination string (validating the input), I can safely use strcpy(). I could just as easily use strncpy(), and then check that the destination string is null-terminated (validating the output). Both of these methods still require validation, so I don't see the gain in using one over the other. Without validation, strncpy() is safer, the same way it is safer to be in a car with airbags then a car without if you never wear a seatbelt.
What do you mean, "verified my assumption that the source string is smaller than the destination string"? Look, if you had that assumption, then you should have stated it. Because that's not an assumption I'm making, nor requiring. It is a bug to use strcpy() when the dest is smaller than the source. You are getting around the bug by assuming that the dest is big enough, and checking that assumption. So above where you talk about making an assumption about input... That's an assumption you're making, not some theoretical bad programmer. Well, no wonder you don't see the gain.
The way to understand what you gain is to realize that there are two prerequisits for the bug: 1) use strcpy() 2) src bigger than dest. You try to eliminate the second condition by assuming it is true and checking (with unspecified error recovery), not realizing that all you need to do is eliminate the first condition! You've added an unecessary invariant to every string copying operation. An invariant, I might add, that it isn't all that uncommon to have be untrue, which is the worst kind of invariant to have. The reason why strncpy() is better is because it doesn't require this invariant.
strncpy( dest, src, dest_size); dest[dest_size-1] = '\0'; always works, no matter the size of dest and src, and without modifying the src, calling exit(), or any other such sloppiness. Well, obviously, dest and src can't be NULL and whatnot, but those certainly aren't caveats specific to string copying operations.
I hope it is clear why code that always works with minimal assumptions is better than code that only works with additional bad assumptions and an unclear and almost certainly undesireable recovery path should the assumption prove false. I hope it is clear why strncpy() is the former, and strcpy() the latter.
All that being said, I do wish that strncpy() would stick the null in itself. It only saves a line of code, but it's a line of code you always need and thus it only makes sense to roll it into the function you need it with. I usually end up making an inline function that just does exactly that. Actually, I just wish more systems supported the OpenBSD strlcpy().
The enemies of Democracy are
Not only is it an assumption I'm making, but (as I stated) also an assumption I'm verifying in the hope that I don't get mistaken for that theoritcal bad programmer.
:)
But you haven't indicated what happens when that assumption proves false. What do you do? No offense, but making this assumption when unecessary I think constitutes bad programming.
What happens when you want everything that is in src, not just what fits in dest? I guess that would be another unstated assumption.
That's not an assumption (that you want everything in src to be copied into the dest), that's a goal. You're then dealing with a special case of copying. But, taking that as a goal (which may be common, but is not an aspect of general string copying), you're in either 1 of 2 cases:
1) You can do dynamic (re-)allocation, because you have control over the dest buffer. If you absolutely have to have the entire string fit in the src, this had better be the case. I already addressed this case, last post.
2) You can't dynamically create the source at that point (the function in which you are doing the copying did not create the buffer, and can't re-allocate it). In this case, well, you've pretty well painted yourself in a corner by requiring something you can't control. What do you do?
Bottom line for this: If you need that extra constraint on operation, then you must provide it. Regardless, my 2 lines of code will always be safe, with the minimal amount of constraints.
Of course this gets messy because you then make the assumption that src has a reasonable finite size.
Finite is a good assumption. Reasonable is not. If you are requiring that you can copy the whole src into the dest, then you allocate your buffer right there and let malloc() decide if it is "reasonable".
I worry something like this will appear: "*Note: We use strlcpy() here because it is safer then strcpy() and strncpy().", forgetting of course to add the checks.
No doubt. Shitty books abound; no need to wait for them to be written. Yet here's something curious: strncpy() without the check is still more secure than strcpy(). So long as your dest size is always correct, then even if you have non-null terminated strings floating around, you'll never write more than the target buffer can hold. Thus you won't have buffer overflow. Though you'll probably segfault first time you call strlen() or printf()
In the end I feel that the copying of strings is not a trivial exercise in C, and if people don't validate (ie check) their assumptions (or even realise they are making them), then trouble will ensue, not matter what standard function they use.
That is true, but also meaningless. It's the same as the "all OSs have had security issues, so you can't compare security" argument. C does require care, but that doesn't mean there aren't good and bad functions, or good and bad assumptions/invariants. Part of being careful is picking the right tools. "C was not designed to be safe" isn't an excuse to go sprinkling your code with set/longjump() or gets(). And it doesn't get around the fact that strncpy() is preferable to strcpy() in nearly every way.
Though you keep talking about "checking" assumptions. Note: dest[dest_size-1] = '\0' isn't a check, it's a guarantee. It doesn't check if the assumption holds true (so you can take corrective action), it causes the assumption to hold true. That is good code, and a good assumption.
Also, there is nothing stopping people from creating their own wrappers (as you suggested with your inline function), this could be done once (for a program, or a project, or all projects), and it is never a concern again.
Show me the version of the inline function that uses strcpy(). I can't come up with one that is both safe and free of undesireable behavior off the top of my head, but then again I'm not spending much effort because I already have a strncpy() version in 2 lines.
The enemies of Democracy are
he was asking about his code. strlcpy and strncpy will both get the job done (if available) but I didn't think that was what he was asking. Besides, you could clean up my code, put it in a library and call strahdecpy() just as easily.