Domain: and.org
Stories and comments across the archive that link to and.org.
Comments · 76
-
Re:Very interesting comment about GNU libc
So, did his Hello World support multibyte character sets, or, in fact, any sort of internationalization?
Without being nice about it, dietlibc is a piece of shit. If you just want a syscall list and the obvious functions (memcpy() etc.) use klibc. If you need more then dietlibc is almost certainly broken IMO, everytime I've looked at something "big" in it the implementation was worthless. In fact the printf() like function is unique only in how terrible the implementation is, and that's probably the most widley used function in libc.
As for multibyte, that's no problem because Felix is using a broken string library, which is one of the few things that tries to forgo the use of printf() making i18n almost impossible.
-
Re:BIND crap
Link should have been... http://www.and.org/vstr/security.html
-
Re:Huh?
Actually, you're wrong. The exception is "checked" for each operation that can generate the exception. Once an exception is thrown, control immediately passes to the catch(){} block. So in order for the second op to run, the first needs to succeed.
No, the second op is only run if the first fails and it's run outside of the try, as is the close(). And according to Sun's documentation both calls are allowed to throw exceptions. This was hard to read due to
/. eating all the formatting. I've put it here so you can easily see it. -
Re:Imagine that you are an alcoholic...
C has no formal definition for exceptions (signals can't really count)
And exceptions are good, because...? Ahh yes, it's such a joy when I run a random python app. with input it didn't expect and get a screen full of exception traceback.
No, really. If I'd said to people 10 years ago, ok so there's this great idea that everyone should use called the "invisible return value". Basically what happens is that after you write to an API I can happily add extra return values you have to handle, however they'll be no possible way for you to find out what they are from looking at the code (and with most APIs I won't even declare them, so it'll basically be impossible to find out). Even better, because everyone will do this even changing what my API calls might change what I can return, and even I won't know
... hahahha.There are ways to declare side affects of functions in C, so that you can do a group of operations and only check for it in a single place. For string APIs in C SafeStr, Vstr and glib all do this in different ways. However it is often much more readable/secure to not turn explicit return values into invisible ones (or at least to provide both, Vstr does this).
-
Re:Imagine that you are an alcoholic...
C has no formal definition for exceptions (signals can't really count)
And exceptions are good, because...? Ahh yes, it's such a joy when I run a random python app. with input it didn't expect and get a screen full of exception traceback.
No, really. If I'd said to people 10 years ago, ok so there's this great idea that everyone should use called the "invisible return value". Basically what happens is that after you write to an API I can happily add extra return values you have to handle, however they'll be no possible way for you to find out what they are from looking at the code (and with most APIs I won't even declare them, so it'll basically be impossible to find out). Even better, because everyone will do this even changing what my API calls might change what I can return, and even I won't know
... hahahha.There are ways to declare side affects of functions in C, so that you can do a group of operations and only check for it in a single place. For string APIs in C SafeStr, Vstr and glib all do this in different ways. However it is often much more readable/secure to not turn explicit return values into invisible ones (or at least to provide both, Vstr does this).
-
Re:Imagine that you are an alcoholic...
C has no formal definition for exceptions (signals can't really count)
And exceptions are good, because...? Ahh yes, it's such a joy when I run a random python app. with input it didn't expect and get a screen full of exception traceback.
No, really. If I'd said to people 10 years ago, ok so there's this great idea that everyone should use called the "invisible return value". Basically what happens is that after you write to an API I can happily add extra return values you have to handle, however they'll be no possible way for you to find out what they are from looking at the code (and with most APIs I won't even declare them, so it'll basically be impossible to find out). Even better, because everyone will do this even changing what my API calls might change what I can return, and even I won't know
... hahahha.There are ways to declare side affects of functions in C, so that you can do a group of operations and only check for it in a single place. For string APIs in C SafeStr, Vstr and glib all do this in different ways. However it is often much more readable/secure to not turn explicit return values into invisible ones (or at least to provide both, Vstr does this).
-
Re:Java : C :: Emacs : vi
The reason is simple. While I can build a system that doesn't suffer from buffer overflows in C, the only way I can guarantee this is by looking through the code - a code audit, in security parlance.
Again, this isn't true. You don't need to audit all of the code
... if all of the code uses a dynamic string API, then you only need to make sure the API can't cuase problems. And as with Java where only need to know that the overflow checks work, a good implementation will come with tests to more or less prove that this is the case. -
Re:Java : C :: Emacs : vi
Now, as far as errors go: it's true that experienced programmers (in whatever language) will make fewer mistakes than less experienced programmers. But they're still human, and even if you're Donald Knuth himself, you're still going to make mistakes. The fact is that mistakes in C are far more costly than mistakes in Java. You can have off-by-one errors in both languages. In Java, however, your program will raise an out of bounds exception and, at worst, halt. In C, such a mistake could easily lead to a buffer overflow security flaw that can be exploited for elevated privilege. The same error in C and in Java is far more costly in C than in Java.
This isn't true, you can pick any of a number of string ADTs in C that make these errors exactly the same cost as in Java
... and there are probably cases where you'd pick different implementations from the above. In fact I'd argue that the mental brain fart that caused Gosling to put threads into Java as a primitive operation is a far bigger problem, and a bigger source of bugs. Good experienced C programmers use a dynamic string API designed for the job and no threads, that combination happens even less often in the Java world that the C one. -
Re:The problem with C...
I've been programming in C for years. Pointer problems aside, the main problem I've always seen with this little language is that there is no fundamental "string" data type. C++ solves this by creating the big fat bloated "string" object class. But in my opinion, there should have always been an extremely small and efficient fundamental "string" type, as in...
What is a "string"
... no really. In C++ you have std::string, std::strstream, std::streambuf and now std::stringstream ... which isn't even couting calling basic_string<> to create a slightly different one or using QString from Qt. I guess it's possible if there was a simple dynamic string API in ISO C people might use it, but it's not like safe dynamic string APIs don't exist. So I wouldn't put a lot of money on people using them more than they do now. -
Re:Static vs. dynamic strings
Dynamic strings are fine--until you run out of memory.
Whether static or dynamic, there is, eventually, a limit you'll run into, and if you don't code with that limit in mind then, eventually, you'll be screwed. In some cases, static allocation can be better because you know ahead of time what the limit is.I'm guessing you didn't read my links, so let me spell it out...
It's none trivial to run out of memory, and even if you do it's "only" a DOS attack.
As I said before, there are a lot of mistakes that you can do using a limited string API alloc which can be much worse that a DOS atttack (buffer overflows, privilage escalation, information leakage).
You almost always need to use more memory when using a limited string API than when using a dynamic string API, as you need to allocate the maximum amount of space (see this article I wrote in comp.lang.c).
As well as more memory, you often need more CPU because you have to do more copies of the data (esp. given that more than a few dynamic string APIs let you share data between strings).
Sometimes you don't know the maximum amount needed, and so if you are using a limited string API then you'll now have to use two sets of string APIs.
Of course there's also the real life test, assuming you leave out the C library with extentions model that apache, squid, openssh, sendmail, nfs, etc. have all tried to use (and all had buffer overflows with) the only one daemon I can think of using a limited string API is samba
... and supprise, supprise that's had buffer overflows too. So can you name one application that uses a limited string API and hasn't had a buffer overflow -
Re:Sendmail's future
I'm not sure that "insecure by design" is quite fair to the hard-working folks who developed this near-ubiquitous MTA.
So are you saying it is designed with security in mind?
A fairer assessment is that, when sendmail was designed, security was not as big an issue as it has become today.
So you saying (agreeing) it is designed without security in mind.
It's been years since the internet operated where everyone allowed relaying to help everyone else out. And go look at the code, they still use NIL terminated char *'s all over the place. Mostly with limited length APIs like strlcpy(), but even a few strcpy()s.
Now go look at postfix or qmail, but have fully dynamic string APIs and use them everywhere. And supprise supprise neither has had a buffer overflow.
-
Re:Sendmail's future
I'm not sure that "insecure by design" is quite fair to the hard-working folks who developed this near-ubiquitous MTA.
So are you saying it is designed with security in mind?
A fairer assessment is that, when sendmail was designed, security was not as big an issue as it has become today.
So you saying (agreeing) it is designed without security in mind.
It's been years since the internet operated where everyone allowed relaying to help everyone else out. And go look at the code, they still use NIL terminated char *'s all over the place. Mostly with limited length APIs like strlcpy(), but even a few strcpy()s.
Now go look at postfix or qmail, but have fully dynamic string APIs and use them everywhere. And supprise supprise neither has had a buffer overflow.
-
Re:Interepreted languages help, but aren't a cure-
Buffer overflows are arguably the most common vulnerabilities that occur in the wild, which in turn indicates that most of the network services attacked are being written in C.
No, it's not arguable, here's some stats. However it doesn't require a new language to solve this problem.
We need more developers to be putting on their black-hats, and looking at their code and wondering "what if I tried this? Could I break this code?".
This is very true, if only Universityies offered something like "CS 104 - Thinking like an adversary", which is a bit different from how programmers normally think about their code. Ahh. well, someday.
-
Re:I blame colleges
Well, according to measurements we have done, the bounds checking overhead is less than 0.1%. But in the case described by you, the compiler can do even better.
You mean on Dylan, I find that hard to believe. What kind of workload was that?
Anyway that requires rewritting your application in a language that very few people have ever written anything in. The stats I've seen for adding bounds checking into C affect the performance by about 100x (10-15%) IIRC. Hell the performance of doing stack guard/pro-police like checks were in the 1-2% region.
As for the compiler is magic nowadays, well maybe your compiler
... but the ones I've used, for C, still have a long way to go just doing constant propagaion/cse/etc. to get inlining to be as good as a programer could.In some analysis I've done on C String APIs you can get very close to "raw" C with good/secure dynamic string APIs
... and even better than the obvious approach in some cases. And I think it'll be somewhat easier to get C programers to learn a C API and take a very small performance hit in some cases than it will to get them all to write everything in dylan, or Java or python, etc. -
Re:Scripting languages have problems, too ...
Sure. And there are languages that avoid buffer overruns (and double-frees, something which C can't really protect against without doing garbage collection, and format string bugs, and integer overflows), too, and people stick with C due to lack of awareness.
Getting people to use a string library is much easier than getting the to rewrite all their code in a new language, or even create new code in a language they haven't used much before.
And to reiterate there has been a single double free security errata for Red Hat this year. There is also one free while in use security errata (but that was in a printf() function that apache has no business implementing IMO, so the fact it's broken in more than just std. conformance doesn't supprise me).
IIRC there was a single double free from last year too, in zlib, however I don't have stats for that yet.
-
Re:Yet languages make a big difference
In applications, buffer overflows, along with (recently) integer overflows, double-free bugs, and printf formatting bugs, are the most common source of exploitable holes by far. (Case in point: the MySQL buffer overflow currently on slashdot's main page, the several recent RPC vulnerabilities in XP, the recent OpenBSD hole, etc.) All of these errors would be impossible to make in a safe language.
Of all the Remote Security Vulnerabilities that Red Hat has released over the last year over 50% are are impossible to make with dynamic string APIs in C. The rest are almost all cross site scripting, various DOS attacks and temporary file vulnerabilities which affect python/perl/etc. programs just as much.
Really, you don't need to buy a new car because the one you have has a tape deck and not a CD p0layer. Language doesn't matter in this way.
-
Re:Yet languages make a big difference
In applications, buffer overflows, along with (recently) integer overflows, double-free bugs, and printf formatting bugs, are the most common source of exploitable holes by far. (Case in point: the MySQL buffer overflow currently on slashdot's main page, the several recent RPC vulnerabilities in XP, the recent OpenBSD hole, etc.) All of these errors would be impossible to make in a safe language.
Of all the Remote Security Vulnerabilities that Red Hat has released over the last year over 50% are are impossible to make with dynamic string APIs in C. The rest are almost all cross site scripting, various DOS attacks and temporary file vulnerabilities which affect python/perl/etc. programs just as much.
Really, you don't need to buy a new car because the one you have has a tape deck and not a CD p0layer. Language doesn't matter in this way.
-
Re:gets()
*sigh* that link should be... this one
-
Re:gets()
In contrast, dietlibc warns of many other functions, e.g. unportable functions like sendfile, security risks like system and {tmp,temp}nam,
Great so if you autoconf for sendfile() then you still get a warning, how intelligent. system() is debatable, I'd say it was rarely used and rarely used badly
... however tmpname/tempname both produce link warnings with glibc.functions introducing bloat into your programs like all stdio stuff
*laughs*
... yeh, because almost no C programs use stdio. Mind you due to the terrible printf in dietlibc a link time warning for printf/sprintf/fprintf/etc. is probably a good thing. -
Re:No Deadlines does not mean No Pressure
But a simple look at any bugtrack should back me up. Most errors are due to memory leakage, buffer overflows and other artifacts of C programming.
Bugtrack will almost certainly tell you that buffer overflows are a large source of security bugs but this doesn't back you up at all, as there are a lot of good C string APIs and using one that stops buffer overflows is trivial.
Literate programming means writing well in code. It means much more then comments and I urge you to some research into it.
I'm well aware of what literate programing is, and it is little more than writing everything twice. It makes the actual code very hard to read IMO, due to copious amounts of "coments".
While you could argue that TeX is "perfect", the debian network interface code is much less useful than the Red Hat version and changing it was much harder.
In that case then open source programs will continue to suffer from too many bugs.
In your opinion, in my opinion having the people who implement the code design the code makes for a much better result.
-
Re:No Deadlines does not mean No Pressure
But a simple look at any bugtrack should back me up. Most errors are due to memory leakage, buffer overflows and other artifacts of C programming.
Bugtrack will almost certainly tell you that buffer overflows are a large source of security bugs but this doesn't back you up at all, as there are a lot of good C string APIs and using one that stops buffer overflows is trivial.
Literate programming means writing well in code. It means much more then comments and I urge you to some research into it.
I'm well aware of what literate programing is, and it is little more than writing everything twice. It makes the actual code very hard to read IMO, due to copious amounts of "coments".
While you could argue that TeX is "perfect", the debian network interface code is much less useful than the Red Hat version and changing it was much harder.
In that case then open source programs will continue to suffer from too many bugs.
In your opinion, in my opinion having the people who implement the code design the code makes for a much better result.
-
Re:No Deadlines does not mean No Pressure
1) Stop using C. Use object oriented languages and languages that offer garbage collection. You will immediately reduce bugs by 80%
That number looks like you pulled it from somewhere the Sun don't shine. I guess it's possible that you remove 80% of the simple bugs in your code, but even then I'd find that number hard to swallow. And if you introduct simple things like a string API I'm not going to believe a number anywhere near 80%.
2) Make code more literate. Use pre and post conditions, demand that all contributors use lots of asserts. Make liberal use of interfaces
"literate" code is often the wrong approach, when I want to say things well in English I don't write the same thing in Japanese next to it
... I just spend time writing it well in English.assert()'s are nice as are the assert()s called pre/post conditions. Interfaces in the Java sense I doubt buy you much from a lower bug count POV.
3) Designate a few people as architects of the project. These people should do nothing but write and design interfaces and maybe write class stubs with pre and post conditions and have the rest of the team complete the classes.
Even if you assume that people can easily take these roles (Ie. someone not doing any implementation knows what the interfaces should be
... err yeh right, not). It's just not going to happen in the OpenSource world IMO, implementing someone's very good spec. isn't much fun ... implementing one less than that gets much less so.4) Unit testing, unit testing, unit testing.
Duh, testing helps reduce bugs
.. I'm shocked. A bunch of projects do this now, some have a large "make check" pass ... and some don't, sure. But for instance I know a bunch of the samba people have private win32 labs to do regression testing, I'd be supprised if that wasn't similar for the apache/etc. people. -
Re:Great Quote
It still boggles my mind that C is used to do any high-level programming (ie, anything besides api's to system calls, and writing drivers and kernels).
Depends, I don't see any real server apps. written in perl/python/php (Ie. smtp, nntp, ntp, ftp, http, etc.)
... and although we're starting to see some small GUI apps. written in python, they are almost uniformly terrible and fail to run or anything but the developers box.Some of us just don't mind thinking about the materials as we build the bridge, because let's face it
... you aren't going to be happy if the bridge is made of glass.And ten times as long to find all the strange bugs and buffer overflows that eventually show up as exploits.
This really isn't a problem with C, and half decent C programer can make a simple string API that is safe (and I think you'll find all the secure apps. do this). Or you can even just use one someone else already made. However a lot of people "learn to paint" with C, and too often those early rough sketches end up being used by a lot of people.
-
Re:Security?
Well, how else do you propose to test for buffer overflows?
Don't have staticly sized buffers.
There are more than a few string libraries for C, and amazingly there is a direct correlation between those servers that use one and those that are secure. Then again nsd doesn't seem to have that many static buffers, and UDP servers can get away with this a lot more than most
... and it only has one call to strncpy() (which looks dodgey -- but is in the config parser) ... well it's not like it can be much worse than bind. -
Re:Marketing
here's no reason why 99% of open source projects actually need to market themselves - they don't need to make money,
While the dict definition of "Marketing" is all about making money, this isn't what most people are using the term to meanin the OSS world, having no users and/or help. I'd say there are four things you want to market for....- Get users, it's much easier to work on something that people are using than working on something that noone else is using.
- Get developers, getting help for some problem you want/need solved is a big thing
- Getting feedback, it's very helpful to have external auditing of your code, or I compiled your program on blahSystem and this is what happened, or even just getting comments from people who've done similar things.
- Getting recognition, after all that works it's nice to get the warm fuzzies
:)
...you admit to 2 (and I know I've given you 3) and I'll bet a lot of money you want 1 and 4 as well. As for other people just look at http://freshmeat.net or http://www.gnu.org/directory these are basically simple marketing tools, and a lot of people are using them (I know I am for http://www.and.org/vstr :). -
Just another C string library
Some of the idea's aren't bad (and those have been done before), but mostly it's just another simple dynamic string library in C.
As for efficency...
t_strconcat() is one function that I also copied from GLIB. It's a bit dangerous though, the terminating NULL is too easy to forget. I've been thinking about removing it entirely, but it's much more efficient than t_strdup_printf() so I haven't yet had the heart
:)...this pretty much speaks for itself. Why Is strconcat() so efficient compared to just doing strcat() multiple times? Because you've got a model for representing the data that has ZERO metadata, and a model for storing the data that requires you to reallocate bits of memory all the time.
Assuming you can just disacount all this overhead by using memory pools, is a simplistic outlook (for instance even if you waste gobs of memory so you don't have to call malloc that much you'll still need to do copies all the time)
There are more than a few much better string libraries out there for C. Probably the best for an IMAP server is probably Vstr as that was deigned to work well in an I/O context (For instance it doesn't need strconcat() like calls in the API because doing repeat adds is just as fast).