Bounds Checking for Open Source Code?
roarl asks: "Is anyone working on an Open Source bounds checking system? (A system that checks a program at runtime for array out of bounds access, reading uninitialized memory, memory leaks and so on). I've been using BoundsChecker for some time and believe me, there are situations where you know you are going to spend hours debugging unless you let BoundsChecker sort it out for you. But it annoys me that I have to transfer (and sometimes port) the buggy program to Windows each time. I'd much rather stay in Linux.
Insure works on Linux. I haven't tried Insure for some time, but last time I tried I wasn't especially impressed. Purify seems still not to support Linux, but on other Unix platforms it works great. The problem with all of these products is that they are so da*n expensive. So it makes me wonder, are all Open Source programmers doing without them? If so, what can we expect of the quality of Open Source developed programs? If not, is there a free alternative?"
Need bounds checking for Linux? May I suggest the CMU Common Lisp interpreter and compiler (to machine code) or perhaps Smalltalk. :)
Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
The ever-resourceful Bruce Perens wrote a cool gizmo called "electric fence", which I have used on many occasions. It doesn't actually do bounds checking as such, what it does is provide a replacement "malloc" that allocates unwritable pages either above or below every memory allocation. Your application will then segfault when it misbehaves, and you can then use conventional debugging tools to track down the
It's very "non-invasive" -- all you have to do to use it is link against it, and maybe set a few environment variables.
2*3*3*3*3*11*251
Of the top of my head, and with the help of my bookmarks:
I personally had high hopes for the GCC BP project. If you feel like doing something that will earn you the admiration of millions, finish that code up. :-)
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Isn't bounds checking just a specialized case for checking any type of access to uninitialized memory? There are several tools that provide replacements for malloc() that can track *all* memory allocation, and some, like Valgrind, provide almost a virtual machine that tracks basically everything your program does. Any time you read, write, or allocate memory, Valgrind will track it, and tell you if it is in error. Like I said, array bounds checking is just a special case of this.
I like to use the bounds checking patches to gcc to check code. You recompile your code and it checks every array access, memory access, etc. http://web.inter.nl.net/hcc/Haj.Ten.Brugge/
I've found that ccmalloc helped me to find a lot of problems in C code. The output is more verbose than Purify, but it showed me where some real problems lay with my code.
Check out this site by Ben Zorn on free and other tools for this.
"Provided by the management for your protection."
Insure++ is heavenly, I don't know how long it's been since you've used it, but it detects almost all errors. I think most open source people who use it have their company buy it for them though; it is very expensive. It does very good bounds checking for both reading and writing, but it's real amazing help is in tracking down bad or dangling pointers.
It also does very detailed tracking of memory leaks, but can get a little confused when you store the last referencing pointer in a hashtable.
I think other than its somewhat clunky UI, price is the big killer. it takes a pretty fast machine to be able to use it much and it has a large up front cost, plus maintainence(upgrades and support) fee. It's really too bad they don't have a program in place with someone like sourceforge to let people use Insure++ on the test machines because that would not only be great advertising for them, but also could really help the open source developers too.
Warning: I'm a language zealot, so be warned that I'm utterly irrational and unamenable to the Sweet Voice of Reason. That said... :)
Use a different language. There are some things which C is appropriate for, but one of the things it's categorically not called for is when you have concerns about buffer-overflow conditions [*]. If this is a purely open-source, noncommercial project, do yourself and your career a favor: learn another language (one which doesn't have these sorts of problems) and write your app in that instead. You'll learn more, and you won't have to spend a dime on Purify or whatnot. If you go this route, I'd suggest Scheme; it's a beautiful LISP derivative.
If this is a commercial project, ask Management how married they are to C. In the overwhelming majority of cases, you can quietly substitute C++ without affecting the APIs one bit. Just wrap the external APIs in extern "C" and, inside the code, use C++'s beautiful vector instead of C-style arrays. Sure, you'll take a minor performance hit, but the increase in reliability will be well worth it.
Anyway, to try and give a (weak) answer to your question--instead of slapping a Band-Aid on the festering wound that is C memory management, you might want to think about doing away with the festering wound altogether. Use the right tool for the job--if C really is the right tool for the job, then fine, may God have mercy on your code. But if there are other, better, tools available... use them instead.
[*] OpenBSD manages to do pretty well with a C kernel, but that's because they're certifiably insane. It also impacts their dev cycle; they spend a great deal of time avoiding the pitfalls of C, so much so that it affects how much time they can devote to new development.
An excellent general solution I've found for problems of this nature can be found at "file:///usr/include/assert.h". Seriously,
preconditions, postconditions, and invariants are the best approach to avoiding such errors. Will a bounds-checker detect if you access an element that is out-of-bounds in a view (subarray) of a larger array? Also, if you are developing a library, using assertions will also greatly assist any end-users who are not using a bounds-checking tool.
IMHO, you should do a mix of C and C++ and use the Standard Template Librarys vector, deque, or list classes instead of an array. Hell, even if you use an array, the STL functions and algorithms still work on them. You can even use the Queue and Stack wrappers if thats what your doing... Thats just my opinion though....
Of course with PERL you could have the best of both worlds:
Develop in PERL with the flexibilty of the interpeter and all the garbage collection and neato stuff built-in.
When you hit a "stable" release version, use the O module to compule the code. either to Perl byte code for faster loading, or to one of two versions of C code. One just spits out calls to the perl/system libraries, the other is standard C code.
Article X: The powers not delegated... by the Constitution...are reserved...to the people