Tools for Debugging Stack Corruption?
blackcoot asks: "I know that there are tools which exist on hardened Linux distros and OpenBSD (and probably $your_os_of_choice too), which are designed to help track down stack corruption (which is often symptomatic of buffer overruns). Unfortunately, that's about all I know about those tools. What options are there? How effective are they? What does it take to get access to those tools? Are they really useful enough to make the effort justified?"
"My goal here is to increase my effectiveness at hunting down memory bugs, not necessarily to produce bullet proof, secure production quality code — the bugs I'm dealing with are, I believe, largely in software delivered by a subcontractor who swears they test their code and can't reproduce my bugs. What I really want is to a) demonstrate to them that a problem does, in fact, exist; b) demonstrate that the problem exists inside their code; and c) give them the tools they need to find, repair, and verify that the bug is no longer an issue.
First prize in my mind would be a Valgrind like tool which only requires trivial changes to the build process, but I'm pretty open to suggestions. If I have to run a hardened Linux to make this all possible, suggestions on pretty leading edge distros with reasonable automagic self configuration and hardware detection + laptop support would be greatly appreciated. Thanks much!"
First prize in my mind would be a Valgrind like tool which only requires trivial changes to the build process, but I'm pretty open to suggestions. If I have to run a hardened Linux to make this all possible, suggestions on pretty leading edge distros with reasonable automagic self configuration and hardware detection + laptop support would be greatly appreciated. Thanks much!"
See Electric Fence and your documentation for debugging malloc.
Also, you can use the Boehm garbage collector as a leak detector.
First of all, there's libsafe which is just a simple compile and install. Seeing as we don't know much about your specific problem, I'm not sure if it'll help.
Second to try is compiling your own custom kernel with the GrSecurity kernel patch. It has as part of it the PaX kernel patch which is very effective at protecting against overflows. You could even just install the PaX kernel patch itself, but I believe the version in GrSecurity is kept more up-to-date. You can compile it with protection turned off by default, but using the PaX tools turn on protection for just the binaries you wish to check.
Installing either (or both) of these could well help you, without the need to blow away your current install and start fresh.
Using a complete hardened Linux distro is not necessary for "normal" development work, but it may be a good idea to verify that your application actually works there too.
In addition to the run-time checkers you can also look for static checkers like Splint. It can provide you with some extra information about potential problems that only occurs under certain conditions that maybe aren't met during runtime.
If the effort of trying to track down stack corruption is worth it? - YES! Absolutely! Catching bugs in an early stage is essential to keep down the lifecycle cost of a system. Also consider the risk of badwill if your product is prone to strange behavior and crashes.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
I feel your pain about bugs in libraries that you must use without the source code. I had an arrangement like that for nearly two years with extremely buggy code. Just relinking the static library with changes to my code, changing where in memory the library would reside, would often cause huge problems. Let's just say I got really good at debugging in assembly with gdb. I got where I could call them up and say something like, "you have some code at the end of function foo that looks like 'a[2] = b', but a was never allocated." They'd always reply with something like, "Yes it is ... oh wait..."
This space intentionally left blank.
When politicians are suspected of corruption it is usually traced back to a stack of money.
Excuse me! I don't have $ in front of my variables!
I don't like yourOsOfChoice style naming, so I'll give you that...
Reality is nothing but a collective hunch.
If you don't mind spending money, get Rational Purify from IBM. They seem to properly support Linux now, and they are the gold standard. You can download a 2 week evaluation license to give it a try.
As far as your build process, you just take whatever your link command is, and tag "purify" on the beginning of it. The tool will instrument all compiled objects that you are linking in, and produce a binary that does all sorts of really useful checking (read/write of freed memory, buffer overruns, memory leaks, file descriptor leaks, ...).
If you are developing software for money, there is nothing better.
If you're doing open source work for no pay, obviously it's hard to justify it.
Don't use C!
"I don't know, therefore Aliens" Wafflebox1
If you're on Windows, the first line of defense is running a debug build of the app with /RTC1 and /GZ turned on. I ran into a stack buffer overrun just the other day and message box came up immediately telling me exactly which variable got corrupted.
You may find DieHard useful, if heap errors are also a problem -- it's a plug-in replacement for malloc/free that provides probabilistic guarantees of correct execution in the face of errors.
http://www.cs.umass.edu/~emery/diehard
-- emery
gcc -fbounds-check could find some memory problems (and some non-problems). I used this feature over 6 years ago before it was integrated into gcc. YMMV.