Bounds Checking for Open Source Code?

A simple answer to a simple question... by RevAaron · 2002-06-12 09:15 · Score: 3, Funny

Need bounds checking for Linux? May I suggest the CMU Common Lisp interpreter and compiler (to machine code) or perhaps Smalltalk. :)

--

Working toward a usable PDA environment in the spirit of Newton OS: Dynapad

Re:A simple answer to a simple question... by RevAaron · 2002-06-12 12:50 · Score: 2

Flamebait my rear end. He'd get exactly what he was asking for, and a helluva lot more. (in a good way) :)

--

Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
Re:A simple answer to a simple question... by bsartist · 2002-06-12 13:57 · Score: 2, Interesting

I don't really see how the parent post is flamebait. It's a fair answer to the question that was asked. The write-up asks whether OSS developers are simply going without automated bounds checking. The answer is yes, many people are doing exactly that, by relying on interpreted languages that don't require such in-depth memory management.

Of course, doing so doesn't do away with the problem entirely, it simply moves the problem up a level - how does one handle bounds checking when debugging a language interpreter?

--
Lost: Sig, white with black letters. No collar. Reward if found!
Re:A simple answer to a simple question... by __past__ · 2002-06-12 22:20 · Score: 2

...by relying on interpreted languages that don't require such in-depth memory management.
CMUCL includes a native-code compiler.

--
Programming can be fun again. Film at 11.
Re:A simple answer to a simple question... by RevAaron · 2002-06-13 02:25 · Score: 2

As someone pointed out, CMUCL includes a real-live compiler. Other Common Lisp implementations do as well. While my other example, Smalltalk, isn't compiled to native machine code, it is compiled to bytecode where it is then JITed. More or less compiled, at least, it's a lot closer to compilation than interpretation. :)

I can't speak for any other implementation of Smalltalk or Common Lisp, but Squeak Smalltalk does bounds checking at runtime. It does it with in the Smalltalk object-space via regular Smalltalk methods and not in the VM.

--

Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
Re:A simple answer to a simple question... by bsartist · 2002-06-13 04:05 · Score: 1

You can skip over the word "interpreted" in my post if you'd like... it's incidental to the point I was trying to make anyway. My point was more about the fact that many languages don't require such rigorous hands-on memory management.

--
Lost: Sig, white with black letters. No collar. Reward if found!
Re:A simple answer to a simple question... by broody · 2002-06-14 06:57 · Score: 1

I don't really see how the parent post is flamebait.

While I don't agree with the logic, I understand it.

BoundsChecker works with C++ & Delphi, Insure++ with C/C++, Purify is yet again C/C++. Translated in a round about kind of way the original poster is asking "How can I get bounds checking for C++ with free software?". The response marked flamebait could be translated as 'use LISP' which easily could be considered flambait for a C++ developer with a bloodlust for language holy wars.

--
~~ What's stopping you?
Re:A simple answer to a simple question... by nirvanafreek · 2002-06-14 09:12 · Score: 1

We can't leave Java out of this discussion, did someone say "Array out of bounds exception"? The VM does the memory monitoring automatically and most will tell you exactly where it happend. "I'll check my own bounds thank you very much!"
Re:A simple answer to a simple question... by RevAaron · 2002-06-17 08:25 · Score: 1, Flamebait

That's all fine and dandy, but Java sucks- everyone knows that. If you want C/C++, use them; if you want something better, you step up to Smalltalk or Lisp. :P I mean, no one uses Java unless they're forced to. Or brainwashed.

<duck>

--

Working toward a usable PDA environment in the spirit of Newton OS: Dynapad

Electric Fence by Urban+Garlic · 2002-06-12 09:16 · Score: 5, Informative

The ever-resourceful Bruce Perens wrote a cool gizmo called "electric fence", which I have used on many occasions. It doesn't actually do bounds checking as such, what it does is provide a replacement "malloc" that allocates unwritable pages either above or below every memory allocation. Your application will then segfault when it misbehaves, and you can then use conventional debugging tools to track down the

It's very "non-invasive" -- all you have to do to use it is link against it, and maybe set a few environment variables.

--
2*3*3*3*3*11*251

Re:Electric Fence by Hard_Code · 2002-06-12 13:47 · Score: 2

..."and you can then use conventional debugging tools to track down the"

Looks like you should have used it on that sentence ;)

--

It's 10 PM. Do you know if you're un-American?
Re:Electric Fence by morcheeba · 2002-06-12 16:41 · Score: 2

It's nice software (source) and a beautiful hack (a compliment), but it has a fundamental flaw that limits its use in some applications (like mine at the time)...

Something in my program was modifying free'd memory. To detect this, efence doesn't really free the memory (EF_PROTECT_FREE), which makes it consume a huge amount of space. Especially if your program does a lot of memory allocating and freeing, like mine did, the system runs out of memory and swaps until the cows come home.

I finally found my problem by changing my frees to clear the memory before actually freeing it - then my usual sanity checks would find the bad pointer.

from the manpage:

WHAT'S BETTER -- PURIFY, from Purify Systems, does a much better job than Electric Fence, and does much more. It's available at this writing on SPARC and HP. I'm not affiliated with Purify, I just think it's a wonderful product and you should check it out.

--
HIV Crosses Species Barrier... into Muppets
Re:Electric Fence by joto · 2002-06-13 02:11 · Score: 2

Well, compiling a 50MB source tree with purify doesn't exactly make the resulting program a speed demon either. But I agree it works better then ElectricFence.
On the other hand, ElectricFence is very unintrusive. For small to medium-sized programs and/or libraries, you can use it in every compile, saving you the trouble of fixing those bugs later.
I would recommend them both. Always use ElectricFence, then use Purify to find those additional nasty bugs...

Lots of almost-complete solutions by devphil · 2002-06-12 09:23 · Score: 5, Informative

Of the top of my head, and with the help of my bookmarks:

Bell Labs had a "libsafe" that provided versions of malloc et al: http://www.lucent.com/press/0400/000420.bla.html Unfortunately the link given in that press release no longer works.
A quick scan through sorceforge and other open source project sites yields about 1.87E3 projects to replace Checker and StackGuard with kew1 Linux-only alternatives. (Why? Who knows.) Most of these projects seem not to have gotten any further than the project web page saying how 'leet they were going to be.
One of the side branches in the GCC repository was the bounded pointers project, which was way cool. It was mostly working, too, until the author had to go work on something else.
I personally had high hopes for the GCC BP project. If you feel like doing something that will earn you the admiration of millions, finish that code up. :-)

--
You cannot apply a technological solution to a sociological problem. (Edwards' Law)

Re:Lots of almost-complete solutions by norwoodites · 2002-06-12 10:28 · Score: 1

GCJ, the GNU's java compiler, has switches to turn off bounds checking when compiling to native code.

A general case by big_hairy_mama · 2002-06-12 09:24 · Score: 5, Interesting

Isn't bounds checking just a specialized case for checking any type of access to uninitialized memory? There are several tools that provide replacements for malloc() that can track *all* memory allocation, and some, like Valgrind, provide almost a virtual machine that tracks basically everything your program does. Any time you read, write, or allocate memory, Valgrind will track it, and tell you if it is in error. Like I said, array bounds checking is just a special case of this.

Re:A general case by morcheeba · 2002-06-12 17:07 · Score: 3, Informative

Isn't bounds checking just a specialized case for checking any type of access to uninitialized memory?

That's a good chunk of it, but there are several cases of acces to initialized (and alloc'ed memory) that should also be detected:

- pointers to stale memory. Example: malloc, initialize, free, malloc again and get some reused memory that was left initialized before the free, and then attempted use of that data. (calloc zeros the memory, malloc doesn't)

- pointers exceeding array bounds. Example: int x[100][100], reading array element x[10][150] doesn't cause a segfault (but x[150][10] does)

- unexpected pointer alias (devious and borders between just a regular bug and a bounds-checkable bug). The function doesn't expect the two pointers passed to it to point to the same area of memory (example memcpy pointers can't overlap, memmove pointer can). Incidently, this assumption is usually a toggleable (& dangerous if you're not careful) optimization and can cause the compiler to generate 'bad' code - more limited languages (eg. Fortran 77, but not Fortran 90) that don't have pointers can be more agressively optimized! when I've got a function that could choke on this kind of thing, I usually code in a bunch of asserts to check for this case and raise a flag.

--
HIV Crosses Species Barrier... into Muppets
Re:A general case by muleboy · 2002-06-14 10:03 · Score: 0

Isn't bounds checking just a specialized case for checking any type of access to uninitialized memory?

No. Try this code and see if it raises an error with Valgrind (spoiler: I just did and it doesn't):
int main(void) { int a[10]; int b[10]; int loop; for (loop = 0; loop
Re:A general case by muleboy · 2002-06-14 10:07 · Score: 2, Informative

Isn't bounds checking just a specialized case for checking any type of access to uninitialized memory?

No. Try this code and see if it raises an error with Valgrind (spoiler: I just did and it doesn't):
int main(void) { int a[10]; int b[10]; int loop; for (loop = 0; loop < 10; loop++) { a[loop]=100; b[loop]=100; } a[10] = 1; }
P.S. The comment before this looks like crap because slashdot doesn't use the <pre> HTML tag. Ooops.
Re:A general case by big_hairy_mama · 2002-06-14 10:48 · Score: 1

I guess it's unfortunate that I got all those mod points, me being wrong and all :)

Bounds checking gcc compiler by geog33k · 2002-06-12 09:26 · Score: 5, Informative

I like to use the bounds checking patches to gcc to check code. You recompile your code and it checks every array access, memory access, etc. http://web.inter.nl.net/hcc/Haj.Ten.Brugge/

Re:Bounds checking gcc compiler by Anonymous Coward · 2002-06-19 00:33 · Score: 0

I'll agree with the benefits of the bounds-checking patches and gcc. The downside is that you have to apply them yourself against a slightly older version of gcc. The upside... well, let's just say that I once recompiled 4 million lines of code across 600 programs. Fish in a Barrel! I spent the next few weeks feeding bug reports to the development staff. In a matter of hours I found a bug that was 7 years old.

Valgrind by Anonymous Coward · 2002-06-12 09:27 · Score: 2, Informative

Valgrind

memprof? by grek · 2002-06-12 09:58 · Score: 2, Interesting

How about memprof?

Have you looked into Immunix and StackGuard? by mikehoskins · 2002-06-12 10:00 · Score: 2, Interesting

See http://www.immunix.org/

While it may not be EXACTLY what you want, it may be MORE....

use a better language... by larry+bagina · 2002-06-12 10:07 · Score: 1, Flamebait

like pascal. The ISO Pascal standard requires bounds checking (most implementations allow you to turn it off).

With the 80188, intel actually introduced the bound instruction, which compares a register against pair of upper/lower bounds an produce interrupt 5 if the register is too high or low. Motorola's 680x0 CHK instruction does the same.

It would be useful if gcc produced debugging code to do static array bounds checking.

--
Do you even lift?

These aren't the 'roids you're looking for.

Re:use a better language... by Anonymous Coward · 2002-06-13 04:42 · Score: 0

That's avoiding the problem. You can try and ignore it all you want. Besides Pascal doesn't have templates.
Re:use a better language... by Anonymous Coward · 2002-06-14 20:15 · Score: 0

Besides Pascal doesn't have templates.
Ada does. They are called "generics". More powerful and type-safe than templates, IMO.

Good Question - Some Answers by 4of12 · 2002-06-12 10:10 · Score: 4, Informative

I've found that ccmalloc helped me to find a lot of problems in C code. The output is more verbose than Purify, but it showed me where some real problems lay with my code.

Check out this site by Ben Zorn on free and other tools for this.

--
"Provided by the management for your protection."

Insure++ by vipw · 2002-06-12 10:21 · Score: 3, Interesting

Insure++ is heavenly, I don't know how long it's been since you've used it, but it detects almost all errors. I think most open source people who use it have their company buy it for them though; it is very expensive. It does very good bounds checking for both reading and writing, but it's real amazing help is in tracking down bad or dangling pointers.

It also does very detailed tracking of memory leaks, but can get a little confused when you store the last referencing pointer in a hashtable.

I think other than its somewhat clunky UI, price is the big killer. it takes a pretty fast machine to be able to use it much and it has a large up front cost, plus maintainence(upgrades and support) fee. It's really too bad they don't have a program in place with someone like sourceforge to let people use Insure++ on the test machines because that would not only be great advertising for them, but also could really help the open source developers too.

Change languages. by rjh · 2002-06-12 10:48 · Score: 3, Insightful

Warning: I'm a language zealot, so be warned that I'm utterly irrational and unamenable to the Sweet Voice of Reason. That said... :)

Use a different language. There are some things which C is appropriate for, but one of the things it's categorically not called for is when you have concerns about buffer-overflow conditions [*]. If this is a purely open-source, noncommercial project, do yourself and your career a favor: learn another language (one which doesn't have these sorts of problems) and write your app in that instead. You'll learn more, and you won't have to spend a dime on Purify or whatnot. If you go this route, I'd suggest Scheme; it's a beautiful LISP derivative.

If this is a commercial project, ask Management how married they are to C. In the overwhelming majority of cases, you can quietly substitute C++ without affecting the APIs one bit. Just wrap the external APIs in extern "C" and, inside the code, use C++'s beautiful vector instead of C-style arrays. Sure, you'll take a minor performance hit, but the increase in reliability will be well worth it.

Anyway, to try and give a (weak) answer to your question--instead of slapping a Band-Aid on the festering wound that is C memory management, you might want to think about doing away with the festering wound altogether. Use the right tool for the job--if C really is the right tool for the job, then fine, may God have mercy on your code. But if there are other, better, tools available... use them instead.

[*] OpenBSD manages to do pretty well with a C kernel, but that's because they're certifiably insane. It also impacts their dev cycle; they spend a great deal of time avoiding the pitfalls of C, so much so that it affects how much time they can devote to new development.

Re:Change languages. by jtdubs · 2002-06-12 11:27 · Score: 2, Offtopic

I'm also a language zealot. :-)

I agree with everything you said. C has it's place. It's small and dark and should be avoided by most.

Scheme is a sexy, sexy language. However, why not just use straight ANSI Common LISP. That's my preference for one main reason, CLOS. Scheme has only several mediocre implementations of "CLOS-like" systems. Nothing really on par with CLOS that I can find.

For the uninitiated (you poor, poor people :-)) CLOS is the Common LISP Object System (iirc). It's a fabulous polymorphic, multiple-dispatch OO system written in LISP. It has features that will make C++ and Java programmers head's swim. Namely, :before, :after and :around methods. Plus the whole multiple-dispatch thing.

Anyway, just my suggestion. Unless, of course, someone can suggest a good CLOS system for Scheme.

Regardless, have a great day guys,

Justin Dubs
Re:Change languages. by rjh · 2002-06-12 11:59 · Score: 2

Okay, you said Scheme was sexy, so I guess I won't flame you. :) (Hey, I warned you I'm a language zealot.) That said...

C has it's place. It's small and dark and should be avoided by most.

I don't know if I'd go this far. On one of my pages (here, or http://soli.inav.net/~rjhansen/c_relevance.html for the goatse.cx averse) I've got an essay--which I originally wrote intending it to be a response to a Web editorial I saw blasting every language that wasn't Java--which may be germane to the discussion here.

Short version: C has its place. Yeah, the place is small and dark and should be avoided whenever possible. But sometimes we don't get a choice of whether or not to avoid it, and when we're trapped in that small, dark place, C is your salvation. So I'm not going to knock C--but I will say that I generally avoid it whenever possible. :)

However, why not just use straight ANSI Common LISP

Didn't recommend it because I don't know Common LISP. :) I had a hard enough time wrapping my head around the C++ Standard; given the ANSI Common LISP standard is about the same size, I'm very reluctant to give it the focus that I'd need to in order to be a credible Common LISP programmer.

Scheme, on the other hand, has a very lightweight standard. It's easy to read, easy to understand. Sure, it probably misses out on some cool things that are in Common LISP, but I've yet to find an instance where Scheme has let me down. :)

Really--the only reason why I didn't recommend Common LISP was because I have a moral aversion to recommending languages I don't understand. If you like it, though, by all means, get down with your bad functional-programming self. :)
Re:Change languages. by blueroo · 2002-06-12 12:48 · Score: 0

The only festering wound here is inane programmers who don't know how to code. If you experience memory management difficulties with your C code, then you can count yourself as one of the above. In C, the programmer is the memory management. If its festering... well, you get the picture. Don't blame your deficiencies on your tools.

Oh, and poster. What can we expect of the quality of open source programs who don't rely on an expensive crutch to hold their insufficient code together? Well, we can expect them to segfault, fail silently, or work well. I don't get too many segfaults or silent failures from oss code, so you put the puzzle pieces together.
Re:Change languages. by 0x0d0a · 2002-06-12 13:34 · Score: 2

I agree that C++ is a good replacement for C in many cases. However:

I can still generally compile and run five year old (since last revision) C programs without too much trouble. Frankly, five year old C++ programs have a habit of failing compilation on the first file -- too much change in the compilers.

I don't think that simply moving to a functional languages is an option for most people. I and others dislike using functional languages for larger programs.

As for lisp: first-order functions feel *right*, yes. They also end up causing code that is absolute hell to debug. Trying to find where the code for the function is that's being called in the current function can get to be really aggravating when you're working over someone else's code. I was puzzling my way through OpenLDAP code today, and function pointers alone make it frusterating to see what's going in in a program. When a language has good first-order functions (meaning) the programmers use them all over), and particularly if we throw in continuations, it's rough on the poor maintainers. This is one thing that C++ did right -- templated code is a Good Thing for maintainers, much easier to read than code that uses function pointers or first-order functions.

OpenBSD manages to do pretty well with a C kernel

*snicker* Okay, find me a high performance Common LISP kernel.

--
May we never see th
Re:Change languages. by jtdubs · 2002-06-12 14:11 · Score: 2

Yeah, I completely agree with your sentiments on C. As well as pretty much everything else you said. :-)

About ANSI Common LISP: In terms of syntax and functionality, there really isn't that much too it. It's beauty is it's simplicity. However, in terms of the available methods and packages, it is a bit of a beast. I'm still wading through it myself.

Honestly, to a certain extent I prefer Scheme. It's a bit more consistant and definitly more simple. The only thing really holding me back is, like I said in my last post, the lack of an Object system of the same quality as CLOS.

If I can find an OO system for scheme that has method-combinations and supports functionality like :initarg and :accessor for slots I'd be thrilled.

Do you have any recommendations?

Anyway. Thanks for the reply. Have a good night,

Justin Dubs
Re:Change languages. by rjh · 2002-06-12 15:09 · Score: 2

I can still generally compile and run five year old (since last revision) C programs without too much trouble. Frankly, five year old C++ programs have a habit of failing compilation on the first file -- too much change in the compilers.

Hardly surprising given that five years ago there wasn't even an ANSI/ISO C++ standard. If you're using code that doesn't conform to the standard, it's not the language's fault if it fails to compile.

*snicker* Okay, find me a high performance Common LISP kernel.

I'm not a Common LISP hacker, sorry. But for non-C kernels, try BeOS (C++), try Oberon (Modula), try Plan 9... they vary from acceptable to excellent, and don't use C.

But if you really want a high-performance LISP kernel, I'd suggest looking at an (old) LISP Machine. Those babies were pretty sweet for the day. :)
Re:Change languages. by Anonymous Coward · 2002-06-12 18:54 · Score: 0

http://rhn.redhat.com/errata/rh72-errata.html
Re:Change languages. by flynn_nrg · 2002-06-12 23:12 · Score: 1

"But for non-C kernels, try BeOS (C++)"

Sorry, but there's not a single line of C++ in the BeOS kernel. Their motto was always: "No C++ in kernel code".
Re:Change languages. by joto · 2002-06-13 02:22 · Score: 2

This is one thing that C++ did right -- templated code is a Good Thing for maintainers, much easier to read than code that uses function pointers or first-order functions.
Funny, I would tend to say exactly the opposite. Aside from syntax issues and verbose compiler errors, template'd code is hard to step through in a debugger, it doesn't work too well with separate compilation, and most C++ compilers are quite buggy, biting your ass if you try something too fancy.
That doesn't mean I don't find C++ templates useful or interesting. Eventually these issues will get sorted out, maybe with a new language, or maybe just with new and better tools. But so far, they have rightfully proven themselves as a nightmare for the maintenance programmer (at least in my book).
Re:Change languages. by PissingInTheWind · 2002-06-13 07:29 · Score: 1

"But for non-C kernels, try BeOS (C++)"

Sorry, but there's not a single line of C++ in the BeOS kernel. Their motto was always: "No C++ in kernel code".
Well, that's too bad. Did you notice they went bankrupt too?
Maybe they should have. Or maybe not, since C++ isn't such a radical change from C, you still get the sticky stuff while only adding hairy stuff...

--

A message from the system administrator: 'I've upped my priority. Now up yours.'
Re:Change languages. by Anonymous Coward · 2002-06-13 12:37 · Score: 0

"But for non-C kernels, try BeOS (C++)"
Sorry, but there's not a single line of C++ in the BeOS kernel. Their motto was always: "No C++ in kernel code".
Which is probably a good thing, considering the lack of ABI (hell, the lack of name mangling standards) for C++ could have made it a one-compiler platform - not the best way to attract tool chain vendors.
Re:Change languages. by hding · 2002-06-14 03:42 · Score: 2

While the ANSI Common Lisp standard is comparably sized to the C++ standard, there is one important difference - it's actually really easy to read. :-) I think most Lisp programmers start very early in their development to start referring to and using the standard on a regular basis. I'm less sure about C++ programmers.
Re:Change languages. by spitzak · 2002-06-16 10:31 · Score: 2

does not do bounds checking.
Re:Change languages. by rjh · 2002-06-16 11:06 · Score: 3, Informative

Oh? Look at the at() method.

vector does do bounds checking, but since it results in a (minor) performance penalty, operator[] (the normal method of vector access) is unchecked. You want bounds checking, use at() .

95% of the time, it is simply bad software engineering practice to use operator[] within a vector . The only time it's really acceptable practice is when (a) you're operating under severe performance limitations and (b) you have some other guarantee you won't hit an out-of-bounds condition.
Re:Change languages. by spitzak · 2002-06-16 11:27 · Score: 2

I'm suprised that anybody is reading this old of an article. But anyway I see no difference than somebody saying "C is safe, just use this special at(pointer,index) function instead of [] and you will be fine" The normal syntax everybody uses is not bounds checked and there is no way around that.
Re:Change languages. by rjh · 2002-06-16 11:43 · Score: 2

The normal syntax everybody uses is not bounds checked and there is no way around that.

The syntax you use isn't. The syntax which I commonly use--and which is commonly used by professional C++ coders--is. Bounds-checking is one of the biggest wins of the vector ; discarding it, just because you can't be bothered to learn how to use the STL properly, seems exceptionally rash to me.

But anyway I see no difference than somebody saying "C is safe, just use this special at(pointer,index) function instead of [] and you will be fine"

There is no difference, save for this: a bounds-checked vector is part of the C++ standard library (via the STL) and is available on every C++ platform that's worth coding for. Even MSVC++'s shoddy STL implementation supports it.

Bounds-checked array access is not part of the C89/C90 spec (dunno about C99), and thus, if you want it, you have to do what the original poster does--bleed for it, via many different vendors.

The original advice I gave is still the advice I'm giving now. Use a different language. If bounds-checking is what you need, then use a language with support for bounds-checking built into the language.

C++ has this. C doesn't.
Re:Change languages. by Jonner · 2002-06-17 09:26 · Score: 1

I don't really have any experience with Scheme yet. I have been going through a tutorial and have been very impressed with the language's simplicity and mathematical nature compared to the languages I know: '(Pascal C C++ Java Python). I have seen references to CLOS-like object systems for Scheme, such as tiny CLOS and GOOPS, though I haven't tried them yet.

g77 (GNU Fortran) has it built-in... by PaulBu · 2002-06-12 10:50 · Score: 1

I guess the flag is -C and it does what
you would expect: program checks bounds
on any array access. (Used it a couple
of month ago to track a really nasty bug
in some ancient code).

I doubt this would be easily portable to
the C/C++ side of GCC, because in C you have
miriad ways to access the same memory location
(via different pointers).

Of course, already mentioned Electric Fence
is a really nice tool to debug malloc() problems
(but not other types of memory overruns, like
overrunning a static array).

Linker can put a 0xDEADBEEF after all arrays and
verify that it is the same on the program exit,
might help some...

Paul B.

The solution to most of your debugging needs! by ken_mcneil · 2002-06-12 10:56 · Score: 5, Informative

An excellent general solution I've found for problems of this nature can be found at "file:///usr/include/assert.h". Seriously,
preconditions, postconditions, and invariants are the best approach to avoiding such errors. Will a bounds-checker detect if you access an element that is out-of-bounds in a view (subarray) of a larger array? Also, if you are developing a library, using assertions will also greatly assist any end-users who are not using a bounds-checking tool.

Re:The solution to most of your debugging needs! by 0x0d0a · 2002-06-12 13:40 · Score: 4, Informative

Amen. I remember going through classes and having to churn out preconditions and postconditions and hating it...but then realizing that assert() is your best friend in code. If a data structure should or shouldn't have something true at a given point, but in an assert() to ensure that it's the way you think it is. It helps you understand your code, and most importantly, if you change something else that has a ripple effect through the program that causes a crash hours later, a good set of assert() calls will finger the initial cause in one minute instead of one week.

This is particularly important in open source projects. Bob writes code that produces and uses a data structure and makes some assumptions about it. Now John makes a few improvements to the program, has no idea what assumptions Bob (who lives a continent away) has made, and modifies the data structures in a way that breaks Bob's code. John doesn't know what single change broke Bob's code, and Bob doesn't know all the things that John did that might affect his data structures. Liberal use of assert() will cost you nothing at runtime (compile with -DNDEBUG), takes only a tiny bit of extra typing, and is one of the very best weapons against program-spanning nasty errors.

--
May we never see th

STL by j_kenpo · 2002-06-12 13:56 · Score: 3, Insightful

IMHO, you should do a mix of C and C++ and use the Standard Template Librarys vector, deque, or list classes instead of an array. Hell, even if you use an array, the STL functions and algorithms still work on them. You can even use the Queue and Stack wrappers if thats what your doing... Thats just my opinion though....

But is assert() portable? by astroboscope · 2002-06-12 15:41 · Score: 1

Its info page on my GNU/Linux box says assert is a GNU extension. I suppose I could still keep a debugging copy with asserts, and then sed them all out for a shipping copy, or better make configure do it if necessary, but that's work.

--
If we were ants living on a Rubik's cube, differential geometry would be a little more confusing.

Re:But is assert() portable? by ken_mcneil · 2002-06-12 16:31 · Score: 1

assert and NDEBUG are part of the ISO C standard, and if assert is not defined on a platform it is trivial to implement yourself.
Re:But is assert() portable? by cant_get_a_good_nick · 2002-06-19 20:13 · Score: 2

Its info page on my GNU/Linux box says assert is a GNU extension.

The assert that the info page is talking about is a command line argument, a weird gcc-ism. This can be safely ignored. If you're really interested, it's kind of a -D__OSType__, but umm, different. Try gcc -v -E - and look for all the -A..s if you're really interested to see what's set. Ignore this, any code that uses this should be shredded and the coder shot. It doesn't give any advantage, and it locks you to gcc. And while you're shooting the coder, shoot the gcc guy who called it assert, just adds to confusion unnecessarily.

The assert() that's your best coding friend is a debugging thing. It allows you to check important conditions that could lead to bugs. It's a macro that's turned off by -DNDEBUG, so on your release version, one compile switch and no check or runtime penalty.

Quickie example:
#include <assert.h> int getSomethingFromArray(struct array *ptr, int elem) { assert(ptr != NULL); /* ptr should NEVER be NULL */ assert(elem >= 0 && elem arraySize); /* check bound */ return ptr->elems[elem]; }
Contrived example, but you get the point. ptr should never be NULL, if it is, somethings wrong. If you violate the array bounds, something else is wrong. So in either case, the associated assert() blows up, core dumps, and you see from the core file where it trashed. Once you debug it and get ready to ship, define -DNDEBUG and the assert()s become empty statements, and the compiler eliminates them. It's pretty cool.

assert() has it's rules on how and where to use, and any good C book will tell you these rules. You should also be comfortable with looking at core dumps with your debugger, at least stack traces to see where it crashed. if you get a core, try:
gdb progname core
and once you're in there, type where.

And as a general rule, if it seems that something is a lot of work, chances are someone else thought that, and either has written the stuff for you, or there's a better way. Us programmers is lazy. In this case, the NDEBUG define strips everything, no need for sed.
May you code in interesting times.

use malloc... by zenyu · 2002-06-12 15:45 · Score: 2

Since libc5.4.23 the standard malloc has included rudimentary bounds checking. Just set MALLOC_CHECK_ to 1 or 2. At 1 it prints debugging output, at 2 it calls abort() so you can look at the core and see what happened. The best part is you can even do this on code you don't have the source to. Of course the other suggestions here are good, but I've tracked down a lot of bugs without having to link to one of the special range checkers.

You can also set MALLOC_CHECK_ to 0 to get a malloc like Windows and BSD that's safe against double free's and most off by one errors. Not useful for debugging, but can sometimes make a buggy closed source program run without dumping core. It's slower of course, but...

Re:use malloc... by spitzak · 2002-06-16 10:36 · Score: 2

Yes, MALLOC_CHECK_ has worked for me to find bugs as well. This was in a portable program that we tried to locate the bug on Windows using Purify and gave up on locating, MALLOC_CHECK_ found it right away (the bug was in an automatic constructor). Though I doubt this is normal, it sold me on using the simple tools.

Setting MALLOC_CHECK_ to zero makes it act the same but not abort or print messages. The weird end result is that malloc is safer because a side-effect of the checking is that multiple frees and writing off the end of a buffer. I think it may actually do the tests and then decide not to report the results, so it certainly is slower.

Re:Wide page! by Anonymous Coward · 2002-06-12 16:04 · Score: 0

Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.0rc2) Gecko/20020510

Wide page my dick. Yeah, I use windows.

Creatign a Bounds checker like tool by doryzi · 2002-06-12 16:44 · Score: 1

Hi guys, Yes it's true that most those tools are really really expensive. I think I would like to try and write something like that. I have a lot of C++ and C experiance, and I also have experiance with memory managment. I would like to start writing something like that which will be a GPL or a shareware so that people can use it. I'm just sure it will be a very big project. Anyone want to help me out ? Anyone want to join in? Anyone know where I can find people that would be able to help me and want to join me and take an active part in it ? Let's try and start rolling out something like that !!! ;-) Cheers Dory.

Re:Creatign a Bounds checker like tool by James+Youngman · 2002-06-18 23:18 · Score: 1

It already exists - it's called Valgrind - an AC post provided a link to it. In case you can't find it, it's here.

Time to change your .sig by Anonymous Coward · 2002-06-12 17:09 · Score: 0

...unless you're really determined to let everyone know what a whiny loser you are...

Re:Time to change your .sig by Anonymous Coward · 2002-06-14 03:51 · Score: 0

You also look really determined to let everyone know what a whiny loser you are...

splint.org by Kunta+Kinte · 2002-06-12 17:43 · Score: 1

wow,

I've had splint.org in my sig for a while now. I think it's one of those projects that needs more attention. This project used to be called lclint but got renamed to splint.

There are lots of papers out there on static checkers. One good intro paper is at http://www.research.ibm.com/people/h/hind/paste01. ps. This would give you a nice intro on pointer analysis, a sub topic in static analysis..

--
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW

Re:splint.org by joto · 2002-06-13 02:59 · Score: 2

Well, I found lclint too much hassle for what it was worth.
But that might be because I only use C for nasty low-level code. E.g. to implement a reference-counted pointer scheme, I would have to fight with lint on all kinds of "who owns this pointer" issues. And this would appear anywhere I used them in the program. I think this corresponds mostly to the third paragraph in section 4.5 of the linked to paper.
I don't think static checking is useless, in fact I am very interested in the issue, and I'd love to be proven wrong!
Do you know of some examples where lclint/plint has been used with reasonable effort to find interesting bugs (bugs that wouldn't easily show up in non-static checkers) in complex pointer-handling code (i.e. something akin to the Boehm GC, or equally awful)

Try YAMD by grundy · 2002-06-12 21:32 · Score: 1

YAMD by Nate Eldredge has much of the functionality your looking for. Plus, you don't have to recompile your code to use it!

Extensive list by isj · 2002-06-12 23:42 · Score: 1

The author of MPatrol has compiled a list of heap debugging and related tools: MPatrol: Related software

Some are commercial and some are freeware/public domain/whatever.

GNAT by Detritus · 2002-06-13 00:46 · Score: 1

GNAT, an open source Ada-95 compiler, support those checks.

--
Mea navis aericumbens anguillis abundat

Re: GNAT by Black+Parrot · 2002-06-13 08:26 · Score: 2

> GNAT [gnat.com], an open source Ada-95 compiler, support those checks.

The language also supports bug-resistant programming, e.g. -
for I in Big_Array'Range loop...
When you change the size of the array later, everything stays Cooper City.

--
Sheesh, evil *and* a jerk. -- Jade
Re: GNAT by Anonymous Coward · 2002-06-17 07:03 · Score: 0

But watch out if you don't do that:

type Atype is array(range 1 .. 10) of Integer;
array : Atype;

for I in range 1 .. 3000 loop
array(I) := I;
end loop;

Will segfault and raise Program_Error

Rear End! by jishcat · 2002-06-13 01:49 · Score: 0

Flamebait my rear end.

Sounds painfull...

Use scripting and VM languages, where possible. by mikehoskins · 2002-06-13 04:44 · Score: 2, Insightful

In the pursuit of creating a language holy war, I'd say to use scripting and virtual machine languages, where possible.

JSP & PHP are great for web sites. Perl & Python are great replacements for shell scripting, as well as most general-purpose stuff. LISP is great, if you're a purist. Java has its uses, to be sure.

The point of all of this is that built-in memory allocation, built-in garbage collection, and a lack of pointers is A Very Good Thing(tm). You basically don't have bounds-checking problems. In general, scripting and VM code won't break due to memory leaks and the like.

Interpreted code, in particular, is highly reliable. As an example, Perl code, if well written, which means it traps all errors, etc., is rock solid. Python, I am told, is even more solid. C, on the other hand, is highly unstable. C++ is almost as bad, and VB code on M$ boxen, breaks all the time, as well.

These days, hardware, memory, and disk are SO cheap and fast that you *should* recoup almost all program performance costs associated in interpreted/scripting/vm languages in four ways: 1) faster, easier coding; 2) easier debugging; 3) more portability; 4) more reliable software. Of course, in the case of Perl, you've got to force good style upon yourself, so items 1 and 2 may not apply, some times....

I'd also avoid stuff that puts too much faith in the stability of dynamically linkable code. DLL's and COM objects in M$ land is a huge problem. It goes without saying that Linux's .so mess isn't nearly as great as M$'s.

C, C++, etc., have their uses, to be sure. People use C where it doesn't belong. It belongs in writing operating systems, interfaces, drivers, etc., but it isn't, for most intents and purposes, a good business language. C++ is better, but Java and *modern* scripting languages are even better, most of the time.

If we're going after "The Best Tool for the Job," I see that you need to balance among several different tensions: a semi-popular language (so you can get help, when needed), one that's well-documented (good books at your local book store and many web sites that cover it, for example), is highly portable (the larger, older, more successful, and more mature a project gets, the chances it'll get ported increase), does the job with a minimum amount of effort (planning, coding, testing, debugging, and documentation all go into this), won't crash unexpectedly (like C/C++/VB/assembly), runs quickly enough (with modern hardware and preemptive multitasking/multiprocessing operating systems, this isn't a bug issue, most of the time), is easy to fix/alter (most scripting languages don't have a compile step, so the code is the executable, ergo, it's usually easier to fix), and is general purpose (not specialized).

Just as important, you need to avoid the tensions of "too many" or "too few" languages for a project. Having 1 language that tries to force the big square peg in the small round hole is just as bad as 10 languages in a small to medium-sized project. Working on a team illustrates this even more. While SQL, OS shells, XML, HTML, and JavaScript are all exceptions to the rule (they're usually the only/main way to accomplish a specific task), having one person writing in C, another in C++, one in Perl, one in Python, another in VB, and still another in Java is usually a ticket to disaster, for most projects.

My personal rule of thumb: Perl for batch processing, utilities, command-line scripts, and most data massaging; PHP for small to medium web apps; JSP for larger web apps, or those created on teams of about 4 or more people; Java for most apps, especially GUIs; C/C++/VB for really specialized stuff; and what ever else, if you've got to support old code (new code from the above list).

Re:Use scripting and VM languages, where possible. by Anonymous Coward · 2002-06-13 10:05 · Score: 0

>> As an example, Perl code, if well written, which means it traps all errors, etc., is rock solid

oxymoron.

or should i be more PC and say: writing a stream of 1's and 0's in the machines native language, if well written, is rock solid.

i'm tired of people blaming their tools for their own lack of understanding of how to use those tools.
Re:Use scripting and VM languages, where possible. by Anonymous Coward · 2002-06-13 14:52 · Score: 0

Great, But how does that relate to the topic of the post ?
Re:Use scripting and VM languages, where possible. by Anonymous Coward · 2002-06-14 07:10 · Score: 0

Or you could avoid using 1E6 different languages and environments, and simply use a portable *compiled* language that has the monopoly on military and commerical airplane, missile and satellite control systems.

It the most safe language allowed by the laws of physics, but not restricted in any way, with *elegant* and safe ways of doing low level access, and a massive list of tools and libraries.

Oh, yeah, and the main-stream compiler is based off of GCC - it's called GNAT.

GNU Ada Translator.

Do something useful for yourself, and learn it.
Re:Use scripting and VM languages, where possible. by gerardrj · 2002-06-14 07:37 · Score: 3, Interesting

Of course with PERL you could have the best of both worlds:

Develop in PERL with the flexibilty of the interpeter and all the garbage collection and neato stuff built-in.

When you hit a "stable" release version, use the O module to compule the code. either to Perl byte code for faster loading, or to one of two versions of C code. One just spits out calls to the perl/system libraries, the other is standard C code.

--
Article X: The powers not delegated... by the Constitution...are reserved...to the people

glibc is your friend. by Karellen · 2002-06-13 11:05 · Score: 2

glibc 2.2.x has a number of really nice little quirks that you can use to help debug memory problems. Among my favourites are:

MALLOC_CHECK_

If you set the environment variable MALLOC_CHECK_ before running a program, glibc uses a slow but thorough variant of malloc to do some checking on buffer overruns, double-frees, etc... Setting MALLOC_CHECK_ to 0 makes it ignore problems, 1 causes it to print a diagnostic to stderr, and 2 causes it to print a diagnostic and abort(). All of this is the glibc malloc(3) man page.

MALLOC_TRACE and mtrace()

If you "#include " in your source, you can call mtrace(3) at some point in your code. This function looks for the environment variable MALLOC_TRACE which it then logs all malloc(3)s, free(3)s, realloc(3)s and calloc(3)s to. When your program is finished, you can run the mtrace(1) perl script (also supplied with glibc) to run through this log, and print out a list of all unfreed memory, all freed, unallocated memory, all double-freed memory and probably a bit more besides. It's really handy.

I tend to put the "#include " and "mtrace()" calls inside "#ifdef HAVE_MTRACE" guards, and then add "-DHAVE_MTRACE" to my CFLAGS when compiling debug builds.

The documentation for this can be found at http://www.gnu.org/manual/glibc-2.2.3/html_chapter / ibc_3.html#SEC37

malloc() and free() are weak symbols.

glibc's copy of free(3) is a `weak' symbol in the library. What this means is that you can write your own functions called malloc() and free() in your program, and those will be called all the time, instead of the proper ones. You can call the originals with _malloc() and _free, or __malloc() and __free() (can't remember which, think it's the first pair.) and do little extra checks and things yourself. (Such as filling memory with bogus data before returning, etc..., to make sure you're not forgetting to zero some bytes here and there for example.

gdb is also really great too and has loads of stuff that I've not found in other debuggers. Check out the manual sections on `ignore' (to ignore a breakpoint x times to catch the (x + 1)th malloc), and `commands' (to automatically print out variable values and continue for example) w.r.t. breakpoints.

http://www.gnu.org/manual/gdb-5.1.1/html_chapter /g db_6.html#SEC34
http://www.gnu.org/manual/gdb-5.1 .1/html_chapter/g db_6.html#SEC35

--
Why doesn't the gene pool have a life guard?

dmalloc by Anonymous Coward · 2002-06-13 12:00 · Score: 0

dmalloc is by far the best memory cheker ive tried

Scheme - - by Anonymous Coward · 2002-06-13 12:11 · Score: 0

I'm not sure if it meets your requirements for a *good* object system, but certainly an existing Scheme object system is Scheme--, which is the obvious corollary to C++. I used this several years ago in an algorithms course. I didn't have the understand of and respect for OO that I currently possess, so I can't say if it was a crock or I simply didn't appreciate it.

Tried to google up some Scheme-- links, but alas, no luck. Sorry.

Re:Scheme - - by jtdubs · 2002-06-13 14:31 · Score: 2

Thanks a lot. I'll definitly do some looking into Scheme--. I appreciate the reply. Have a good day,

Justin Dubs

Try Bell Labs vmalloc by BuildMonkey · 2002-06-13 15:59 · Score: 1

Bell Labs released vmalloc() for public use and gave a white paper on it at 1996 USENIX. I recently investigated it for memory leak problems in an embedded real time system. http://www.research.att.com/sw/tools/vmalloc/

FunctionCheck - gprof replacement. by Anonymous Coward · 2002-06-13 21:09 · Score: 0

Development is currently stoped but it looks super powerful and tracks memory allocation/dealocation by funtion etc...

http://www710.univ-lyon1.fr/~yperret/fnccheck/pr of iler.html

Scheme, Lisp not good for commercial programs by Anonymous Coward · 2002-06-14 00:30 · Score: 0

I evaluated Scheme, Lisp for writing programs that we can distribute.

They don't seem to have many of the facilities that we take for granted in a widely used language. Like standard interfaces to TCP/IP networks, standard interfaces to databases.

What I would have liked to have done is - here is the API for sockets, here is the API for databases and spend time writing code. Agreed, not chasing mallocs saves time but so does not having to write foreign function interfaces to standard functionality. Maybe I was wrong in my evaluation. If so I would be glad to be corrected and begin using these languages

checker-gcc or a better language by Anonymous Coward · 2002-06-14 20:09 · Score: 0

There is an older version of gcc, "checker-gcc" (based on gcc 2.8) which is the most powerful memory checker available under Free Software.

Without a doubt, the most industrial strength language avaiable for Free Sofware use is GNU Ada. Ada won't let you f*ck up. It is truly an awesome language. Check out the GNU Visual Debugger - gvd- for an example of one of the coolest examples of what Free Software Ada technology can do.

Good places to start:

There is a wealth of Ada lerning resources on the web, perhaps more on line instruction than any other progrmming language. Ada is at or above the same level of abstraction as C++. C++ programmers should not have too much trouble learning Ada. One other nice aspect of Ada is that since it was the first ISO standard OOP language, and since the way it interacts with other programming languages is codified as part of that standard, it is very easy to use Ada for the "mission critical parts of a software project. There is no need to re-write a whole project to start taking advantage of Ada; it can be done piece by piece.

memory usage tracking by Anonymous Coward · 2002-06-15 00:15 · Score: 0

I've found that this works quite nicely for tracking memory usage. Doesn't check for over runs etc but is pretty flexible and makes spotting leaks/double free etc very easy (which is pretty much all i use boundschecker for in win32...) its GPL too and nice and simple to look at... cheers.

bcc is a patch to gcc which gives gcc bounds check by Anonymous Coward · 2002-06-17 09:03 · Score: 0

http://web.inter.nl.net/hcc/Haj.Ten.Brugge/

I have used it, and it works very well.
Given the requirement of a using a
brain-damaged language, it is the best open source
tool I am aware of for finding bounds checking errors.

jeff.deifik@jpl.nasa.gov

Simple GCC macro makes bounds checking a snap by Alice+Terry · 2002-06-26 08:59 · Score: 1

With gcc, you can add bounds checking easily with extended inline assembler. It only adds one assembly instruction,
and very little overhead. The C macro Bound(), defined below, make it very simple. Here is a demonstration:
#include <stdio.h> #include <stdlib.h> #include <stdint.h> struct _bounds { uint32_t lower; uint32_t upper; } __attribute__ ((aligned (4))) ; #define Bound(X,Y) __asm__ ( "bound %0,%1\n\t" : : "r" (X), "m" (Y) ) #define UPPER_BOUND(X) (sizeof(X)-1) /* create a plain vanilla array for our test */ #define LENGTH 15 static char test_array [LENGTH]; /* store the lower and upper limits of your test array */ struct _bounds limits = { 0, UPPER_BOUND(test_array) }; void bound_test (int index) { /* if the index is out of range, create a core dump */ Bound (index, limits); test_array[index] = 'a'; } /* * We can invoke our test procedure bound_test() by entering * an array index on the command line. If the index is out * of range for the bound_test() procecure, the x86 "bound" * instruction will trigger a core dump. */ int main(int argc, char *argv[]) { if (argc > 1) { bound_test (atoi(argv[1])); } return 0; }

Re:Simple GCC macro makes bounds checking a snap by Alice+Terry · 2002-06-26 11:02 · Score: 1

Oops. The UPPER_BOUND() macro which I gave only works with
character arrays. Here is a more general version:
#define UPPER_BOUND(X) ((sizeof(X)/sizeof(X[0]))-1)

Slashdot Mirror

Bounds Checking for Open Source Code?

90 comments