Memory Checker Tools For C++?

Most tools I've tried are useless by DrXym · 2007-06-05 21:59 · Score: 3, Insightful

I've played with Boundschecker, Purify (& Quantify) and Fortify. My experience of these tools is that they either take a painfully long time to run, throw up too many spurious warnings or crash outright after eating all available memory / disk space.

They might be useful for small apps but if you have a massive app they are almost more trouble than they are worth.

It's hard to say what you can do except foster safe coding practice and highlight the common pitfalls such as memory leaks, buffer overflows etc. Many compilers can help detect heap / memory overruns because the debug libs put guard bytes on the stack & heap that trigger exceptions when something bad happens. There are also 3rd party libs such as Boehm which help with memory leeak / garbage collection issues and dump stats. I'd say using STL & Boost is also a very good way of minimizing errors too simply because doing so avoids having to write your own implementations of arrays, strings etc. which are bound to be less stable.

Re:Most tools I've tried are useless by peterpi · 2007-06-06 02:12 · Score: 4, Insightful

I didn't explain it all that well. What I mean is; I love destructors.

A good example of what I'm talking about is a std::ifstream versus a java.io.FileInputStream. If you make an ifstream on the stack, you can be absolutely certain that when it goes out of scope, the destructor will be called and the file closed. You can be certain that it will happen, and you can also be certain when it happens; at the very point it goes out of scope.

With a heap based FileInputStream, you have no such gaurentee. You leak it, and you just hope that the finaliser gets called soon (if at all). I've had more than one occasion where I've been leaking FileInputStreams quicker than the garbage collector cares to clean them up, and sooner or later the OS says 'no' and you get an exception. And it's very difficult to reproduce, because it's all down to the whim of the garbage collector, and you always go slower when you're looking for a bug.

Of course the answer to this is to say "Well you should Close() your input stream beforehand". But that's just as bad as saying "You should delete your heap based objects" in C++. It's that situation of having to manually shut down objects that seems old fashioned to me.

Maybe there's a better way these days, I've been away from Java for a couple of years now.

(I do enjoy coding in either language though!)

Re:Boost? Ugh by Viol8 · 2007-06-05 22:28 · Score: 4, Insightful

"you'd better get used to the "weird syntax" of templates and especially the boost libraries"

I'm used to templates syntax (though I think its ugly and Stroustrup could have done a lot better) but Boost makes it worse by overloading operators and then using them in ways never intended that produce syntax that a plain C++ wouldn't even recognise, never mind understand what its doing.eg the gratiutous overload of () for matrix ops where a simple function call would have been much cleaner and easier to follow.

A second vote for Valgrind by RatCommander · 2007-06-05 23:39 · Score: 3, Insightful

Valgrind's default memcheck tool is an excellent way of finding memory errors - ranging from extremely subtle to obvious. In addition, Valgrind can be used as a code profiler, cache simulator and many other things. It really is an excellent tool - I recommend it to anybody writing C++.

--
"It is better to die for an idea that will live than to live for an idea that will die" - Steve Biko

Re:Boost? Ugh by maxwell+demon · 2007-06-06 00:23 · Score: 3, Insightful

"I suppose you like adding vector components manually, instead of doing v1 + v2?"

No , something like vectorAdd(v1,v2) would be a lot more readable and a damn site easier to grep for. Idiot. Then probably we should remove the operators for built-in types as well. After all, you could use functions like doubleAdd(a, b). As a bonus, you'd not get nasty surprises when mixing unsigned and signed integers. intGreater(a, n) would always give you the expected answer, even if a is negative and n is unsigned. If you'd want to compare in unsigned arithmetics, you'd use uintGreater(a, b) instead. And what dereferenePointer(p) does is self-evident, unlike *p. Also, removing all operators would greatly simplify the parser, because the only types of expressions it would have to parse would be constants, variables and function calls.

But I just see you signed your post with "Idiot." Thus I guess I shouldn't have taken it seriously anyway. :-)

--
The Tao of math: The numbers you can count are not the real numbers.

Re:um by Shadowlion · 2007-06-06 00:23 · Score: 3, Insightful

> and works fine in legacy apps

Regarding legacy applications, I think the point was that he can't go back through the app and rewrite everything to use smart_ptr.

Most people can't understand Purify's output by Anonymous Coward · 2007-06-06 00:51 · Score: 5, Insightful

Most people can't understand Purify's output, and I've actually ran across coders who actually believe their code can't be as bad a Purify says it is.

For example, this code has serious issues:

extern string method_that_returns_string_object();
char *ptr;
.
.
.
ptr = method_that_returns_string_object();
.
. . That actually will compile, and seem to "work". But it's horribly wrong, and Purify will find the problems.

And FWIW, I've used Purify on massive apps, and found huge problems that the developers didn't even know were there. On one project, they couldn't explain why their "perfect" app kept crashing, either. Worse for them, I had been hired as a consultant to fix their problems that they couldn't seem believe existed (HINT: your boss hired someone from the outside...), and after watching the team flail and spend literally almost a man-year trying to find one memory bug, I finally had enough of "advice giving" being ignored and got on their system, linked their app under Purify, ran it, and found the bug - a double delete of an object from two different threads. It all took me about fifteen minutes. I did that in front of their management. I made my point.

Purify (and like tools) are a great help. Not using them is like trying to build a house without power tools. Yeah, it can be done. But what would you think if hired a builder to make your house and his team showed up carrying hand saws? Oh, and you are paying that team to hand-saw all the lumber...

What would you think of that builder?

Yet, when a developer asks for tools like Purify, management often balks. Because 1) they're shortsighted, and 2) developers don't know how to use such tools.

Like I said - what would you think of a construction company where the workers don't know how to use modern power tools to help their productivity?

Well, you just put yourself in that category.

Yes, Purify is somewhat slower than running without Purify. But it's a lot faster than most other full-memory checking methods. If you're worried about speed, link against the Win32 debug libraries - they'll at least show problems with double free() calls, access of free()'d and deleted objects, etc. And without too much performance problems.

Re:um by Evanisincontrol · 2007-06-06 01:14 · Score: 3, Insightful

I hope that was supposed to be sarcastic, otherwise you are the worst developer I've ever heard of. Rewriting an entire legacy application just to use shared pointers is downright stupid. He might as well just redesign the entire software and build it from the ground up... but then you're not "maintaining" anymore. You've completely redefined your job description.

Don't allocate or free = no leaks = need no tools by seniorcoder · 2007-06-06 01:19 · Score: 5, Insightful

We are running many high speed financial message processing applications. A crash for any reason (including a leak) would be very costly for us.

We pre-allocate pools of objects at startup and then re-use them. No other memory is allocated or freed while the process is running. Our pools of reusable objects are monitored very carefully as an object that isn't release back to its pool when the job is done is akin to a memory leak. Use of sentries to automatically release objects back to the pools when they fall out of scope is mandatory.

So my answer is to the problem is:
1. Use sentries (or some other mechanism) to guarantee memory is released.
2. Don't allocate except at startup.
3. No need for elaborate tools due to the above.

I'm sure that not all applications data usage would fit into this model, but it is surprising how many can.

We have seen some leaks in our applications. These were tracked down to STL internally leaking. They weren't generally very large and therefore we continue to live with them.

On the subject of garbage collectors, some of our colleagues use Java and .NET. Both sets of colleagues have had major performance problems caused directly by the garbage collectors kicking in and consuming vast CPU power while they did their thing. The result was a failure to process messages in a timely manner in our high speed environment. The solution in both languages was to use pools of reusable objects and never cause their reference counts to drop to 0. Thus they implemented the very same mechanism that we use in C++ and avoided the garbage collectors.

So don't think that a garbage collector is the solution. Perhaps in less demanding applications it is a potential answer.

Lastly, I strongly dislike anything from Rational. I find them overpriced unreliable bloatware (YMMV). Purify used to be good some time ago, but those days are long gone.

I echo what others have said above. You are a developer. You know your requirements. Build a simple tool to monitor and check your usage. For us it was managed pools of re-usable objects.

Re:um by bhsurfer · 2007-06-06 01:47 · Score: 4, Insightful

The first thing I was told by my boss when I got hired was "You're going to look at this app and want to rewrite it from scratch. Don't do it, that's not what we want you for." Software doesn't need to be pretty, you just make improvements as you can and leave the ugly but solid code alone until necessary. It's an extremely rare situation to have the luxury of a complete redesign/rewrite.

I guess that's a long way of saying "I agree completely with what you just said."

--
Those are my principles, and if you don't like them... well, I have others.
Groucho Marx

Re:um by bhsurfer · 2007-06-06 03:42 · Score: 3, Insightful

I meant "ugly" in the sense of "Not the way I would have done it" rather than in a "Holy shit, what a freakin' mess! This guy should be bagging groceries, not writing software!" kind of way. I certainly do not think that clever tricks and mounds of complex spaghetti code that were designed by avalanche is maintainable, believe me.

I also have (unfortunately) written enough ugly stuff that when I go back later I say "I can't believe I actually did something that stupid."

You live, ideally you learn, and when you look at code you wrote 5 years ago you likely slap your forehead in embarrassment - that's how you know you're getting better. That, and when your coworkers aren't trying to slash their wrists when they get handed something you wrote...

--
Those are my principles, and if you don't like them... well, I have others.
Groucho Marx

Re:um by joto · 2007-06-06 03:44 · Score: 4, Insightful

If its ugly its not solid. Ugly code is hard to understand at first glance, and its easy to introduce an error. Or do you consider code that's easy to make a mistake with as actually being "maintainable"?

You are confusing two aspects here. Ugliness does limit maintainability. But it does not limit "solidness". "Solidness" would mean that the code actually works, and has a proven track record, such as being used in production for over 20 years. Code that has been in production for over 20 years is usually both solid and ugly.

That ugly code is usually a monument to the "there's not enough time to do it right, but there's always enough time to do it over ... and over ... and over" and "ship it now - fix it later."

Or it could be a monument over "the world is a complex place, and if you change anything here, and it causes the program to fail in some weird special case, your company is going to loose umpteen zillion dollars". While the reality is probably somewhere in between, rewrites should still be avoided like the plague. However, if you really have taken the time to understand what some nasty bit of code does, there's nothing wrong about cleaning it up. But most of the time, the ugly code is there for a reason.

Home brew tool for memory leaks with glibc by cant_get_a_good_nick · 2007-06-06 04:11 · Score: 3, Insightful

GLIBC allows you to create hooks for the standard mem functions (malloc/realloc/free). Remember that g++ still calls these under new/delete so it works for C++ also.

One of our guys coded up a simple shared lib that can be loaded with LD_PRELOAD that sets simple hooks of printing memory locations for new/realloc/delete. He then wrote a perl script that kept track of these things and spit out anything that was malloc'ed and not realloc'ed or free'd.

I can't post it, because technically it's not my code it's my company's. But his shared lib code is just 300 lines long, and shouldn't be hard to duplicate. The perl log filter is even more straighforward. Each malloc gets saved. Each free removes the malloc. Each realloc removes the old malloc and adds a new one. Anything left over is a leak.

Override __malloc_initialize_hook with a pointer to your init_function. In your init_function, save the old functions at __malloc_hook __free_hook __memalign_hook and __realloc_hook and substitute your own. Now write your replacement functions, in it, do your logging and temporarily replce the old hooks and call the original functions, replace with your hook on the way out to get the next call. All of the hooks should be wrapped in a mutex to help re-entrancy problems.

It's not a full memory detector, just does leaks, but it's non-intrusive, requires no recompiles, and is the best way we have to leak detect our huge server long running code.

Re:two points by Anonymous+Brave+Guy · 2007-06-06 05:15 · Score: 4, Insightful

It's not automatically bad, but using semi-automated memory management like this tends to reduce the emphasis on constructing things only when they're needed and destroying them immediately when you're done with them. This concern, known as "Java bloat syndrome" in honour of the language that first popularised it, can lead to major performance problems in applications that manipulate a lot of data, and is a favourite mistake made by the cult of "hardware is cheap, so optimisation doesn't matter".

The thing is, this sort of care-free programming philosophy is natural in languages like Java, so languages like Java have had to learn from their early mistakes and adapt. There have been dramatic improvements in GC technology since those early days, and today there isn't the same degree of performance penalty associated with relying on GC to clear everything up.

However, this sort of behind-the-scenes magic isn't really the "C++ way". You can do it, but tools like shared_ptr don't have the same level of sophistication as full-blown GC. Using them requires some care from programmers, and as the grandparent post said, this can lead to problems if the programmers come to rely on them more than they ought.

FWIW, I'm not sure I'd have described things in quite such black-and-white terms as the GP, but I can see the underlying point and I think it's a valid one.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:two points by shutdown+-p+now · 2007-06-06 17:45 · Score: 3, Insightful

In other words, programmers shouldn't use shared_ptr as if it were a replacement for GC. When it is worded thus, I can fully agree with that (and indeed, anyone who understands how reference counting works, will agree as well). The nice thing about shared_ptr is that, unlike GC, it is still fully deterministic, and so it properly preserves the "C++ spirit".

Slashdot Mirror

Memory Checker Tools For C++?

15 of 398 comments (clear)