Memory Checker Tools For C++?
An anonymous reader writes "These newfangled memory-managed languages like Java and C# leave an old C++ dev like me feeling like I am missing the love. Are there any good C++ tools out there that do really good memory validation and heap checking? I have used BoundsChecker but I was looking for something a little faster. For my problem I happen to need something that will work on Windows XP 64. It's a legacy app so I can't just use Boosts' uber nifty shared_ptr. Thanks for any ideas."
They might be useful for small apps but if you have a massive app they are almost more trouble than they are worth.
It's hard to say what you can do except foster safe coding practice and highlight the common pitfalls such as memory leaks, buffer overflows etc. Many compilers can help detect heap / memory overruns because the debug libs put guard bytes on the stack & heap that trigger exceptions when something bad happens. There are also 3rd party libs such as Boehm which help with memory leeak / garbage collection issues and dump stats. I'd say using STL & Boost is also a very good way of minimizing errors too simply because doing so avoids having to write your own implementations of arrays, strings etc. which are bound to be less stable.
"you'd better get used to the "weird syntax" of templates and especially the boost libraries"
I'm used to templates syntax (though I think its ugly and Stroustrup could have done a lot better) but Boost makes it worse by overloading operators and then using them in ways never intended that produce syntax that a plain C++ wouldn't even recognise, never mind understand what its doing.eg the gratiutous overload of () for matrix ops where a simple function call would have been much cleaner and easier to follow.
Valgrind's default memcheck tool is an excellent way of finding memory errors - ranging from extremely subtle to obvious. In addition, Valgrind can be used as a code profiler, cache simulator and many other things. It really is an excellent tool - I recommend it to anybody writing C++.
"It is better to die for an idea that will live than to live for an idea that will die" - Steve Biko
No , something like vectorAdd(v1,v2) would be a lot more readable and a damn site easier to grep for. Idiot. Then probably we should remove the operators for built-in types as well. After all, you could use functions like doubleAdd(a, b). As a bonus, you'd not get nasty surprises when mixing unsigned and signed integers. intGreater(a, n) would always give you the expected answer, even if a is negative and n is unsigned. If you'd want to compare in unsigned arithmetics, you'd use uintGreater(a, b) instead. And what dereferenePointer(p) does is self-evident, unlike *p. Also, removing all operators would greatly simplify the parser, because the only types of expressions it would have to parse would be constants, variables and function calls.
But I just see you signed your post with "Idiot." Thus I guess I shouldn't have taken it seriously anyway.
The Tao of math: The numbers you can count are not the real numbers.
> and works fine in legacy apps
Regarding legacy applications, I think the point was that he can't go back through the app and rewrite everything to use smart_ptr.
For example, this code has serious issues: extern string method_that_returns_string_object();
char *ptr;
.
.
.
ptr = method_that_returns_string_object();
.
.
And FWIW, I've used Purify on massive apps, and found huge problems that the developers didn't even know were there. On one project, they couldn't explain why their "perfect" app kept crashing, either. Worse for them, I had been hired as a consultant to fix their problems that they couldn't seem believe existed (HINT: your boss hired someone from the outside...), and after watching the team flail and spend literally almost a man-year trying to find one memory bug, I finally had enough of "advice giving" being ignored and got on their system, linked their app under Purify, ran it, and found the bug - a double delete of an object from two different threads. It all took me about fifteen minutes. I did that in front of their management. I made my point.
Purify (and like tools) are a great help. Not using them is like trying to build a house without power tools. Yeah, it can be done. But what would you think if hired a builder to make your house and his team showed up carrying hand saws? Oh, and you are paying that team to hand-saw all the lumber...
What would you think of that builder?
Yet, when a developer asks for tools like Purify, management often balks. Because 1) they're shortsighted, and 2) developers don't know how to use such tools.
Like I said - what would you think of a construction company where the workers don't know how to use modern power tools to help their productivity?
Well, you just put yourself in that category.
Yes, Purify is somewhat slower than running without Purify. But it's a lot faster than most other full-memory checking methods. If you're worried about speed, link against the Win32 debug libraries - they'll at least show problems with double free() calls, access of free()'d and deleted objects, etc. And without too much performance problems.
I hope that was supposed to be sarcastic, otherwise you are the worst developer I've ever heard of. Rewriting an entire legacy application just to use shared pointers is downright stupid. He might as well just redesign the entire software and build it from the ground up... but then you're not "maintaining" anymore. You've completely redefined your job description.
We are running many high speed financial message processing applications. A crash for any reason (including a leak) would be very costly for us.
.NET. Both sets of colleagues have had major performance problems caused directly by the garbage collectors kicking in and consuming vast CPU power while they did their thing. The result was a failure to process messages in a timely manner in our high speed environment. The solution in both languages was to use pools of reusable objects and never cause their reference counts to drop to 0. Thus they implemented the very same mechanism that we use in C++ and avoided the garbage collectors.
We pre-allocate pools of objects at startup and then re-use them. No other memory is allocated or freed while the process is running. Our pools of reusable objects are monitored very carefully as an object that isn't release back to its pool when the job is done is akin to a memory leak. Use of sentries to automatically release objects back to the pools when they fall out of scope is mandatory.
So my answer is to the problem is:
1. Use sentries (or some other mechanism) to guarantee memory is released.
2. Don't allocate except at startup.
3. No need for elaborate tools due to the above.
I'm sure that not all applications data usage would fit into this model, but it is surprising how many can.
We have seen some leaks in our applications. These were tracked down to STL internally leaking. They weren't generally very large and therefore we continue to live with them.
On the subject of garbage collectors, some of our colleagues use Java and
So don't think that a garbage collector is the solution. Perhaps in less demanding applications it is a potential answer.
Lastly, I strongly dislike anything from Rational. I find them overpriced unreliable bloatware (YMMV). Purify used to be good some time ago, but those days are long gone.
I echo what others have said above. You are a developer. You know your requirements. Build a simple tool to monitor and check your usage. For us it was managed pools of re-usable objects.
I guess that's a long way of saying "I agree completely with what you just said."
Those are my principles, and if you don't like them... well, I have others.
Groucho Marx
I also have (unfortunately) written enough ugly stuff that when I go back later I say "I can't believe I actually did something that stupid."
You live, ideally you learn, and when you look at code you wrote 5 years ago you likely slap your forehead in embarrassment - that's how you know you're getting better. That, and when your coworkers aren't trying to slash their wrists when they get handed something you wrote...
Those are my principles, and if you don't like them... well, I have others.
Groucho Marx
You are confusing two aspects here. Ugliness does limit maintainability. But it does not limit "solidness". "Solidness" would mean that the code actually works, and has a proven track record, such as being used in production for over 20 years. Code that has been in production for over 20 years is usually both solid and ugly.
Or it could be a monument over "the world is a complex place, and if you change anything here, and it causes the program to fail in some weird special case, your company is going to loose umpteen zillion dollars". While the reality is probably somewhere in between, rewrites should still be avoided like the plague. However, if you really have taken the time to understand what some nasty bit of code does, there's nothing wrong about cleaning it up. But most of the time, the ugly code is there for a reason.
GLIBC allows you to create hooks for the standard mem functions (malloc/realloc/free). Remember that g++ still calls these under new/delete so it works for C++ also.
One of our guys coded up a simple shared lib that can be loaded with LD_PRELOAD that sets simple hooks of printing memory locations for new/realloc/delete. He then wrote a perl script that kept track of these things and spit out anything that was malloc'ed and not realloc'ed or free'd.
I can't post it, because technically it's not my code it's my company's. But his shared lib code is just 300 lines long, and shouldn't be hard to duplicate. The perl log filter is even more straighforward. Each malloc gets saved. Each free removes the malloc. Each realloc removes the old malloc and adds a new one. Anything left over is a leak.
Override __malloc_initialize_hook with a pointer to your init_function. In your init_function, save the old functions at __malloc_hook __free_hook __memalign_hook and __realloc_hook and substitute your own. Now write your replacement functions, in it, do your logging and temporarily replce the old hooks and call the original functions, replace with your hook on the way out to get the next call. All of the hooks should be wrapped in a mutex to help re-entrancy problems.
It's not a full memory detector, just does leaks, but it's non-intrusive, requires no recompiles, and is the best way we have to leak detect our huge server long running code.
It's not automatically bad, but using semi-automated memory management like this tends to reduce the emphasis on constructing things only when they're needed and destroying them immediately when you're done with them. This concern, known as "Java bloat syndrome" in honour of the language that first popularised it, can lead to major performance problems in applications that manipulate a lot of data, and is a favourite mistake made by the cult of "hardware is cheap, so optimisation doesn't matter".
The thing is, this sort of care-free programming philosophy is natural in languages like Java, so languages like Java have had to learn from their early mistakes and adapt. There have been dramatic improvements in GC technology since those early days, and today there isn't the same degree of performance penalty associated with relying on GC to clear everything up.
However, this sort of behind-the-scenes magic isn't really the "C++ way". You can do it, but tools like shared_ptr don't have the same level of sophistication as full-blown GC. Using them requires some care from programmers, and as the grandparent post said, this can lead to problems if the programmers come to rely on them more than they ought.
FWIW, I'm not sure I'd have described things in quite such black-and-white terms as the GP, but I can see the underlying point and I think it's a valid one.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
In other words, programmers shouldn't use shared_ptr as if it were a replacement for GC. When it is worded thus, I can fully agree with that (and indeed, anyone who understands how reference counting works, will agree as well). The nice thing about shared_ptr is that, unlike GC, it is still fully deterministic, and so it properly preserves the "C++ spirit".