Memory Leaks
G3ck0G33k writes: "Is there any free software version/clone of Rational's programs PureCoverage and/or Purify? I have worked with both of them on fairly large projects (>150,000 lines of code) and they were great to work with. When the first runs of Purify found nearly fifty instances of minor memory leaks, I was deeply frustrated/impressed. A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget. Of course, the more kinds of leaks it may detect, the better. GeckoGeek" We had a similar question last year but there's no harm in seeing what the current answers are.
For the other kinds of stuff Purify does (aside from memory leaks), look at Greg McGary's bounded pointer work.
Bad news: You'll have to build your own gcc (Greg's changes haven't yet been accepted in to the gcc trunk), and all your libraries (just as Purify re-writes all your libraries).
Good news: The resulting code is much faster than Purify'ed code, and finds some problems Purify doesn't. I know of a major software development effort (hundreds of developers, millions of lines of code; sorry, can't give details) that uses bounded pointers to great advantage.
Other tools: GNU Checker, dbmalloc, Bruce Perens' Electric Fence, MemProf, mpatrol, and Mprof; Google searches will turn them all up.
Stupid job ads, weird spam, occasional insight at
The Boehm-Weiser garbage-collecting malloc() can be built in a leak-detection mode. Every time an object is leaked, it prints out the address of the memory in question. Do that. Then it's 15 lines of python to correlate that back with the malloc() calls; I wrapped malloc/realloc to print out the line number and filename, e.g.
void *our_malloc(size_t howbig, int line, char * file)
{
void *p;
p=GC_malloc(howbig);
fprintf(stderr, "Line %d of %s/%s(): %p\n", line, file, p);
return p;
}
#define malloc(x) our_malloc(x, __LINE__, __FILE__)
with similar for realloc (and make free do GC_free).
Then run the proggy, redirecting stderr through a simple python script: (leading spaces have been replaced with underscores since slashdot doesn't do PRE)
import sys
a={}
for line in sys.stdin.readlines():
__line=line.strip()
__num=line[line.find("0x"):]
__try:
____num=num[0: num.index(" ")]
__except:
____pass
__if line[1]=="i":
____a[num]=line
__else:
____print "Leaked object: "+a[num]
When I run my program this way I get the following output:
Leaked object: Line 43 of leak_stuff.c/(): 0x806efe0
Leaked object: Line 43 of leak_stuff.c/(): 0x806eff0
Leaked object: Line 55 of leak_stuff.c/(): 0x806dfd8
Which tells me which lines to look for the initial allocations of leaked objects at.
The garbage-collecting malloc is really cool; it's at:
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
for now, but rumor has it that gcc will become the official source for it at some point (it's needed for the Java compiler).
Sumner
rage, rage against the dying of the light
I went through this phase of trying to fix up the memory of all the code I'd ever written. I found ccmalloc to be the best. Its the easiest, instead of gcc -o prog prog.o you just prefix with ccmalloc eg. ccmalloc gcc -o prog prog.o. It provides a nicely formatted output log file, with configurable filtering, showing the stack trace of each unfreed leak, and also catches over/underflows, and lots of other stuff. hint: if you are using the c++ std library get g++-3 (with libstdc++-3) and #define __USE_MALLOC to disable malloc pooling. RPMs here
What Memory Leak Detector Do People Use?
What is pirate software? Software for inventory of stolen treasure?
Your kung fu is no good, Anonymous Coward.
Write a malloc wrapper and #define it in place of the real thing. With #define you can easily log the location in the code, amount of RAM, and location in memory to a file, then write a script in the language of your choice to see which locations in RAM weren't dealloced, and match them with the appropriate malloc call, which also contains the location in code. It took me about an hour to implement this in a multi-thousand line program and it works very well. The only thing it doesn't catch is when a library call mallocs something and expects you to dealloc it, but i solved this by including a fake malloc call that just logs but doesn't actually malloc, so you'd call it right after the library call that actually does the malloc.
- "Ford, you're turing into a penguin. Stop it." Go Prefect!
For anyone interested, an excellent book that covers resource management in C++ is C++ In Action: Industrial-strength Programming Techniques by Bartosz Milewski. I'm relatively new to C++, and this is the book that really sold me on the language...he presents a methodology that practically guarantees you won't have leaks. He's also practical enough to tell you how to retrofit the technique to existing projects. Milewski is a former physicist, and a very clear writer and thinker. And to top it off, the full text is available on the web.
I like dmalloc for memory debugging. It even found a memory bug for a program that purify choked on. It doesn't have a GUI.
Looks like libc doesn't have heap-walking APIs anymore... you can solve this pretty easily by putting a "tag" in front of the allocation, and passing back a pointer past the tag space - free just moves the pointer passed "backwards" to the tag, uses normal free, and voila. A heapwalk is then done to find all remaining blocks at the end, and then you just print the tags that are left. However, in leu of that, you can do your own pseudo-tagging in the form of allocating lists of allocations - more or less what a garbage collector would do. You can then look at your list during process shutdown to see who allocated what where. As an added bonus, you can capture some stack backtraces so you know the context of the allocation. Far too often I find myself writing an object factory, and I know my heap tagging procedures will be useless without stack context - having 2000 allocations all at "factory.cpp:3042" isn't my idea of fun. The code below can be fixed up to use stack backtracing if it's available on the platform. (I know on Win32, you can capture stack traces by faking an SEH exception, capturing the stack frames, and then using imagehlp.dll to map the eip's back to their associated functions... I've been too long away from *nix C to know if something similar is available.) A far better implementation would use a hashtable on various groups of bits in the allocated pointer over a small number of preallocated pages, but that's got other overhead associated with expanding buckets, etc. that this brute-force implementation doesn't have. At least this grows "hot spots" of pages that are getting hit, on the basis that a pattern where an alloc/free of a block happens in a "nested" form more often than not - so you get "alloc a, alloc b, free b, alloc c, free c, free a" more than you get "alloc a, alloc b, alloc c, free a, free b, free c." (Of course, I haven't tested the below code, nor do I care to - my own tracking library is heavily Win32 based and uses a private heap to boot.) #define PAGE_SIZE (4096) /* Intel */ /* #define PAGE_SIZE (8192) */ /* Alpha? */
typedef struct _malloc_block_tag {
int line;
char* file;
void* pblockpointer;
int size;
} malloc_block_tag;
typedef struct _malloc_block_list {
struct _malloc_block_list* next;
int opencount, nextopen;
malloc_block_tag tags[PAGE_SIZE - (sizeof(int) * 2 + sizeof(struct _malloc_block_list*))];
} malloc_block_list;
malloc_block_list *malloc_tag_list_root, *malloc_tag_list_last;
lock_t malloc_tag_list_lock;
#define NUMBER_OF(x) (sizeof(x)/sizeof(*x))
void
tagging_malloc_add_block()
{
malloc_block_list *pblock = NULL;
pblock = malloc(sizeof(malloc_block_list));
memset( pblock, 0, sizeof(malloc_block_list));
pblock->opencount = NUMBER_OF(pblock->tags);
enter_lock(&malloc_tag_list_lock); // assume reentrancy!
if ( malloc_tag_list_root == NULL )
malloc_tag_list_root = malloc_tag_list_last = pblock;
else { // Head insertion, speed up future lookups
pblock->next = malloc_tag_list_root;
malloc_tag_list_root = pblock;
}
leave_lock(&malloc_tag_list_lock);
}
void
cleanup_tagging_malloc() {
enter_lock(&malloc_tag_list_lock);
while ( malloc_tag_list_root ) {
malloc_block_list *phere = malloc_tag_list_root;
malloc_tag_list_root = phere->next;
free( phere );
}
leave_lock(&malloc_tag_list_lock);
}
void
find_leaked_tags()
{
malloc_block_list *phere = NULL;
enter_lock(&malloc_tag_list_lock);
for ( phere = malloc_tag_list_root; phere; phere = phere->next ) {
if (phere->opencount == NUMBER_OF(phere->tags)) continue;
for ( idx = 0; idx tags); idx++ ) {
if ( phere->tags[idx] != NULL ) {
fprintf(stderr, "Leaked %d bytes from %s:%d : @%p\n",
phere->tags[idx].size,
phere->tags[idx].file,
phere->tags[idx].line,
phere->tags[idx].pblockpointer);
}
}
}
leave_lock(&malloc_tag_list_lock);
}
void
tagging_free( void* pv )
{
malloc_block_list *phere;
assert(pv != NULL);
enter_lock(&malloc_tag_list_lock);
for ( phere = malloc_tag_list_root; phere; phere = phere->next ) {
for (idx = 0; idx tags); idx++ )
if ( phere->tags[idx].pblockpointer == pv ) {
memset(phere->tags + idx, 0, sizeof(malloc_block_tag));
pv = NULL;
phere->opencount++;
phere->nextopen = idx;
}
}
leave_lock(&malloc_tag_list_lock);
fprintf( stderr, "Error: Block @%p wasn't tagged.\n", pv );
}
void*
tagging_malloc( size_t cb, int line, char* file ) {
void* pv = malloc(cb);
malloc_block_list* here;
malloc_block_tag* tag;
if ( !pv )
return NULL;
enter_lock(&malloc_tag_list_lock);
if ( malloc_tag_list_root == NULL )
tagging_malloc_add_block();
here = malloc_tag_list_root;
while ( here ) {
if ( here->opencount )
break;
else
here = here->next;
}
if ( here == NULL ) {
tagging_malloc_add_block();
here = tagging_malloc_list_last;
}
here->opencount--;
tag = here->tags + here->nextopen++;
tag->line = line;
tag->file = file;
tag->size = cb;
tag->pblockpointer = pv; // Fallen off the open slots, or candidate next open isn't?
if ( ( here->tags[here->nextopen].pblockpointer != NULL ) ||
here->nextopen >= NUMBER_OF(here->tags))
{ // Cycle through, looking for an open slot
for (
here->nextopen = 0;
( ( here->nextopen tags) ) &&
( here->tags[here->nextopen] != NULL ) );
here->nextopen++ )
;
}
leave_lock(&malloc_tag_list_lock);
}
I switched from C to C++ basically because I couldn't get Purify for Linux. C++ has allowed me to adopt clear, well-defined memory management strategies and automate various pointer checks. I hardly ever get memory leaks or pointer errors in my C++ code anymore.
But no matter what you do in your own code, if you are using C or C++, you will always be exposed to numerous pointer bugs and leaks in library code. Most real-world C++ code commits the same memory allocation sins and has the same pointer bugs as real-world C code--people aren't taking sufficient advantage of C++'s smart pointer facilities (even STL is flawed in that way). Therefore, for multiprogrammer projects, I wouldn't use anything but Java or another safe language anymore.
ElectricFence detects overruns of malloc()d buffers (hence its name). Unless this changed recently I am fairly sure it has nothing to do with leak detection?
Just because a few of us can read write and do a little math, doesn't mean we deserve to conquer the universe
get_mem(ptr, size, "widget hash table")
When debugging, get_mem keeps track of all allocs. At the end, just before the program shuts down the heap dump routine is called which lists all outstanding memory blocks along with the debug string so you can see where they were allocated.
It's also often practical to call the dump routine at various points within the program and give the output a quick look-over or diff - it's amusing how often you can nip these problems in the bud this way.
Also, if you get really desparate, change the get_mem routine to increment a global counter and tag that to the end of each allocation info block. If you keep a program debug log and log each allocation it makes it easy to see where a loose block was allocated - grab the unique ID from the dump and search the log file for it.
A handy feature about this trick is that you use #define to define get_mem, so when you go to production you simply define it to malloc and throw the debug string away - no speed or size cost in the running program. In addition, it basically costs nothing except an hour or so to set it up in the first place. The catch is you have to use it religiously from the start of your project.
A really simple trick, but it has saved me so much work!
I looked at this recently. It doesn't work with the latest gcc, and it appears to have been abandoned by the author. It looks promising.
In college, I rolled my own wrapper for malloc(), free(), and array/pointer dereferences. A couple hours of coding that wrapper caught most of my memory leaks and seg faults. If I could do it when I was half-drunk and didn't know what I was doing, you've probably got a developer on staff who can handle it.
A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget.
Maybe you should put a developer or two on that project and see how long it takes them to build something similar. I think Purify runs about $1,500 now (could be wrong). That's what, two Aeron chairs? That shouldn't kill any real company's budget. Numega's Boundschecker is a viable cheaper alternative though. Or just rip off the free trial versions.
When I've seen Purify bought, a developer downloaded the trial and built a list of all the problems he found and fixed using it. When he showed his manager how much pain and suffering the product could save it was an easy sell. (The hardest part was countering the "so everything's fixed already?" mentality.)
Its by one of the RHAD labs kids. Its basically just a GUI around bohem's garbage collector in leak-detector mode.
Its not purify (it really aims for leak detection, not all the other errors purify finds), but the efence + memprof combination gets you about 85% of purify's functionality.
It seems to handle threaded apps reasonably well, and C++ doesn't faze it. The only down side is that its hard to get running on non-x86 platforms.
find a collection of different memory usage problems, and is reasonably easy to use even on large projects
But to answer the question are there any out there? NO, not with pretty GUIs and all.
Only 'flamers' flame!
Try Checker I think AX.25 pointed to some relevant information, but was moded has redundant for some odd reason.
I'm a mouse.
mpatrol is another tool to help with this.
p at rol.pdf
It can:
- log your memory usage
- report on improper memory usage
- profile your memory usage
- work with your applications *without* re-linking (assuming your OS allows this)
The web page is at:
http://www.cbmamiga.demon.co.uk/mpatrol/
In addition, the author has excellent documentation. The pdf manual actually has a section that lists competing products and what they do.
http://www.cbmamiga.demon.co.uk/mpatrol/files/m