Experiences w/ Garbage Collection and C/C++?
dberger queries: "Java has helped garbage collection enter the mainstream programmer's life - but it's certainly not new or unique to java. There have been (and are) several ways to add garbage collection to C/C++ - the most active seeming to be Hans Boehm's free libgc. I'm curious if any of the Slashdot crowd has used this (or any other) C++ garbage collector in non-trivial commercial applications. If so - what were your experiences? If not, why not? (Before you ask, yes - I know that GC isn't the only difference between C++ and Java, but 'automagic memory management' is certainly part of Java's marketing luster)"
Bjarne Stroustrup, the creator of C++, has this to say on garbage collection:
Clearly, if your code has new operations, delete operations, and pointer arithmetic all over the place, you are going to mess up somewhere and get leaks, stray pointers, etc. This is true independently of how conscientious you are with your allocations: eventually the complexity of the code will overcome the time and effort you can afford. It follows that successful techniques rely on hiding allocation and deallocation inside more manageable types.
He goes on to give detailed examples and recommendations on how to avoid using garbage collection.
void* operator new(size_t n) {
return GC_malloc(n);
}
void operator delete(void* p) {
}
You can also mix collected memory with uncollected memory, but we really don't see the point. This way we can still have descructors which do useful things but the actual memory clean up is left to the garbage collector. Of course, as we write more and more new code we leave our deletes and our destructors out, and eventually we'll go through and remove them all. Until then, we can disable the garbage collector just by #if 0ing these lines out.
How we know is more important than what we know.
The Boehm-Weiser (BW) collector is not as portable as we had hoped. There are a number of platforms we wanted to run on where it just doesn't run at all. Relatively small changes to the target runtime can create a need to port it all over again. OpenBSD, in particular, was an ongoing hassle until we abandoned BW. Hans, I hasten to add, was quite encouraging, but he simply doesn't have time to adequately support the collector.
The BW collector doesn't work in our application. OpenCM has a few very large objects. For reasons we don't really understand, this tends to cause a great deal of garbage retention when running the BW collector. Enough so that the OpenCM server crashed a lot when using it. Please note that this was NOT a bug involving falsely retained pointers, as later experience showed.
Conservative collectors are actually too conservative. If you are willing to make very modest changes in your source code as you design the app, there prove to be very natural places in the code for collection, and the resulting collector is quite efficient.
Independent of the collector, we also hacked together an exceptions package. This was also the right thing to do, but it's easy to trip over it in certain ways. The point of mentioning this is that once you do exceptions the pointer tracking becomes damned near hopeless and you essentially have to go to GC.
I think the way to say this is: exceptions + GC reduces your error handling code by a lot. Instead of three lines of error check on every procedure call, the error checking is confined to logical recovery points in the program, and you don't have to mess around simulating multiple return values in order to return a result code in parallel with the actually intended return value.
To provide malloc pluggability, we implemented an explicit free operation. This lets us interoperate compatibly with other libraries and do leak detection. Turns out to be very handy in lots of ways.
Hybrid storage management works very well. For example, our diff routine explicitly frees some of its local storage (example) [Sorry -- this link will go stale within the next few weeks because the OpenCM web interface will change in a way that makes it obsolete. If the link doesn't work for you, try looking for the same file in .../DEV/opencm/...] This is actually quite wonderful, as it lets us build certain libraries to be GC compatible without being GC dependent. One of the challenges in using a GC'd runtime in a library is compatibility with an enclosing application that doesn't use GC. We haven't tried it yet, but it looks like our gcmalloc code will handle this.
Eventually, we gave up on the BW collector and wrote our own. Our collector is conceptually very similar to the collector that Keith Packard built for Nickle, though we've since built from there. A variant of the Nickle collector is also used as a debugging leak tracer for X11.
The OpenCM GC system is reasonably well standalone. We need to document it, but others might want to look at it when we cut our next release.
On the whole, I'ld say that GC for this app was definitely the right thing to do. Once you get into object caches it becomes very hard to locate all of the objects and decide when to free them. We were able to use a conservative approach with no real hassle, and heap size is fairly well bounded by the assisted GC approach we took.
On the other hand, I would not recommend a pure conservative collector for a pro
Jonathan S. Shapiro (The EROS Guy)
us janitors prefer the term "sanitation engineer".
Do you even lift?
These aren't the 'roids you're looking for.
Garbage collection has costs:
- The obvious: CPU & memory overhead for the checking and tracking. I can't comment on the amount here, but it is a generalized solution, so you forego the optimization opportunities that you'd otherwise have.
- The subtle: Memory allocation can become a major bottleneck in multithreaded systems. Garbage collection has similar issues.
- The irritating: you don't know when your destructors are called.
Another way: Smart Pointers. They're simple wrappers around the types that act like pointers, but they can make sure your objects live as long as you need and no longer. The big trick is knowing which kind of smart pointer you want.
- Reference Counting Smart Pointer (RCSP for short): this type of smart pointer will keep of how many RCSPs are pointing to the same object. It'll delete the object when the last RCSP is destroyed. A good one is the boost shared_ptr. Available for free from www.boost.org. This type is great for general use.
- Owning Smart Pointer (OSP): this type is specialized for those cases when the refcnt is never more than 1. When you assign one OSP (a) to another (b), the new OSP (a) gets ownership of the referred object, and the old one (b) is automatically set to null. When an OSP that isn't set to null is destroyed, it deletes the object it owns. It's great for parameter passing, return values, and objects you want dead at the end of the current scope, even if there's an exception. The STL comes with auto_ptr, which works this way.
You can use an RCSP wherever you can use an OSP, but not the other way around. The STL containers are a great example.
Sure it's not as easy as 'allocate and forget,' but you won't have the (sometimes very costly) expense of full-blown garbage collection.
Also, you can optimize your smart pointers for individual types (through template specialization). A great example is to give the no-longer-needed object back to a pool for later reuse.
This is really a quick, quick overview. For the meat & potatoes, go read Effective STL by Scott Meyers.
I've tried really hard to be fair & polite. There's probably still a bias, but I'm really trying!!
Care about electronic freedom? Consider donating to the EFF!
So please tell me what
typedef vector::const_iterator Iter;
(or rather vector::const_iterator) is supposed to mean. I suppose vector is a templated class, but how does ::const_iterator come up with a type name -- I thought :: either references a static field or a class member function?
And what is the deal with the sort(,) as a free-standing function? Following OO principles, shouldn't the vector object v know how to sort itself with a call to v.sort()? And what the heck is this const_iterator type anyway that you can do ++ and * on it -- looks an awful lot like a pointer -- oops, I forgot, you can overload ++ and * to make "safe" operations on what are really objects look like "dangerous" pointer operations which the C/C++ community is in custom of using.
In principle, all the stuff done in Java and perhaps in scripting languages could all be done elegantly and expressively in C++ if us mere mortals ever figure out how to use the darned thing. But there is a kind of uniformity to Java (all object variables being GC'd heap references, collection and iterator types working with generic Object's that we cast to what the object is and rely on runtime type checking, don't worry-be happy allocation of these objects where we grab towels from the rack and leave them on the bathroom floor for the hotel maid to pick up) that simply feels more comfortable.
C++ is the music of Bach: elegant, mathematical, intricate, and expressive, but most musicians performing in front of audiences don't understand it and it is played as a dull jumble and mishmash and audiences gaze at Bach stuck at the beginning of a recital as a chore to get through. Java is the music of Mozart: simplified, standardized, predictable, and economical, but musicians of this era understand it and play it with gusto, and audiences love it because it sound so happy and makes them feel uplifted.
IMO, if you can't organize you code such that
/. bemoans the
you prevent leaks, you probably can't organize
your code to accomplish anything simpler than
bubblesorting a text file.
It is attention to detail and the creation of
highly patterned structures of code that allows
software-in-the-large to be successfully
constructed. Not that I've ever tried
designing a production system that utilized
GC, but I can't imagine that forgetting about
when/where you are creating new memory structures
will improve your total comprehension of the
information flow within the software.
While every 30th article on
outsourcing of programming jobs to India, ask
yourselves this: will they do what it takes
to make sure there are no memory leaks? I'm
fairly certain that they will, and, moreover,
they just consider it a part of creating
quality software.
Here's what software does:
1. reads in data from storage to memory
2. allows the user to:
a. change the memory
b. add new structures to memory
c. delete structures from memory
3. writes the memory back to storage
Now, in implementing functions that accomplish
both 2a and 2b, there will be memory structures
created that are used in a transient/temporary
fashion to accomplish the creation of data
structures that get attached to the tree of
permanent data structures.
Anything not attached to the permanent set of
data structures gets freed! Whoopee, no memory
leaks! And, guess what, when the user deletes
something, all that shtuff gets freed.
I'd charge you a fee for tutoring you in basic
computer science, but I don't know your billing
information.
Peace & Blessings,
bmac
For true peace & happiness: www.mihr.com
A program variable is either a global variable, a stack variable, a class variable or an instance variable. Global and stack variables are held in lists. Class and instance variables are kept inside objects.
Every class object has a global variable that always refers to it.
Any object that is not, and that can not become referenced (directly or indirectly) by a global or stack program variable is garbage.
Each object has a 'not-garbage' flag.
For each global and stack variable, if the referenced object is not marked not-garbage, mark the referenced object as not-garbage, and recurse for that objects contained variables.
Delete all objects that are not marked not-garbage.
There are a few more twists, like handling return values on the stack, but this algorithm correctly handles self-referencing objects no matter the complexity.
Does everything include nothing?
There is another indirect cost pointed out by Linus Torvalds in a lengthy post to the gcc mailining list. The executive summary is that (he thinks that) memory that is not to be used anymore should be freed immediately. Otherwise, the data in there will keep lying around in the data cache. Also, he claims that explicit ref-counting gives you advantages for optimization: Assume you have to make some modifications to a data structure, but you don't want other parts of the program to see the modifications. Without ref-counting, you have to copy all the data structure before modifying it. With ref-couting, you can omit the copying if you are the only one with access to the data structure.
And finally, he thinks that GC makes it too easy to write pointer-chasing-heavy code---as that kind of code is bad for cache behaviour all the time.
It is an ongoing discussion whether GC really has that bad effects on performance of GCC. But Linus Torvalds seems to have very good points. (And some of them certainly cannot be taken into account in a "GC cost is less than hand-written memory management"-paper.)
I've used a garbage collection system in a C project before and it works surprisingly well. The problem with GC in C though is that it is possible and legal to,
o allocate memory
o write the pointer to a disk
o lose the pointer in memory
o read the pointer back off the disk,
o make use of the pointer
With all GC strategies I'm aware of, by the time you read the pointer from the disk the memory may well have been freed.
I'm not saying that this style of programming is a generally good idea but it is used in certain, specialised situations and therefore not suitable for a garbage collecting language.
The salient points:
Destructors are not Called
If an object is allocated in collectible memory, then its destructor will not be called when the object is collected. Therefore, destructors are pretty much useless and your code must be designed to work without them.
Actually, if your object derives from class gc_cleanup, then its destructor will be called. However, due to the handling of cleanup functions in the BDW collector, cycles of such objects will never be collected. For this reason, I don't use gc_cleanup much.
Allocating Collectible Memory
By default, C++ allocates objects in the "malloc" heap. The BDW collector maintains a separate heap. In effect, there are four types of memory:
"Scannable" refers to the property that objects in the heap are scanned for pointers. "Collectible" refers to the property that objects in the heap will be deallocated if no further references are found.
These four memory types are an issue when you interact with STL and third-party class libraries. By default, STL uses the malloc heap. If you want, say, a std::vector in collectible memory, then you need to write an allocator to get it. The most recent versions of the collector come with such a beast; the version I started using did not.
Similarly, std::string is reference-counted, and in the malloc heap. Here, rather than using an allocator to force it into the collectible heap, I wrote my own lightweight GCString class, which stores the string as an immutable object, and relies on the collector for cleanup.
Third-party class libraries such as ANTLR may use reference-counted objects; you need to bridge between GC and non-GC applications carefully.
It turns out, however, that there are natural places to do GC, and a little help from the application can go a very long ways. In the OpenCM collector, we mark procedures that return pointers using a special GC_RETURN macro. This works because at the return from a procedure all of its local variables are known to be unreachable. The only surviving objects are the ones that are reachable from the returned pointer (idea for this is due to Keith Packard).
By using this discipline, we actually blur the distinction between managed and unmanaged collection. The results look very good from a performance perspective.
However, I should acknowledge that this is partly due to the structure of our application. Servers are "in and out". They generate a lot of garbage during a given query and then release essentially all of it. Procedure return is therefore a natural collection point. The same experience might not apply in systems that hold large amounts of memory in the heap during long-running computations, with lots of temporary allocation.
Jonathan S. Shapiro (The EROS Guy)
I pointed out that Stroustrup's example shows the expressive power of C++, but there is a big "huh?" factor of reading the code on account that many of us mere mortals are not rehearsed in the use of templates and STL, and there was something to be said for Java and GC, not just for safety but for simplicity of expression and code reading by maintenance programmers.
Yours is one of three comments to my remarks, answering questions I had raised, disagreeing with some points, agreeing with others, but otherwise engaging in a reasoned discussion of the merits of C++ and its advanced features (templates and STL). But I get moderated down and flagged "troll" -- oh well.
I looked at Stroustrup's two examples. It looks like his first example does not involve freeing any memory at all. Am I right? His second example seems to use auto_ptr to assure that an object is freed when the function where it's allocated returns. Is that all it's doing? I would expect the situations where people get memory leaks to be more complex than auto_ptr could handle.
Anyway, he never mentions garbage collection; just easier "explicit" management. (I put "explicit" in quotes, because malloc/free has to manage free blocks, so it's not as manual a process as you might think. For some applications, Boehm's collector is actually faster.)
How is incrementing/decrementing a reference count "*far* more compute-intensive" than scanning memory?
A few years ago, I used the Boehme GC when writing a pair of compilers (Verilog/VHDL) in C++. I was very happy with the result, since it was rare for GC even to get called at all. It was also surprising how much simpler code gets when you don't have to worry about deleting objects.
I once wrote some classes to work with Memory Mapped Files (under Windows) in an almost transparent manner. It works great for making complex C++ object hierarchies persistant.
ILOG Solver & Scheduler are mainstream commercial thrid party libraries in C++ based on the constraint programming paradigm. One of the major features is ILOG's automatic garbage collection heap, which is automatically deallocates memory (based on assumptions on program flow). To make this efficient, they skip all deallocations (using a longjump, rather than a return).
At first this may look like an elegant way to get rid of complicated memory management & garbage collection without loosing efficiency. However, in my personal experience, it is completely horrible when combined with pats that use the normal system heap. Specifically, when writing your own constraints, goals or deamons, it is practically impossible to use anything but ILOG's solver heap.
I gues this is one of the mayor reasons why they recently made a technology change and launched JSolver, a Java based counterpart.
--
Ninety-Ninety Rule of Project Schedules: The first ninety percent of the task takes ninety percent of the time, and the last ten percent takes the other ninety percent.
if you need garbage collector your code must
be filled with garbage. i, on the other hand,
need only malloc() and free() and still my
code manages to work well without memory leaks
(thanks to valgrind).
The Qt toolkit (on which KDE is based) has a nice garbage collection facility. All of the widgets derived from the base class QWidget take care of deleting child widgets that are also derived from QWidget, including user defined types. This means you can add, remove or move widgets in your user interface without having to worry about the corresponding delete.
Tom.
There's a really good book about everything you ever needed to know about garbage collection. Although most of the book deals with garbage collection techniques in general, it has two complete chapters devoted to implementing and using garbage collectors in C and C++ and which ones you should use depending on your application needs.
...whenever you find yourself writing an overly-complicated means to overcome issues of object/memory 'ownership'.
(Granted, one could say that this would apply to the GC itself, but not necessarily so)
The trick is, memory is a 'resource' and as such is subject to acquisition and release steps in order to maintain it properly. If the notion of ownership of memory is ambiguous, you need to normalize your data somehow so you get back to a 1:n relationship between owners and acquired resources. This happens frequently in situations where objects have an n:n relationship, usually a network of one homogenous type (a network).
The easiest way to achive such a normalization, short of drastically changing your system design or coding practice, is to plug a GC to do object/memory management for you.
As a side note:
Reference counting can help with this style of problem too, but it utterly fails to cope with cycles (mutually pointing objects). See: COM. It can be used *very* effectively if you code with this shortcoming in mind. Some GC's even use ref-counting internally to clear out the bulk of objects, leaving just abandoned cycles to be garbage-collected.
Many people are not aware that GCC itself uses garbage collection as it runs. You can actually select which algorithm gets used at configure time, and tweak the GC parameters during runtime (via a growing set of command-line options that users never think to use).
That aside: I've corresponded with Linus a couple times (on other subjects), and while he is the brilliant guy that /. thinks he is, he is a kernel expert, not a compiler expert. Entirely different problem domain, very differnt approaches to solutions. Not every compiler is alike, not every GC strategy is alike, and most of the GC strategies out there are not appropriate for use within GCC. (Note: within GCC, not by a program compiled with GCC.)
What the GCC maintainers have known for a long time -- due to actual analysis of the compiler, not "this tends to work elsewhere and in other programs" -- is that the current GC strategy is suboptimal. There's even a design for a good replacement. None of the volunteers have had time to write it yet. And on that note, I'll leave you with a quote from Torvalds: "What we need is less people running around and telling everyone else what to do and more people actually writing code."
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
You can get the same effect using Smart Pointers and not give up the control that using a garbage collection system entails. See Boost and Alexandrescu, Andrei. Modern C++ Design. There is also a nice article on CUJ.
Sorry my bullshit sensor overloaded.
If a project is complex enough, you're either going to have to use an existing garbage collector or invent your own. These people who claim that there's no need for GC just haven't had a complex enough project (compilers come to mind).
You can do some things to simplify things: don't share -- make copies instead, use smart pointers (which is really just reference counting GC).
In the end though, you're going to have to start doing pointer scans to get rid of the cycles, and at that point you might as well import an already debugged GC rather than roll your own.
is not great to use in a hybrid C and C++ application. Why? C++'s new operator does not generally take a memory pool as a parameter and the lifetimes of the objects allocated by Apache (on a connection, let's say) do not match the lifetime of the C++ heap allocated objects you may be using. Something could be rigged to have Apache fire callbacks to dispose of C++ resources when a memory pool dies - but this is way too complicated.
If someone was able to get the Apache Portable Runtime working with C++ in a sensible, I tip my hat to you.
Instead, we developed our own automated object management facility based on reference objects, that is, "smart pointer" objects with these new capabilities: Reference objects in C++ are completely synonymous with object references in Java. Unlike traditional smart pointers, reference objects support inheritance, both single-implementation inheritance and multiple-interface inheritance. These inheritance characteristics also apply to reference objects of arrays. They can also be built to be rigorously threadsafe and secure.
Reference objects have allowed us to port Java source code directly to C++ with virtually no changes. They have also allowed us to take advantage of productive Java language features in new C++ projects without the overhead or installation of any JRE or VM. delete has disappeared from our C++ code entirely. At the same time, our C++ is still characteristically lean and interoperates naturally with other C/C++ libraries and APIs, such as ANSI C++ STL, Win32 API, and MFC.
We have used these concepts to implement the core Java API in C++, for C++, although reference objects themselves are not specific to this API. They are implemented in our low-level Pie Library and usable in any C++ application.
Info and features of NewJ and Pie Library
The NewJ C++ Developer's Guide explains reference objects more fully. It is available as a free download (registration required).
throughout your post I was think, "yes but...", "well put, but..." and then I reached this, which is what I agree with and the only problem I have with GCs.
>The reason is, of course, that you build the pooling based on your knowledge of the actual usage characteristics of the objects; knowledge that no general-purpose memory manager can possibly have.
I find this in general... Garbage Collectors, not unlike VMs, do well when they know their problem domain well. So, it's common to use a Garbage Collector to manage a special heap, say of fixed size objects that are often allocated/deallocated.
The secret to memory management is "don't let things get out of hand", understand how you are using memory. It's not like having to type in a laborious syntax, like say, trying to do OOP in Fortran, which just doesn't have a friendly syntax for that... iow, it's really worthwhile to think about memory management issues, who owns memory, etc, and not just for the free() call, but for the design itself. It pays back a lot to think of these things, and use GC for particular cases that can benefit.
For this reason I don't like GC based languages (or at least, I don't like that aspect of them), while I have nothing against GC in general.
-pyrrho
... well thanks, you made my head explode by saying that!....
Luckilly I don't need my head to type. I can no longer read the articles on slashdot... but that doesn't matter, I can still post.
-pyrrho
OK, now I understand. Basically he's saying that if you follow a certain discipline, allocated memory will be freed when the function returns. But the same could be said of explicitly freeing at the end of the function, and having goto's at places where you would otherwise return early. The second example bumps it up exactly one level. However, for objects that get passed around between more than two functions, you still need to keep track of what must be freed.
Regarding Scheme performance, check out how bigloo and Stalin are doing in the Bagley language shootout.
I've dealt with garbage collection in Java and now in Python. When you get into the mindset of a language that does this natively, I have found that your code naturally flows into that paradigm. I can't imagine trying to use garbage collection in C/C++ -- it just doesn't fit into the scheme of things for me. True, the STL has auto_ptr, and I have used that in the past -- works rather nicely, IMHO -- however the way I learned how to write clean, efficient C code was to make sure you write the code to deallocate memory when you write the code to allocate it, whenever possible. This works for me, mainly 'cause I've done it for so long. Granted, there are times when doing so is difficult -- there are times when you allocate in one function, but don't deallocate until a much later time in another function. I've written a couple of 100K+-line apps that do that (and have to juggle CORBA calls as well) and it is -very- difficult to debug. That's when you wish you had garbage collection. At any rate ... I have found that a number of programmers prefer GC simply because they don't want the hassle of worrying about cleaning up after themselves. For really good programmers ... yeah, it's nice and it allows them to focus more on the task at hand. However, for not-so-good or poor programmers (and there are a number out there) it allows them to write poor code that could allocate memory without considering the consequences -- before I get flamed for this comment, yes I have seen such code, and from some people who were supposedly "really good programmers". When push comes to shove, nothing beats writing good, clean code in the first place, having already designed into the code where memory is allocated and where it is deallocated. Once you've done that, grab the profiler and look for ways to optimize. Chances are, the code will be cleaner and more elegant than if you were to use a GC-type solution. If not, maybe C/C++ isn't the right language for the task. Time to take a look at Python. :)
I don't think you are thinking about what actually happens at CPU level.
If a mutex is active and a thread blocks it will get put into the wait queue and not use CPU. The only ready processes will be new IO, then you are back to the GC code again. The only overhead is the actual scanning on a single CPU system.
You are completely wrong. If a mutex is active and a thread blocks while it waits for a memory allocation it may use a less (but not zero) CPU, but more importantly - its progress has halted. So basically, that database transaction or whatever you have scheduled on this memory blocked thread will wait. Imagine a dozen such threads all blocked waiting for new memory while you have garbage being collected - you have very poor utilization of CPU resources - especially if you have expensive SMP hardware with most of the CPUs sitting IDLE when they could otherwise be doing useful work. We're not talking a few percentage slower here - we're talking several times slower on multi CPU hardware in heavy load.
Do yourself a favor and read about Hoard. Then you'll get a clue about how modern memory management works.
I work in a system that has (mostly soft) real time deadlines. The article states that their
mark-and-sweep algorithm can handle 10MB/s on a 177Mhz machine. It is not unusual for a system to have 500M+ of live data. That's 50 seconds during which the system can do nothing "useful" from an application standpoint, which is 43 seconds beyond our most common deadline. Throwing a faster processor at it doesn't really help all that much.
I don't think I'm misunderstanding. I understand you have a mechanism that destroys the object a variable is bound to when that variable goes out of scope. That's only useful if you aren't putting the object in a data structure or otherwise planning on using the object outside of the block where that variable is in scope. Basically, it's only useful in trivial cases. I'm not saying it's worthless; I can see it has some advantage over explicitly freeing all the objects. It's just not anything near as powerful as garbage collection. To be fair, Bjarne Stroustrup doesn't present it that way; only aster_ken did.