Pros and Cons of Garbage Collection?

← Back to Stories (view on slashdot.org)

Pros and Cons of Garbage Collection?

Posted by Cliff on Monday November 28, 2005 @02:55PM from the how-would-you-prefer-to-allocate-YOUR-memory dept.

ers asks: "Most new programming languages are using garbage collection, rather than programmer-controlled memory management. The advantages are obvious: programmers no longer have to worry about forgetting to delete allocated memory, leading to far fewer memory leaks. The disadvantages are often glossed over by programming language designers - aside from the performance issues, predictable memory management can be used for controlling access to files and similar resources, creating safer thread locking code and even providing better error messages. Some programming languages, which usually predictable memory management, can also be made to behave like they are garbage collected - for example, Boost provides various C++ smart pointer classes. So, given the choice between garbage collection or manual memory management, which would you choose and why? When using a manual memory management language, when do you consider the performance and syntactic overhead of faked garbage collection to be worthwhile?"

16 of 243 comments (clear)

Min score:

Reason:

Sort:

Mainly GC but sometimes... by 2starr · 2005-11-28 15:09 · Score: 2, Interesting

In general I prefer having a GC because most of the time I don't want to have to worry about memory management... there's no need. However, sometimes it would really be nice to have more direct control. Not being a VM expert myself, it seems like it should be possible (though I can imagine the types of problems that would arise) to allow specifying that you're assuming manual memory control either over certain objects or while inside a particular context.

--
"Let your heart soar as high as it will. Refuse to be average." - A. W. Tozer
Garbage Collection vs. Manual Memory Management by Dual_View · 2005-11-28 15:26 · Score: 2, Interesting
Manual memory management is similar to assembly language in a certain way: everybody should know how to use it, but they should strongly avoid actually having to use it in most cases. Even though I like to write code in C, I still understand the value of garbage collection. This goes back to the old adage: "Programmer time is more valuable than processor time." On the other hand, there are still a few instances where the manual method is the best tool for the job.

But then, the question is rather ambiguous. Is the writer asking:
- "How much does garbage collection affect what language you choose to code in?"
- "If you were designing a programming language, would you implement garbate collection? If so, what kind? If not, what memory allocation strategies do you use?"
- "In what situations would you find garbage collection to be the most useful? When is manual memory management better?"
I'll leave further interpretation of the writer's words to other posters, as well as more thorough responses.
It all depends by unr_stuart · 2005-11-28 15:27 · Score: 3, Interesting

I've always had the philosophy, "use what makes the job easiest." Typically, this involves garbage-collection. However, one of the biggest problems I have with garbage collection is that you can't have your cake and eat it too. Meaning, you can get all the memory you want, but you can only access it at a high level (think Objects in Java). In C/C++ however, you can call malloc/new, create a big pool of memory (or just a single object), and then do whatever the heck you want with it. But again, as the subject says, it all depends on which method helps get the job done, and so far neither has been perfect for everything.
C++ and others.... by try_anything · 2005-11-28 15:48 · Score: 5, Interesting

C++'s constructor/destructor paradigm with predictable object destruction has the benefit of enabling the RAII (Resource Acquisition Is Initialization) idiom. RAII and exceptions greatly simplify resource management in the presence of error handling. Still, even as someone who knows C++ better than I know any other language, I have to admit that for many applications a garbage collected language puts the least mental burden on programmers and produces the fewest memory errors. The burden of arranging all the extra try/catch blocks in Java (because it lacks RAII) has to be weighed against the burden of investigating and fixing memory management errors in C++, and for people using new/delete, Java wins, IMHO.

C++ programmers should be making very little use of new and delete, though; they should be using smart pointers. I think the article poster misunderstands smart pointers. boost::shared_ptr is a reference counted pointer, but std::auto_ptr and boost::scoped_ptr have nothing to do with garbage collection - they certainly aren't "faked garbage collection" and they certainly aren't unpredictable. They use C++'s object scoping and copying mechanisms to manage memory in a way completely unlike garbage collection. scoped_ptr is the simplest and most predictable memory management tool of all. Taking programmer error into account, it's more predictable than using delete. Even shared_ptr is predictable; when the reference count falls to zero, the object is immediately destroyed, not just marked for destruction.

Sadly, although C++ is a very powerful language and can be used to write code with few errors, the language as used by beginners is as dangerous as C, perhaps even more dangerous. It takes programmers years to become proficient in all the methods and idioms that make C++ a usable language.

(I would love to see a language that allows programmers to choose scoped allocation, smart pointer heap allocation, or garbage-collected heap allocation, and uses types to avoid dangerous combinations such as garbage-collected objects pointing to scoped objects or an object pointing to an object in an unrelated scope. Every object would have two types - the object type (int, file, circle, etc.) and the memory management type (scoped with scope S1, scoped with scope S2, garbage-collected, etc.))
1. Re:C++ and others.... by cryptoluddite · 2005-11-28 19:10 · Score: 2, Interesting
  
  Why do C++ people use the acronym RAII "resource acquisition is initialiation" to talk about when the object is uninitialized? The acronym is just completely wrong, because languages like Java are far more "RAII" than C++ (in C++ you can actually allocate resources without initializing them). It really should be something more like RAIS, "resource acquisition is scope", or LSILS "lexical scope is logical scope", or ODOL "object destroyed on leaving", or RROL "resource released on leaving", or something that actually makes sense. To me it just sounds like more arbitrary complexity and nonsense, so basically like everything else C++.
Cocoa and Objective-C by mccoma · 2005-11-28 16:21 · Score: 2, Interesting

Apple / NeXT takes a reference counting approach. It is not automatic, but it works well once you understand the rules.
Re:If C++ Memory Management by try_anything · 2005-11-28 17:15 · Score: 2, Interesting

I agree with a comment posted elsewhere that unchecked memory access and manual memory management, while both unsafe and fertile sources of errors, are different beasts.

Manual memory management is a control issue. Unchecked memory access is a matter of asceticism.

Buffer overruns happen because the devotion to performance and minimalism among a certain crowd is religious. Because of this, the C++ standards guys were terrified of encouraging the use of anything slower and safer than what a C programmer would do. In std::vector and boost::array the default (and readable) access operator [] is unchecked, and checked access is relegated to the ugly at() method. This is a political decision that turns on its head the principle that quiet syntax should be used for routine, safe operations and ugly syntax should be used for rare, dangerous operations.

Using template metaprogramming, it's easy to produce your own array classes with bounds-checking chosen at compile time, but because the survival of C++ depended on the "just as fast as C!" slogan, unchecked access was officially encouraged and has unsurprisingly become the norm among C++ developers.

Rationally, even a developer who expects that bounds checking will always be too expensive in production should insist that bounds checking be easily enabled at compile time. Electric Fence helps with that, but template metaprogramming is a more straightforward solution.
GC is DRY by PBPanther · 2005-11-28 17:35 · Score: 2, Interesting

Not using GC requires that you to write code to free those resources repeatedly. That goes against the DRY (Don't Repeat Yourself) principle.

I wonder how many of the people who use the "C++ model" bother to unit test that they have freed all their resources.
Re:Getting it backwards by Profound · 2005-11-28 19:05 · Score: 2, Interesting

>> you can write "destructor's" in a garbage collected language

You mean like Java's Object.finalize()?

The same one that causes significant performance problems fundamental to how GCs work, and is not guaranteed to execute in any specific order, or even at all?
False dichotomies by Eric+Smith · 2005-11-28 20:03 · Score: 3, Interesting

Some of the cited advantages of not using garbage collection are red herrings. For instance the "controlling access to files and similar resources" by RAII works fine with garbage collection. In most cases, the compiler can determine by static analysis that a particular object is allocated within a scope and no referenes are propogated upward out of scope, and can remove the reference so the garbage collector will deallocate it (possibly calling a destructor). Depending on the type of GC and its implementation, the compiler may generate code that forces the object to be deallocated immediately.
For cases where static analysis can't do this automatically, it isn't that hard to use a design methodology that achieves the same result; it's certainly still much easier than doing manual allocation and deallocation and ensuring that the deallocation is done (or not done) correctly in all cases.
And if you are using a reference-counting GC, or a hybrid GC that includes reference-counting, you don't have to do anything special at all.
The same applies to the claimed mutex and error message disadvantages, since those are just specific uses of RAII.
Garbage collection efficiency overstated by butlerm · 2005-11-28 21:08 · Score: 3, Interesting

This is a common claim, but it is an apples to oranges comparison. No one (including the compiler) dynamically allocates objects in C/C++ when they can place them on the stack instead. Garbage collected languages like Java, on the other hand, require practically everything to be managed on the heap.

In addition, an array of objects on the heap requires only a single memory allocation in C or C++, where Java has to allocate and track each separately. As one luminary once said, "C++ is better because there is less garbage to collect."

That might be acceptable, but the worst part is random application pauses of arbitrary duration for garbage collection. Unless that problem can be resolved, garbage collected languages will be always be a poor match for latency sensitive applications, even where the net throughput is otherwise adequate.
1. Re:Garbage collection efficiency overstated by swillden · 2005-11-28 22:51 · Score: 4, Interesting
  
  No one (including the compiler) dynamically allocates objects in C/C++ when they can place them on the stack instead.
  Are you certain of that? Here:
  void foo() { //... auto_ptr<Foo> f(new Foo); //... };
  
  What would the compiler do? What *could* it do, if it were smarter? And have you really never seen any code that does this? Or written it?
  Lots of C and C++ programs dynamically allocate many objects that could be heap allocated. In particular, many C++ objects that are placed on the stack immediately allocate storage on the heap. Think std::string. Many programmers do make an attempt to allocate as much on the stack as possible, but I think most don't really consider it. And keep in mind when I say this that I've been writing C and C++ (mostly C++) professionally for nearly 15 years -- I've seen more than a little code.
  Garbage collected languages like Java, on the other hand, require practically everything to be managed on the heap.
  Interestingly, Java does *not* require that at all... it's just the most obvious way to implement it. In fact, I read a while back that the next generation of Java compilers will perform escape analysis, looking for objects whose lifetime is associated with a stack frame. Here's a link. When they find such an object, it will be allocated on the stack. If such an object creates other objects, as long as the analysis can prove that their lifetimes are also frame-associated, they will also be allocate on the stack.
  The same analysis will often allow Java objects and their sub-objects to be allocated as a single block. Since the compiler can see that the constructor of class Foo always allocates objects of Bar and Baz, all of fixed size, it can allocate a single block, just like a C++ compiler would be able to for a class like:
  class Foo { // ... Bar bar; Baz baz; };
  
  The same sort of analysis should also allow your other point to be addressed: An array of objects can be allocated as a single block. The compiler can recognize code like:
  Foo[] f = new Foo[n]; for (int i; i < n; ++i) f[i] = new Foo;
  
  And allocate a single block that is n*(sizeof(Foo)+sizeof(Bar)+sizeof(Baz)) in size, and if 'f' has a stack-associated lifetime, allocate the whole pile on the stack.
  All of the above is still theoretical, of course, but it's coming quickly.
  That might be acceptable, but the worst part is random application pauses of arbitrary duration for garbage collection. Unless that problem can be resolved, garbage collected languages will be always be a poor match for latency sensitive applications, even where the net throughput is otherwise adequate.
  As I pointed out in my previous post, whether or not that problem exists depends on the GC implementation. Incremental GCs keep the pauses small, and there are GCs designed for real-time usage that further guarantee maximum latencies. It's worth pointing out also that normal malloc() and free() implementations don't provide any run-time guarantees. Real-time code that uses a heap uses special versions that do provide guaranteed latencies, at the expense of worse average performance.
  
  --
  Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:Depends by archeopterix · 2005-11-28 22:01 · Score: 2, Interesting

Actually, it seems to me that if you want reliability, maintainability, and perhaps most important, debugability, you want to manage your memory yourself.
And try to pinpoint which of the hundred thousand totally unrelated functions has modified my data because it happens to use a bad pointer?
I had to debug a C program that started crashing after an unused variable declaration had been removed. The reason? - a dangling pointer.
The program was compiled without any optimization, so the memory for the variable had still been allocated (in spite of the var being unused), which shifted the other variables to that the dangling pointer had missed them. After deleting the unused var, the pointer (used totally elsewhere) damaged the data.
Managed memory gets you rid of this kind of problems. Or, at least, confines them to external libs written in non-managed languages.
VM aware GC by renoX · 2005-11-29 01:34 · Score: 2, Interesting

A paper I've found interesting is on a GC which communicates with the VM systems to avoid putting too much load on the VM system.
It needs a modification of the VM, but IMHO this is better than having to handtune the memory used by the GC. (Note: I'm not an expert in GC)

http://www.cs.umass.edu/~emery/pubs/04-16.pdf
Re:Depends by Mr.+Slippery · 2005-11-29 03:07 · Score: 2, Interesting

When you create objects in one module and give them to someone else, you create bugs.

No, the caller of your module creates a bug when they fail to free the object that you have clearly defined in the interface to be their responsibility. It's no different than any other violation of an interface condition. (If you don't clearly define your interfaces, then yes, you have of course created a bug.)
That's not a leak, it's sloppy programming.

Are you saying that leaks are not a form of sloppy programming? You've either failed to free things you knew needed to be freed (your sloppiness) or you've failed to know which things needed to be freed (which could be your sloppiness or someone elses').

--
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Lisp idiom by Nicolay77 · 2005-11-29 05:33 · Score: 2, Interesting

In lisp that would be using the unwind-protect macro.

Notice that this is totally unrelated with memory management.

So yes, it IS possible to have our cake and eat it too.

--
We are Turing O-Machines. The Oracle is out there.