Experiences w/ Garbage Collection and C/C++?

Use More Manageable Types by aster_ken · 2003-09-15 13:28 · Score: 3, Informative

Bjarne Stroustrup, the creator of C++, has this to say on garbage collection:

Clearly, if your code has new operations, delete operations, and pointer arithmetic all over the place, you are going to mess up somewhere and get leaks, stray pointers, etc. This is true independently of how conscientious you are with your allocations: eventually the complexity of the code will overcome the time and effort you can afford. It follows that successful techniques rely on hiding allocation and deallocation inside more manageable types.

He goes on to give detailed examples and recommendations on how to avoid using garbage collection.

Re:Use More Manageable Types by Javagator · 2003-09-16 03:36 · Score: 5, Interesting

I was part of a project to write an image display system of about 100k lines of C++ code. Our coding standards required that we allocate resources in constructors, release them in destructors, and put objects on the stack when possible. When putting objects on the stack was not possible, we used smart pointers.

Towards the end of the project, we finally got our company to spring for Purify. We ran Purify on the code and found a few places where we forgot to release a graphics context, and one place where we didn't follow our coding standards, but no other memory leaks. We than ran purify on a comparable C system and found hundreds of memory leaks.

Proper use of constructors and destructors can make resource management virtually automatic.
Re:Use More Manageable Types by WatertonMan · 2003-09-16 06:37 · Score: 2, Interesting

Depending upon how you are using Java, you end up having to do a lot of things that end up being managed garbage collection anyway. i.e. not that different from C/C++ except that when you screw up you don't end up with a memory leak.
If you use a tool like boundschecker and run it a couple of times a day, you'll avoid most of these sorts of problems in C++. It's an amazing tool I couldn't live without. One advantage to having these sorts of things flag as a "memory leak" is that you can often find subtle bugs that way.
Admittedly because of the issue of pointers vis a vis Java, that's less of an issue in Java. But I'm not sure merely adding garbage collection is ideal in most cases.
If you are writing high enough code that you really want garbage collection, the advantages of writing in C++ over Java are probably slight. (The speed advantage isn't that big a deal anymore and many programs don't really need that slight speed advantage anyway) If you are writing code that benefits from the fine control C++ offers you'd probably not want to trust garbage control so the point is moot.
I'm not saying there aren't cases where garbage control is useful. However I halfway wonder if these situations wouldn't be better handles by a combination of languages. i.e. Java and C or Python and C. Code your important functions in C or C++ and put the main interface/io functions in the higher level language.
Re:Use More Manageable Types by angel'o'sphere · 2003-09-16 22:58 · Score: 1

Bjarne is only partly right.

First of all, he is against GC, and thats why he finds ways to avoid it and arguments for its unnecessarity.
However there two points which make his argumentation weak:
a) you need to know a far big deal of "how the standard library works" and about c++ in general to apply his hints. If you had GC ... programmers could focus on their algorithms instead

b) all the arguments *against* GC, are in fact arguments *for* GC. All the work and the burdon the ordinary programmer is freed from, some library designer has put into the standard library!!!

His sample you refered to under this link: http://www.research.att.com/~bs/bs_faq2.html#memor y-leaks, only shows: you can deal with standard library classes just if you had GC enabled ... so: GC is good for the ordinary programmer.

Bejond that, Bertram Meyer, the inventor of the Eiffel programming language, made observatins that a GC less program in the majority of all observed and analyzed cases:
1) spends up to 30% of its runtime in memory management code
2) has still memory errors, dangling pointers and leaks
3) could be 29% faster, if a GC with global program optimization would be used(as such a GC in general uses far less then 1% of the CPU power for collection)

Background note: I spend > 12 years with C++ programming ... and have about 800.000 LOC C++ programming experiance and managed finally to get a strickt habit about how to allocate and deallocate and wrote my own libraries and memory analyzers to check my programs. So I *know* that you can live without GC ... but I also know that I spend a good deal of time in getting a C++ geek, instead of getting a "application domain geek" :-)

angel'o'sphere

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.

Certainly.. by QuantumG · 2003-09-15 13:36 · Score: 4, Interesting

Have a look at our project Boomerang. We're over 230k lines of code and we garbage collect everything. It's as easy as linking to Hans Boehm's libgc and adding the following lines to one of your files (probably best is the one which contains "main").

void* operator new(size_t n) {
return GC_malloc(n);
}

void operator delete(void* p) {
}

You can also mix collected memory with uncollected memory, but we really don't see the point. This way we can still have descructors which do useful things but the actual memory clean up is left to the garbage collector. Of course, as we write more and more new code we leave our deletes and our destructors out, and eventually we'll go through and remove them all. Until then, we can disable the garbage collector just by #if 0ing these lines out.

--
How we know is more important than what we know.

Re:Certainly.. by AJWM · 2003-09-15 14:09 · Score: 2, Interesting

leave our deletes and our destructors out,

If you have destructors, do they ever get called? Destructors aren't just for freeing memory, they're also used for freeing other system resources (depending on the object). File descriptors, database connections, that kind of stuff. (Granted, you can have explicit methods to take care of that cleanup, but then you can have explicit methods to free memory too. Seems to me you want all that as automatic as possible.)

--
-- Alastair
Re:Certainly.. by QuantumG · 2003-09-15 14:12 · Score: 1

You can call the destructors explicitly (with a delete) or you can configure gc to call them when the object is garbage collected. We don't do that because the only thing our destructors do is clean up memory, which is what the garbage collector does.

--
How we know is more important than what we know.
Re:Certainly.. by cpeterso · 2003-09-15 14:14 · Score: 1

Like the other guy, I want to know when your destructors get called? Plus not every system allows you to override global new and delete operators (e.g. the Symbian cell phone OS).

--
cpeterso
Re:Certainly.. by QuantumG · 2003-09-15 14:29 · Score: 1

well, for a start, you can read the reply I gave to the other guy. Secondly, I'd be suprised if the Boehm garbage collector even runs on that platform (it is very platform specific and supports a number of popular platforms).

--
How we know is more important than what we know.
Re:Certainly.. by Hard_Code · 2003-09-16 00:12 · Score: 1

System resources you need to free should ALWAYS BE FREED EXPLICITLY regardless of GC. Relying on a destructor to close a file or database connection is bad.

--

It's 10 PM. Do you know if you're un-American?
Re:Certainly.. by AJWM · 2003-09-16 03:02 · Score: 1

Shrug. Memory is a system resource.

This is what destructors are for, so you can explicitly (in the destructor) free resources that haven't yet been freed. Simplifies dealing with exceptions, especially where an exception may take control out of the scope of the object.

Sure, explicitly free your descriptors and whatnot -- that makes the code clearer -- but also do it in the destructor (wrapped in a suitable check so you don't do it twice, of course) as a back up.

--
-- Alastair
Re:Certainly.. by arkanes · 2003-09-16 05:33 · Score: 1

Ignoring the whole "Allocation is initializion" thing is a good way to mess up when you're dealing with exceptions = C++ lacks finally blocks for a reason, after all.
You can just rely on the GC to clean up other system resources as well as memory, but most of them are alot finickier than memory - if you open a file, you want to close it when you're done, not when GC runs.
Re:Certainly.. by jmccay · 2003-09-16 10:15 · Score: 1

Actually, you may not write any destructors, but they do get created for you by most compilers. Destructors are one of a few functions that will get created for you if you don't add it.
Personally, I think GC is over rated. GC should be left to langauges like Java where it is built in, and a lot of design consideration was put in to adding it to the langauge. At the bear minimum I think everyone should have to manage their own memory for a while in order to learn what's going and why it's going on. Once you've got the basics and fundamentals down, then you can move on to GC type stuff.

--
At the next eco-hypocrisy-meeting, count the private jets used to get to the meeting. Should be interesting to see that
Re:Certainly.. by Anonymous+Brave+Guy · 2003-09-17 11:33 · Score: 1

Sure, explicitly free your descriptors and whatnot -- that makes the code clearer -- but also do it in the destructor (wrapped in a suitable check so you don't do it twice, of course) as a back up.

Actually, I'd argue that most of the time, you shouldn't be releasing anything directly. If you find yourself writing my_file.close(), consider whether you've got the my_file object at the right scope in the first place.

What does it mean to refer to my_file after the close() call anyway? In most cases, the answer is "not a lot", in which case you'd do better to build your code so that my_file is only in scope when it's useful, and thus automatically closed by the destructor as soon as you've finished with it.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:Certainly.. by Anonymous+Brave+Guy · 2003-09-17 11:39 · Score: 1

Why? The whole point of deterministic destruction is that you can rely on destructors to clean things up, and in C++, the accepted idiom for resource management does just that. If the destruction is happening at the "wrong time", it's probably a symptom of a design flaw (see my other post in this subthread).

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:Certainly.. by AJWM · 2003-09-17 14:58 · Score: 1

Valid point. It depends what exactly the object is and what you're doing with it, of course, but if you're encapsulating properly then usually what you argue is correct.

--
-- Alastair
Re:Certainly.. by Ed+Avis · 2003-09-19 19:36 · Score: 1

Hmm, but wouldn't it be better to have the delete operator do something? I know you don't _need_ it to do anything, because the garbage collector will eventually free the memory, but performance might be better if you can insert explicit 'delete' at certain places in your program where you know that an object is no longer referenced. operator delete() would inform the GC that the memory had been freed manually.

--
-- Ed Avis ed@membled.com

GC in OpenCM by Jonathan+S.+Shapiro · 2003-09-15 14:33 · Score: 5, Informative

We made a decision early to use GC and exceptions in OpenCM, even though the application is written in C. Conceptually, it was a big success, but there were a number of hurdles along the way. Here are some things we learned:

The Boehm-Weiser (BW) collector is not as portable as we had hoped. There are a number of platforms we wanted to run on where it just doesn't run at all. Relatively small changes to the target runtime can create a need to port it all over again. OpenBSD, in particular, was an ongoing hassle until we abandoned BW. Hans, I hasten to add, was quite encouraging, but he simply doesn't have time to adequately support the collector.
The BW collector doesn't work in our application. OpenCM has a few very large objects. For reasons we don't really understand, this tends to cause a great deal of garbage retention when running the BW collector. Enough so that the OpenCM server crashed a lot when using it. Please note that this was NOT a bug involving falsely retained pointers, as later experience showed.
Conservative collectors are actually too conservative. If you are willing to make very modest changes in your source code as you design the app, there prove to be very natural places in the code for collection, and the resulting collector is quite efficient.
Independent of the collector, we also hacked together an exceptions package. This was also the right thing to do, but it's easy to trip over it in certain ways. The point of mentioning this is that once you do exceptions the pointer tracking becomes damned near hopeless and you essentially have to go to GC.
I think the way to say this is: exceptions + GC reduces your error handling code by a lot. Instead of three lines of error check on every procedure call, the error checking is confined to logical recovery points in the program, and you don't have to mess around simulating multiple return values in order to return a result code in parallel with the actually intended return value.
To provide malloc pluggability, we implemented an explicit free operation. This lets us interoperate compatibly with other libraries and do leak detection. Turns out to be very handy in lots of ways.
Hybrid storage management works very well. For example, our diff routine explicitly frees some of its local storage (example) [Sorry -- this link will go stale within the next few weeks because the OpenCM web interface will change in a way that makes it obsolete. If the link doesn't work for you, try looking for the same file in .../DEV/opencm/...] This is actually quite wonderful, as it lets us build certain libraries to be GC compatible without being GC dependent. One of the challenges in using a GC'd runtime in a library is compatibility with an enclosing application that doesn't use GC. We haven't tried it yet, but it looks like our gcmalloc code will handle this.

Eventually, we gave up on the BW collector and wrote our own. Our collector is conceptually very similar to the collector that Keith Packard built for Nickle, though we've since built from there. A variant of the Nickle collector is also used as a debugging leak tracer for X11.

The OpenCM GC system is reasonably well standalone. We need to document it, but others might want to look at it when we cut our next release.

On the whole, I'ld say that GC for this app was definitely the right thing to do. Once you get into object caches it becomes very hard to locate all of the objects and decide when to free them. We were able to use a conservative approach with no real hassle, and heap size is fairly well bounded by the assisted GC approach we took.

On the other hand, I would not recommend a pure conservative collector for a pro

--
Jonathan S. Shapiro (The EROS Guy)

for your information... by larry+bagina · 2003-09-15 15:07 · Score: 0, Offtopic

us janitors prefer the term "sanitation engineer".

--
Do you even lift?

These aren't the 'roids you're looking for.

Re:for your information... by Anonymous Coward · 2003-09-16 01:15 · Score: 0

I'm with you 99%.

There's another way. by Lally+Singh · 2003-09-15 15:34 · Score: 4, Informative

Garbage collection has costs:
- The obvious: CPU & memory overhead for the checking and tracking. I can't comment on the amount here, but it is a generalized solution, so you forego the optimization opportunities that you'd otherwise have.
- The subtle: Memory allocation can become a major bottleneck in multithreaded systems. Garbage collection has similar issues.
- The irritating: you don't know when your destructors are called.

Another way: Smart Pointers. They're simple wrappers around the types that act like pointers, but they can make sure your objects live as long as you need and no longer. The big trick is knowing which kind of smart pointer you want.
- Reference Counting Smart Pointer (RCSP for short): this type of smart pointer will keep of how many RCSPs are pointing to the same object. It'll delete the object when the last RCSP is destroyed. A good one is the boost shared_ptr. Available for free from www.boost.org. This type is great for general use.

- Owning Smart Pointer (OSP): this type is specialized for those cases when the refcnt is never more than 1. When you assign one OSP (a) to another (b), the new OSP (a) gets ownership of the referred object, and the old one (b) is automatically set to null. When an OSP that isn't set to null is destroyed, it deletes the object it owns. It's great for parameter passing, return values, and objects you want dead at the end of the current scope, even if there's an exception. The STL comes with auto_ptr, which works this way.

You can use an RCSP wherever you can use an OSP, but not the other way around. The STL containers are a great example.

Sure it's not as easy as 'allocate and forget,' but you won't have the (sometimes very costly) expense of full-blown garbage collection.

Also, you can optimize your smart pointers for individual types (through template specialization). A great example is to give the no-longer-needed object back to a pool for later reuse.

This is really a quick, quick overview. For the meat & potatoes, go read Effective STL by Scott Meyers.

I've tried really hard to be fair & polite. There's probably still a bias, but I'm really trying!!

--
Care about electronic freedom? Consider donating to the EFF!

Re:There's another way. by profet · 2003-09-15 18:08 · Score: 1

QUOTE:

Another way: Smart Pointers. They're simple wrappers around the types that act like pointers, but they can make sure your objects live as long as you need and no longer. The big trick is knowing which kind of smart pointer you want. - Reference Counting Smart Pointer (RCSP for short): this type of smart pointer will keep of how many RCSPs are pointing to the same object. It'll delete the object when the last RCSP is destroyed. A good one is the boost shared_ptr. Available for free from www.boost.org. This type is great for general use.

You may also want to check out the Objective-C language...
Re:There's another way. by Lally+Singh · 2003-09-15 18:42 · Score: 1

Objective-C's got a builtin reference-counting mechanism in NSObject, but you've gotta call [obj retain] and [obj release] yourself. The RCSP will do all of the work itself. You just assign it and use it. It'll even work properly through an exception.

--
Care about electronic freedom? Consider donating to the EFF!
Re:There's another way. by swillden · 2003-09-15 18:50 · Score: 5, Insightful

You repeat some common myths about GC; allow me to counter them.
The obvious: CPU & memory overhead for the checking and tracking. I can't comment on the amount here, but it is a generalized solution, so you forego the optimization opportunities that you'd otherwise have.
Malloc/free and new/delete (without pooling) are also generalized solutions, and they also consume CPU and memory overhead for checking and tracking. There is good reason to believe that in the right type of language (which C and C++ are not) that GC can actually be much more efficient than manual deallocation, mainly because it can do its work in larger batches, and because it can reorganize objects in memory to make allocating more efficient. Contrast a simple single-heap malloc implementation, which has to scan a free list looking for a sufficiently large block against a copying garbage-collected system where the allocation pool is simply a large contiguous block from which you just grab the first 'n' bytes.
If you look on Boehm's web site, you can find a few papers comparing the performance of conservative GC for C with optimized malloc/free implementations. malloc/free wins, but not by as much as you'd expect.
The subtle: Memory allocation can become a major bottleneck in multithreaded systems. Garbage collection has similar issues.
Actually, GC *eases* the issues associated with recovering memory in multithreaded systems. Why? In a multi-threaded program with manual deallocation, both allocation and deallocation occur in every thread context. In a GC system, all deallocation is typically concentrated in a single thread, the GC thread. Allocation is still spread across threads but the required interlocking is hugely reduced since the GC thread can do all of the reclaiming, block coalescing and free list construction (if that's the technique used) without any interference from the other threads. It will have to acquire a mutex to place the recovered blocks back where the active threads can get them, of course.
Generational, copying GCs can do even better, but not for C or C++.
The irritating: you don't know when your destructors are called.
As experience with finalize methods in Java has shown, you should really treat GC as a way of having infinite memory. The problem with finalizers/destructors is that not only do you not know when they'll be called, you have no way of knowing that they'll *ever* be called. That means that they're effectively useless and add significant complexity and overhead for little or no return.
IMO, if you want to use C++ with GC, you should make sure that objects that have non-trivial destructors (those that do something besides memory management) get destructed normally, and just let GC handle the memory.
Reference Counting Smart Pointer (RCSP for short)
They're useful, but I'd hardly call them great. Reference counting is *far* more compute-intensive than scanning-type garbage collection. And then there's the problem of circular references, which will never be reclaimed. Of course, it's not that hard to avoid those situations most of the time, but with GC you don't have to care.
Owning Smart Pointer (OSP) ... auto_ptr
These are very useful, and conscientious use of them will eliminate 95% of memory leaks and dangling pointers. OTOH, they don't work when things get complex enough that ownership isn't simple and clear.
Also, you can optimize your smart pointers for individual types (through template specialization). A great example is to give the no-longer-needed object back to a pool for later reuse.
Generally, I would do this through specialized new/delete, rather than specialized smart pointers. Regardless of the mechanism, though, pooled allocation is the absolute best thing you can do to minimize the cost of memory management in your application. The reason is, of course, that you build the pooling based on your knowledge of the actual usage characteristics of the objects; knowledge that no general-purpose memory manager can possibly have.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:There's another way. by jdfekete · 2003-09-15 23:15 · Score: 2, Insightful

I used BW Garbage Collection on an Information Visualization system here available under the GPL.
It works, but with some problems, mainly due to the fact that it doesn't knows enough of the OS, in particular large pages allocated by libraries that it has to scan for pointers. There is nothing that cannot be fixed in theory, but systems are not designed for it right now.
On my visualization application, it spends seconds scanning some memory mapped zone opened by NVidia OpenGL implementation (this is a guess from looking at the stack trace in gdb).

For the performance talk, there are interesting figures in the book Garbage Collection
and at Boem's site showing that reference counting costs more than garbage collection.
However, reference counting is predictable and does not interrupt interactive applications at random moments.
In particular for multithreaded applications, reference counts should be guarded by a mutex that is very expensive and can be avoided using a GC.

One possible alternative to C++ with BW is using gcj, the Gnu Java Compiler. It uses the same back-end than G++ for producing optimized code but is closely tied to the BW garbage collection, providing information about how objects are organized in memory, improving the marking-time of the GC.
gcj produces code that can be linked with C and C++ without having to resort to JNI.
Re:There's another way. by cronie · 2003-09-15 23:47 · Score: 1

And then there's the problem of circular references, which will never be reclaimed. Of course, it's not that hard to avoid those situations most of the time, but with GC you don't have to care.
Ref-counting can be safe if used in conjunction with other methods. Its flaw is not only in that circular references between objects can't be tracked down but also in the general assumption that you should stay alive in memory while at least someone holds a reference to you. Semantically that's not always fair, since you may have done your job already and there's no need for someone to refer to you any longer.
A mechanism that allows to invalidate a reference when an object dies of its own will complements reference counting. Afaik such thing is present in Gtk, in Delphi VCL and some other generic libs.
Re:There's another way. by Hard_Code · 2003-09-16 00:04 · Score: 1

What about circular references? If they are not themselves referenced by an external object, do they still get cleaned up, or do they stick around because they each have a refcount of 1?

--

It's 10 PM. Do you know if you're un-American?
Re:There's another way. by profet · 2003-09-16 01:46 · Score: 1

While you still have to do some work... NSAutoreleasePool does help.
Re:There's another way. by swillden · 2003-09-16 02:02 · Score: 1

A mechanism that allows to invalidate a reference when an object dies of its own will complements reference counting.
Yep, manually breaking the circularity solves the problem, but that's not always possible.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:There's another way. by Repugnant_Shit · 2003-09-16 02:29 · Score: 1

I read some good articles about this at Relisoft.com and it was very helpful.

--
Vote for global prefs bug
Re:There's another way. by cronie · 2003-09-16 04:46 · Score: 1

When exactly it's not possible? You program destruction notification yourself, so why it shouldn't be possible? The only difficulty here is extra coding in both objects, the reference holder and the target. They should be somehow `aware' of this functionality, and most likely it means that it should be built into the very basic class of your hierarchy, like Gtk and VCL do.
From my experience, this technique combined with 'smart' pointers even in huge and complex applications do quite well.
Re:There's another way. by AJWM · 2003-09-16 08:12 · Score: 1

Reference Counting Smart Pointer (RCSP for short): this type of smart pointer will keep of how many RCSPs are pointing to the same object. It'll delete the object when the last RCSP is destroyed.

So, if you have two RCSPs pointing at each other (or a whole daisy chain of them), and nothing else pointing to any of them, when do they get deleted?

(They don't. That's the weakness of reference counting. You're fine so long as you never create any circular lists. (That's one reason you cannot create hard links to directories on 'nix.) It's something that mark and sweep can catch, though. But reference counting is usually faster than mark'n'sweep.)

--
-- Alastair
Re:There's another way. by Magius_AR · 2003-09-16 10:40 · Score: 1

Another way: Smart Pointers. They're simple wrappers around the types that act like pointers, but they can make sure your objects live as long as you need and no longer. The big trick is knowing which kind of smart pointer you want.
I've read from multiple resources that smart pointers don't work with STL containers (due to the way the internal container handles memory)
Re:There's another way. by Anonymous Coward · 2003-09-16 12:18 · Score: 1, Interesting

Actually, GC *eases* the issues associated with recovering memory in multithreaded systems. Why? In a multi-threaded program with manual deallocation, both allocation and deallocation occur in every thread context. In a GC system, all deallocation is typically concentrated in a single thread, the GC thread. Allocation is still spread across threads but the required interlocking is hugely reduced since the GC thread can do all of the reclaiming, block coalescing and free list construction (if that's the technique used) without any interference from the other threads. It will have to acquire a mutex to place the recovered blocks back where the active threads can get them, of course.

It's amazing how moderators will usually mod up a post because it's really long and they seem to know what they are talking about.

You're 100% wrong. Garbage collection from a single thread locked by a single mutex stalls all threads wishing to allocate memory. Allocation and deallocation from a given memory arena must make use of the same mutex - i.e., memory operations across all threads effectively block. Making use of more than one memory arena is one (naive) way to work around this problem. You also have to consider cache lines to avoid stalling different threads. There's a dozen other factors as well. In the end even your best generational garbage collector is no match for a modern SMP malloc/free implementation like Hoard. Google on it.
Re:There's another way. by swillden · 2003-09-16 12:41 · Score: 1

You also have to consider cache lines to avoid stalling different threads. There's a dozen other factors as well. In the end even your best generational garbage collector is no match for a modern SMP malloc/free implementation like Hoard. Google on it.
Thanks, I will.
I'll freely admit that my knowledge of memory management schemes was state of the art circa 1998, but that changes in processor architecture (heavy memory caching, deep pipelines) and machine architecture (widespread SMP) could very well have dated that knowledge.
Apparently I need to read up again; I'll do so. I'm not completely convinced that you're correct, though. GC architectures that I'm familiar with do use a separate pool for allocations during the time that the GC is running. I can see that cache management is an issue, but it would also seem that a compacting collector would help to make sure that more of your actual data fits in the cache (as opposed to the mixture of data and garbage). Dunno how that washes out.
I'll definitely have to look into it.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:There's another way. by j3110 · 2003-09-16 16:26 · Score: 1

I don't think you are thinking about what actually happens at CPU level.

If a mutex is active and a thread blocks it will get put into the wait queue and not use CPU. The only ready processes will be new IO, then you are back to the GC code again. The only overhead is the actual scanning on a single CPU system. I agree that GC and SMP can get a bit harry (probably be best to divide the memory space N ways for an N processor system and seperate thread memory as much as possible).

It's also true that GC will make your program nearly impossible to be real-time. Who are we kidding though, no one makes real time programs really anymore. They make near-real-time IO driven that you hope the CPU is fast enough that people don't see the lag of the controls to the graphics (games) or you buffer (multimedia).

I really don't care if SSH was a little slower... if it had memory bounds checking and garbage collection such that I didn't have to upgrade a half dozen machines today, I would be happy.

You argueing that malloc and free are faster than GC is like argueing that a space shuttle is faster than a car or a plane. We don't need to go to the moon for most tasks, so it's just not worth the expense and effort to hitch a ride on the next orbital craft to cross the pacific. If you want to waste the time and effort to make ssh run on a palm top, that's your motivation. The rest of us would prefer to not spend 10h/week patching computers that max out at using 10% of the CPU/Memory bandwidth. In fact, if the numbers for the damage caused by worms are even remotely accurate, I think we could afford just spending the extra 100$ on CPU+RAM/system instead of the 1-2h of labor per system per month and it would be very much more economical.

--
Karma Clown
Re:There's another way. by Lally+Singh · 2003-09-17 04:38 · Score: 1

The auto_ptr doesn't. The reference-counting boost::shared_ptr does.

--
Care about electronic freedom? Consider donating to the EFF!

C++ very expressive indeed by Latent+Heat · 2003-09-15 16:03 · Score: 2, Troll

C++ is very expressive indeed, and the example shows that C++ is good at something for which many would consider a scripting language to be required. But like some scripting languages, this C++ code fragment descends into a mix of Egyption hieroglyphics and Hittite cuneiform rather quickly.

So please tell me what

typedef vector::const_iterator Iter;

(or rather vector::const_iterator) is supposed to mean. I suppose vector is a templated class, but how does ::const_iterator come up with a type name -- I thought :: either references a static field or a class member function?

And what is the deal with the sort(,) as a free-standing function? Following OO principles, shouldn't the vector object v know how to sort itself with a call to v.sort()? And what the heck is this const_iterator type anyway that you can do ++ and * on it -- looks an awful lot like a pointer -- oops, I forgot, you can overload ++ and * to make "safe" operations on what are really objects look like "dangerous" pointer operations which the C/C++ community is in custom of using.

In principle, all the stuff done in Java and perhaps in scripting languages could all be done elegantly and expressively in C++ if us mere mortals ever figure out how to use the darned thing. But there is a kind of uniformity to Java (all object variables being GC'd heap references, collection and iterator types working with generic Object's that we cast to what the object is and rely on runtime type checking, don't worry-be happy allocation of these objects where we grab towels from the rack and leave them on the bathroom floor for the hotel maid to pick up) that simply feels more comfortable.

C++ is the music of Bach: elegant, mathematical, intricate, and expressive, but most musicians performing in front of audiences don't understand it and it is played as a dull jumble and mishmash and audiences gaze at Bach stuck at the beginning of a recital as a chore to get through. Java is the music of Mozart: simplified, standardized, predictable, and economical, but musicians of this era understand it and play it with gusto, and audiences love it because it sound so happy and makes them feel uplifted.

Re:C++ very expressive indeed by ville · 2003-09-15 16:26 · Score: 3, Informative

And what is the deal with the sort(,) as a free-standing function? Following OO principles, shouldn't the vector object v know how to sort itself with a call to v.sort()?

Not necessarily. If you have multiple types of containers and you can write a single sort that can sort all those types then why implement it in all of them instead of just once.

Here is an article that deals with the question which functions should be members and which shouldn't. It uses the std::string as an example which has a lot of methods that turns out shouldn't have to be.

// ville
Re:C++ very expressive indeed by cookd · 2003-09-15 17:05 · Score: 3, Insightful

Classes can have inner classes as well as typedefs. Those are in the namespace of the class, so the namespace operator :: is used to access them.

And the sort question was answered by somebody else, but here is a bit more on the subject: if a bunch of classes share portions of their interfaces, and the shared subset is enough to perform a useful operation, why not share the implementation of the operation? While you could certainly "tell a vector to sort itself", it makes just as much sense to "apply the sort operation to a vector". Sort is not a primitive operation on a vector, it doesn't require access to the internals of the vector for efficient implementation, so there is no reason to make it a member.

The rest I agree with. A C++ master can do wonderous things, but there are few C++ masters out there. Very simply put, it is really tough to come up with the BEST way to do something in C++ -- there is always more than one way to do it, and doing it perfectly can take an impossible amount of time.

Also, since many useful concepts are possible to implement optimally, yet not built-in to the language, there are way too many libraries. How many different ways are strings expressed? Too many.

--
Time flies like an arrow. Fruit flies like a banana.
Re:C++ very expressive indeed by lovelace · 2003-09-15 17:20 · Score: 3, Interesting

So please tell me what

typedef vector::const_iterator Iter;

(or rather vector::const_iterator) is supposed to mean. I suppose vector is a templated class, but how does ::const_iterator come up with a type name -- I thought :: either references a static field or a class member function?
No, classes can have types too. In this case, a vector::const_iterator is an iterator over the vector type that can point to anything in the container (the vector) but cannot change anything in it. A read-only pointer, if you will.
And what is the deal with the sort(,) as a free-standing function? Following OO principles, shouldn't the vector object v know how to sort itself with a call to v.sort()? And what the heck is this const_iterator type anyway that you can do ++ and * on it -- looks an awful lot like a pointer -- oops, I forgot, you can overload ++ and * to make "safe" operations on what are really objects look like "dangerous" pointer operations which the C/C++ community is in custom of using.
C++ is not just an OO language. It's a multi-paradigm language. In addition to OO, you can also do procedural , functional or generic programming. The key is to use the right tool for the right job, not to force everything into the OO model.

As far as iterators looking like pointers, there's a reason for that. They're modeled after pointers! This way a regular pointer can be used with any generic algorithm (like sort).
C++ is not just Bach. It's the entire classical genre. Java, while nice for some things, just falls short at most things. All things do not fit into the object oriented model and trying to put them there when they don't belong is a recipe for disaster. While I disagree with your characterization of Java as Mozart, I will say this. I learned Mozart when I was a beginning music student. After too long, though, it became predictible and downright boring. There was so much more I wanted to do with music. So, I branched out to Beethoven or Strauss. From there I went to even more exotic things like Bartok or Shostikovich. I felt much more fulfilled as a musician and could express things I never could while just playing Mozart. C++ is like this. It's not just OO or procedural or functional or generic. It can be whatever you want it to be and after using it for a while, other languages just don't match up.
Re:C++ very expressive indeed by aled · 2003-09-16 01:52 · Score: 1

C++ is not just an OO language. It's a multi-paradigm language.

I'm tired of people confusing any constraining in a language with being handcuffed. They are used so much to be loose that anything else looks like too tight to them. I know nothing about music, but when you learned you accepted the pentagram paradigm and language for describing music, didn't you? Do you need pointers and templates for music? seems like a single language, single paradigm to me.
In the end if you don't like just don't use it.

--

"I think this line is mostly filler"
Re:C++ very expressive indeed by be-fan · 2003-09-16 03:32 · Score: 1

I agree with this assessment somewhat, in that C++ is a generally more nifty language (in the hands of an expert) than most people give it gredit for. However, its got some limitations I wish were addressed ---

It's awfully verbose. Namespaces don't help, because of irritating practical restrictions on the use of "using namespace" in header files.

Its support for functional constructs is very limited. In particular, the lack of type inference and proper lambdas makes functional code painful to write.

C++ is great on the OO and procedural fronts, but its got some major issues with functional and generic programming. And it needs macros :)

--
A deep unwavering belief is a sure sign you're missing something...
Re:C++ very expressive indeed by angel'o'sphere · 2003-09-16 23:05 · Score: 1

To write functional code you need to stick to template meta programming.

But oh well, that code is really ugly :-)

A good book about it (despite the missleading title) is: Generative Programming from Ulrich Eisenecker (sorry, and another guy who has a to long/strange name to memorize, but that one is also a real coryphaea)

angel'o'sphere

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Re:C++ very expressive indeed by lovelace · 2003-09-17 01:56 · Score: 1

Its support for functional constructs is very limited. In particular, the lack of type inference and proper lambdas makes functional code painful to write.
That is actually being addressed by the Boost library. Boost is basically a testing ground for future additions to the language so if it works out there it's a good chance it will get added to the standard in a few years as an add on library, similar to the STL. For lambdas, take a look at The Boost Lambda Library, especially the examples. Other stuff about function objects and higher-order programming can be found here.
Re:C++ very expressive indeed by lovelace · 2003-09-17 02:04 · Score: 1

I'm tired of people confusing any constraining in a language with being handcuffed.
I didn't say anything about being handcuffed. Sure, Java and other OO languages are Turing complete. You can write anything you want in them. Whether you should or not is another matter. Basically, you should always, as much as possible, use the right tool for the job. Sometimes, heck, even a lot of times, OO works well. But there are sometimes where it doesn't and for those times you're much better off using a language better suited to what you're trying to do.
As an example of using the right tool for the job, have you ever tried to write a real-time 3D application in Java. Yes, I know all about Java3D. But take a look at their simple example of a spinning 3D cube. Every few seconds, it will pause, just for an instant, and then restart. Why? Because the garbage collector is running in the background, and, iirc, the GC Java uses is mark and sweep, which is most certainly not real-time. If you could change out the garbage collector implementation with something like a train GC, it might be better, but, at least with current implementations, that's not possible. So, for something like that, I'll stick to a language where I can control what happens when, because it's the right tool for the job.
Re:C++ very expressive indeed by PhilHibbs · 2003-09-17 02:43 · Score: 1

To write functional code you need to stick to template meta programming.
You can write functional code in templates, but it executes at compile time, which isn't always what you want. Unless you meant lambda functions, which I call 'expression objects'?
Re:C++ very expressive indeed by aled · 2003-09-17 10:00 · Score: 1

I remember reading an article about a quake-like shooter game made in java. They say it was fast but I never saw it myself. Perhaps someone could point us to some demo?

--

"I think this line is mostly filler"
Re:C++ very expressive indeed by Anonymous Coward · 2003-09-18 09:39 · Score: 0

Because the garbage collector is running in the background, and, iirc, the GC Java uses is mark and sweep
No. You happen to have used one of the bad implementations in this respect.

Re:your lesson for today by gangien · 2003-09-15 18:32 · Score: 1

IMO, if you can't organize you code such that
you prevent leaks, you probably can't organize
your code to accomplish anything simpler than
bubblesorting a text file.

First i'm not sure this post isn't a troll.. but whatever. While code organization could have a huge impact on the complexity of a program, this is simply not the reason's for most memory leaks in my experience. Most are because of stupid human mistakes. IE creating an array and only deletingthe first node. These are hard to catch because the syntax and everything appears correct at first glance.

It is attention to detail and the creation of
highly patterned structures of code that allows
software-in-the-large to be successfully
constructed. Not that I've ever tried
designing a production system that utilized
GC, but I can't imagine that forgetting about
when/where you are creating new memory structures
will improve your total comprehension of the
information flow within the software.

Erm then why not do everything in assembly? It's a waste of time to constantly pay attention to details in large projects. Why pay attention to deleting stuff at the proper time, when something like GC can take care of it, at a trivial cost. You increase the speed of development.

While every 30th article on /. bemoans the
outsourcing of programming jobs to India, ask
yourselves this: will they do what it takes
to make sure there are no memory leaks? I'm
fairly certain that they will, and, moreover,
they just consider it a part of creating
quality software.

This is pure speculation, but my opinion is at first they will be careful about such things, but I think eventually they will realize they can be rather sloppy because the cheap amount they cost, the poeople getting the code, will simply not care so much. I mean if i can have one program for 80k and a similiar one for 10k, but the 10k one has some slightly annoying bugs.. so what?

Now, in implementing functions that accomplish
both 2a and 2b, there will be memory structures
created that are used in a transient/temporary
fashion to accomplish the creation of data
structures that get attached to the tree of
permanent data structures.

This is where I think you are trolling. what you've said is like no shit sherlock, but understanding these thigns are not the problem, well in my experience. IMO it is due to human error. It's like asking if a developer has ever made a program that crashed.. mistakes happen. You seem to think that mistakes are for inferior programmers, which is a very arrogant view, and once again, in my experience, is held by extremely ignorant people.

A bit off-topic, perhaps. by bo0ork · 2003-09-15 18:47 · Score: 3, Informative

I've wrote an OO language back in 1993 that's being used by two medium-sized companies. It's garbage collected, and it's kernel is written in C. The language is not interpreted; it gets translated into C and then compiled. The applications written with the language are fairly large. The source code of one is 28MB uncompressed. I'll skip the general implementation details, and just go over the garbage collection approach I used. These definitions are true for that language; they're not meant to be general.

A program variable is either a global variable, a stack variable, a class variable or an instance variable. Global and stack variables are held in lists. Class and instance variables are kept inside objects.

Every class object has a global variable that always refers to it.

Any object that is not, and that can not become referenced (directly or indirectly) by a global or stack program variable is garbage.

Each object has a 'not-garbage' flag.

For each global and stack variable, if the referenced object is not marked not-garbage, mark the referenced object as not-garbage, and recurse for that objects contained variables.

Delete all objects that are not marked not-garbage.

There are a few more twists, like handling return values on the stack, but this algorithm correctly handles self-referencing objects no matter the complexity.

--
Does everything include nothing?

Re:your lesson for today by bmac · 2003-09-15 19:34 · Score: 1

Erm then why not do everything in assembly? It's a waste of time to constantly pay attention to details in large projects. Why pay attention to deleting stuff at the proper time, when something like GC can take care of it, at a trivial cost. You increase the speed of development.

All a computer program is *is details*. Every
single non-comment line is a crucial detail.
Using C instead of assembly is because C
takes care of details that I have no need of
taking care of. My overall point here is that
if you don't structure your data structures
in reference to the three types of changes
they incur (addition, modifcation and deletion)
you aren't thinking clearly about you design.

This is where I think you are trolling. what you've said is like no shit sherlock, but understanding these thigns are not the problem, well in my experience. IMO it is due to human error. It's like asking if a developer has ever made a program that crashed.. mistakes happen. You seem to think that mistakes are for inferior programmers, which is a very arrogant view, and once again, in my experience, is held by extremely ignorant people.

To paraphrase a book on Go from a 7-dan,
he said "the difference between the amature
and the professional is that the professional
*never* strays from the fundamentals".
Furthermore, *all* his actions are
manifestations of the fundamentals.

And, *absolutely not*, are mistakes for
inferior programmers. Inferior programmers
don't *catch* their mistakes. Being able
to catch your bugs is what separates the
amature from the professional. Furthermore,
no professional in *any* discipline has
gotten to that level without both making
mistakes and correcting them. The lifelong
amature just doesn't care enough to correct
his mistakes.

the poeople getting the code, will simply not care so much. I mean if i can have one program for 80k and a similiar one for 10k, but the 10k one has some slightly annoying bugs.. so what?

Hmmm. I better not see you bashing Micro$oft for their bug-ridden, insecure os's :-)

Peace & Blessings,
bmac
www.mihr.com: for true peace & happiness

Re:your lesson for today by gangien · 2003-09-15 20:17 · Score: 1

All a computer program is *is details*. Every
single non-comment line is a crucial detail.
Using C instead of assembly is because C
takes care of details that I have no need of
taking care of. My overall point here is that
if you don't structure your data structures
in reference to the three types of changes
they incur (addition, modifcation and deletion)
you aren't thinking clearly about you design.

Programms are details, this does not mean programming has to be. Something like GC allows you to abstract a bit more, which increases effeciency which includes speed of development, and lack of bugs, among another things. BUt your overall point, is invalid. If you can't structure your programm halfway decently you shouldn't be doing it, well for money anyhow. Memory leaks that I have seen, are mostly little silly thigns that are inconspicuous. Design and all teh attention to detail will not prevent this from happening. To err is human, right? However when you can automate this, it takes away a possible source of error, which is a good thing, right?

To paraphrase a book on Go from a 7-dan,
he said "the difference between the amature
and the professional is that the professional
*never* strays from the fundamentals".
Furthermore, *all* his actions are
manifestations of the fundamentals.

GC is straying away from fundamentals? Fundamentals are a relative thing. fundamental in C is vastly different from LISP or java. C you have to manage memory, it's a fundamental. Java, that is not part of the game. And if you use a GC for C lib, it wouldn't be part of C either.

And, *absolutely not*, are mistakes for
inferior programmers. Inferior programmers
don't *catch* their mistakes. Being able
to catch your bugs is what separates the
amature from the professional.

I aggree, but you can still not catch all your bugs. Debain stable is software from how many years ago? Most people don't even go that far to correct their mistakes. But you seemed to imply that any memory leaks were due to bad programmign design. My point is, good design, while makes these things less frequent does not eliminate them, and that most of the leaks i've seen were not from bad design. They were from human error.

Hmmm. I better not see you bashing Micro$oft for their bug-ridden, insecure os's :-)

I don't bash them for that, i bash them for being a monopoly and such. Tho i do use that as a point when trying to convince people to switch from windows.

Re:your lesson for today by bmac · 2003-09-15 20:34 · Score: 1

GC is straying away from fundamentals? Fundamentals are a relative thing. fundamental in C is vastly different from LISP or java. C you have to manage memory, it's a fundamental. Java, that is not part of the game. And if you use a GC for C lib, it wouldn't be part of C either.

Yes, GC is just shifting the responsibility,
but the freeing-unused-memory pied-piper must
be paid by something, no matter the language,
no matter the os. And, when it comes right
down to it, we are already developing more
programmers now that *cannot* work on os-level
code (straight C or asm, always) because
they've been indoctrinated into the "just let
another process handle that bit o' trivia".

Also, when it comes time to write an app that
is as fast as possible with the fastest
possible reaction time to the user, a solid
minute of unscheduled, but necessary, GC will
make such an endeavor impossible.

But you seemed to imply that any memory leaks were due to bad programmign design. My point is, good design, while makes these things less frequent does not eliminate them, and that most of the leaks i've seen were not from bad design. They were from human error.

No, good design *could* eliminate bugs, but we
don't yet know how to make such a design. What
I'm referring to is the *process* of creating
software, which involves both good design and
pattern usage inorder to to *allow* a
process that can find the bugs that will
inevitably appear.

And, yes, to err is human. But to seek to
find *all* the errors and fix them, ahhh,
that makes a damn good coder.

Peace & Blessings,
bmac
real programmers *never* put a space in
a filename.

Re:your lesson for today by kruntiform · 2003-09-15 21:09 · Score: 1

Yes, GC is just shifting the responsibility,
but the freeing-unused-memory pied-piper must
be paid by something, no matter the language,
no matter the os.

Paid for in what sense? I know from experience that I pay in programming time when doing manual memory management. If you mean paid for in CPU time, you should understand that malloc and free take a significant amount of time already. They are not free. GC does not add a great deal to that cost.

And, when it comes right
down to it, we are already developing more
programmers now that *cannot* work on os-level
code (straight C or asm, always) because
they've been indoctrinated into the "just let
another process handle that bit o' trivia".

A master programmer can write low-level and high-level code. If you are writing os-level code, you use os-level tools, and manual memory management is appropriate; if you are writing high-level code, you use high-level tools, and GC is appropriate. I really don't see your point. There are many programmers out there who can only write low-level code (poorly). Too much focus on the low-level means that, for instance, they don't know how to use any datastructures except for arrays.

Also, when it comes time to write an app that
is as fast as possible with the fastest
possible reaction time to the user, a solid
minute of unscheduled, but necessary, GC will
make such an endeavor impossible.

That hasn't been true of garbage collectors for decades. We are not in the 70s any more. Garbage collectors these days are incremental. The biggest optimization gains in software (except in some low-level software) are usually won through algorithmic optimizations rather than low-level optimizations. Garbage collection tends to make such optimizations easier or at least gain you some programming time in which to implement them.

Re:your lesson for today by gangien · 2003-09-15 21:13 · Score: 1

Yes, GC is just shifting the responsibility,
but the freeing-unused-memory pied-piper must
be paid by something, no matter the language,
no matter the os. And, when it comes right
down to it, we are already developing more
programmers now that *cannot* work on os-level
code (straight C or asm, always) because
they've been indoctrinated into the "just let
another process handle that bit o' trivia".

I know it's a very popular opinion here that details must be paid attention too. But when you have a language, tho perhaps a bit slower, allows you to abstract more, You gain a lot more than you lose. And often times you will have a program that is faster than one you would of develoepd with out that abstraction, because You're more worried about solving the problem than all those details. And So waht there's more programmers that don't know these things? Hell, this is good, less compitetion for me :) programming is changing, as everything else does. In 100 years, I would be surprised if anyone but the highly elite paid any attention to things like memory management, assuming we still ahve similiar computers.

Also, when it comes time to write an app that
is as fast as possible with the fastest
possible reaction time to the user, a solid
minute of unscheduled, but necessary, GC will
make such an endeavor impossible.

This happens when? you're goign to make leaps and bounds of improvements in hardware and better algorythms compared to optimizing your code. And I believe the military uses mostly Ada for their stuff. And i've never doen ada, but I thought ada had automatic memory management, but I could be wrong on this. BUt still, All these slowness things people love to bring up, are overrated, they are largely irrelevant and trivial. Yes there are cases where they are not. BUt as i said before..

No, good design *could* eliminate bugs, but we
don't yet know how to make such a design. What
I'm referring to is the *process* of creating
software, which involves both good design and
pattern usage inorder to to *allow* a
process that can find the bugs that will
inevitably appear.

I dunno what you're trying to say.. we will never reach perfection, we will jsut get closer and closer, perhaps we'll reach a point when it's trivial. But we're lightyears from that. So memmory leaks will happen.. if you use GC they won't happen except if there's an error in the GC itself.

GC costs by greppling · 2003-09-15 21:14 · Score: 5, Informative

One thing you didn't mention is that GC is deemed to have pretty high processor cache-miss costs. The obvious part is that the GC run itself is basically pointer chasing, i.e. pretty much the worst thing you can do cache-wise. And after the GC run, the cache is clobbered with stuff useless for continuing the work.

There is another indirect cost pointed out by Linus Torvalds in a lengthy post to the gcc mailining list. The executive summary is that (he thinks that) memory that is not to be used anymore should be freed immediately. Otherwise, the data in there will keep lying around in the data cache. Also, he claims that explicit ref-counting gives you advantages for optimization: Assume you have to make some modifications to a data structure, but you don't want other parts of the program to see the modifications. Without ref-counting, you have to copy all the data structure before modifying it. With ref-couting, you can omit the copying if you are the only one with access to the data structure.

And finally, he thinks that GC makes it too easy to write pointer-chasing-heavy code---as that kind of code is bad for cache behaviour all the time.

It is an ongoing discussion whether GC really has that bad effects on performance of GCC. But Linus Torvalds seems to have very good points. (And some of them certainly cannot be taken into account in a "GC cost is less than hand-written memory management"-paper.)

Re:GC costs by Hard_Code · 2003-09-16 00:09 · Score: 3, Interesting

Can anybody informed tell me whether we have not ALREADY lost the war against pointer-chasing and cache clobbering? Any OOP or interpreted language (the vast majority of mainstream code) is doing this already, true?

--

It's 10 PM. Do you know if you're un-American?
Re:GC costs by be-fan · 2003-09-16 03:25 · Score: 2, Insightful

Your description is slightly inaccurate. He said that explicitly freeing allows the next alloc to reuse a given chunk of cache-hot memory, while the GC will ignore that memory and allocate a cache-cold chunk instead.

--
A deep unwavering belief is a sure sign you're missing something...
Re:GC costs by dvdeug · 2003-09-18 21:25 · Score: 1

But Linus Torvalds seems to have very good points.

After reading many of Linus Torvalds' posts, I think it's useful to remember where he's coming from. He's spent ten years writing a kernel, with some of the best programmers, where every line of code has been rewritten several times. Yes, in that environment, garbage collection won't a huge win. But you don't always have forever to work on one project; you don't always have a team of crack programmers on the job; and a lot of times, efficency is not of primary importance.

Re:your lesson for today by bmac · 2003-09-15 22:37 · Score: 1

I dunno what you're trying to say.. we will never reach perfection, we will jsut get closer and closer, perhaps we'll reach a point when it's trivial. But we're lightyears from that. So memmory leaks will happen.. if you use GC they won't happen except if there's an error in the GC itself.

"Seek and ye shall find." If you do *not*
seek perfection, you will get just that, and
when you say we you are really saying
I. And to the legions of people who
don't try harder, I say "cheers". Keep up
the mediocre work. In the meantime, what I
seek shall keep getting the same response:

I don't see your point.

At least I can say I tried. Just remember,
it may just be that not "everyone is lightyears
away from that".

Peace & Blessings,
bmac

www.mihr.com: where is that dark matter?

It's okay by Anonymous Coward · 2003-09-16 00:41 · Score: 4, Interesting

I've used a garbage collection system in a C project before and it works surprisingly well. The problem with GC in C though is that it is possible and legal to,

o allocate memory
o write the pointer to a disk
o lose the pointer in memory
o read the pointer back off the disk,
o make use of the pointer

With all GC strategies I'm aware of, by the time you read the pointer from the disk the memory may well have been freed.

I'm not saying that this style of programming is a generally good idea but it is used in certain, specialised situations and therefore not suitable for a garbage collecting language.

Re:It's okay by LWATCDR · 2003-09-16 03:26 · Score: 1

Well yes you could do that but... Don't just for goodness sakes don't do that! While it could work I can not imagain any reason to do such a thing.

--
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Re:It's okay by You're+All+Wrong · 2003-09-16 11:27 · Score: 1

Maybe that's a problem with your imagination. You'll be telling me next that Knuth never propagated the concept of the XOR DLList, and dancing pointers?

If Boehm says to Knuth "but you shouldn't be doing that", then I think it's perfectly acceptable for Knuth to respond "but _you_ shouldn't be doing _that_".

YAW.

--
Your head of state is a corrupt weasel, I hope you're happy.

Boehm collector on large application by Inode+Jones · 2003-09-16 01:06 · Score: 2, Informative

I am using the BDW collector in an EDA tool. EDA tools store large databases of circuit connectivity, and for various reasons we don't want to be bothered with explicit memory management.

The salient points:

Destructors are not Called

If an object is allocated in collectible memory, then its destructor will not be called when the object is collected. Therefore, destructors are pretty much useless and your code must be designed to work without them.

Actually, if your object derives from class gc_cleanup, then its destructor will be called. However, due to the handling of cleanup functions in the BDW collector, cycles of such objects will never be collected. For this reason, I don't use gc_cleanup much.

Allocating Collectible Memory

By default, C++ allocates objects in the "malloc" heap. The BDW collector maintains a separate heap. In effect, there are four types of memory:

scannable and collectible (GC)
scannable, but uncollectible (NoGC)
non-scannable, but collectible (GC_atomic)
non-scannable, non-collectible (malloc)

"Scannable" refers to the property that objects in the heap are scanned for pointers. "Collectible" refers to the property that objects in the heap will be deallocated if no further references are found.

These four memory types are an issue when you interact with STL and third-party class libraries. By default, STL uses the malloc heap. If you want, say, a std::vector in collectible memory, then you need to write an allocator to get it. The most recent versions of the collector come with such a beast; the version I started using did not.

Similarly, std::string is reference-counted, and in the malloc heap. Here, rather than using an allocator to force it into the collectible heap, I wrote my own lightweight GCString class, which stores the string as an immutable object, and relies on the collector for cleanup.

Third-party class libraries such as ANTLR may use reference-counted objects; you need to bridge between GC and non-GC applications carefully.

Re:your lesson for today by mbrezu79 · 2003-09-16 01:13 · Score: 1

bmac,

have you read Structure And Interpretation Of Computer Programs (SICP) by Abelson, Sussman and Sussman (it's available online, as an MIT *introductory* CS course)?

They use Scheme as their programming language and, in the space of ~600 pages they go from "let's write an expression in Lisp/Scheme" to "let's write a Scheme compiler". Lisp/Scheme uses GC. When they write the compiler, they provide a simple implementation of a GC in the pseudo-assembly they compile to. It's possible to use a language with GC and know *extremely* well what's going on under the hood. And they use a language with a GC to be able to teach sophisticated topics, like logic programming and symbolic differentiation.

I also read Kageyama's book on Go, and ever since I wondered what are the fundamentals of CS. I think many of them are described in SICP, and the most important ones are procedure abstraction and data abstraction.

Also, computers are about automatizing automatizable tasks in a manner that can be "proven" to be "correct". GC is an automatization of memory reclamation which works well (although not "correctly" in some cases because of the environment -- programs with GC have to interact with libs that may fool the GC). It has drawbacks, yes, but then everything else has.

Attention to detail? Yes, but remember that you have to automatize/abstract/detect patterns of data/behaviour, otherwise we'd be stuck with machine code -- no one would have invented a programming language in the first place.

Just my 2 cents.

Costs in the OpenCM collector by Jonathan+S.+Shapiro · 2003-09-16 01:40 · Score: 1

The memory management cost for GC is basically the same as for malloc/free. It's just amortized in a different place.

It turns out, however, that there are natural places to do GC, and a little help from the application can go a very long ways. In the OpenCM collector, we mark procedures that return pointers using a special GC_RETURN macro. This works because at the return from a procedure all of its local variables are known to be unreachable. The only surviving objects are the ones that are reachable from the returned pointer (idea for this is due to Keith Packard).

By using this discipline, we actually blur the distinction between managed and unmanaged collection. The results look very good from a performance perspective.

However, I should acknowledge that this is partly due to the structure of our application. Servers are "in and out". They generate a lot of garbage during a given query and then release essentially all of it. Procedure return is therefore a natural collection point. The same experience might not apply in systems that hold large amounts of memory in the heap during long-running computations, with lots of temporary allocation.

--
Jonathan S. Shapiro (The EROS Guy)

The wonders of moderation by Latent+Heat · 2003-09-16 01:40 · Score: 2, Interesting

The original discussion was whether anyone used/benefited from a C++ implementation of GC. A poster responded with a link to Stroustrup pointing out that C++ is so expressive that you don't need GC -- you can embed the memory management in stack-frame objects which take care of it for you.

I pointed out that Stroustrup's example shows the expressive power of C++, but there is a big "huh?" factor of reading the code on account that many of us mere mortals are not rehearsed in the use of templates and STL, and there was something to be said for Java and GC, not just for safety but for simplicity of expression and code reading by maintenance programmers.

Yours is one of three comments to my remarks, answering questions I had raised, disagreeing with some points, agreeing with others, but otherwise engaging in a reasoned discussion of the merits of C++ and its advanced features (templates and STL). But I get moderated down and flagged "troll" -- oh well.

Re:The wonders of moderation by Anonymous Coward · 2003-09-16 07:16 · Score: 0

Sometimes people running IE will moderate a post positively, then try to use the mouse wheel to move up or down the page while the drop-down list control still has the focus. The result is the well-known MODS ON CRACK!!!!1!! effect. I've done it before, myself, but always caught it in time.

The correct solution would be for the Slashdot guys to use a different control (radio-button list?) for moderation. But I wouldn't hold my breath if I were you.
Re:The wonders of moderation by PhilHibbs · 2003-09-17 02:26 · Score: 1

Maybe the moderator mistook your question for socratic irony.
Re:The wonders of moderation by Anonymous+Brave+Guy · 2003-09-17 11:30 · Score: 1

I'm afraid you just mentioned a few standard issue rants about C++ (all of which, as you can tell from the replies, are actually pretty groundless). The tone of your post wasn't clear -- I thought you were trolling initially as well -- so just put it down to bad luck, I guess.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Don't Free Memory Unless You Have To by brlewis · 2003-09-16 02:31 · Score: 1

I looked at Stroustrup's two examples. It looks like his first example does not involve freeing any memory at all. Am I right? His second example seems to use auto_ptr to assure that an object is freed when the function where it's allocated returns. Is that all it's doing? I would expect the situations where people get memory leaks to be more complex than auto_ptr could handle.

Anyway, he never mentions garbage collection; just easier "explicit" management. (I put "explicit" in quotes, because malloc/free has to manage free blocks, so it's not as manual a process as you might think. For some applications, Boehm's collector is actually faster.)

Re:Don't Free Memory Unless You Have To by haystd · 2003-09-16 03:47 · Score: 2, Informative

With the first example I belive the point is that the string and vector classes will clean up themselves when they go out of scope (when their destructors are called). STL is very helpful especially when supplemented with the Boost libraries.
Re:Don't Free Memory Unless You Have To by You're+All+Wrong · 2003-09-16 11:00 · Score: 1

I can see plenty of freeing of memory in the first example.
In case you missed it, here it is again:
}

I remember racing C code with (compiled) Scheme once, and the two were pretty close for the kind of task I was doing. However, when my problem size reached a certain point, GC would have to kick in and scheme was suddenly relegated to the vastly-slower-than-C camp, with so many other languages that otherwise would have plenty of merits. So I admit that gave me an instant ant-GC bias. (That and the _disaster_ that happens when your doofus boss employs Java programmers for a C++ project, and suddenly everything leaks like Mr. goatse's rear end.)

YAW.

--
Your head of state is a corrupt weasel, I hope you're happy.

Umm... by Anonymous Coward · 2003-09-16 02:58 · Score: 0

How is incrementing/decrementing a reference count "*far* more compute-intensive" than scanning memory?

Re:Umm... by swillden · 2003-09-16 09:49 · Score: 1

Because you do it much more often. Google a little; plenty of studies have shown that refcounting is much slower than more sophisticated collection methods.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:Umm... by Anonymous Coward · 2003-09-16 10:56 · Score: 0

For instance
GC-Compact-Mark-Sweep-Generational is much better than Count-Referencingggggg
*Compact: it compacts memory
*Mark-Sweep: it marks all the objects used and later sweepes the heap
*Generational: it divides to few sub-heaps of several old-age.
open4free

Very happy... by DrCode · 2003-09-16 04:07 · Score: 2, Insightful

A few years ago, I used the Boehme GC when writing a pair of compilers (Verilog/VHDL) in C++. I was very happy with the result, since it was rare for GC even to get called at all. It was also surprising how much simpler code gets when you don't have to worry about deleting objects.

Memory Mapped Files by Frans+Faase · 2003-09-16 04:45 · Score: 1

A perfect solution for when you want to write data to a disk is to use a Memory Mapped File. You can write data to a file and still keep it in memory. CG will just work correctly, although using CG with a Memory Mapped File mat cause the data to be read in again everytime a CG occurs.

I once wrote some classes to work with Memory Mapped Files (under Windows) in an almost transparent manner. It works great for making complex C++ object hierarchies persistant.

ILOG Solver / Scheduler by Koos+Baster · 2003-09-16 04:59 · Score: 1

ILOG Solver & Scheduler are mainstream commercial thrid party libraries in C++ based on the constraint programming paradigm. One of the major features is ILOG's automatic garbage collection heap, which is automatically deallocates memory (based on assumptions on program flow). To make this efficient, they skip all deallocations (using a longjump, rather than a return).

At first this may look like an elegant way to get rid of complicated memory management & garbage collection without loosing efficiency. However, in my personal experience, it is completely horrible when combined with pats that use the normal system heap. Specifically, when writing your own constraints, goals or deamons, it is practically impossible to use anything but ILOG's solver heap.

I gues this is one of the mayor reasons why they recently made a technology change and launched JSolver, a Java based counterpart.

--
Ninety-Ninety Rule of Project Schedules: The first ninety percent of the task takes ninety percent of the time, and the last ten percent takes the other ninety percent.

garbage collector for garbage code by Anonymous Coward · 2003-09-16 06:10 · Score: 0

if you need garbage collector your code must
be filled with garbage. i, on the other hand,
need only malloc() and free() and still my
code manages to work well without memory leaks
(thanks to valgrind).

Qt has nice garbage collection by tvm662 · 2003-09-16 06:12 · Score: 1

The Qt toolkit (on which KDE is based) has a nice garbage collection facility. All of the widgets derived from the base class QWidget take care of deleting child widgets that are also derived from QWidget, including user defined types. This means you can add, remove or move widgets in your user interface without having to worry about the corresponding delete.

Tom.

Re:Qt has nice garbage collection by cant_get_a_good_nick · 2003-09-16 10:54 · Score: 1

Apache isn't bad either. You can create things based off a pool. Each pool has different lifetimes (per server, per connection, per request). Upon deletion of the pool, all resources associated with that pool are destroyed.

Good Book on Garbage Collection by umofomia · 2003-09-16 08:51 · Score: 2, Insightful

There's a really good book about everything you ever needed to know about garbage collection. Although most of the book deals with garbage collection techniques in general, it has two complete chapters devoted to implementing and using garbage collectors in C and C++ and which ones you should use depending on your application needs.

You need GC... by pragma_x · 2003-09-16 09:22 · Score: 1

...whenever you find yourself writing an overly-complicated means to overcome issues of object/memory 'ownership'.

(Granted, one could say that this would apply to the GC itself, but not necessarily so)

The trick is, memory is a 'resource' and as such is subject to acquisition and release steps in order to maintain it properly. If the notion of ownership of memory is ambiguous, you need to normalize your data somehow so you get back to a 1:n relationship between owners and acquired resources. This happens frequently in situations where objects have an n:n relationship, usually a network of one homogenous type (a network).

The easiest way to achive such a normalization, short of drastically changing your system design or coding practice, is to plug a GC to do object/memory management for you.

As a side note:
Reference counting can help with this style of problem too, but it utterly fails to cope with cycles (mutually pointing objects). See: COM. It can be used *very* effectively if you code with this shortcoming in mind. Some GC's even use ref-counting internally to clear out the bulk of objects, leaving just abandoned cycles to be garbage-collected.

GC and GCC by devphil · 2003-09-16 09:52 · Score: 1

Many people are not aware that GCC itself uses garbage collection as it runs. You can actually select which algorithm gets used at configure time, and tweak the GC parameters during runtime (via a growing set of command-line options that users never think to use).

That aside: I've corresponded with Linus a couple times (on other subjects), and while he is the brilliant guy that /. thinks he is, he is a kernel expert, not a compiler expert. Entirely different problem domain, very differnt approaches to solutions. Not every compiler is alike, not every GC strategy is alike, and most of the GC strategies out there are not appropriate for use within GCC. (Note: within GCC, not by a program compiled with GCC.)

What the GCC maintainers have known for a long time -- due to actual analysis of the compiler, not "this tends to work elsewhere and in other programs" -- is that the current GC strategy is suboptimal. There's even a design for a good replacement. None of the volunteers have had time to write it yet. And on that note, I'll leave you with a quote from Torvalds: "What we need is less people running around and telling everyone else what to do and more people actually writing code."

--
You cannot apply a technological solution to a sociological problem. (Edwards' Law)

Smart Pointers by stormcoder · 2003-09-16 10:06 · Score: 1

You can get the same effect using Smart Pointers and not give up the control that using a garbage collection system entails. See Boost and Alexandrescu, Andrei. Modern C++ Design. There is also a nice article on CUJ.

--
Sorry my bullshit sensor overloaded.

Those that do not use GC are bound to reinvent it by Anonymous Coward · 2003-09-16 10:26 · Score: 0

If a project is complex enough, you're either going to have to use an existing garbage collector or invent your own. These people who claim that there's no need for GC just haven't had a complex enough project (compilers come to mind).

You can do some things to simplify things: don't share -- make copies instead, use smart pointers (which is really just reference counting GC).

In the end though, you're going to have to start doing pointer scans to get rid of the cycles, and at that point you might as well import an already debugged GC rather than roll your own.

Apache pool collector by Anonymous Coward · 2003-09-16 12:27 · Score: 0

is not great to use in a hybrid C and C++ application. Why? C++'s new operator does not generally take a memory pool as a parameter and the lifetimes of the objects allocated by Apache (on a connection, let's say) do not match the lifetime of the C++ heap allocated objects you may be using. Something could be rigged to have Apache fire callbacks to dispose of C++ resources when a memory pool dies - but this is way too complicated.
If someone was able to get the Apache Portable Runtime working with C++ in a sensible, I tip my hat to you.

Re:your lesson for today by bmac · 2003-09-16 13:41 · Score: 0, Troll

Wow! You even realized it was Kageyama's
book. Man, that is strange, but cool.
BTW, I didn't read the whole book because
I realized that really learning Go was a
full-time occupation and I already have a
few of those :-) But Kage was the best
writer though and I got some good info out
of the few tens of pages I did read.

I hate to keep beating a dead horse, but
my whole point here is the programmer
should know exactly what data is no longer
connected and suitably delete it immediately.
And I know this is lost on the small-focused
but this is indicative of our society -- we
just assume someone else will take care of
"it", whatever it is and no matter the the
long-term costs are far greater than the
short-term cost of taking care of the detail
yourself. We package individual pieces of
candy, and I'm nearly the only person who
finds that completely ridiculous. Sure, it
serves its purpose, but with a more broad-
minded approach to life (and respect for
future generations), we could design systems
for food delivery that used 10% of the current
resources. But, like I said, no one wants
to inconvience themselves to improve the
greater efficiency.

And I'm not gonna flame the Scheme/lisp folks,
as a matter of fact I have recently begun
reading said book because there are no more
interesting CS books (at least at B&N, except
for the DirectX & OpenGL books (and I have
no time for all that fun)), so I figured that
I'd trip down the Lisp lane for awhile.
Anyhow, I don't see any os's written in lisp,
and scant few actual programs. (Autocad, one
of the finest pieces of software in the world,
uses lisp for its scripting, but was written
completely in C). Sure, lisp/scheme is neat
for learning and "concepting" and tail-recursion
is neat, but what gets the job done on a
register-based processor is iterative loops,
and Haskell and Lisp just don't map down
better than C or even C++ (without the STL)
will.

Most programmers do not even realize the power
under the hood of even a Pentium II 400 (which
I use). When you write straight C direct to
win32 api calls, the program response is
*immediate*. The program displays *instantly*
and all functions *fly*. And then I load
the SunOne IDE and it takes friggin one
minute. Someone's missing the point here, and
the person who figures out how to write bullet-
proof C code for straight win32 api is going
to rock some dollars.

Peace & Blessings,
bmac
For the fundamentals of life: www.mihr.com

Automated Object Management in NewJ for C++ by domKing · 2003-09-16 14:23 · Score: 1

We considered using the Boehm collector for our commercial product, NewJ Library for C++. But we opted not to due to Boehm's lack of predictable object destruction and its maintenance of separate heaps, complicating integration with existing libraries. These issues are covered in detail in a recent C/C++ User's Journal (CUJ) article on the Boehm collector.

Instead, we developed our own automated object management facility based on reference objects, that is, "smart pointer" objects with these new capabilities: Reference objects in C++ are completely synonymous with object references in Java. Unlike traditional smart pointers, reference objects support inheritance, both single-implementation inheritance and multiple-interface inheritance. These inheritance characteristics also apply to reference objects of arrays. They can also be built to be rigorously threadsafe and secure.

Reference objects have allowed us to port Java source code directly to C++ with virtually no changes. They have also allowed us to take advantage of productive Java language features in new C++ projects without the overhead or installation of any JRE or VM. delete has disappeared from our C++ code entirely. At the same time, our C++ is still characteristically lean and interoperates naturally with other C/C++ libraries and APIs, such as ANSI C++ STL, Win32 API, and MFC.

We have used these concepts to implement the core Java API in C++, for C++, although reference objects themselves are not specific to this API. They are implemented in our low-level Pie Library and usable in any C++ application.

Info and features of NewJ and Pie Library

The NewJ C++ Developer's Guide explains reference objects more fully. It is available as a free download (registration required).

the key to it all by pyrrho · 2003-09-16 15:26 · Score: 1

throughout your post I was think, "yes but...", "well put, but..." and then I reached this, which is what I agree with and the only problem I have with GCs.

>The reason is, of course, that you build the pooling based on your knowledge of the actual usage characteristics of the objects; knowledge that no general-purpose memory manager can possibly have.

I find this in general... Garbage Collectors, not unlike VMs, do well when they know their problem domain well. So, it's common to use a Garbage Collector to manage a special heap, say of fixed size objects that are often allocated/deallocated.

The secret to memory management is "don't let things get out of hand", understand how you are using memory. It's not like having to type in a laborious syntax, like say, trying to do OOP in Fortran, which just doesn't have a friendly syntax for that... iow, it's really worthwhile to think about memory management issues, who owns memory, etc, and not just for the free() call, but for the design itself. It pays back a lot to think of these things, and use GC for particular cases that can benefit.

For this reason I don't like GC based languages (or at least, I don't like that aspect of them), while I have nothing against GC in general.

--

-pyrrho

Re:the key to it all by swillden · 2003-09-17 05:21 · Score: 2, Insightful

iow, it's really worthwhile to think about memory management issues, who owns memory, etc, and not just for the free() call, but for the design itself. It pays back a lot to think of these things, and use GC for particular cases that can benefit.
I agree with this, actually. I've seen many cases where having to think about object lifetimes has given me clearer insights into the problem domain and into the design, and resulted in better, tighter, cleaner and more maintainable code than would have been the case otherwise.
However, I've also seen some systems where the cleanest, most flexible and most maintainable design made keeping track of object lifetimes a real bitch. So while I wholeheartedly agree that being forced to think about memory management can improve the overall quality of most designs, there are circumstances where having to manage memory manually forces you to choose an inferior application architecture. In those cases, GC is a *huge* win, since choosing the right structure is the single most important thing you can do to facilitate both implementation and maintenance.
In addition, there are those cases where the code just doesn't matter that much; and GC is certainly an aid to getting the job done quickly, because it removes a large class of concerns from the programmer.
I'm not necessarily a huge fan of GC, sometimes it's great, sometimes it sucks -- it's just another tool. But I had to respond in this thread because there are a lot of misconceptions about GC (and the alternatives, like refcounting smart pointers).

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.

writing pointers to disk by pyrrho · 2003-09-16 15:39 · Score: 1

... well thanks, you made my head explode by saying that!....

Luckilly I don't need my head to type. I can no longer read the articles on slashdot... but that doesn't matter, I can still post.

--

-pyrrho

Re:your lesson for today by angel'o'sphere · 2003-09-16 23:40 · Score: 0

You have a nice view as a programmer. Nothing wrong. But as long as you keep the programmer view you wont see that your view regarding the whole topic of "computers, computer science, applying computer sciense to real world problems, variations of solutions depending on programming paradigm, hardware, os, or simply fashion of the current aera" is only a the limted view of a programmer :-)
There is far more than only programming in computer sciense.
Supposed you wanted to write a Go program .... you can be assured that memory management and flying functions and instant display of windows ... are completely irrelevant.

The "X-price" for writing a Go program that beats a Sho Dan Go player at least *once* is about 4 million dollars. A person playing Go 3 times a week, having a teacher, is likely to reach Sho Dan level in about 3 or 4 years. Just a reference, for how skilled you are as a Sho Dan and how "easy" your program only needs to be :-)

Regarding your assembler loops ... its a difference wether your loop is written in C, Java or assembler. Of course it is. And if its well written, likely the assembler part is the fastest.

However, far more interesting is how often your loop gets called. So your local optimization might be completely useless if your loop is called once a year.
And it also is completely useless if it is called very often but runs only over 3 or 5 elements.
In both cases, *I* as *your* boss want you to spend your time in optimzing the *usage* of said loop, and not the loop.

angel'o'sphere

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.

Objects freed at end of function by brlewis · 2003-09-17 01:57 · Score: 1

OK, now I understand. Basically he's saying that if you follow a certain discipline, allocated memory will be freed when the function returns. But the same could be said of explicitly freeing at the end of the function, and having goto's at places where you would otherwise return early. The second example bumps it up exactly one level. However, for objects that get passed around between more than two functions, you still need to keep track of what must be freed.

Regarding Scheme performance, check out how bigloo and Stalin are doing in the Bagley language shootout.

Re:Objects freed at end of function by Anonymous+Brave+Guy · 2003-09-17 11:23 · Score: 1

I think perhaps you're still misunderstanding slightly.

The idea is that you wrap the resources in an automatic variable -- something that *will* be destroyed automatically when it goes out of scope. You cannot forget this, because the language does it for you. Now, have the destructor for that automatic variable release the resource it manages and bingo, you can never forget to release the resource.

The idiom is known as "resource acquisition is initialisation", BTW, if you want to look it up.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:Objects freed at end of function by You're+All+Wrong · 2003-09-18 03:32 · Score: 1

Bjarne's emphasising the fact that if one has the need to do it explicitly, then one has the opportunity to muck up. No paradigm is an unchinked silver bullet, but some are less chinked than others.

Language shootout - if that's the one I'm thinking of it promotes the use of unidiomatic, deliberately perverse code to make Perl look 2 times slower than it really is. However, it might not be the same shootout, so I'll google as soon as I click send. (But take that as a caveat that any language can be made slow|unsafe|whatever in the wrong programmer's hands, to bring it closer to on-topic :-) )

YAW.

--
Your head of state is a corrupt weasel, I hope you're happy.

Re:your lesson for today by dr2chase · 2003-09-17 03:33 · Score: 1

The piper always charges, whether for manual or automatic memory management. The costs have been measured, and for decently large systems, the gains in reliability, maintainablility, and time to market are well worth the additional costs of garbage collection.

As far as your remarks about OS-level code go, people have experimented, successfully, with writing large portions of what might normally be regarded as "the kernel" in a garbage collected language. One example is the SPIN project at the U of Washington during the 1990s. Because they used a safe language (which generally implies garbage collection) they were able to maintain the usual security guarantees while allowing greater freedom in loading code into the kernel. Avoiding the kernel-user boundary at busy interfaces more than made up for the in-the-small overhead of using a safe language. There will always be people capable of writing such code, and compared with the other start-up expenses involved in computing, the costs of motivating those people to be interested in working on the OS will not be excessive. I'm able, and with any luck I'll be alive and mentally sharp (like all my ancestors) for another 40 years.

Another example comes when you do concurrent programming. Reference counting becomes much less attractive when it requires frequent use of bus-locking instructions to keep the counts consistent.

I have also done the experiment of writing code in Java, and then working to insert typed storage pools to avoid activating the GC. The end result does run faster, if you take care to recycle complex data structures without shredding them into parts (by analogy, recycling glass bottles by refilling instead of melting and reblowing), but it does require interface changes, and it does require peculiar distribution of responsibility for reclamation. You might say this is "good design", but it doesn't look good to me -- it looks strictly more complex and more fragile. There is a large quantity of design, but the quality goes down.

And, because this is an economic exercise, it is sometimes worth making the effort to avoid heap allocation, but in practice, like all optimization, this should be done after measurement, not before measurement. Since this will require changes to interfaces, it's not fun, but to simply design the entire system as if it were all one big critical inner loop is wasteful (crazy, actually).

And yes, I AM an Expert. I have done GC research, studied its interactions with optimizing compilers, written interpreters and compilers for garbage-collected languages, written optimizing phases for compilers for those garbage-collected languages, and measured the performance of different garbage collectors and GC-using applications as I optimized them.

Regarding your "solid minute" remark. The last time I played games with rate benchmarking, I saw a rate of 10Mb/second on a 200Mhz Pentium Pro with 66Mhz memory, where the size of the "dense" portion of the live set is what matters (this was with two crude collectors, one of them someone's adaption of the Boehm-Weiser collector). A solid minute of collection would require, on that now-slow processor, 600 Mb of dense live set. This is a worst-case conservative estimate of GC performance -- the collector is crude (not generational), the processor is slow, and I am assuming a large dataset of pointerful objects. In practice, even "crude" copying collections on modern machines run in under a second, because most applications don't have that much live data.

In addition, one can write a provably real-time collector. It's not done often because there are associated overheads (humans are fine with milliseconds of pause, so this would trade off performance losses for no perceived latency gains), but it can be done. Henry Baker wrote a paper on this long ago, and their have been improvements on his work since then (e.g., the "treadmill collector"). Real-time memory management with reference counting always requires careful work, because releas

GC has it's place ... just not something for me by ResidentLinuxLunatic · 2003-09-17 04:35 · Score: 1

I've dealt with garbage collection in Java and now in Python. When you get into the mindset of a language that does this natively, I have found that your code naturally flows into that paradigm. I can't imagine trying to use garbage collection in C/C++ -- it just doesn't fit into the scheme of things for me. True, the STL has auto_ptr, and I have used that in the past -- works rather nicely, IMHO -- however the way I learned how to write clean, efficient C code was to make sure you write the code to deallocate memory when you write the code to allocate it, whenever possible. This works for me, mainly 'cause I've done it for so long. Granted, there are times when doing so is difficult -- there are times when you allocate in one function, but don't deallocate until a much later time in another function. I've written a couple of 100K+-line apps that do that (and have to juggle CORBA calls as well) and it is -very- difficult to debug. That's when you wish you had garbage collection. At any rate ... I have found that a number of programmers prefer GC simply because they don't want the hassle of worrying about cleaning up after themselves. For really good programmers ... yeah, it's nice and it allows them to focus more on the task at hand. However, for not-so-good or poor programmers (and there are a number out there) it allows them to write poor code that could allocate memory without considering the consequences -- before I get flamed for this comment, yes I have seen such code, and from some people who were supposedly "really good programmers". When push comes to shove, nothing beats writing good, clean code in the first place, having already designed into the code where memory is allocated and where it is deallocated. Once you've done that, grab the profiler and look for ways to optimize. Chances are, the code will be cleaner and more elegant than if you were to use a GC-type solution. If not, maybe C/C++ isn't the right language for the task. Time to take a look at Python. :)

Re:your lesson for today by bmac · 2003-09-17 06:26 · Score: 1

"computers, computer science, applying computer sciense to real world problems, variations of solutions depending on programming paradigm, hardware, os, or simply fashion of the current aera" is only a the limted view of a programmer :-)

I actually disagree because we are really
the only people on the planet who deal with
iterations in the millions and billions. Being
able to deal with those numbers (and, more
importantly the Butterfly Effect) gives us a
better understanding of the impact of billions
of human beings being senseless "in the small".
They just don't realize/care that all those
little wrappers add up to a big mess, and,
seeing as I care for the state of the earth
for my children, I'd rather proactively curb
the excess from a systemic point of view
*now* while it's not insoluable. That's why
I turn off the light when I leave the room
and why I free my heap memory as soon as it
isn't needed :-)

Supposed you wanted to write a Go program .... you can be assured that memory management and flying functions and instant display of windows ... are completely irrelevant.

Ah, but the performance of the meat of the
algorithm *will* be *extremely* performance
intensive, being the determining factor in
how much look-ahead can be performed (for a
brute-force method) or how many tactics
can be evaluated. GC in that situation is
no different than using GC in your TCP/IP
stack, IMO. You've got to *squeeze* cycles
out of the computer, and GC just doesn't
allow that.

And, in case you don't know, I have no
love for coding the same patterns over and
over again, but I realize that we computer
scientists are about the only craftsman who
make our own tools. So, for me, the question
is "what's the best tool for the job". The
answer to that question probably involves
many different tools, each to a specific realm,
but working closely together. And my version
of this toolset will generated straight machine
code, God Willing :-)

Peace & Blessings,
bmac
www.mihr.com: if you want to know where all
that dark matter & energy are

Sorry by Anonymous Coward · 2003-09-17 10:37 · Score: 0

I don't think you are thinking about what actually happens at CPU level.

If a mutex is active and a thread blocks it will get put into the wait queue and not use CPU. The only ready processes will be new IO, then you are back to the GC code again. The only overhead is the actual scanning on a single CPU system.

You are completely wrong. If a mutex is active and a thread blocks while it waits for a memory allocation it may use a less (but not zero) CPU, but more importantly - its progress has halted. So basically, that database transaction or whatever you have scheduled on this memory blocked thread will wait. Imagine a dozen such threads all blocked waiting for new memory while you have garbage being collected - you have very poor utilization of CPU resources - especially if you have expensive SMP hardware with most of the CPUs sitting IDLE when they could otherwise be doing useful work. We're not talking a few percentage slower here - we're talking several times slower on multi CPU hardware in heavy load.

Do yourself a favor and read about Hoard. Then you'll get a clue about how modern memory management works.

Re:Sorry by j3110 · 2003-09-17 15:33 · Score: 1

Like I said, it's only significant waste is on a SMP machine, and thats only with outdated knowledge of the garbage collector. The new incremental garbage collectors in Java only clean certain portions of the heap and only for a pre-specified amount of time. This makes all your points outdated at best. The only true overhead is scanning the pointers and heap reorganization (which actually increases the spead of the entire process after it has been in memory long enough that long lived object make it to a stable portion of the heap).

We can argue forever, but I don't see any real loss any Java programs, and neither do most other people.

If I can come up with ideas like locking per object to move an object at a time in a heap off the top of my head, I'm sure it or something better has been implemented. Assuming the worst case scenario for something you don't like doesn't help your arguement.

--
Karma Clown
Re:Sorry by Anonymous Coward · 2003-09-17 17:01 · Score: 0

Like I said, it's only significant waste is on a SMP machine

I'm glad you agree that multithreaded Java processes run very inefficiently on high end SMP hardware. As far as Java garbage collection on uniprocessors, look up the term "context switching overhead".
I would be happy to resolve any other misconceptions you have about garbage collection and multithreaded programs.
Re:Sorry by j3110 · 2003-09-17 19:35 · Score: 1

I know what context switching overhead is. It's pretty negligable on newer processors, or HyperThreading technology wouldn't be possible, would it? :) That's a very old debate from the "monolythic kernels are better than micro kernels" flame wars, and you bringing it up makes me suspect you've been programming for 15+ years and think that hardware hasn't changed as much as it has.

The SUN JRE is terrible on SMP. I've tried it and know how horrible it is. The newer JVMs are better from SUN, but anyone with half a brain and half a project usually end up using one of the alternatives (IBM or BEA).

I don't get how context switching has much to do for garbage collection though. If you use a control object to allocate memory the memory could be allocated in the same thread safely.

Also consider that you could very easily just lock a single object at a time. If all your threads depend on a single object, you need to rethink the problem because it's not going to be fast using any algorithm.

Garbage collection will be much slower for non-JIT'd languages than JIT'd languages as well if done properly. Java knows if a function only reads values from objects where as a library like libGC doesn't. Perhaps you could achieve good GC results with C if you implemented GC at the kernel level like everything C expects. You can play with the schedular and you have no worries about the garbage collection ending in a state you don't like because you only put preempt points where you know everything is good. Basically there is no real locking or context switching at all. Then you are just argueing whose malloc is faster for real world benchmarks.

What I don't understand about compiled languages and GC is how the collector knows that an object is no longer needed. Is there some function you have to call in order to actually get a pointer to the memory? Surely you can't just replace new and malloc and a GC work. You should at least need some kind of way of copying the pointer that involves a call to the GC so that it knows where all the pointers are so it can update them when it shuffles memory around.

--
Karma Clown
Re:Sorry by Anonymous Coward · 2003-09-21 05:58 · Score: 0

Don't be fooled - unnecessary context switching is still a killer even on modern hardware and operating systems. Say you have 5 compute-bound tasks running on a uniprocessor. If those tasks are performed serially on a single thread it will be typically 20% faster than if those tasks were done in parallel on 5 different threads. The context switching overhead is very large. Think about it - you have to save and restore the state of all the registers and the data cache is often cleared between thread context switches. This is not a free operation.

Surely you can't just replace new and malloc and a GC work.
Actually, it can be that simple. There are a couple of conservative garbage collectors for C and C++ on the market. The free one is Boehm's, and there's at least commercial one as well. The problem that these collectors have is that they hold on to too much memory due to false positives. They are also not terribly stable, in my experience. I find that participative schemes in which the C/C++ programmer co-operates with the collector are much more reliable.

I don't get how context switching has much to do for garbage collection though.
Java's finalizers were really ill conceived and they account for the majority of the slowdown of Java on SMP hardware. The finalizers have to be called from a seperate thread or deadlock or reentrancy problems will occur. As far as I know, Sun's JVMs still fire off all finalizers from a single thread. It you have code that uses finalizers and acquire locks then you have serious thread stalling problems. If you program in Java - avoid finalizer at all costs. The irony is, of course, that the Java programmer has to track down and code all the resource closing functionality in their code by hand at every use, while C++ programmers get this functionality for almost free due to wise use of resource releasing in destructors.

Not for real time programming by Anonymous Coward · 2003-09-17 11:58 · Score: 0

I work in a system that has (mostly soft) real time deadlines. The article states that their
mark-and-sweep algorithm can handle 10MB/s on a 177Mhz machine. It is not unusual for a system to have 500M+ of live data. That's 50 seconds during which the system can do nothing "useful" from an application standpoint, which is 43 seconds beyond our most common deadline. Throwing a faster processor at it doesn't really help all that much.

Re:your lesson for today by angel'o'sphere · 2003-09-17 12:46 · Score: 1

LOL

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.

not as powerful as garbage collection by brlewis · 2003-09-18 01:00 · Score: 1

I don't think I'm misunderstanding. I understand you have a mechanism that destroys the object a variable is bound to when that variable goes out of scope. That's only useful if you aren't putting the object in a data structure or otherwise planning on using the object outside of the block where that variable is in scope. Basically, it's only useful in trivial cases. I'm not saying it's worthless; I can see it has some advantage over explicitly freeing all the objects. It's just not anything near as powerful as garbage collection. To be fair, Bjarne Stroustrup doesn't present it that way; only aster_ken did.

Re:not as powerful as garbage collection by Anonymous+Brave+Guy · 2003-09-18 05:52 · Score: 1

I still think you're misunderstanding the significance of this. :-)

In well-written C++, almost all objects are automatic variables, or ultimately contained within objects that are automatic variables). It's also far more common to pass by value or pass a reference than it is to start throwing ownership around using pointers. You don't write:

SomeType *st = new SomeType();

all over the place as in something like Java. You just write:

SomeType st;

under most circumstances.

Typically, a data structure type will own any objects within it, and that data structure will be an automatic variable somewhere. When the data structure object goes, so does everything in it, cleaned up by the container type's destructor.

Yes, of course, it does happen that you have complex data structures relying heavily on indirections. Even then, you'd typically just have a data structure type manage a pool of objects within the structure, and have the objects within the structure linked via that pool. Again, when the data structure goes, it just empties its pool in its destructor.

Of course, the last idea is starting to become a simple garbage collector, but it's far less complicated than a full-blown GC, and rarely needed in practice.

In other words, the C++ approach is far more powerful than just handling trivial cases. It handles pretty much all cases, and when it doesn't, just a little thought when designing the complicated data structures allows the idea to generalise. (I write code in C++ that deals with very complicated graph structures in a performance-sensitive application for a living, BTW.)

The only time the C++ approach really needs significant help is if you don't care about ownership and just start passing references to any old object all over your code. Of course having everything garbage collected makes that easier. OTOH, that approach is systematic spaghetti anyway, and completely contrary to modular design, so I can't say that I've ever missed the ability to do it. In any reasonable design, almost every object is either directly owned by another object or function, or naturally belongs in a pool that can have a single manager.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:your lesson for today by Anonymous Coward · 2003-09-18 09:21 · Score: 0

>Keep up the mediocre work.

Have you ever worked on a large interactive project with dynamic objects shared among multiple users? If so, please tell us how you were able to "know exactly what data is no longer connected and suitably delete it immediately" ?

p.s. Your "Peace & Blessings" signature is bordering on insult when you use it while calling someone an idiot.

Slashdot Mirror

Experiences w/ Garbage Collection and C/C++?

112 comments