Pros and Cons of Garbage Collection?
ers asks: "Most new programming languages are using garbage collection, rather than programmer-controlled memory management. The advantages are obvious: programmers no longer have to worry about forgetting to delete allocated memory, leading to far fewer memory leaks. The disadvantages are often glossed over by programming language designers - aside from the performance issues, predictable memory management can be used for controlling access to files and similar resources, creating safer thread locking code and even providing better error messages. Some programming languages, which usually predictable memory management, can also be made to behave like they are garbage collected - for example, Boost provides various C++ smart pointer classes. So, given the choice between garbage collection or manual memory management, which would you choose and why? When using a manual memory management language, when do you consider the performance and syntactic overhead of faked garbage collection to be worthwhile?"
The C++ model is basically correct. It doesn't treat the programmer like an idiot (which admittedly may be a problem if you have idiot programmers), and it gives you the choice of how to handle memory allocation. The lack of reference-counted pointer in the standard library is a bit of a bitch, but the Boost shared pointer templates will likely make it into C++0x, and it's only a hundred lines of code or so to make your own in the mean time.
Of course, the C++ model is not perfect either. Lack of virtual and const constructors can be a nuisance (the workaround being the pimpl idiom and a shared pointer), and not being able to use shared pointers to functions without nasty syntactic hackery occasionally breaks the "stuff pretending to be a pointer" illusion. Still, the power it gives over the Java model is definitely worth the occasional bit of extra effort.
Then again, if you're coding some quick scripting hack rather than a proper program, who cares about memory allocation?
It depends on what you are trying to make, duh.
If you are trying to make something where performance is important, like a 3d game, then manage memory yourself. If you are making a simple business application where reliability and security are important, use garbage collection. If your program uses lots of RAM and you need every last drop either find an expert at RAM management to get every last bit or use garbage collection if your programmers are not so awesome.
And so on and so on...
The GeekNights podcast is going strong. Listen!
In general I prefer having a GC because most of the time I don't want to have to worry about memory management... there's no need. However, sometimes it would really be nice to have more direct control. Not being a VM expert myself, it seems like it should be possible (though I can imagine the types of problems that would arise) to allow specifying that you're assuming manual memory control either over certain objects or while inside a particular context.
"Let your heart soar as high as it will. Refuse to be average." - A. W. Tozer
They collect ours every Wednesday, blue boxes every second Wednesday.
Personally, I think C++ would be overall best-choice, but it really depends on the situation. If security was in question, I might opt for manual memory managment, but if performance was not a big issue, then certainly garbage collection can be included. As a student, when I programmed, I always used garbage collection, becasue I was lazy and there were no real security issues, even as a researcher, I was more worried about sheer functionality then security, but now in the real-world, where your code is going to be right there with people you don't trust, I am careful with what I do and am paranoid about security, and therefore, won't let a machine take care of it for me. It's all situational...
I like suggestions, but I don't like contributing towards them.
Garbage collection does not equal poor performance. In some instances, it actually speeds things up--when done properly. Take, for example, the D Programming language. It's just as fast as C (faster in some cases) yet it has a garbage collector. The reason is that most programmers tend to not realize that the free() operation actually takes up a decent amount of CPU cycles, and when you're freeing a bunch of little things all over the place, the overhead tends to add up. With a well-designed garbage collector, however, memory is freed all in one big chunk in a single go, and thereby decreasing that overhead. The myth that garbage collection = poor performance is just that, a myth, and most likely started by people who associate Java's performance issues with garbage collection.
Best. Webhost. Ever. Dreamhost.
As someone who works on long-lived projects with a mid-sized team (a dozen or so developers), I prefer a GC-based language. The biggest pro is the great reduction in memory leaks, closely followed by the productivity increase by not having to think about allocation/deallocation (very much). The biggest con is that far too many "young whippersnappers" seem to think memory allocation/deallocation is therefore "free" in a GC-based language and will take absolutely no care at all about when they allocate (e.g. will allocate a largish object inside a very tight loop instead of allocating it outside and reusing it...). And the 2nd biggest con is that a lot of developers can't believe you can have memory leaks in a GC-based language, won't look for them until you rub their nose in them, and don't really know how to find them when they look.
After a few days, the garbage really starts to stinks and all sorts of animals come around.
But then, the question is rather ambiguous. Is the writer asking:
I'll leave further interpretation of the writer's words to other posters, as well as more thorough responses.
I've always had the philosophy, "use what makes the job easiest." Typically, this involves garbage-collection. However, one of the biggest problems I have with garbage collection is that you can't have your cake and eat it too. Meaning, you can get all the memory you want, but you can only access it at a high level (think Objects in Java). In C/C++ however, you can call malloc/new, create a big pool of memory (or just a single object), and then do whatever the heck you want with it. But again, as the subject says, it all depends on which method helps get the job done, and so far neither has been perfect for everything.
If the C++ memory management approach work, we wouldn't have spent the last two decades being terrified of buffer overruns. Just like any complex problem, the more details a programmer has to track, the more likely an error will exist. Its the reason we adopted subroutines, modules, and eventually objects. Memory management is almost never core to a project -- and the best way to manage it can vary due to OS, CPU, or underlying language. Focus on the core competencies, and don't reinvent the wheel.
As a programmer-come-sysadmin, I vote both. Which has its issues all its own...
When I programmed professionally, I craved the control of memory management. Objects did _exactly_ what was _explicitly_ told to do.
Now I'm a ruby junkie, and love the OO, GC, Etc.
Still, yes, for performance reasons, there are good reasons to do it yourself.
For programming reasons, there are reasons to go GC.
all in all, GC tends to be great. wouldnt work without it. But there are times I'm mystified as to why an object left scope, got destroyed, etc.
So I would (as a programmer), like a compromise [and yes, ruby/rails provides this in its own way, but...] All my objects should be GC'd by default. But I want the ability to hook the destructor, and only have it react the way I expect.
If I want a big block of memory to manage myself, I find an appropriate object for the language (char *, ruby C bindings to a mempool object, unsafe C# (or even safe C#, if you're good), or whatever idiom matches your lang)
Then again, you dont get pointers in ruby, so I s'pose I'm just whining...
So from my perspective:
- Scripters want GC
- RAM Intensive code needs at least somewhat programmer-managed MManagement
- Embedded devices need hard kernel memory management
- Short run applications generally want GC
- Long running, RAM intensive, frequent paging, or frequently shifted data process generally should go with kernel malloc.
Cheers.
-- (appended to the end of comments you post, 120 chars)
It's definitely worth checking out before people go spouting off the traditional rants against garbage collection.
Of course, determining which one is best always depends on your application and your available resources, among other things. There are good arguements for both in various situations. I code C++ for embedded devices for a living, which means that I am working with the new/delete/malloc/free model, but for school projects I really like to work with Java, because it lets me focus entirely on implementing an algorithm without having to spend any time thinking about memory allocation or the underlying hardware.
I actually use both gc'd and native code in my apps. Generally speaking, when some component is time critical i tend to use native code, and when it's not the benifits outweigh the cons to not use a a gc'd environment
aside from the performance issues, predictable memory management can be used for controlling access to files and similar resources, creating safer thread locking code and even providing better error messages.
This is silly. None of these have any connection to garbage collection; you can write "destructor's" in a garbage collected language, and do everything in them just as you would have in a non-GC language.
The advantage comes from the RAII style of coding, not from the absence of a garbage collector. In fact, most modern GC languages provide better RAII support, in that there is no way to get an uninitialized object object.
In a way, it reminds my of the old bounds-checking arguments. A fair number of people used to resent/resist built in bounds checking, with very similar arguments (performance, trusting the coder, illogical correlations between manual bounds checking and various Good Things(TM), etc.) and thus we still to this day struggle with buffer overruns and related problems.
--MarkusQ
C++'s constructor/destructor paradigm with predictable object destruction has the benefit of enabling the RAII (Resource Acquisition Is Initialization) idiom. RAII and exceptions greatly simplify resource management in the presence of error handling. Still, even as someone who knows C++ better than I know any other language, I have to admit that for many applications a garbage collected language puts the least mental burden on programmers and produces the fewest memory errors. The burden of arranging all the extra try/catch blocks in Java (because it lacks RAII) has to be weighed against the burden of investigating and fixing memory management errors in C++, and for people using new/delete, Java wins, IMHO.
C++ programmers should be making very little use of new and delete, though; they should be using smart pointers. I think the article poster misunderstands smart pointers. boost::shared_ptr is a reference counted pointer, but std::auto_ptr and boost::scoped_ptr have nothing to do with garbage collection - they certainly aren't "faked garbage collection" and they certainly aren't unpredictable. They use C++'s object scoping and copying mechanisms to manage memory in a way completely unlike garbage collection. scoped_ptr is the simplest and most predictable memory management tool of all. Taking programmer error into account, it's more predictable than using delete. Even shared_ptr is predictable; when the reference count falls to zero, the object is immediately destroyed, not just marked for destruction.
Sadly, although C++ is a very powerful language and can be used to write code with few errors, the language as used by beginners is as dangerous as C, perhaps even more dangerous. It takes programmers years to become proficient in all the methods and idioms that make C++ a usable language.
(I would love to see a language that allows programmers to choose scoped allocation, smart pointer heap allocation, or garbage-collected heap allocation, and uses types to avoid dangerous combinations such as garbage-collected objects pointing to scoped objects or an object pointing to an object in an unrelated scope. Every object would have two types - the object type (int, file, circle, etc.) and the memory management type (scoped with scope S1, scoped with scope S2, garbage-collected, etc.))
Most new programming languages are using garbage collection
You mean like Lisp and Smalltalk? ;-)
The advantages are obvious: programmers no longer have to worry about forgetting to delete allocated memory, leading to far fewer memory leaks.
In other words: the computer is perfectly capable of figuring out what to do, so let it! This is almost always the best thing.
When using a manual memory management language, when do you consider the performance and syntactic overhead of faked garbage collection to be worthwhile?
I haven't used manual memory management in years, except for a couple old C programs that I still maintain (and those are written to be as easy to "visually inspect" as possible.. allocate, use, de-allocate in three separate lines).
There's rarely good reason to be tinkering around with pointers and other "implementation details". I prefer using very high-level languages (like Haskell for instance) that allow me to express problems and solutions as directly as possible, rather than having to deal with implementation details that only resemble the problem domain from a distance.
I suppose there are times when you need to override the memory management, so there should definitely be "hooks and hints" where appropriate. And even the highest-level language shouldn't free you from the burden of understanding how computers work. However, I personally haven't needed to know about low-level details in *long* time, not since the Pentium era began at least..
Pros and cons of garbage collection?
If you don't CONS, you never need to collect garbage. *rimshot*
More seriously, GC isn't so much about pros and cons, as it is about tradeoffs between the various GC algorithms: time vs. space, low-latency vs. high-throughput, parallelism, etc.
If you're designing a new language, it should include garbage collection, or nobody will use it (i.e., your target audience can already program in C). You may wish to have multiple GC implementations available for different purposes, perhaps to be selected at compile-time.
For a good overview of what's available, see http://www.memorymanagement.org/
My personal favorite is the good old Cheney semi-space collector (and Ephemeral/Generational Garbage Collectors, which are more advanced versions designed to generally have low latency), as it is very straightforward (both to understand and to implement), compacting (it defragments memory, and can perhaps improve cache locality by grouping related objects), and it has high throughput (work is proportional to the amount of live data, not total data).
If memory usage is of more concern than fragmentation and throughput, a mark-sweep collector may be more your style.
There are also "real-time" (and "soft-real-time", i.e. bounded latency [see Henry Baker's Treadmill]) collectors, parallel collectors [including an interesting case for reference counting, usually considered a dog performance-wise, as a viable parallel/remote GC method], "conservative" collectors for C/C++ (see Hans-J Boehm's libgc), collectors for real and hypothetical computers with special hardware and/or OS support for GC features, and some collectors that are just plain weird.
Note also that garbage collection algorithms are considered hard to measure for performance, especially with regard to wall-time latency, so just because a paper(*) claims that a certain GC has certain performance characteristics, be sure to benchmark if it really matters.
(*) Did I mention papers? If you're serious about implementing GC, getting comfortable reading CS research papers is a must. The book "Garbage Collection" is your best friend here, as it provides a very good overview/survey of said papers and algorithms, and it discusses a lot of pros and cons between various algorithms, and useful variants or adaptations that have been applied to previously-published work.
Also check out Henry Baker's papers, because he is a memory management demigod: http://home.pipeline.com/~hbaker1/home.html.
The article throws out a bunch of links covering concepts that the writer believes supports the writer's statements. The "controlling access to files and similar resources" links to "Resource Acqisition Is Initialization" which superficially appears similar to BASIC's initializing variables on each entry into scope, and BASIC's a GC language. (I have QuickBASIC in mind, to be more specific). I was confused when writing a TurboPascal program to find garbage in variables until someone explained that TurboPascal doesn't init it's variables. I don't know if TurboPascal is GC or not, but I don't remember explicitly allocating and freeing variables, so I guess so.
Under "safer thread-locking code" we find another RAII article. Again with the next. I wrote the first part of this post, thinking that each link would cover topics that explicitly described the situation involved.
To the best of my knowlege, there is nothing intrinsic to either allocation method that would make those tasks easier or harder in either of them.
I call "Shenanigans!"
People seem to have confused Memory Access with Memory Allocation. Neither GC nor PC (programmer collected) should allow memory accesses on out-of-scope data. GC just delays when that out-of-scope data gets freed for reallocation. Those unfamiliar with GC and used to the unreallocated = still-accessible-data situation of improper PC coding think that in the GC world unreallocated means still-accessible, which is not necessarily nor usually the case.
Apple / NeXT takes a reference counting approach. It is not automatic, but it works well once you understand the rules.
All of the reasons given for manual memory management seem to boil down to a desire to have support for the Resource Acquisition Is Initialization (RAII) idiom, which is hard to pull off in GC languages. But, the alternative idiom Resource Acquisition Is Invocation provides the desired capability in GC languages. Same capability, no chance of memory leaks. So tell me again why manual memory management might be a good idea?
Of course, this will be hard to use unless your language supports closures. Sadly, most imperative languages (e.g. Java) do not. But, hey, I'm a functional language person so I'm not going to try to defend them.
Some programming languages, which usually predictable memory management, can also be made to behave like they are garbage collected - for example, Boost provides various C++ smart pointer classes.
Aside from the obvious problem with cycles, note that reference counting is slow. Very, very slow. Since you need to use an atomic counter, every copy of a refcounted pointer requires inter-processor communication (on a multiprocessor system). Do not underestimate how slow this is. Yes, it is much slower than real GC.
No, you can have a memory leak of the PC (programmer collection) type in a GC system. You clearly haven't had an experience similar to the QuickBASIC "String space corrupted" error. QuickBASIC is a GC system.
The answer, as always, is "it depends". I'm firmly inside the "right tool for the job" camp.
Manual memory management is not free. In some circumstances, it can be quite expensive. There is a group of programmers who are best described as "rabidly anti-GC". These people are almost all completely unaware of the costs that manual memory management can impose on your code.
A multi-threaded program, for example, can allocate memory from any arena, but it MUST return a block to the arena from whence it came, which can cause all sorts of difficult lock contention problems, making free() much more expensive than malloc(). (Ask anyone who has written high-performance memory-intensive multi-threaded programs.)
In some languages, like C, the situation is even worse. In structure-hungry programs, you can end up structuring your code around data lifetimes, which precludes you from using the most natural, maintainable and efficient algorithms. Garbage collection frees you from this, as the GCC people have discovered.
I do recommend reading Paul Wilson's excellent survey paper on the topic. It answers a lot of your questions, though it's by no means the final word.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
The original post provided 3 examples of the supposed utility of programmer-controlled memory management and treated them as opposed to automatic garbage collection, but none of them served the original poster's argument: They were all examples of automatic variables being used with RAII, not to manage heap memory, but to manage non-memory resources by lexical scope. Cluestick sez: Languages with GC still have stacks. *Whack*.
-I like my women like I like my tea: green-
Not using GC requires that you to write code to free those resources repeatedly. That goes against the DRY (Don't Repeat Yourself) principle.
I wonder how many of the people who use the "C++ model" bother to unit test that they have freed all their resources.
C memory management is not completely deterministic either, since a fragmented heap will not always take the same amount of time to allocate in. To make it completely deterministic you would have to pre-allocate objects. But if you're going to do that, you could do it in a GC language and turn the GC off.
I had this job interview where I was asked if I started a new project right now what programming language would I write it in.
I said "That would of course depend heavily on the project"
This got me the job, because I was the only person who didn't answer "Java" or "PHP" - a clear indicator that the prospective employee was either feeding you the line they'd gotten in CSCI 102 at the local university, or reacting against that line.
The same thing about garbage collection. Come on, if I'm writing a web application with 10000 concurrent users, I'm going to write in Perl or PHP, because I can throw hardware at the performance problems, and the hell with stupid memory bugs in an app difficult to run in a debugger or profiler.
And if I'm writing a device driver? C. Duh. I don't want the compiler doing anything stupid with my memory mapped registers.
Having used languages with and without garbage collection, my view is that
/either/ garbage collection or C-style memory management when I am using it. (And of course, you can use either of those with C++ just fine.) There are all kinds of tools and patterns to help you out, although again it's going to be up to you to use them and make a habit of it. There's the "resource aquisition is initialization", smart pointers (built into the standard library, provided by Boost, and roll-your own), new and delete, malloc() and free(), statically allocated memory, and garbage collectors for you to link in. Take your pick! Use the right one for the job.
g e-collectionr y-leaksc ed
garbage collection is often very nice... but I don't really mind the "lack" of garbage collection in C and especially don't miss it in C++.
My opinion is that it takes some effort on the programmer's part to learn to use C safely. I'm not sure why, but this answer seems to suprise some people. Do they seriously expect that in the real world-- of software or of anything else-- that they should be able to pick up any tool they want and understand how to use it well right away? Appearantly so, but it's pretty silly, isn't
it?
C++ and C are like very sharp carving gouges: It's up to you to learn how to use them properly, to hone them peridically, and to build safe habits. You need to do this in any language that you use, but the sharper the tool the more important it is to bear this principle in mind. A lot of people try to pick up these sharp tools, and then blame the gouge when they've cut themselves. It's no suprise, because they did not take the time to learn to be careful.
In my view your best bet for using a programming language safely is to develop
certain habits and adopt idioms that cause you to simply not write code with some types of errors. In the case of resource-freeing errors, when you open a door, you must remember to close it. It's not complicated, but you have got to turn it into a real habit, something you just do-- and it follows that there will also be some patterns that you should train yourself to just never write. Give it a go (and give it some time).
That will go a long way, but I'm afraid that there's some bad news: even that is not enough. I've been writing software in C for 17 years now, and although without a doubt I write fewer errors than I used to, I still do not produce error-free programs. Eventually, I make plenty of mistakes. Complexity does that.
Some very good news: Memory management in C++ is so nice that very frankly I do not miss
Bjarne Stroustrup explains all of this extremely well. Here's a link or two to his FAQ and what it has to say about C++ and garbage collection:
http://www.research.att.com/~bs/bs_faq.html#garba
http://www.research.att.com/~bs/bs_faq2.html#memo
http://www.research.att.com/~bs/bs_faq.html#advan
Think about it for a while. Remember that memory is only one kind of resource that we need to manage. It's up to you to ferret out and apply the tools available to you.
If you want a garbage collector, why not use one? But it's not the only way to achieve the ultimate goal of producing reliable, efficient software.
I prefer garbage collection. At most, I take the cans to the edge of the driveway and some guy in a noisy truck with a cool robotic arm just hauls it away. Yeah, there is a landfill somewhere that isn't good for the overall environment but I accept that tradeoff. I also don't throw old car batteries into the trash.
Sure the hell beats me keeping the trash around, remembering where it is, and putting it in my truck and hauling it to the heaping landfill myself. I'm not here to manage trash, I'm here to get something done.
Is this post about programming?
For cases where static analysis can't do this automatically, it isn't that hard to use a design methodology that achieves the same result; it's certainly still much easier than doing manual allocation and deallocation and ensuring that the deallocation is done (or not done) correctly in all cases.
And if you are using a reference-counting GC, or a hybrid GC that includes reference-counting, you don't have to do anything special at all.
The same applies to the claimed mutex and error message disadvantages, since those are just specific uses of RAII.
Right now someone I know is trying to track down a Java memory leak.
No doubt some reference is left in a persistent collection of some sort (hash, list, array, etc)
Just As C/C++ programmers must remember to free when done, so Java programmers must remember do undo such "life maintaining" references when they are done.
Sam
blog.sam.liddicott.com
Not inherently. It is perfectly possible to write a GC implementation that stores data that can only be accessed in a certain scope into the stack, and frees it automatically and immediately when the scope exits. From what I've understood, Sun's upcoming JVM does just this.
Garbage collection versus manual collection really only applies to objects that get passed out of the scope they were created in.
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
This is a common claim, but it is an apples to oranges comparison. No one (including the compiler) dynamically allocates objects in C/C++ when they can place them on the stack instead. Garbage collected languages like Java, on the other hand, require practically everything to be managed on the heap.
In addition, an array of objects on the heap requires only a single memory allocation in C or C++, where Java has to allocate and track each separately. As one luminary once said, "C++ is better because there is less garbage to collect."
That might be acceptable, but the worst part is random application pauses of arbitrary duration for garbage collection. Unless that problem can be resolved, garbage collected languages will be always be a poor match for latency sensitive applications, even where the net throughput is otherwise adequate.
I don't know if TurboPascal is GC or not, but I don't remember explicitly allocating and freeing variables, so I guess so.
Pascal doesn't do automatic gabage collection, it provides Alloc and Free functions for dynamic memory management. Apparently you only bothered to write code using global and local (stack allocated) variables. But rest assure that any trivial piece of linked list code in Pascal has you doing the deallocation manually.
I shouldn't have to change languages, change IDE's, and change the rules of disambiguating overloaded operators in order change coding paradigms.
Granted, you may not want to mix paradigms in the same thread, but having a multi-threaded app with some threads use malloc/free or C++ new-delete and other threads use GC seems entirely reasonable.
Even hard real-time systems often have portions of the code that can live with the occasional GC pause.
I shouldn't have to change languages to change paradigms, just declare a threads memory type and be consistent. Inter-thread communication will have to accomodate this, of course.
Java could have made much more of a contribution if it had done this. And C++ could have supported a standard interface to GC earlier on, to support GC code in C++. MS C++ now allows coding for managed code, but it's a little late in the game for C++. And the managed environment generates IL, not machine code, so "Managed code" is not exactly equal to "Uses GC".
I18N == Intergalacticization
Okay, so in C / C++ you might have to malloc() and free(). But the C libraries still manage the memory for you (or the operating system does, depending on what/how you're coding).
Memory management is about much more than specifying when you grab a chunk of memory and when you release it. It's also about managing fragmentation of memory, for example. GC handles this for you and therefore can optimize it in a way that would be harder to do in C.
Worrying about these things does not make a good programmer. A good progammer delivers software that works to a reasonable extent within a reasonable timescale. That may only sometimes mean having to worry about these things. 'Works' depends on what you're trying to achieve. If it takes you twice as long because you unnecessarily used C++ instead of Java, then that's not good programming practice.
Successfull programs evolve over time.
:D ) :
So they get refactored. Classes get reused at unexpected places. References to objects are kept on places where it was not anticipated. Calling delete now is unapropritated at the old point as it can't take the new references and the changed lifetime into account.
So the memory management needs to get refactored just because you "reuse" a class?
Simple example (controverse because it shows where GC leads to problems also
For some reason you implement a cash for a certain type of objects.
In a non GCed language everywhere where you delete the objects you need to tell the cash to drop it first.
Now that means you have to change existing code and write new test cases. And it means you can not assign different programmers to different tasks. You also can renot use the original class without the cash now.
However in GCed languages the cash influences the lifetime of the objects, so you need WeakReferences as wrappers around the obejcts in the cash.
Nevertheless, bottomline GCed languages give more flexibility in program evolution, or simply refactoring during ordinary development. Classes are easyer to reuse as you can change to a new object lifecycle without changing the original classes sourcecode.
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
For short-lifetime, stateless programs, like queries, scipts, macros, and such GC is fine because if has a finite end, at which everything can be cleaned up together.
But for longer running programs which launch other programs like root processes, server processes, and such; they might hang around long enough to run out of memory.
To me the crux is, how does a garbage collector itself allocate memory? Somewhere down the line something has to keep solid track of resources, GC is an option for many subsystems, but regular allocation is a requirement at some point.
You might use a car iunstead of walking to get to a lot of places, but you have to walk to the car first.
A paper I've found interesting is on a GC which communicates with the VM systems to avoid putting too much load on the VM system.
It needs a modification of the VM, but IMHO this is better than having to handtune the memory used by the GC. (Note: I'm not an expert in GC)
http://www.cs.umass.edu/~emery/pubs/04-16.pdf
Well yes it does depend on the project, but even more so on the programmers involved. Memory leaks breaking applications are a visual indication of stupidity, carelessness or timescales that are too tight. GC allows total idiots to masquerade as anything they like from systems programmer to xyz expert. The problem then is that although their code works (with the help of GC) it probably does not perform the correct function. I cannot begin to say how many financial reports I have found careless careless mistakes in; did the programmers just get lazy from things like GC? Even at the VS2005 launch I am amazed at the questions that developers are asking: at least 50% do not understand how a forms or web application works: proof of that was the reaction to the security session at DDD in Reading. Managed code, GC & everything else from weed to cola all has friggin advantages & disadvantages. In short: don't climb into the driving seat unless you know what you're doing.
In the mainframe-based online transaction application environment where I work, each program is given a fixed block of memory to play with. Period. After the program terminates, that memory is usually freed, but often a transaction has either its code or its data memory area(s) locked into core to speed up program load times (it remains resident but idle until the next time the program is activated).
Transaction programs are event-driven entities, though, and they have very short lifetimes -- they are loaded in response to a specific query, they perform their task (usually in a fraction of a second), and they terminate, returning control to the transaction monitor.
Performance (as in "reponse time") is one of the key design criteria for such an environment. The whole idea is to break an application into a series of discrete transactions, each performing its predefined task and then terminating. Most of the time a task consists of reading a series of files and creating a display screen, or parsing a data entry screen and saving the result to a file or files.
All in all it's quite slick, and it completely avoids the kinds of traps that one sees in the world of typical PC or UNix applications (where a program remains resident for many minutes or hours and has to do dynamic memory allocation/deallocation on its own).
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
People seem to have confused Memory Access with Memory Allocation. Neither GC nor PC (programmer collected) should allow memory accesses on out-of-scope data.
This is why I mentioned I was talking real-world, not theory. I can conceive of a "safe" (out-of-scope data not accessible by any in-language construct) language that uses manual allocation. But I am aware of no such beast, which doesn't prove it doesn't exist but is pretty strong evidence that it's not very popular if it does.
And going back to the original post, even if it did, it would not be the manual allocation that would be the source of the security, it would be the safety of the language, so OP was still off.
A lot of posts in this discussion almost imply that there is 100% manual memory management, or some sort of super-generational-buzzwordy-GC, and nothing in between. That simply isn't the case.
I write C++ for a living. I work with intricate, graph-like data structures, using performance-sensitive algorithms, with pointers all over the place. And yet I can't remember the last time I had to use the delete operator, nor any sort of super-ref-counted-smart-pointer. We just have a simple, effective ownership scheme, and deterministic destruction of the owning objects guarantees everything is always cleaned up completely and promptly.
There is no magic. There is only knowing which tool to use for which job.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
It's not just you. A lot of people in this discussion are confusing fundamentally different concepts whose implementations often happen to coincide.
In particular, whether or not something is ever cleaned up is different from whether or not it is cleaned up promptly. Also, releasing memory is not the same as destroying/finalising an object that happens to be stored in that memory.
Garbage collection addresses exactly one of the four possible combinations: making sure that memory is always released.
The main advantage of C++'s RAII idiom covers the opposite combination: making sure that objects are destroyed promptly. (As it happens, in C++ the underlying memory is also guaranteed to be released after the destruction of the object when using automatic variables, so we're actually covering two of the combinations here.)
The fact that the latter can be used to guarantee the former, but the converse does not hold, is why RAII is a much more powerful tool than a straightforward GC. The fact that the former works automatically while the latter requires some coding skill is why a simple GC is a more reliable tool for average programmers than RAII.
As always, you have to look at what you've got to do and who's going to be doing it, and then pick an appropriate point on the power-safety spectrum.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Because the concept is more fundamental than merely "resource release is destruction", although the latter is arguably the most important aspect of it.
What you're doing in C++ is tying the period between allocation and release of a resource to the lifetime of an object. If you like, the resource-owning class's invariant conditions include the fact that the resource is allocated correctly.
That means that allocating the resource in the constructor is quite fundamental. If the allocation fails, the constructor should fail. If it does so in the idiomatic way -- throwing an exception -- then the resource-owning object never exists, and can't inadvertently be misused because its name will be inaccessible.
Similarly, as long as the resource-owning class is instantiated as an automatic variable, the deterministic destruction when it goes out of scope ensures that the allocation of the resource can't outlast the owning object either. Again, it also ensures that once the resource is no longer allocated, the name of the owning object is no longer accessible.
In between, you have a valid object, which can present whatever interface the programmer deems useful in order to access the resource it owns, and which can hide any handle necessary to refer to that resource.
All of this is possible because of the interaction between C++'s scoping rules, automatic variables, deterministic destruction and exceptions. Since Java lacks at least two of those, it's not possible to get the main benefits of RAII in the same way.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
If your a serious programmer, learn how to manage your own memory. With the exception of C++, most other development languages are intended for rapid application development senarios. Typically people involved in this level of development are not native programmers, but people from other career paths that got into programming out of necessity for the job rather then a real desire to develop software for a living. I am NOT saying that these people are not skilled in what they do, just that programming wasn't their first love or forte. They haven't spent years learning the ins and outs of software development, or have worked years honing their skills. Memory management is one of those things that requires discipline. You need to do things in a specific way, and if you don't, you leak memory and cause program/OS instability. Only after years of dedicated programming does one learn how to write code that does not cause memory leaks (and even then it can easily trip you up), you learn the skills and tricks to control the allocation and deallocation of memory. People using RAD tools or languages with garbage collection are focused on the end product rather then skilled design. The limitations of disciplined programming eat into their ability to get the job down quickly and efficiently, they didn't spend years learning to program, they need to get the job done in a few weeks or months. Garbage collection in these RAD tools means they don't have to worry about highly disciplined programming. From what I have seen, ANY language with garbage collection is typically NOT used for high performance applications. They are used for web or database front ends, intended to be a UI layer between data and the user. I have never seen a Visual Basic, Java, or C# language used for scientific calculations (nothing serious that is), or high performance software where speed is a concern. In any regard, if all you ever intend to do is to make UI front ends for web services or databases, then any language with garbage collections will suit you fine, you don't have to worry about the finer points of developing software and can simply focus on the final product rather then the details on how to get there. The extra overhead garbage collections imposes on these higher level languages will not be noticed or undesirable in the final product. But, if there is a chance you may need to develop a high performance application, or if you think you may be developing in C++ or another non-managed language, learning how to effectively allocate and deallocate memory is an invaluable skill to learn.
I haven't thought of anything clever to put here, but then again most of you haven't either.
Thirty or forty years ago, the opposite was true.
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
In lisp that would be using the unwind-protect macro.
Notice that this is totally unrelated with memory management.
So yes, it IS possible to have our cake and eat it too.
We are Turing O-Machines. The Oracle is out there.
As I told in another post, in Lisp you can use the unwind-protect macro to perform this "pattern".
This is used to perform file access, database access and so on.
But in the languages you use, you need a compiler-interpreter update from the vendor/implementor to fix that.
We are Turing O-Machines. The Oracle is out there.
Raw speed. Malloc a huge chunk of memory. Custom cut one or more items out of the huge chunk, manually managing byte alignment of the objects to ensure portability between different chipset architectures. This is what most people refer to as a "custom allocator" (possibly also a pool or slab allocator) and its performance is vastly higher than an equal number of new's in a GC language. But wait! You don't have to actually free anything once you're done (no memory leaks), you just do a bulk free of the original memory slab. Or not, you can just reuse the memory slab for a new group of objects. Manual memory management gives flexibility that no GC language ever will. If you don't need the flexibility then fine, you can use a GC language and reap the benefits. Just don't complain when you're shopping around for a product and find one that is 5x the speed of its GD'd competitors. Then you the customer are forced to pay the price for programmer laziness and/or convenience.
As I said in another post, GC and the automatic release of a resource can be managed independently (when the resource is not allocated memory). It is just a language design issue.
In Lisp you have GC, and you have the unwind-protect macro that does the "RAII" idiom, and they are totally independent one from the other.
Having said that, I like the objects-in-the-stack feature of C++.
We are Turing O-Machines. The Oracle is out there.
Closures and objects are quite different, but they can be used (almost) the same way.
However, I think that you don't need an anonymous class to implement an algorithm that is originally conceived using closures. A normal class is enough and easier to use.
The good thing about closures is that you write less code, and that code is usually more elegant.
I say this because I generally try new algorithms in Lisp, and when they are working just perfectly, translate the algorithm to C++ or php or whatever I'm using.
We are Turing O-Machines. The Oracle is out there.
Sure. But the original question was about reasons for manual management other than performance. Plus, as others have pointed out in this thread, the performance of GC-based systems is not necessarily any worse than that of a manually managed memory (even using custom allocators) depending on how the GC is used and how complex the things you're doing with allocated memory are. More importantly, for many applications GC is not the performance bottleneck, even when it does have lower performance than a manually managed solution. You have to look at the complete system, not just individual features of it.
But what if you need/want the initialization idiom?
A Government Is a Body of People, Usually Notably Ungoverned
Use GC unless you know precisely why you're not doing so. And even then, double check to make sure you're right.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
Hi,
s /access
3 -11.html
Have you ever heard of Ada? No, then read:
Easy (at wikibooks):
http://en.wikibooks.org/wiki/Ada_Programming/Type
Detailed (ISO/IEC 8652:1995(E)):
http://www.adaic.com/standards/rm-amend/html/RM-1
All you wanted - especially the savety part. Still you can also use an "access all" if you need more flexibility.
The thing I like about Ada it that it comes with of lot of savety features which I can all switch off if I need to.
Martin
The invocation idiom does the same thing as the initialization idiom. It just implements it in a different way. A way that doesn't depend on whether or not you are using GC. So why would you need/want to use the initialization idiom?
So, I see the poster has recently picked up RAII. ;-) Yup, it's lot's of fun.
However, what RAII provides is a way to do lexically scoped resource lifecycles. It's really handy for common cases, but it turns out that for complex situations it gets quite ugly, becuase it's hard to be certain that you are really "done" with a resource simply based on lexical analysis (hence why people end up doing things like reference counting).
Arguably, a better solution would be for the runtime to have knowledge of how to manage all types of resources, not just memory but file descriptors, disk space, database connections, etc. Indeed, a lot of the code in complex systems is all about implementating algorithms for generically managing such resources and then providing an abstracted interface to said algorithms.
However, in the imperfect, real world of programming, it's tough enough to find a runtime that can intelligently manage memory, let alone other resources. In systems that do have automatic memory management, it is still possible to have lexically scoped resource management (C#'s "using" clause as an example) for other resources. So, the resources get freed up when you drop out of your lexical context, but the memory associated with those resources gets managed seperately by the runtime's automatic memory manager.
So, going with automatic memory management is really orthogonal to using something like the RAII paradigm for managing resources.
sigs are a waste of space
An idiom is a way of thinking. Initializing resources versus invoking resources are two different ways of thinking about resources. Neither one of them is wrong. What is wrong is to limit yourself to only one way of thinking.
The RAII link used files as an example. In most such cases invoking makes sense, because that's how you use a file. Rarely do you actually "initialize" a file. But you do it with data all the time. Getting rid of initialization is to get rid of object constructors. Even in vanilla C your data is conceptually composed of objects that need initialization. Since there is a high correspondence between data and memory in most software, it makes sense to allocate memory with the initialization idiom. (But not always, which is why you should never limit your way of thinking).
A Government Is a Body of People, Usually Notably Ungoverned
As some of the folks at C2 pointed out in the originally linked references, RAIInitialization is a misnomer anyway (as is RAIInvocation - but that was created in response to the other RAII). Most of what we're talking about here is (despite the name) finalization. That's the part that conflicts with GC. No one's saying that we should eliminate initialization or constructors. The point is that using the RAIInit idiom to handle resource finalization is not a good argument for manual memory management, because the RAIInvoc can produce the same finalization effects without producing any conflict with garbage collection.
Except that RAIInvoc isn't a part of most languages, meaning I have to write them myself. Don't forget, RAIInvoc isn't just for memory, it's used for all resources, including files, threads, etc. The invocation idiom is a part of some languages, but not all. For those myriad langauges for which it is not a part, are we supposed to do all the extra work of implementing it?
p.s. So where's the big push for file handle collection? If deallocating memory automatically is necessary, then where is the concern over closing files automatically? What makes calling free() so much more evil than calling close()?
A Government Is a Body of People, Usually Notably Ungoverned
I don't think anyone would be too happy using a kernel of an OS that relies on garbage collection. seems like a lot of pray and hope it works.
Anons need not reply. Questions end with a question mark.
I much prefer the current model...where one just prays and hopes it works.
If those languages are garbage collected, and you want some kind of RAII-style resource management, then yes, I suppose so - RAIInit won't work. Otherwise, by all means use RAIInit. My original point stands: RAIInit is not a good argument for abandoning GC.
So where's the big push for file handle collection?
Performing "file handle collection" (or "lock collection") would leave you in the same boat that attempting to use finalizers for closing files in a GC language would land you: files (or locks) would not be guaranteed to be closed (or released) at a predictable time. For the majority of applications it isn't necessary to have completely predictable object destruction (which is why GC can be used), but predictable file and lock release is a necessity.
My original point stands: RAIInit is not a good argument for abandoning GC.
Ummm, that wasn't your original point. You may have meant it to be, but it wasn't. Besides, you can't abandon GC if you've never adopted it...
For the majority of applications it isn't necessary to have completely predictable object destruction (which is why GC can be used), but predictable file and lock release is a necessity
Sometimes predictable object destruction IS necessary. But if your language doesn't support it because it considers memory to be a second class resource taking a hind seat to files and locks, then you're screwed.
One size does not fit all. That's all I'm saying. Right now as I type I should instead be finishing coding on this hardware device driver I'm getting paid to write. I'm trying to wrack my brains figuring out how total control over memory allocation and deallocation isn't necessary for this project, and I can't figure it out. The problem is not that I'm an exception to the rule, the problem is that you're trying to fit a rule to everyone.
A Government Is a Body of People, Usually Notably Ungoverned
Malloc a huge chunk of memory. Custom cut one or more items out of the huge chunk, manually managing byte alignment of the objects to ensure portability between different chipset architectures. This is what most people refer to as a "custom allocator" (possibly also a pool or slab allocator) and its performance is vastly higher than an equal number of new's in a GC language.
What about that scheme can't be done in a GC language when needed in critical sections and abandoned when not? Sure it's ugly to do in those languages, but so's writing your own GC in a non-GC language (e.g. SmartPointers).
Incidentally, this is the sort of thing that your OS should be doing for you. No one pratically needs this level of control unless they're flying without a well-featured OS (i.e. working in an embedded context). People pulling this sort of nonsense in a normal application are just setting themselves up for a fall.
Just don't complain when you're shopping around for a product and find one that is 5x the speed of its GD'd competitors. Then you the customer are forced to pay the price for programmer laziness and/or convenience.
Just don't complain when you're shopping around for a product and find one that has 5x the number of crashes and exploits of its GD'd competitors. Then you the customer are forced to pay the price for programmer obession with trying to optimize non-critical sections.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
"Little does he know, but there is no 'I' in 'Idiot'!"
The original subject line: RAII is a bad reason for manual memory management (it's still the subject line now). Is manual memory management always a bad idea? No. Nor did I ever claim that it is a bad idea. What I said was that a desire to use the RAIInit idiom is not a good justification for manual memory management (specifically, because there are alternative idioms that work just as well in GC languages). I also asked for some other (i.e. non RAIInit-based) justification for manual memory management. Predictable object destruction is a perfectly good reason for manual memory management, and I'm certainly aware that there are applications that may require it. But that isn't the reason the original article gave.
Besides, you can't abandon GC if you've never adopted it...
Which would explain why I said
The second sentence is about languages that are not GC (and thus would not need to abandon GC).Frankly, I'm a little confused about what you're trying to argue about here. You started by asking wanting/needing the initialization idiom, moved to the need for constructors, and eventually ended up at manual memory management being necessary for some tasks (because of reasons other than a need for RAIInit). So, what exactly are you trying to say that contradicts my original (and continuing) point, that "RAII is a bad reason for manual memory management"?
Never seen it done in any language other than C or C++. Perhaps it is possible in C# but you'd have to be in unsafe mode the whole time. I'd love to hear how its done. I've had the good fortune of building an incremental GC system in C integrated into a memory manager and it was tricky to get right but with time it became rock solid, so it can be done.
Incidentally, this is the sort of thing that your OS should be doing for you. No one pratically needs this level of control unless they're flying without a well-featured OS (i.e. working in an embedded context).
There are things the OS can do and there are things it can't. It would be nice for the OS to provide all relevent services but sometimes they just aren't there and other times for the sake of portability you have to manage it at the application level. This is more true for enterprise-class software than one-off hacks.
People pulling this sort of nonsense in a normal application are just setting themselves up for a fall.
I will agree that this is an arena for experts. Inexperienced or incompetent people will hang themselves when they wade in this deep. Look its all just memory anyways. What you do with it is up to you. If you consider C as a "high level assembly language" (which it is) then this kind of "nonsense" is par for the course and even expected. Would you rail against an assembly language programmer for doing this?? Don't fault the language for allowing it (or the expert for doing it) simply because it looks like your idea of a high level language.
Just don't complain when you're shopping around for a product and find one that has 5x the number of crashes and exploits of its GD'd competitors. Then you the customer are forced to pay the price for programmer obession with trying to optimize non-critical sections.
I disagree. You are describing a product with poor QA testing, not a defect inherent in manually managed memory. If you dislike unstable products blame the company foisting this piece of shite on the market. Its not a huge stretch to say software that is tested thoroughly tends to become more stable regardless of its memory management scheme. And I would hardly call memory alloc/free optimization non-critical. It is the single biggest (non-algorithm based) drain on processor time, beyond even synchronization.
You're listing things that can be made easier with the C++ RAII idiom. This has nothing to do with garbage collection. C#'s "using (...) { ... }" construct does what you're asking for. The only difference is that the scope is more explicit (which was a conscious design decision, not a constraint imposed by garbage collection).
There's a rule that 80 % of the cpu time is spent in 20% of the code. http://en.wikipedia.org/wiki/Pareto_principle So why not use both GC and non-GC languages together? Write the app first in a user friendly language like Python and then optimize the slow parts in C++. Then you can focus on your design and less on your heap.
The difference between the two is that if you give a non-GC tool to a beginner, it will blow up in his face and make it clear that the person needs more practice and experience in programming. Now give a GC tool to a beginner and he will think he is a professional. Only much later will the program callapse under its own weight, but to the beginner, it worked for a while and can you really pin the bug on him? This discussion is void. If you're arguing FOR GC, you're a beginner, plain and simple. Advanced programmers use it when appopriate and not when it's disavantageous. Languages like Java that force you to use GC and where you can't tell the GC that an object should be deleted is not a real language. When I program, I don't like having chains around my ankles. This is not bias. Java is a subset of C++ and always will be. It's just a fact and if you use Java, then you must be aware of its limitations is all.
Keeps house and yard cleaner
More efficient bulk processing of waste
leaves room for technoligical advancements in disposal methods
Cons:
Encourages people to discard more stuff
Reduces public awareness of waste and pollution issues
Some garbage trucks are bad polluters
You see, I don't know what the rest of you are talking about, but I do want to contribute.
Language students: Don't try to learn English here. This ain't it.
In most cases, the total run-time cost of garbage collection is lower than that of malloc/free memory management, at the cost of higher on-average memory usage (which can obviously destroy performance if you end up having to swap). On the other hand, application-tuned manual memory management using pooled allocation is generally faster than GC. Whether or not pooled allocation increases memory usage as much or more than GC depends on many things. Another consideration is that although GC often consumes less total CPU cycles than malloc/free, non-incremental collectors tend to use those cycles in big batches, which can produce GC 'pauses'. That's bad for some applications. Incremental collectors can minimize this effect, but only with some cost in CPU cycles.
At the company I used to work for, we handled our own memory management (to the point of overriding new and delete, and creating our own memory pool for certain tasks).
The three main problems this addresses in a game are: memory fragmentation, control of the AMOUNT of memory usage, and control of the TIMING of de/allocation.
In a game, especially a PS/2 game, memory is a premium. You want to keep the code as streamlined as possible so you have enough memory to load levels that are a reasonable size.
Memory allocation is a CPU hit as well, especially deletion, as many others mentioned already. A GC usually cannot be relied upon to cleanup at opportune times in a game. For example, I might want to free objects when the player is in a menu but never when he's in combat.
But am I against GC in general? Hell no. It is very helpful in avoiding programmer error and making applications more secure. It also provides a better C++ environment. Ever try writing code that uses exceptions w/o a GC, and you'll soon be pulling your hair out.
Anyhow just my $.02
SEAL
Good points, just one quibble:
A GC usually cannot be relied upon to cleanup at opportune times in a game.
That isn't necessarily true. It depends on your choice of GC implementation, and the way that you use it.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Look up the FFI sections on languages like haskell, a few common lisp implementations, etc, and use your imagination. Opaque data within FFI objects isn't subject to collection in most such systems.
Orbitz is a concrete system that uses this technique in its backend. The backend is primarily common lisp, with some data structures that aren't well suited for lisp land being accessed by FFI.
The data structure clients include access paths where "do this quickly" is important.
OS kernels are entirely too performance-critical to use garbage collection anyway.
hexactly. however, SUN thinks differently.
Anons need not reply. Questions end with a question mark.