A Glance At Garbage Collection In OO Languages
JigSaw writes "Garbage collection (GC) is a technology that frees programmers from the hassle of explicitly managing memory allocation for every object they create. Traditionally, the benefit of this automation has come at the cost of significant overhead. However, more efficient algorithms and techniques, coupled with the increased computational power of computers have made the overhead negligible for all but the most extreme situations. Zachary Pinter wrote an excellent article about all this."
An obvious fault that seems to go with out notice about garbage collectors, particularly stop-and-copy collectors is that when ever they do the full blow stop and copy, they have to touch all of those memory pages, and fault all of your virtual memory back into ram.
But like compiling a compiler, can you *really* trust that it is not doing something nefarious?
I have been pwned because my
...it's required by them.
Stack-based languges like the C family (including Java) don't need GC to operate correctly, but can use it if it's available. (Java just has it all turned on by default.)
By "correctly," I'm specifically leaving out memory leaks. Your program may leak, but it will still run correctly, give the right answers to computations, not suddenly lose track of variables, etc. (Right up until you run out of swap.)
Those "other languages" the author dumps a list of don't use GC just to free the poor programmer from the burden of thinking, or whatever. Nearly every one of those languages either has support for functional programming, or is centered around it. And in functional programming, you're creating functions on the fly.
Which means returning functions as data. Possibly involving local variables in the creating function. Which means that locally-declared variables have to keep existing after the creating function returns, even if the coder can't get to them anymore. And the only way to do that is to have the runtime system manage its own heap, which means a garbage collector.
So for all those languages, it's not an "ease of use" thing. It's a "there's no way for a programmer to do even do it manually at all" thing. GC is the only option.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Is reference counting really that bad? I use it all the time with a special smart pointer class, and I can't convince myself that it costs very much. Granted, the number of objects I create is generally low, but is it really a big deal to increment an integer, or provide 4 bytes of extra storage per pointer for the count? I suppose I can imagine cases of millions of object pointers to count, but it seems that it was dismissed a little too off-handedly. It's a really simple solution that may be applied to a wide variety of apps.
the article provided a fairly clear description of all the various techniques for garbage collection. having compared .NET 1.1 to JDk1.4.2, both are about the same in terms of GC performance. The primary benefit .NET has over java is the code is compiled to native and it does release memory unlike java. This shows the biased of each design. .NET is targeted at GUI's and clients, therefore it's important to release memory when the window is minimized. Since Java is geared towards servers, you wouldn't want to release memory because it could have un-intended affects.
A previous poster noted that most GC algorithms are distinctly unfriendly to virtual memory systems. They usually have similar problems with cache locality, which can result in an enormous slowdown, regardless of the time actually spent in the GC itself. A practical problem is that GC regimes are notoriously non-portable, so that each new language implementation needs to have the (increasingly complex) GC re-done again.
A more fundamental problem is that memory is only one of many resources a typical industrial program must manage. GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually, as in C. (Java has this problem, for instance.) "Finalization" simply cannot provide the necessary guarantees.
Given a resource management regime that can handle all these other important resources, as is commonly practiced in C++, memory becomes just another resource. Management is encapsulated the same way for all. A language that lacks the tools necessary to implement such a regime needs GC, so the presence of GC may actually (as in the case of Java) indicate a fundamental weakness in the language.
(Anybody who thinks languages like Haskell or ML are fundamentally more powerful than C++ must be unaware of the Boost Lambda library, and of FC++, a set of header files that implements Haskell language semantics for C++ programs. They get along fine without GC, as well.)
Another flaw of ref counting is that if you have two objects which are no longer referenced by any of the active application, but which have references to each other, they will not get GC'ed, leading to memory leaks. Circular refs alone are just not good enough for any serious application, unless you force the programmer to look after cleaning up circular references, which kinda defeats alot of the benefit of using a GC'ed language.
God this is crap.
How we know is more important than what we know.
> By "correctly," I'm specifically leaving out memory leaks.
What a thing to leave out. Memory leaks are one of the hardest-to-track-down
and most annoying kinds of bugs that we perpetually see in application after
application. Okay, crashes are more annoying and pervasive, sure. And
buffer overruns (which are not a problem in most languages that have GC,
albeit GC is not the reason they're not a problem). But memory leaks are
high on the list.
> And in functional programming, you're creating functions on the fly.
I'm trying to imagine a programming language that doesn't let you create
functions on the fly but is powerful enough for writing real applications.
The only thing I can come up with is that you could write what basically
amounts to an interpreter so that you wouldn't have to write "functions"
in the implementation language but could write them in the interpreted
language instead. But that seems like a really ugly hack, just to avoid
including real memory management in the compiler/interpreter/vm/whatever.
It is possible to get around the need for closures (i.e., anonymous routines
that hold references to otherwise-out-of-scope lexicals), if you have a
sufficiently powerful object system. But again, it seems like a questionable
goal; sometimes closures are really the most convenient way to accomplish
something. (Sometimes they're not, of course... that's why I favour
multiparadigmatic languages.)
> So for all those languages, it's not an "ease of use" thing. It's a
> "there's no way for a programmer to do even do it manually at all" thing.
> GC is the only option.
Strictly *theoretically*, the programmer can do all that stuff in any
Turing-complete language; it's possible to do functional programming in
8086 assembly language, for example, if you're willing to go far out of
your way to do it. But in practice, neither assembly language nor C
really makes that easy or practical, no. But then, there are actually
quite a lot of things that those languages don't make easy or practical.
Cut that out, or I will ship you to Norilsk in a box.
Dammit - I can't tell whether you are truely retarded or just trolling. But for anyone else who might be confused, the grandparent is referring to RAII (Resource Acquisition In Initialization), which ties the lifetime of a resource to the lifetime of the managing object. This is the part that requires deterministic finialization - i.e. destructors. (The assumption is that the managing object's lifetime is controlled via something like auto_ptr or some ref-counted smart pointer).
This is my kind of garbage collection!
It might be useful if some languages had an optional method of hinting that an object should be garbage collected soon. This would help in languages like Java where you get a huge amount of data stored and then all at once the disk thrashes as it GC everything. For some algorithms, it would be nice to tell Java ahead of time that you're done with the object and you're not going to reference it anymore. The nice thing is though, it wouldn't be a requirement, so you wouldn't have to worry about deleting an object still in use by mistake. I wonder how efficient this would be.
There hasn't been a "discrepancy in efficiency". Good garbage collectors have been comparable to, or better than, manual storage allocators for decades.
The perception of a "discrepancy in efficiency" has several causes:
- Garbage collection allows programmers to get sloppy about storage managmentt: if a non-GC program gets sloppy about storage management, it crashes, if a non-GC program gets sloppy about storage management, it just runs slowly. Unfortunately, as a result, many core libraries in garbage collected languages are pretty sloppily written and slow--the fault is with the libraries, not with garbage collection.
- Garbage collection allows language implementors to make different design decisions. Many garbage collected languages will do memory allocation every time you use a floating point number. Imagine how slow C would be if you called "malloc" for every floating point number.
- Garbage collection often bundles memory management overhead into single chunks of time, while manual storage allocators don't. Furthermore, garbage collector implementations really rub your nose in it, printing messages like "[starting garbage collection... done]". But doing a lot of storage management at once is usually more efficient overall--in aggregate, manual storage managers spend more time, they just diffuse it out. However, both kinds of behaviors exist with both storage managers, and you can pick and choose.
The article is right that garbage collection is a good choice today. It is wrong in that it has pretty much always been a good choice. Garbage collection could have been widely adopted in the 1970's or 1980's, and we would have saved ourselves a lot of headaches and troubles without any loss in efficiency.I feel like I just read a small section in the memory management section of an operating systems or programming languages text book. I'm not sure what to discuss here, no knew ideas were expressed or presented here. Perhaps the author could have postulated new ideas for memory management or suggested how current ideas could be improved. Interesting read if you're a programmer who never really got into the mechanics of a programming language and what certain runtime systems do to make your program work. Then again, I would probably call you a strict-scripter and when scripting you're generally more concerned with expressions rather than mechanics.
Although, the point the author made about CPU's being cheaper and faster and how this is allowing the programmer to care less and less about mechanics so the can make use of this extra power to make programming a more expressive rather than mechanical practice is interesting.
Personally, I see no problem with one day having high level application programmers who know nothing of hex, memory management or physical hardware but rather algorithms, computability and productions, etc. Of course, there will always be a place for the "computer programmer", but also a place for the "analytical abstractionist engineer".
"If you are a dreamer, a wisher, a liar, A hope-er, a pray-er, a magic bean buyer
The extreme situations are the only ones that are valuable. If you are not coding an "extreme situation," your job is outsourced. Any application that can tolerate garbage collection is trivial. Thanks anyway -- I'll stick to C and assembly. At least I will have a job tommorrow.
Good article, though very limited in scope (basically just a list of GC methods, wrapping up with the methods used by recent Java and .NET interpreters). I was a little disappointed that they didn't get into the implications of using languages with GC.
One pitfall that I've noticed basically comes along with the benefit of avoiding "micro-managed" explicit memory management -- there are a lot of Java coders who don't think at *all* about memory management, because they think it's all handled for them. Mix that in with an over-excitement about OO, and you get some impressively slow and non-scaleable code.
You DO need to understand, at least on a basic level, what's going onto the heap, and what the garbage collector has to do to keep up with your "garbage". Carefully nulling out objects that are going to be out of scope in a millisecond is just wasting space, but you should definitely keep an eye on what objects you're allocating within that loop that runs a million times. They're all going on the heap; are they all going to be on there at the same time? When are they going to be eligible for collection? Are they just Strings, or larger objects (which possible create other objects when they are created)?
If you have to optimize a section of code, consider sticking to primitives and Strings (obviously you're balancing this against the cost of possibly less-maintainable code!), and don't forget that when you instantiate com.foo.Bar, all of its superclasses are also instantiated, including any member objects they hold. And don't make a variable static for no reason -- it won't get collected with the object instance....
Two useful things to think about -- heap size (the objects you're actively using at a given moment, so they can't be collected), and churn rate (how fast you're creating and trashing objects). Object creation/destruction isn't as costly as it was with the early versions of Java (no, you probably don't need that Thread pool!). But any application that needs to scale requires some thought on memory usage and churn before you start coding.
There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
So if we start using GC for everything will this yield programmers who produce sloppy code and don't know about memory management? If so, will it matter?
Admittedly I haven't used a language with native GC for a few years, but I don't have fond memories of it.
For small programs ie less than 2000 or so LOC I would preffer no garbage colection so compiler directives would be nice, some object property that allows me to chose whether to place the object into the list of objects that may need GC would also be nice.
I knew some kid was going to start bitching and moaning about the memory leak comment. I'm not saying they're not important. I'm saying that one has nothing to do with the other.
C, C++, Java. None of these support closures or lambdas. C++'s Boost makes a good try, but none of them allow me to construct new functions using nothing more than standard language features.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
This is insightful?
If you have something to say, pull your head out of your ass and say it.
It was mentionned earlier that reference counting was pretty good, but had a few drawbacks when it came to cycles and multi-threading.
...
I took a bit of time to go and read Wikipedia's page
In the description they give, they mention that reference counting GC can represent managed objects by directed graphs.
I know there exists algorithm to find cycles in such graphs. So I suppose these could be applied to this problem. Other proposal are to use a tracing GC to detect them. To which it was replied that this would be able to reclaim the memory but not to properly finalize the objects. I don't see why that would be true. I mean, if you have found a member of the cycle to be collected, can't you just finalize that one and let the whole cycle unravel itself ? If there are cycles inside that cycle, just do it again on these etc
As I said, another common objection was the cost of updating the counters in multithreaded environnments. Multiple solutions have been proposed, some more portable than others (using processor/platform specific atomic increments, or deferring the update until it is really necessary and using the standard mutex protection)
All this said, I try to understand a couple of things.
-I am no genius, thus these ideas must not be new, what is the problem which can't be solved with these?
-Reference counting seems to integrate better in the runtime of the program. All the other techniques proposed seem to imply some monolithic operation on the memory summing up all the overheads at on time and doing the cleaning once in a while, with the possibility of becoming a bottleneck in heavily loaded systems. Reference counting OTOH seems to allow the cleanup to continually add a little bit of overhead to the system but nothing which will bring the whole thing to a grinding halt before allowing it to go on. What have I missed?
the reason quicksort is great is that its most internal loop is very tight... two incs, a cmp and maybe a swap data. all others sort algorithms have more complicated inner loops. so, even if the complexity is O(n log n) in the medium case and O(n^2) in the worse case, quicksort wins because its constant k*(n log n) is smaller than, p.ex., mergesort's j*(n log n)
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
Wish I had mod points for you C++ boys. It's really lame how slashdot is populated with Java-advocating idiots. I mean, I'm no Java lover, but I gotta feel sorry for the language when it's proponents are retards...
GC has the problem of non-deterministic finalization -- with reference counting, every time you give up a lock on an object (decrement the reference count), you check to see if the reference went to zero so not only can you release the object, you can invoke the object destructor to close file handles and stuff like that.
A couple of relevant references for garbage collection are the following website (which unfortunately hasn't been updated for a while - still, it's useful):
The Memory Management Reference
and of course Jones and Lins book, Garbage Collection: Algorithms for Automatic Dynamic Memory Management
> > I'm trying to imagine a programming language that doesn't let you
> > create functions on the fly but is powerful enough for writing real
> > applications.
>
> C, C++, Java.
[Scratches Java off list of languages to learn.]
I know C and C++ have been traditionally used for writing applications, but I
have long been of the opinion that they're not really powerful enough for the
job. It takes several times as many programmer-hours as it ought to to do
anything, from prototyping to new feature work to debugging, which IMO means
that "powerful enough" is a real stretch. These languages get by and continue
to be used at this point mostly because a lot of people know them.
In the past, these languages were selected because programmer time was cheaper
than computer resources (with which they're more miserly than a higher-level
language), but that's no longer anywhere near true, as the article points out;
the *average* computer has enough RAM to run three horribly-inefficient
extreme memory-hog applications at the *same time* without needing any swap,
and newer models are coming with more and more. You talk about GC screwing
up virtual RAM algorithms, but it's really not an issue on most systems; if
a process grows to three or four *times* the size it needs to be, it doesn't
actually have any user-noticeable impact on performance. Memory leaks are
actually much worse, because in that case the wasted memory doesn't ever get
collected and eventually it becomes a problem, after a couple of hours of
use. (Actually, a very small memory leak can go for days without being a
problem, but those aren't the ones we notice so much.) In 1996, when most
consumer-grade operating systems were so stable that you had to reboot every
few hours, memory leaks weren't such a big deal (provided you had lots of
swap space), but now that almost any modern OS (and most applications) can
run for weeks and weeks if not months or even years without being restarted,
memory leaks are now a big deal. It's okay to continually use five times as
much RAM as you technically need; it's not okay for your memory requirements
to keep growing as a function of how long you've been running, because that
can get to be *way* more than five times what you need.
Back to creating functions on the fly, I'm just a little bit surprised to
learn that Java doesn't have such an important feature; I had been lead to
believe it was a relatively high-level language with fairly high-level
features. It runs on a virtual machine, for crying out loud; I had imagined
it would be fairly modern and flexible in its design. Are you sure it can't
create functions on the fly, or is that just something you don't know how to
do in Java? That's a pretty serious accusation to level at a language,
almost as bad as saying it can't allocate extra memory on the fly.
Cut that out, or I will ship you to Norilsk in a box.
I'm trying to imagine a programming language that doesn't let you create functions on the fly but is powerful enough for writing real applications."
In most functional languages you can write something like this:
In this OCaml code, the plusx is created "on the fly" and it is different function, depending on the value of x that is read on runtime. How do you do this in C ?and most annoying kinds of bugs that we perpetually see in application after
application.
Well, there are plenty of applications that never need to (or should for optimal performance) release any memory, only shuffle it around, and have well defined points for releasing other resources (such as closing files and sockets) that can't be left to be done in the background. Such applications have no fear of memory leaks (well, mostly anyway, you can always screw up with data structures and pointers).
functions on the fly but is powerful enough for writing real applications.
This I don't quite understand. Any compiled language by definition can't create functions on the fly, every functions needs to be compiled before the program is run. So what do you mean by "creating functions on the fly", actually?
It has different collectors, which you can select according to the needs of your application. Currently there are two, the default collector (generational) and an incremental collector which is slower but less likely to pause.
Also, the default collector is a 3-generation one, not 2, at least as of Java 1.4.1. More details here.
> Any compiled language by definition can't create functions on the fly
... and then there's JIT compilation... and then
This is flat-out false. There are various compiled languages (compiled as
in compiled to native machine code, yes) that not only allow creating functions
on the fly but actively encourage it. Common Lisp is just one example. Yes,
garbage collection gets compiled in. (This is no weirder than compiling a
memory-management library into a C program, and actually being standardized
is an advantage.)
Besides that, the whole compiled-versus-interpreted-languages argument is
getting fairly blurry these days. It's no longer as simple as C and C++ on
the one extreme, which take hours to compile and then run on systems that
don't even have a compiler, and BASIC on the other extreme where you can stop
the program while it's running, change some variables and maybe some lines of
code, and set it running again (possibly at a different line) in-progress
with the state intact. There are all kinds of in-between cases now, Perl
and Java and Python and so on, which technically are both compiled and
interpreted or neither or somewhere in-between. Java runs on a virtual
machine, okay, and Perl6 will, but what do you do with Perl5 and others like
it, which don't really run on a vm per se but have separate compile-time and
run-time phases yet allow more code to be compiled later at run time (through
eval and things like it),
you have compilers that take languages designed to compile to a virtual
machine and instead compile them to native machine code for a specific
platform...
Cut that out, or I will ship you to Norilsk in a box.
i wish people would learn LISP before claiming LISP is impossible.
the trick is to have a runtime system that includes a parser and compiler for your language; it can then compile any newly created functions on the fly for you. it's not as grotty as it sounds - we've got several decades of experience with it already, most of the bugs got ironed out back fifteen years ago or so.
of course, an alternative method is to not have your language be compiled, or be at most bytecode-compiled; interpreters and byte-code compilers are often a bit lighter-weight than a "full" native-language compiler, so don't burden you with quite as large a runtime library. whether the gain is worth it is a bit debatable, though.
Are you sure it can't create functions on the fly, or is that just something you don't know how to do in Java? That's a pretty serious accusation to level at a language, almost as bad as saying it can't allocate extra memory on the fly.
Java can't create functions on the fly as LISP or Scheme can do. It does have runtime reflection and class loading. This means that classes (and therefore the methods in those classes) can be loaded at runtime. But it would be quite a bit of work to use this facility to generate new functions from inside a program.
I'm more curious as to why you are so adamant that generating functions at runtime is such an important capability. Many people avoid runtime code generation because they find it harder to reason about. Could you give me some examples where you have used runtime function-generation to good effect?
char a[]="lbiitgt l e \n\n\0";main(){for(char*c=a; *(short*)c;c+=2){putchar(*(short*)c);}}
The implementation of higher order functions as closures does not require garbage collection, if you are willing to leak memory. The same exact issue comes up whenever someone returns an allocated object from a function in Java or C. The creation of a closure is an allocation of an object (which may copy values from the current stack frame into it), but the stack frame still goes away when you leave it, as do the local variables.
You seem to have a mistaken impression of the way functional languages are implemented.
On top of that, I don't see why we'd even bother talking about an implementation where we leak memory indefinitely.
Stack-based languges like the C family (including Java) don't need GC to operate correctly
Correct memory management in the presence of heap-allocated mutable objects require garbage collection, even in C.
There simply is no way around it: you can allocate an arbitrary graph of objects, and what can be freed depends on what is reachable from the roots.
And if garbage collection isn't built into the language, then the language has to be unsafe, like C is.
So, if you want a safe language with heap allocation of mutable objects, you need garbage collection. Lexical closures have nothing to do with it.
Which means that locally-declared variables have to keep existing after the creating function returns, even if the coder can't get to them anymore. And the only way to do that is to have the runtime system manage its own heap, which means a garbage collector.
There is no such requirement: you can deallocate closures manually just like any other data structure. But once people get to that point, they generally realize that it's a bad idea.
Hi. I'm not the grandparent poster, but I'd like to address your question about when runtime function generation works well.
...)" member function, then you can call the result like a function. Combine this with the runtime code generation method mentioned above (i.e. invoke g++ and use dlopen), and you're good to go.
;) It's a lot easier to let the language decide which variables are needed for closures and silently do the necessary memory allocations in the background.
First of all, remember that a runtime generated function can be as simple as a string of bytecode -- then you just need an interpreter (either as a library or written in the language in question). More advanced languages can emit native machine code and copy it into executable memory pages, but technically you could even spawn a new process to invoke the compiler and then use dlopen (hint: you'd need a consistent way of defining symbol names, and you might need to invoke ld and/or nm)...
Runtime generated functions are usually combined with lexical closures or blocks, and that's what most people actually mean when they say refer to runtime generated functions. By allowing a 'child' function to access its parents' local variables, you can produce some really nifty effects. But is this really any different from OO programming? Barely. You can simulate it in C++ by allocating a new object with a "virtual obj operator()(obj* arg0,
*cough* ok that's a lot of work, and I pity whoever is tasked with maintaining such a system.
Now to answer your original question: when are runtime generated functions useful? Answer: Any application where the user (or administrator) is allowed to write code that affects the running program: Word processors (emacs uses a lisp dialect), spreadsheets, database access programs, CAD programs (autocad uses/used lisp), MUDs/MMORPGs, etc. (In a strict technical/hair-splitting sense, this also includes dynamically loadable kernel modules, and if you want to split even more hairs, it also includes every program you've ever used from the shell/gui.)
I'm ready to believe you're simplifying this deliberately to illustrate functional programming techniques, but I think the simplification here is confusing.
It's important to keep in mind the difference between code routines and closures. The term "function" as is commonly used doesn't respect this difference. C's "functions" are code routines, while ocaml's are closures, i.e. a pair of a routine and an invocation frame.
What's being "created on the fly" are closures, which are like stack frames (storage for local values of identifiers in an invocation), but which:
- are allocated in the heap,
- have a pointer to a "parent" frame (the bindings in the enclosing environment), and
- have unlimited extent (since the invocation of a closure A might return a closure B whose "parent" is A, requiring that A be kept around indefinitely after the call to it returns).
I think the point the original poster is making can be expressed in another way, but one that's more revealing of what's at stake: stack allocation is a form of automatic memory management.In any modern language, there is some form of automatic storage management behind the scenes for function-local storage. Imagine if in C, you had to manually allocate the stack frame of any function you called, and every function had to deallocate its frame before returning. This would be tedious and repetitive. Automatic management of a stack of limited-extent frames provides the programmer a simple (but restricted) way of doing this.
It would be possible to have a functional language where the storage for closures was managed by hand. Imagine a language like C except that it allowed you, when you called a function, to specify a heap-allocated binding to be used in the invocation, instead of a stack frame.
This would be similar to the hypothetical C variant from above, where the programmer was responsible for creating and destroying stack frames. But much harder, since closures have unlimited extent. In the stack allocation case, it's clear when the allocations and deallocations need to happen (before the a function is called, and before one returns). In the manually allocated closure language, the programmer would have to figure out on his own the extent of every closure, and when and where it's safe to free them. This is not simply tedious like in stack allocation, but rather devilishly complex in general.
So, garbage collection solves it.
Are you adequate?
> Memory leaks are actually much worse
GC doesn't protect against memory leaks, so I fail to see how it relevant to the GC vs no-GC discussion?
Also considering that nearly all the languages currently used do not allow you to create function at runtime, you could have a serious problem finding a job if you refuse learning them..
Why not employ a processor with its own ram, seperate from the main system to do nothing but GC? Expensive? Not really. Consider a machine with 8 processors and 8GB of ram for an app server. Whats 1 more of each to the cost? This solves the paging problems as memory can be paged into the separate memory space. It does not eat up processor because it is a dedicated task. It only hurts disk performance when the disk has to be hit for a page fault. A dedicated "device" (term used loosely here) would create an almost impact free way to manage memory. Would this work? I'm still thinking it out myself.
I tried for 5 years to come up with a clever sig...only to realize that I am not clever.
I guess that is a good example of what I am taking about. Gcc does not make anyone any money, directly. And it is not an especially great compiler either. Gcc is a wonderful commodity. Try benchmarking low level code with the (also recently free beer) MS compiler (under W2000) against gcc (under W2000 or Linux) and you will find out what I am talking about.
I did not intend the comment as a troll, though I suppose it was overly terse. People are constantly worried about outsourcing and the devaluation of engineering jobs. I'm not saying easier jobs have no value or are not worth doing, just that harder problems have higher relative value.
Basically, I think everything will become a commodity except what is still hard. And the only way to attack what is still hard is at the lower levels, because the main problem with hard problems is consistently time. The "extreme situations" are the ones we should be attacking; in the beginning all we worked on were extreme situations, and I and other software engineers were far more professional and respected than we are today. By trying to abstact and simplify so much we have denigrated the value of the computer profession as a whole, usually for the benefit of the proponents of the specific "solution" or abstraction. It's great that it is so easy to put together an online transaction system these days, but whatever happened to natural language recognition?
Also, there is a big difference between using task specific garbage collection in the context of a proprietary data structure and trying to develop a general abstract collector. It is even worse to disallow programmatic memory mangement like Java does.
I should try being more eloquent. I was obviously in a bad mood when I wrote that comment.
To an extent, it is like being a cabinet maker. There are different kinds of people who make cabinets. Some are in love with wood and form. Some build cabinets to make money. Some people are actually more interested in the tools than the piece they are building.
It is an axiom of the artistic cabinetry world that the best work is done with the fewest and simplest tools. A band saw, a couple hand saws, some chisels, a couple of hand planes, a bit and brace. That's pretty much it. The tools force you to work directly with the thing that matters -- the wood and the construction. With these tools you have complete expressiveness in the material and complete control. You keep track of exactly what the grain of the wood is doing, and how the joints are holding up.
Many of the people who work as custom cabinet makers make a lot of money. Their pieces are worth hundreds of thousands of dollars. Other people work as laborers in furniture making factories. Of course here the managers apply the "different tools for different jobs" notion, though the employees probably don't care that much -- they get paid by the hour. It is the carefully choosen "different tool" that is most likely to take their fingers off anyway.
One of the programmers I have the most respect for, the one who wrote my favorite programming books, is Donald Knuth. His books contain some of the most advanced consructs I have seen in book form, including a lengthy discussion on garbage collection. But he choose to present his ideas using an assembly language -- MIX. Garbage collection is not the tool, it is a product of the tool. I think the reason I responded so strongly to the assertions of the parent article is because it is like (for instance Java is like) Large Woodworking Corp telling me I cannot use hand saws and chisels anymore. LWC says I have to buy their premolded modular furniture components and join them together with LWC fasteners. Which was not really the intent of the article. The article was just talking about improvements in GC techniques. But the statement about GC being a panacea was absurd -- unless you are working on a nearly trivial problem.
There is a disease amongst Computer Scientists that makes us get lost in the "tools". It is as if our job is to tell other people how to solve problems, instead of pursuing the solution of real problems ourselves. What is the "wood" of computer science? I think the substance is the problem, especially the hard problem. How are we doing with computer vision, with natural language, with common sense and reasoning? Not too good. A couple of decades ago it got hard, despite our initial optimisim. So everyone gave up and started selling "tools" instead of promising solutions. I'm sure that, if anything, all these tools just get in the way.
Actually the number has increased for the SERVER VM (YES, THERE IS A DIFFERENCE BETWEEN CLIENT AND SERVER VM FOR THOSE OF YOU DOING BENCHMARKS!)
It's now more like 4 or 5 garbage collectors to choose from rather than 2.
Java provides parallel garbage collector for the Server VM as an option. Not the same, but takes advantage of multi-processing environments.
Now to answer your original question: when are runtime generated functions useful? Answer: Any application where the user (or administrator) is allowed to write code that affects the running program: Word processors (emacs uses a lisp dialect), spreadsheets, database access programs, CAD programs (autocad uses/used lisp), MUDs/MMORPGs, etc.
I can't speak for DB access programs or CAD programs, but I've played around with the scripting features of office suites and MUDs to state that what these programs implement is probably not what most programmers think of when they hear the term "runtime function generation". What you're describing is more like parsing. MUDs, for example, might be written in C, but then invent their own specialized scripting language that simply is an API for calling the C functions that were written by (human) programmers.
If you accept "parsing" as a form of "runtime function generation", you could stretch the analogy so that you could consider a file deletion program to do runtime function generation. The program as you if you're sure you want to delete a file. Based on the user input, it either generates the "deleteFile" function (if it receives 'y'), or the "doNothing" function (if it receives 'n').
When I hear "runtime function generation", I picture perhaps an operating system that detects security holes and patches itself without the need for a human developper to tell it what the security flaw was in the first place (of course, merely connecting to the internet to download patches off a server is cheating.) Or perhaps an OS for which you'd use it's paint application to draw rough sketches of screenshots, and the OS figures out you want to do from those screenshots. If it already has that feature, great, it runs it for you. Otherwise, it writes a new program that does it for you. Perhaps I could describe to the computer that I want a game like Diablo II, except set in the future, and it'd generate the program for me.
Actually, the last real data I saw (but note that this was some time ago) was that 7 of the 10 most visited web sites in the world ran on C++ back-ends. Sorry, no link off the top of my head, but IIRC there was some discussion of that statistic in these parts, so a search will probably turn it up. Do you have any more up-to-date information?
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
aQazaQa
the *average* computer has enough RAM to run three horribly-inefficient extreme memory-hog applications at the *same time* without needing any swap
You know my computer runs at least 50 apps at the moment, not 3 or 4. RAM is still not a resource conscious developers prefer to waste.
You talk about GC screwing up virtual RAM algorithms, but it's really not an issue on most systems; if a process grows to three or four *times* the size it needs to be, it doesn't actually have any user-noticeable impact on performance.
Depends. It's not noticible only if that memory is not actually accessed, i.e. no frequent cache misses and, God forbid, swaps.
Memory leaks are actually much worse, because in that case the wasted memory doesn't ever get collected and eventually it becomes a problem, after a couple of hours of use.
I'm sick of a "memory leak" argument. Do we compare well written apps here or sloppy written apps? If GC allows you to afford sloppiness in software development and still have a "decently" performing app then so much it's worth.
I think you mean "mark and sweep" collectors. "Stop and copy" collectors just trace the working set from whatever your heap root is. Add in the copy step, and you only touch twice the size of you working set. If your collector is well-written and the OS provides the hooks, it will ask for the new space to be allocated in core, and the old space to be discarded, wherever it is.
In the great CONS chain of life, you can either be the CAR or be in the CDR.
When ref goes out of scope, the object it references becomes available for collection. It's that simple.
In fact, your misconceived example negates the whole point of GC-- your ref = 0, in terms of programmer logic, amounts to free(ref), which is exactly what you don't have to do if you have GC!
Are you adequate?
If there is something bad programmers should be forbidden from authoring, libraries are it.
Are you adequate?
In good programming style, a function should be short and simple: it should do just one thing, and return. Which means that any references in local scope should cease to exist quickly. If your function allocates objects A, B, C, doing something with them, then allocating D and E, and doing something else with those two, and then returning, you should rewrite that as two functions.
During this time, your program is using more memory than it needs.. This time may be the whole duration of the program if your reference is referenced by a global variable --> memory leak.
Good programming style also avoids the use of global variables. And if one does need such variables, one certainly doesn't stick ephemeral values in them.
Essentially, the scope of a variable is a way of managing the lifetime of the objects it refers to. If you want an object to live forever, you reference it from a variable with a very wide scope. If you want it to live for a very short time, you confine it to a narrow scope. Given these reasonable programming practices, GC can identify objects as soon as they are available for collection.
If you have unneeded objects that can't be collected, you're misscoping references.
Are you adequate?