Slashdot Mirror


A Glance At Garbage Collection In OO Languages

JigSaw writes "Garbage collection (GC) is a technology that frees programmers from the hassle of explicitly managing memory allocation for every object they create. Traditionally, the benefit of this automation has come at the cost of significant overhead. However, more efficient algorithms and techniques, coupled with the increased computational power of computers have made the overhead negligible for all but the most extreme situations. Zachary Pinter wrote an excellent article about all this."

216 comments

  1. An Obvious Fault by vandel405 · · Score: 2, Interesting

    An obvious fault that seems to go with out notice about garbage collectors, particularly stop-and-copy collectors is that when ever they do the full blow stop and copy, they have to touch all of those memory pages, and fault all of your virtual memory back into ram.

    1. Re:An Obvious Fault by p2sam · · Score: 3, Insightful

      An obvious fault that seems to go with out notice about sorting algorithms, particularly bubble sort is that it takes O(n^2) time to complete.

    2. Re:An Obvious Fault by dvdeug · · Score: 1

      What?!? Bubble sort is O(n^2), but sorting algorithms can be O(n log n), like quicksort.

    3. Re:An Obvious Fault by ComputerSlicer23 · · Score: 1
      I'm not sure if I'm being trolled, but as a technicality, quicksort is actually O(n^2). It just so happens that if you pick a good pivot point, it will be average case of n * log(n). Heapsort and merge sort, are actually O(n * log(n)). Quicksort is also handy because if I remember correctly, it is easy to write a stable quicksort, and that's harder to do with merge and heap sorts.

      Kirby

    4. Re:An Obvious Fault by Anonymous Coward · · Score: 0

      The other advantage of quicksort is that the n in that O(n log n) is a much smaller n than the n in the n of the merge/heap sorts' O(n log n)s. So it's normally an order of magnitude faster. Of course, if you hit an unfortunate case giving you the O(n^2) complexity things get slower.

    5. Re:An Obvious Fault by Anonymous Coward · · Score: 1, Informative

      An obvious fault that seems to go with out notice about garbage collectors, particularly stop-and-copy collectors is that when ever they do the full blow stop and copy, they have to touch all of those memory pages, and fault all of your virtual memory back into ram.

      That's why people use things like generational garbage collectors. Newer objects are generally shorter-lived, so newer objects get collected more often.

      What that means is that generally speaking a generational collecter will usually only be operating in main memory, and it'll only head into virtual memory once in a while. End result? The problem you describe becomes much rarer, and often doesn't occur at all in normal operation.

    6. Re:An Obvious Fault by 10101001+10101001 · · Score: 2, Interesting

      That's not how big O notation works. It's actually defined as:

      If O(n log n), then

      c*f(n)+k < n log n

      defines the worst case scenario for time where c and k are constants and f is the function being used. n is always the number of elements being sorted (for sort algorithms). So, the issue is what k and c are in each of the various algorithms. It might be that c and k are huge for heap/merge sort, but with quicksort as O(n^2), it'd take either a hugely massive c and/or k or really small n for quicksort to always be faster than heap/merge sort.

      Of course, as was pointed out, quicksort is more trivially to write, is generally n log n performance, and apparently has a lower c value (IIRC, heapsort has a c of 2). So, people use quicksort even though quicksort isn't guaranteed to always have the best time. Heck, people still use bubblesort (which is fine for really small n's as bubblesort's k is really small). Personally, I'd rather sort algorithms (and gc algorithms) be stuffed into system wide libraries and possibly let an outside function chose which to use; gmp which uses several different methods for doing bignum integer math is a good example of just having a library choose the right algorithm for the job; it'd seem smart to have an equivalent sort algorithm which was based on n and either worst or average case as chosen by the programmer.

      Big O Notation

      --
      Eurohacker European paranoia, gun rights, and h
    7. Re:An Obvious Fault by hding · · Score: 2, Insightful

      Obviously the poster's point was that there are better garbage collection algorithms that do not generally suffer from the original poster's problems, as there are sorts that generally perform better than bubble sort. It was intended to be a tad sarcastic, I'd think.

    8. Re:An Obvious Fault by Anonymous Coward · · Score: 0

      Well, you remember somewhat incorrectly: mergesort is naturally stable, given implementation details, of course, whereas making quicksort stable makes the "quick" part of the name a misnomer. Heapsort you remember correctly.

      One reference

    9. Re:An Obvious Fault by p2sam · · Score: 1

      I seem to recall 3 fancy capital greek laters:

      Big-Omega, Big-Oh, Big-Omega

      If I recall from algorithms class ...

      1) lower bound
      2) upper bound
      3) average case (or something similar)

      So isn't qsort Big-Omega( n log n ), and Big-Oh( n ^ 2 ) ?

      PS: "Oh" *is* a greek letter, right?

    10. Re:An Obvious Fault by p2sam · · Score: 1

      gah, second omega == theta ... I'm a dumbass ...

    11. Re:An Obvious Fault by UnknownSoldier · · Score: 1

      > but sorting algorithms can be O(n log n), like quicksort.

      You forgot to say "comparision" based. Sorting can be done in O(n), ala Radix Sort, which can be adapted to sort floating-point numbers in O(n).
      i.e.
      Radix Sort Revisited by Pierre Terdiman

      --
      "I'd rather be idealistic, so people are inspired at what might be,
      Then be realisic and not have any hope of what could be."

    12. Re:An Obvious Fault by Chipaca · · Score: 1
      PS: "Oh" *is* a greek letter, right?

      Omicron is it's name.

    13. Re:An Obvious Fault by blamanj · · Score: 1

      That's one of the reasons that generational collectors were created. The "young" generation is where most of the activity takes place and it tends to be in your working set anyway.

    14. Re:An Obvious Fault by angel'o'sphere · · Score: 1

      Radix sort is based on the length of the numbers in bits.

      So in case you sort 8 bit long values it is O ( k * N) where k = 8 and N = amount of numbers.

      So for reasonable smal numbers of N: k is log N. And for bigger numbers of N, k is min(length(X), log N).

      While it is "faster" than n log n for really hughe amounts of numbers and reasonable smal bit lengthes it is in general only O(n log n)and not O(n).

      OTOH, there are similar sorting algorithms like "sort by counting", thats the translation from german ... its up to you to google for it :D Those algorithms are O(n).

      angel'o'sphere

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    15. Re:An Obvious Fault by aled · · Score: 1

      My sorting algorithm is O(1). It's only constraint is that it requires an ordered array as the input.

      --

      "I think this line is mostly filler"
    16. Re:An Obvious Fault by CableModemSniper · · Score: 1

      I think that was his point. Just because one particular method of garbage collection does bad things, doesn't mean that there isn't a better method.

      --
      Why not fork?
  2. GC always looks great by ObviousGuy · · Score: 0

    But like compiling a compiler, can you *really* trust that it is not doing something nefarious?

    --
    I have been pwned because my /. password was too easy to guess.
  3. Sigh. It's not a "feature" of other languages... by devphil · · Score: 3, Insightful


    ...it's required by them.

    Stack-based languges like the C family (including Java) don't need GC to operate correctly, but can use it if it's available. (Java just has it all turned on by default.)

    By "correctly," I'm specifically leaving out memory leaks. Your program may leak, but it will still run correctly, give the right answers to computations, not suddenly lose track of variables, etc. (Right up until you run out of swap.)

    Those "other languages" the author dumps a list of don't use GC just to free the poor programmer from the burden of thinking, or whatever. Nearly every one of those languages either has support for functional programming, or is centered around it. And in functional programming, you're creating functions on the fly.

    Which means returning functions as data. Possibly involving local variables in the creating function. Which means that locally-declared variables have to keep existing after the creating function returns, even if the coder can't get to them anymore. And the only way to do that is to have the runtime system manage its own heap, which means a garbage collector.

    So for all those languages, it's not an "ease of use" thing. It's a "there's no way for a programmer to do even do it manually at all" thing. GC is the only option.

    --
    You cannot apply a technological solution to a sociological problem. (Edwards' Law)
  4. Reference Counting... by wanerious · · Score: 1

    Is reference counting really that bad? I use it all the time with a special smart pointer class, and I can't convince myself that it costs very much. Granted, the number of objects I create is generally low, but is it really a big deal to increment an integer, or provide 4 bytes of extra storage per pointer for the count? I suppose I can imagine cases of millions of object pointers to count, but it seems that it was dismissed a little too off-handedly. It's a really simple solution that may be applied to a wide variety of apps.

    1. Re:Reference Counting... by complete+loony · · Score: 1

      When the reference counting is automatic, it happens everywhere. Consider a string class, and a "simple" operation like s = s + "foo". The old s goes out of scope and a new s is created, so it you're doing lots of these operations reference counting does add a lot. But you're right, manual reference counting on objects that don't get created and destroyed a lot shouldn't cost too much.

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    2. Re:Reference Counting... by I_Love_Pocky! · · Score: 1

      If you really aren't creating that many objects, why even bother freeing memory? Just because you were always told to? If you aren't going to use much memory, just use new at will. Shoot if you can get away with it just use the stack for all of your memory.

      I would guess that in actuality you probably do need quite a few objects, or reference counting wouldn't even be necessary.

      By the way reference counting becomes a pain when you have circular references.

    3. Re:Reference Counting... by bluGill · · Score: 2
      One day a novice came to the master and said "I have this great idea for GC, we just count references." To which the master replied one day a novice came to the master and said "I have this great idea for GC, we just count references."

      One of my favorite little sayings that applies here. I suggest you look for the original if you haven't seen it, I can't paraphrase it as good.

    4. Re:Reference Counting... by jonadab · · Score: 2, Informative

      > Is reference counting really that bad?

      Refcounting can't collect everything if you have any circular references. It's
      what Perl5 has, and we live with it, but Perl6 is getting real garbage
      collection (mark-and-sweep I think, or at any rate something more advanced
      than refcounting).

      --
      Cut that out, or I will ship you to Norilsk in a box.
    5. Re:Reference Counting... by Pseudonym · · Score: 4, Insightful

      It can be.

      Let's ignore circular references for a moment. To be honest, cycles don't turn up as often as people claim in programs where reference counting is done manually (or through smart pointers) because people are smart enough to know the issues and avoid them (e.g. by using weak references or other non-owning pointers to break cycles).

      For a start, reference counting interacts badly with multithreading. The reference count has to be protected against concurrent updates, and that can cost a lot, especially if the count is already effectively protected in some other way (e.g. by only being used single-threadedly). This is such a problem that many C++ library vendors are doing away with reference counting in their std::basic_strings.

      Secondly, every time you copy a pointer, you modify the reference count. Every single time. Sometimes (e.g. if you take a temporary local copy) that will be in cache, but not always. If there's contention between CPUs (see previous point), for example, the count will bounce between them. Sometimes it's an almost guaranteed cache miss.

      Admittedly, this isn't such a big problem in C++-implemented reference counting, because the programmer is usually far more aware of what's going on with pointer copying and will go to some lengths to avoid copying, but it can cost if reference counting is automatic. Have a look at the Python source code some time and see just how much trouble it goes to to avoid manipulating reference counts.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    6. Re:Reference Counting... by Anonymous Coward · · Score: 0

      Is reference counting really that bad?

      No. GC purist often raise their noses to it, but reference counting has its place. The important thing is to know the ramifications.

      Ref Counting is simple to implement, and thus very portable. It is said by some to be easier on the CPU cache than other methods (object counts are kept with the object, and objects most frequently used are already in the cache). It can't handle cycles, but a cycle collector can be made that periodically breaks unreachable cycles (with the important caveat of not being able to "finalize" the object, however). Object destruction is easily handled with ref counting, whereas other collectors have more problems (which order to finalize objects in, for example, and when exactly it should occur)

      One big disadvantage, besides cycles, comes when you must manually adjust the count of objects (such as when when you are writing a language extension). Managing the counts can be tricky at first, before you get the hang of it.

      These days, if you can use a more advanced garbage collector, you're life will be easier. But ref counting is a very simple solution, is often "good enough", and is thus here to stay. Ideally, a language design and implementation would be GC neutral, allowing for a variety of possible schemes behind the scenes.

    7. Re:Reference Counting... by Circuit+Breaker · · Score: 3, Interesting

      Reference counting can interact nicely with multithreading on modern (post `96) hardware - most modern CPUs have this nice "compare-and-swap" atomic operation, which can be used to manage refcounts without any form of locking. Yes, it is a little less efficient and a little more intricate, but it's doable; In Windows, for example, it's called "InterlockedIncrement()" and "InterlockedDecrement()".

      Also, in many environments you DON'T modify the reference count every time you copy a pointer; there's a concept called "borrowed references" which is used in Python, COM, and many other ref count schemes to avoid some useless refcounts.

      Python (pre 2.0) used to do only refcount, and did it much better than Java (using GC) in all respects except thread friendliness. Modern python (2.0 and beyond) does both -- but it's extremely rare for the gc to be needed at all.

    8. Re:Reference Counting... by Anonymous Coward · · Score: 0

      Oh, come on, it's in the Jargon file...

    9. Re:Reference Counting... by arkanes · · Score: 1

      This is one reason why I like C++ so much - because you aren't locked into a refcounting model. Especially with role-based templates, you can use any number of different styles of smart pointer (weak refs, copy on write, threadsafe, simple auto_pointers...) with very little trouble. A language that handles ref counting for you must do it in a more generic fashion and will lack either functionality or performance (or both) at one time or another. JIT and very smart compilers can help with some of this (eliminate resource locking if you can tell that a resource is never accessed from multiple threads) but they aren't a panacea and I don't think any current JIT implemetations can do this anyway.

    10. Re:Reference Counting... by Anonymous Coward · · Score: 0

      Reference counting does not necessarily interact badly with multithreading. The count may need protection against concurrent updates or may not. The programmer can decide. I like the reference counted smart pointer in the Loki-lib (http://sourceforge.net/projects/loki-lib/). Alexandrescu's policy-based design principles allow great freedom to the application implementor to choose what capabilities each object needs.

    11. Re:Reference Counting... by hak1du · · Score: 1

      Python (pre 2.0) used to do only refcount, and did it much better than Java (using GC) in all respects except thread friendliness. Modern python (2.0 and beyond) does both -- but it's extremely rare for the gc to be needed at all.

      That's a myth. Python reference counting doesn't even get close to the performance of a reasonable garbage collector. Even the Python implementation as it is can be speeded up by replacing its reference counting scheme with the Boehm garbage collector (if you do it correctly).

    12. Re:Reference Counting... by Pseudonym · · Score: 1
      The programmer can decide.

      I believe I said precisely that.

      OTOH, there are many circumstances where the programmer has no choice, such as with most built-in C++ string implementations. This is a bad design decision. Library implementors, especially standard library implementors, should not dictate a specific performance model for you.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    13. Re:Reference Counting... by Pseudonym · · Score: 1
      Reference counting can interact nicely with multithreading on modern (post `96) hardware - most modern CPUs have this nice "compare-and-swap" atomic operation, which can be used to manage refcounts without any form of locking.

      On most platforms (IA32 being an exception, for reasons to be outlined below), no, it doesn't require explicit OS-level locking, but it does require a load/store barrier and cache synchronisation. On some architectures, this can be even more expensive than a cache miss.

      Intel IA32 chips don't require these because a) they don't use load/store buffers, and b) caches are automagically synchronised. The downside is that IA32 chips don't vertically scale very far. You can't put more than about 4 Pentia on a single motherboard before the cache synchronisation overhead swamps the bus completely. (Thankfully, IA32 systems are fairly cheap to scale horizontally, but that's another story.)

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    14. Re:Reference Counting... by Anonymous Coward · · Score: 0
      Python (pre 2.0) used to do only refcount, and did it much better than Java (using GC) in all respects except thread friendliness. Modern python (2.0 and beyond) does both -- but it's extremely rare for the gc to be needed at all.
      Is this some special usage of the term "better" that I'm not familiar with? Because hillariously, grotesquely broken comparisons aside, my experience is that Python clocks in at almost exactly one-tenth the speed of Java. Not that there's anything wrong with that -- you use the tools appropriate to the job.
    15. Re:Reference Counting... by Anonymous Coward · · Score: 0

      I can't paraphrase it as good.

      "as well."

    16. Re:Reference Counting... by Anonymous Coward · · Score: 0

      you're life will be easier.

      "your".

    17. Re:Reference Counting... by Animats · · Score: 1
      Yes. If automated memory management is to be retrofitted to C++, it should be reference-counted, not garbage collected. Calling destructors from the garbage collector means that they're called at more or less random times. This makes them unsuitable for controlling resources like files, windows, and such.

      Perl has good reference-counting semantics, with strong and weak pointers. In structures with back-pointers, the back-pointers should be weak, which avoids cycles.

      I've made some proposasl on this. But adding strictness to C++ isn't popular, which means another decade of buffer overflow exploits.

    18. Re:Reference Counting... by Anonymous Coward · · Score: 0
      Well, that's why you combine them.

      Python uses reference counting, and occasionally runs a GC to weed out any leftover circular references.

    19. Re:Reference Counting... by oliveira · · Score: 1

      There is no such thing as built-in string implementation. If you don't like one vendor's standard library you take another one's, simple as that. There's a lot of choice when it comes to C++ standard library. Dinkunware, STLPort, SGI to name a few.

  5. informative by Anonymous Coward · · Score: 0

    the article provided a fairly clear description of all the various techniques for garbage collection. having compared .NET 1.1 to JDk1.4.2, both are about the same in terms of GC performance. The primary benefit .NET has over java is the code is compiled to native and it does release memory unlike java. This shows the biased of each design. .NET is targeted at GUI's and clients, therefore it's important to release memory when the window is minimized. Since Java is geared towards servers, you wouldn't want to release memory because it could have un-intended affects.

    1. Re:informative by Anonymous Coward · · Score: 0

      .NET is as suitable for server applications as it is for client ones. Have you heard of ASP.NET?

      On the contrary, Java is primarily used on the server. With .NET being only 2 years old I already have a couple of apps on my desktop that use it. Java is, what, 8-9 years? No Java app is sight for me for the nearest future!

      I don't buy your comment of releasing memory when a window is minimized. This just doesn't happen. You're seriously confused here.

  6. Under the Rug by Markus+Registrada · · Score: 4, Informative
    As with most such presentations, this article sweeps under the rug most of the reasons why languages dependent on garbage collection have always failed to find much deployment in industrial settings.

    A previous poster noted that most GC algorithms are distinctly unfriendly to virtual memory systems. They usually have similar problems with cache locality, which can result in an enormous slowdown, regardless of the time actually spent in the GC itself. A practical problem is that GC regimes are notoriously non-portable, so that each new language implementation needs to have the (increasingly complex) GC re-done again.

    A more fundamental problem is that memory is only one of many resources a typical industrial program must manage. GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually, as in C. (Java has this problem, for instance.) "Finalization" simply cannot provide the necessary guarantees.

    Given a resource management regime that can handle all these other important resources, as is commonly practiced in C++, memory becomes just another resource. Management is encapsulated the same way for all. A language that lacks the tools necessary to implement such a regime needs GC, so the presence of GC may actually (as in the case of Java) indicate a fundamental weakness in the language.

    (Anybody who thinks languages like Haskell or ML are fundamentally more powerful than C++ must be unaware of the Boost Lambda library, and of FC++, a set of header files that implements Haskell language semantics for C++ programs. They get along fine without GC, as well.)

    1. Re:Under the Rug by sfjoe · · Score: 2, Interesting

      ...languages dependent on garbage collection have always failed to find much deployment in industrial settings.

      Huh? The world's busiest e-commerce websites are largely written in Java. Just what is your definition of "industrial settings"? If you mean that Java isn't used much in a foundry, then I guess you're right.

      --
      It's simple: I demand prosecution for torture.
    2. Re:Under the Rug by Dr.+Bent · · Score: 1

      A language that lacks the tools necessary to implement such a regime needs GC, so the presence of GC may actually (as in the case of Java) indicate a fundamental weakness in the language.

      Please explain to me exactly how you can implement a resource management system (or regime as you call it) in C++ for, lets say, managing socket connections, that has no equivlent in Java. You are aware of this method, right?

      Then, please, further explain to me how a task performed by a software platform, in this case the Java Virtual Machine, can show a weakness in a programming language. Do you mean to imply that If I use Jython or Groovy that I won't have this problem? Or are you just making the rookie mistake of confusing the Java 2 platform with the Java language?

    3. Re:Under the Rug by WayneConrad · · Score: 4, Informative

      GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually, as in C.

      Ruby has an interesting approach using closures to handle manual resource allocation. One calls the function that allocates a resource, passing it a closure. The function allocates the resource, calls the closure, and then deallocates the resource (even if an exception occurs). Here's how you might write to a file the manual way (I apologize for the lousy formatting; I don't know how to trick /. into indenting):

      file = File.new("foo")
      file.puts "My mistress's eyes are nothing like the sun"
      file.close

      That's the usual way, easy to get wrong: What if an exception occurs? What if I forget to call close? Here's the better way, calling File.open and passing it a closure:

      File.open("foo") do |file|
      file.puts "My mistress's eyes are nothing like the sun"
      end

      File.open might use this common idiom:

      def File.open(filename)
      file = File.open(filename)
      begin
      yield(file)
      ensure
      file.close
      end
      end

      The "yield" calls the closure that was passed in, passing it the file object. The "begin...ensure" is like Java's "try...finally" construct, used here to make sure that the file gets closed whether the closure terminates normally or raises an exception.

      This idiom doesn't solve all manual resource allocation/cleanup problems, but it's a pretty way to solve some of them.

      I don't think Ruby invented this idiom, but I don't know where it came from. Perhaps Lisp: Everything seems to have come from Lisp.

    4. Re:Under the Rug by Pseudonym · · Score: 4, Insightful
      A previous poster noted that most GC algorithms are distinctly unfriendly to virtual memory systems.

      It depends on the language. Haskell, for example, has very different memory access patterns than Java. Being lazy, a value is produced only when it's time to be first consumed, at which point it often becomes garbage immediately. It follows that most of the garbage that a decent generational GC will be collecting will probably be in cache.

      Anybody who thinks languages like Haskell or ML are fundamentally more powerful than C++ must be unaware of the Boost Lambda library [...]

      I'm one of those rarest of beasts, a programmer who regularly uses (and likes) both Haskell and C++. (Disclaimer: I'm not familiar with FC++, though from what I've read it doesn't really support lazy evaluation, which is one of Haskell's most important distinguishing features.)

      From a reductionist point of view, of course, neither is more powerful than the other. However, even with Boost.Lambda and the likw, I still find Haskell almost always allows for far more rapid development than C++ does, all other things being equal. Naturally, all other things are rarely equal, and speed of development is not always the greatest concern, and I won't be drawn into ranking one of my two favourite languages over the other.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    5. Re:Under the Rug by Brandybuck · · Score: 1

      One could say that whipped cream fails to find much deployment in industrial settings. And one would be correct in saying so, regardless of the volume output of the twinkie factory.

      --
      Don't blame me, I didn't vote for either of them!
    6. Re:Under the Rug by Brandybuck · · Score: 1

      You bring up a good point. Whenever people talk about the glories of garbage collection, I always wonder why they think memory is the only resource. And if I'm supposed to manually deal with those other resources, why am I not supposed to likewise manually deal with memory?

      --
      Don't blame me, I didn't vote for either of them!
    7. Re:Under the Rug by BCoates · · Score: 4, Informative
      That sounds like the way a C++ destructor is used with the "Resource Acquisition is Initialization" model. You'd open a file by creating an object on the stack, and the destructor would close the file-handle once control returns (or the object is deleted, if on the heap)
      // some_file_object is a hypothetical file i/o object with manual open(), close(), write(), etc. functions

      class File : public some_file_object {
      public:
      File( const std::string & fname ) : m_handle( open(fname) ) {}
      ~File() { close(m_handle); }
      private:
      const handle m_handle;
      };
      It's sort of inside-out relative to the ruby version becuase it doesn't use the closure, but the useage is near-identical:
      {
      File file( "foo" );
      file.write( "There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy." );
      } // close happens here, or at the throw/return/break/continue site, if any
      new/delete just being another open/close pair to be avoided or contained away in a small object when practical, so it reduces the benefit gained from GC use.
    8. Re:Under the Rug by Fat+Cow · · Score: 1

      In fact, C# can make handling of those other resources less reliable and more complex as compared with C++ for example. I miss smart pointers!

      --
      stay frosty and alert
    9. Re:Under the Rug by DavidTurner · · Score: 3, Informative

      Well put!

      Another important consideration is that where the programmer has the expectation that his garbage will be cleaned up for him, he will tend to assume that all of his resources will be cleaned up. This is clearly not the case. The seminal example for me was the use of database query result sets in C# - if you don't explicitly close them, they tend hang around, and the next time you try to perform a query on the same connection, you'll as likely as not get an exception. Surprise!

      Also, as some other posters have pointed out, not all garbage collection is automatically bad. It works pretty well in Lisp, Scheme, Haskell, and friends. However in the cases of Java and C# it is certainly detrimental, as it disables the only really effective mechanism for managing resources in those languages - destructors.

      Perhaps the thing to do is introduce an analogue for Lisp unwind-protect mechanisms. I suggested this a while back on the Java community forums, in the form of having sentry objects with the lifetime semantics of C++ automatic objects. Someone made the suggestion that the volatile keyword could be twisted to serve this purpose.

    10. Re:Under the Rug by DavidTurner · · Score: 2, Interesting
      Please explain to me exactly how you can implement a resource management system (or regime as you call it) in C++ for, lets say, managing socket connections, that has no equivlent in Java. You are aware of this method, right?

      Okie dokie (pardon the bad formatting):

      class TCPSocket
      {
      int handle_;
      public:
      TCPSocket()
      {
      handle_ = socket (AF_INET, SOCK_STREAM, 0);
      }

      ~TCPSocket()
      {
      close(handle_);
      }
      };

      I think you'll find that there's no equivalent to the above in Java.

      This may prove to be good reading, if you are interested in learning more.

    11. Re:Under the Rug by tyrecius · · Score: 1

      This is a very useful idiom, and I am glad to hear that it is implemented in Ruby. There is a way to make this idiom even more powerful. In C++, an obect defines the ownership. Object copying is also allowed. Which means that transfer of ownership is also allowed.

      For instance, suppose we have an object called foo which uses a particular resource. foo might have a method which an outsider can use to change which resource class foo uses. Because ownership is defined by an object, that object can be passed into the method, and ownership changes hands.

      In Ruby, is there an analagous operation? How do you use it?

      --
      char a[]="lbiitgt l e \n\n\0";main(){for(char*c=a; *(short*)c;c+=2){putchar(*(short*)c);}}
    12. Re:Under the Rug by Anonymous Coward · · Score: 0

      Whenever people talk about the glories of garbage collection, I always wonder why they think memory is the only resource. And if I'm supposed to manually deal with those other resources, why am I not supposed to likewise manually deal with memory?

      Are you allocating hundreds of files and sockets every second?

      No?

      That's why.

    13. Re:Under the Rug by EsbenMoseHansen · · Score: 1
      Ah, here is one who has seen the REAL problem with the popular garbage collectors. This is not a problem inherit to garbage collectors: Try looking up "conservative garbage collectors". I have only seen this for C++, though.


      A more fundamental problem is that memory is only one of many resources a typical industrial program must manage. GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually, as in C. (Java has this problem, for instance.) "Finalization" simply cannot provide the necessary guarantees.

      --
      Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
    14. Re:Under the Rug by Antity-H · · Score: 1

      I am not sure what you mean here.As far as I know I can write the same thing in Java. Using the constructor to allocate the ressource and the destructor to close it.

      Or do you mean that in Java I can't create and close a socket without checking for errors (by catching possible exceptions) ? Do you really think this is bad ?

      I'll read the pdf tonight, but would you care to tell me how it relates to your example ?

    15. Re:Under the Rug by TheSunborn · · Score: 1

      You don't have destructors in java. And if you use finalize as a destructor, then that is your problem.

      If your program have an awt/swing gui, your only way to stop the program is calling System.exit(). And if you call System.exit() your objects are not finalized (That is: The finalize method is not called)

    16. Re:Under the Rug by cxvx · · Score: 1
      If your program have an awt/swing gui, your only way to stop the program is calling System.exit().

      That is not true, you just have to assure that the last (J)Frame is disposed of by calling dispose() on it.

      Although there was a bug in a certain JRE version that didn't allow for GUI programs to finish correctly, even when all the frames were disposed, IIRC.

      --
      If only I could come up with a good sig ...
    17. Re:Under the Rug by DavidTurner · · Score: 2, Informative

      Java doesn't have destructors. It has finalizers, but they are called at unpredictable and occasionally awkward times (if at all). This presents a whole new challenge in terms of establishing invariants.

      Consider, for example, the case of a mutex. In C++ I can encapsulate the acquisition and release of the mutex in an object (for an example of this, see boost::mutex::scoped_lock at www.boost.org), and know for certain that I will own the mutex object for the duration of the scope of the locking object. This invariant is maintained even in the presence of exceptions, break or continue statements, early returns, or other surprises.

      The net effect is that the code is (a) robust, (b) maintainable, and (c) easy to read.

      Contrast:

      mutex::scoped_lock lock(name_list_update);
      if (!should_insert) break;
      name_list.insert("Bjarne");

      with:

      name_list_update.lock();
      if (!should_insert) {
      name_list_update.unlock();
      break;
      }
      try { name_list.insert("Bjarne"); }
      finally { name_list.unlock(); }
    18. Re:Under the Rug by Antity-H · · Score: 1

      For the destructor remark, the documentation of Object.finalize() is pretty clear that is can be overriden for cleanup needs. And it specifically states :
      For example, the finalize method for an object that represents an input/output connection might perform explicit I/O transactions to break the connection before the object is permanently discarded.

      From http://java.sun.com/j2se/1.4.2/docs/api/ System.exit:
      This method calls the exit method in class Runtime. This method never returns normally.

      And from the Runtime.exit documentation :
      The virtual machine's shutdown sequence consists of two phases. In the first phase all registered shutdown hooks, if any, are started in some unspecified order and allowed to run concurrently until they finish. In the second phase all uninvoked finalizers are run if finalization-on-exit has been enabled. Once this is done the virtual machine halts.
      you _can_ enable finalization on exit if you need it. Of course you have to be careful with what you do in finalizers, just as you have to be careful with the destructors.

    19. Re:Under the Rug by Spy+Hunter · · Score: 3, Insightful
      So we should ignore GC because it doesn't solve all the world's resource problems at once? Your post doesn't provide a convincing argument against the use of GC. Non-portability is a non-issue; only language writers have to worry about that, and it's already their job. The cache-thrashing issue is the only real problem you mention. Generational GC significantly reduces this problem, to the point where the small runtime performance hit (if there is one at all; malloc and free take time too you know) is balanced out by increased programmer productivity (giving more time to optimize if you so desire, or add features if that's what you value more).

      Side note: Boost Lambda and FC++ are impressive but ugly hacks with horrible syntax, lots of "gotchas" that make code not work (often related to operator precedence and order of evaluation), and compiler errors from hell. Probably not the best examples of the power of C++. (OTOH, maybe that makes them the perfect examples of the "power" of C++ ;-)

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
    20. Re:Under the Rug by arkanes · · Score: 1

      Finalizers in Java are equvilent, in C++ terms, to not doing any cleanup and relying on the OS to do it. Except it's even worse, because finalizers are not guaranted to be called. They do not provide the functionality of destructors, which are guaranteed to be called when the object goes out of scope. Using finalizers the same way a C++ programmer uses destructors is a HUGE programming error and will result in resource leaks.

    21. Re:Under the Rug by arkanes · · Score: 2, Insightful
      Thats pretty cool, but I suspect that the code gets really tangled if you've got a need to lock more than one resource - you'd need chains of closures. It's functionally (hehe!) equivilent to the C++ RAII model, but I think the C++ one is more concise and clear.

      On the other hand,it can probably deal with transaction-style locks easier than RAII can - although I've seen a system to handle that that uses RAII objects combined with functors (instead of closures). It works almost identically to the Ruby model.

    22. Re:Under the Rug by arkanes · · Score: 1

      D (see digitalmars.com/d, or the recent Slashdot article) is the holy grail here as far as I'm concerned - it has GC but also retains automatic objects you can use for RAII. On top of that, the GC is much more amenable to bypassing, unlike Java or C#s.

    23. Re:Under the Rug by hding · · Score: 2, Informative

      In Lisp a pretty common way to handle this sort of thing is with a with-some-resource macro (with-open-file is a commonly used built-in). This sort of macro will typically take information for dealing with the resource (e.g. in the file example the pathname of the file, options for opening it, etc.) and the body of code to be executed with the resource; it will then expand to code that acquires the resource and then uses it and releases it (the latter two typically wrapped in unwind-protect to assure the release should there be an exit via a condition or whatnot).

    24. Re:Under the Rug by Antity-H · · Score: 1
      SUN's documentation seems to disagree, I am talking about the part where SUN explicitely states :
      A subclass overrides the finalize method to dispose of system resources or to perform other cleanup.
      But let's stop it here.
    25. Re:Under the Rug by Anonymous Coward · · Score: 0

      Your post is so close to zen it hurts. The preview button is your friend, although in your case, I think you should have another human being nearby to ensure that what you are trying to say makes sense outside of your head.

      Of course, this could be a plain old fashioned troll...

    26. Re:Under the Rug by arkanes · · Score: 2, Informative
      Sun is being misleading. You can do that, but it's a lousy idea. Read up on when finalization occurs (the relevant docs were posted in part elsewhere in this thread). Basically, it's run when the object is GCed, not when it goes out of scope (which makes it useless for RAII), and it's not guaranted to be run at all. Finalization is not the same as destruction, and can't be used for the same purposes. Simple as that.

      I'll try to clear it up a little more with a better example. Say you're locking a file. In C++, you'd create an object to control the lock.

      class locker {
      locker() {lockfile();}
      ~locker() {unlockfile();}
      };
      function WriteFile() {
      locker l;
      ....
      }
      The file is locked when the function is entered, and is unlocked when the function is left, no mater how it's left - early return, exception, anything. There is no way to leave that function scope with the file locked. You can't do anything like that with Java, because it has no automatic objects and non-deterministic finalization.
    27. Re:Under the Rug by arkanes · · Score: 1

      It's not a problem with GC so much but that the implementation of GC in the language precludes the constructs that let you efficently manage other resources. I'm only aware of one language that has both GC and automatic objects. Java is probably the most poplar GCed language/environment, and it certainly brought GC into the mainstream, so it's particular resource management flaws are often seen to be part of GC. (Functional languages that use closures to manage resources are a different issue, and I have no idea how GC interacts with that if at all).

    28. Re:Under the Rug by tcopeland · · Score: 1

      > Finalizers in Java are equvilent,
      > in C++ terms, to not doing any cleanup
      > and relying on the OS to do it.

      Yup. We've got an entire ruleset in PMD dedicated to tracking down devious finalizer bugs.

    29. Re:Under the Rug by tcopeland · · Score: 1
      > trick /. into indenting
      File.open("file.txt", "w") {|f|
      f << "Hello world!\n"
      }
      Use the ecode tag and put spaces in there.

      I agree that this is an excellent Ruby idiom... very handy.
    30. Re:Under the Rug by WayneConrad · · Score: 1

      In Ruby, is there an analagous operation? How do you use it?

      I don't remember seeing the transfer-of-ownership idiom used in Ruby. We make an object to handle a resource, and then just let everyone who needs the resource have a reference to the object. Garbage collection takes care of the rest (I suppose the garbage collector owns the resource; everyone else is just leasing it). That solves a big OO problem: Keeping track of who owns what. It gives us a different problem, which is losing control of when an object deallocates its resource, but that problems rears its head less often than the ownership problem so it seems to be a good tradeoff.

    31. Re:Under the Rug by Chris_Jefferson · · Score: 1

      If you think that Boost Lamba and FC++ have anywhere near the power of haskall, then you haven't used it yet. One of the major problems of constructing things via templates in C++ is that you have to define everything statically at compile time. As soon as you start generating any kind of complex system or want to construct functions based upon user input or compuatation you do at run-time, then you reach a wall with these libraries.

      Now don't get me wrong, I like boost::lambda and use it a lot, but comparing it to a functional language is like comparing c++ to a turing machine. (Before anyone posts a smart reply, yes I know c++ and a turing machine have equivalent power, but I'll write a heapsort in c++, you on a turing machine, and we'll see who finishes first ^_^ )

      --
      Combination - fun iPhone puzzling
    32. Re:Under the Rug by Euphonious+Coward · · Score: 1
      I presume you meant Haskell.

      But I think you're confusing two concepts here. Compile-time vs. runtime semantics is a whole other axis for comparing languages. We use C++ in circumstances where a "method not implemented" (in Smalltalk parlance) message thrown in the customer's face at random intervals is considered evidence of incompetence. We like compile-time verification and only wish we had more of it.

      Of course it can be fun to have a compiler in your runtime system, and generate code for it on the fly.

    33. Re:Under the Rug by Gaijin42 · · Score: 2, Informative

      the using keyword in c# handles exactly this situation.

      This code :

      using(DisposableObj x = new DisposableObj())
      {
      x.DoSomething();
      }

      is equivilent to

      DisposableObj x = new DisposableObj();
      try
      {
      x.DoSomething();
      }
      finally
      {
      x.Dispose();
      }

      So as long as the object's author put any "close/release" code into the dispose method, it will get handled automatically when you are done with it.
      (Even if an exception occurs!)

      Most c# objects that have dispose() also have a close()

      For example, databases. You can close() them and then reopen them multiple times. If you call dispose() and the object is open, it closes for you.

      (So what, I still have to call dispose manually!)

      And thats the purpose of the using(){} syntax. You dont have to remember anymore.

    34. Re:Under the Rug by Gaijin42 · · Score: 1

      well, the average (hell, even small) program makes thousands and thousands of memory allocations. Every string, every bit of math etc.

      However, database or file allocations are much less frequent.(and in a well written program, very localized and isolated)

      Just because you can't solve every problem, doesn't mean you can't solve the big one.

    35. Re:Under the Rug by Dr.+Bent · · Score: 1
      And here's the Java equivalent:
      public class MySocket
      {
      private Socket socket;

      public MySocket()
      {
      socket = newSocket("myhost.com", 8080);
      }

      public void closeSocket()
      {
      socket.close();
      }
      };
      Ah, but you say....you have to manually call closeSocket() to close the socket! True, but that's no different that your implementation. The only difference is that instead of calling 'mySocket.closeSocket()', I have to call 'free tcpSocket'. Either way, the programmer is responsible for managing the resource.

      Destructors aren't "automatic". You have to manually free the memory, and that triggers the call to the destructor. While the syntax is a little different, there really is no functional difference between the C++ and Java way.
    36. Re:Under the Rug by TheSunborn · · Score: 1

      And the Runtime.runFinalizersOnExit() which enables this says:

      Deprecated. This method is inherently unsafe. It may result in finalizers being called on live objects while other threads are concurrently manipulating those objects, resulting in erratic behavior or deadlock.

      Enable or disable finalization on exit; doing so specifies that the finalizers of all objects that have finalizers that have not yet been automatically invoked are to be run before the Java runtime exits. By default, finalization on exit is disabled.

    37. Re:Under the Rug by TheSunborn · · Score: 1

      Thanks. I did not know they fixed that bug in 1.4 but it does still not solve the problem that finalize is not running. Just see this program. It will leave the JavaTestFile with a size of 4 bytes not containing the text "hello world" so finalize is not running -(

      import java.io.*;

      public class Test {
      public static void main(String args[]) {
      try {
      File f=new File("JavaTestFile.txt");
      if(f.exists()) {
      System.out.println("Testfile already exists");
      }
      else {
      ObjectOutputStream o=new ObjectOutputStream(new FileOutputStream(f));
      o.writeUTF("Hello world");
      }
      }
      catch(Exception ex) { }
      }
      }

    38. Re:Under the Rug by rmstar · · Score: 1
      Side note: Boost Lambda and FC++ are impressive but ugly hacks with horrible syntax, lots of "gotchas" that make code not work (often related to operator precedence and order of evaluation), and compiler errors from hell. Probably not the best examples of the power of C++. (OTOH, maybe that makes them the perfect examples of the "power" of C++ ;-)

      Very well said. C++ should be a member of the Turing tarpit.

    39. Re:Under the Rug by rmstar · · Score: 1
      Anybody who thinks languages like Haskell or ML are fundamentally more powerful than C++ must be unaware of the Boost Lambda library, and of FC++, a set of header files that implements Haskell language semantics for C++ programs. They get along fine without GC, as well

      You have misunderstood the whole thing completely. A GC lets you program more freely. So yes, you sweep stuff under the carpet, which stays there and gets rid of itself, and unless you do really stupid things, it does so at almost no performance cost. What makes almost any other language more powerfull than C++ is simply that C++ has a rotten and sick notation.

      But why am I telling you this? If you were interested in a sane notation, and in the ability of programming more freely, you would certainly not be programming in C++.

    40. Re:Under the Rug by Brandybuck · · Score: 1

      The frequency of the memory allocation is irrelevant, because it's the same code that doing it. The frequency of the allocation does not increase the complexity of the code. For example, a linked list with thousands of nodes is no more complex than a linked list with only two nodes.

      --
      Don't blame me, I didn't vote for either of them!
    41. Re:Under the Rug by blamanj · · Score: 1

      A previous poster noted that most GC algorithms are distinctly unfriendly to virtual memory systems.

      Actually generational collectors are fairly friendly w/r/t virtual memory.

      GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually.

      Bogus argument. That's like saying steel-belted tires are more puncture resistant but they don't solve the problem of oil leakage so they're worthless.

      Management is encapsulated the same way for all [resources in C==].

      You must have a newer version.

      Anybody who thinks languages like Haskell or ML are fundamentally more powerful than C++...

      The word powerful is not typically a useful term in comparing programming languages. Any language that is Turing-complete can be exchanged with any other Turing-complete language. There are great differences, however, in expressivity, resource needs, readability, and appropriateness to a specific task between languages. Consider SQL, Perl, APL, and assembler. C++ may be your favorite hammer, but sometimes you need a drill press.

    42. Re:Under the Rug by Bloater · · Score: 1

      If an object is not allocated by new, but rather has automatic storage class, or is a member, then everything happens automatically.

      Imagine that you have a class that represents an abstract datatype, but it has no methods to manipulate the data. It can have a method that returns an object that has the member functions necessary to access it, as well as taking an appropriate lock. If you overload operator new to prevent creating one of these on the heap the lock is completely managed, and you are guaranteed efficient, safe access.

    43. Re:Under the Rug by Gaijin42 · · Score: 1

      Not quite what I meant.

      More like : I implement 10 linked lists/hashes/whatever, 50 arrays, and 10000 strings per program.

      (regardless of the length of string/number of elements etc)

      Therefore I have to get memory allocation right several thousand times.

      On the other hand, I only open up a handfull of database connections, and usually only 2 files. Therefore it is much easier to isolate those and make sure they are working right. VS checking for Thousands of pure memory leaks.

    44. Re:Under the Rug by Dr.+Bent · · Score: 1


      Yes, but in that case it's still the programmer and not the platform that's managing the resources. Building a nifty little lock management facade around the problem doesn't make it go away. You can, in fact, do the same thing in Java using an object pool and an abstract factory.

    45. Re:Under the Rug by tyrecius · · Score: 1

      I agree that there are many flaws with the syntax of C++ (especially variable declaration syntax).

      What in particular do you abhor? What would you replace it with? Do you favor LISP/Scheme syntax? Do you favor ML syntax? Do you favor C#/Java syntax?

      --
      char a[]="lbiitgt l e \n\n\0";main(){for(char*c=a; *(short*)c;c+=2){putchar(*(short*)c);}}
    46. Re:Under the Rug by Brandybuck · · Score: 1

      Except we were talking about OO Languages, and not OO Programming. So I was assuming C++. In C++ you don't have to manually allocate memory for strings. And you don't have to allocate memory for arrays either. In fact, with the STL you don't even have to allocate anything for the linked lists either. Of course you could, if you wanted to, but you don't have to.

      Because this is C++, I am using "new" and "delete". On the surface these seem to be about memory allocation, but they are really abstractions for generic resource allocations. I have to match my news with deletes, but I have to do that for any resource.

      Consider a GUI. I allocate memory for a widget with the call "new Widget". A GC will manage the deallocation for me, but I *still* need the destructor because that widget is more than mere memory. I need to properly destroy it because it has logical parents and children. I may have handles to the GUI library I need to release. Etc, etc. So if I have to write a destructor anyway, what's so terribly evil about deallocating the memory there?

      --
      Don't blame me, I didn't vote for either of them!
    47. Re:Under the Rug by Gaijin42 · · Score: 1

      Well, I am primarily a web developer, which changes things for me somewhat (The lifetime of a program is only the lifetime of a page-view, so even if things stick around until the end, its not that long) Additionally, I am not instantiating brushes or anything like that.

      However, even in the winforms world, in c# (and Java) you dont handle getting rid of brushes etc.

      You can "early" dispose them using the dispose() methods we talked about before, but the GC is also smart. It takes into account relative costs of external resources (for GUI only, not for files etc) and will call GC pre-emptively when that cost gets too high.

      An even if it is just a non-resource based class, I do new() on a class a few hundred times per program. Trying to find a delete() for each of those (especially if it is not located near the new() or if you have multiple execution paths etc.)

      There is a reason people came up with GC. And that reason is that leaks are very very common :)

    48. Re:Under the Rug by Tom7 · · Score: 1

      Anybody who thinks languages like Haskell or ML are fundamentally more powerful than C++ must be unaware of the Boost Lambda library, and of FC++, a set of header files that implements Haskell language semantics for C++ programs. They get along fine without GC, as well.

      If you like your memory management to be by reference counting, then they "get along fine without GC." But reference counting is an inefficient way to do memory management.

      Also, I actually believe that copying garbage collection can be good for cache locality, since it can compact the heap to put long lived objects near each other. This is not easy in C++. Of course, this argument needs to be levelled against each GC implementation individually.

      You are absolutely right about the inability for GC to work well with other kinds of resources, however. On the other hand, most languages let you manually manage non-memory resources, so I don't see how the situation is better in C++ except that programmers are already used to doing it with memory.

    49. Re:Under the Rug by Bloater · · Score: 2, Interesting

      Building a nifty little lock management facade *does* make the problem go away. That is precisely the point. The user of that interface cannot even *compile* a program that is not safe with respect to it (Not to say that C++ is any good in other respects, of course).

      The only real ways to do this in Java are twofold:

      1) Use synchronize, and only return the accessor object if the appropriate monitor is held. Then the accessor object throws an exception for each method if you use it without holding that monitor. But there is no reasonable way to implement a cache within that accessor object. There may be a way with several waiting threads and using notify in a complex way that will probably deadlock - but that is hardly the goal of using an object oriented design.

      The problem is that leaving the synchronize block should cause all cached data in the accessor to be flushed to the *real* object - this is often necessary to guarantee performance when the main memory bus is significantly slower than the on-die cache. Furthermore, using a reference to the accessor outside of the synchronize block in which it was created should not only be prohibited - it should be inhibited.

      2) Continuation passing style. But to get that inline one define a subclass inline, which means that any access to local variables requires that they be final. That is not always desirable, and frequently undesirable. This *could* be alleviated in Java with some simple changes (such as allowing reference to non-final variables if it can be guaranteed that there will remain no references to the object after return, but it is still quite ugly conceptually.

      I also don't believe that the alternative of folding the cache into the real object lazily when another accessor is created is workable without an eccessive level of synchronization, possibly causing immense slowdowns on NUMA architectures.

      So while, in C++, the programmer of the interface manages the resources, the interface they produce causes the user of the interface not to have to worry about it. Without that, the user of the interface must be very cautious of their use, and consider far more exceptions. And probably have to take bug reports from end-users that would *not* happen with the C++ method (as long as they don't use "&", see below).

      This post is not intended to mark GC as bad, as there *may* be a solution. An object can be explicitly deleted at a given moment by waiting for notification performed in the finalize method - if the runtime guarantees to GC the object early when waiting for that notification. That is, however, difficult to make both safe and strict in Java, since the wait may be a long one, and it can be hard to statically prove that there are no other references, or to inhibit their creation. C++ also cannot utterly inhibit the creation of references to the accessor object that exists beyond its lifetime, but that is a problem with allowing pointers.

      An unfortunate choice in C++ is that the register keyword is nothing more than an optimisation hint, while in C it inhibits the address-of ("&") operator.
      An unfortunate choice in Java is that references are just C++ pointers with no arithmetic operations - only the "dereference and resolve member" operator "." ("->" from C++) that can throw an exception, "==", and crucially, assignment.

    50. Re:Under the Rug by Tom7 · · Score: 1

      Using this technique for memory management comes from "regions," which are a kind of statically checked manual memory management system. Regions are good for some kinds of programs (and are easy to check), but often the lifetime of an object is not determined statically. If this is your only resource management strategy, it will be too restrictive to write many programs.

      Any language with higher order functions (which you are calling "closures"--an implementation strategy for higher order functions) can implement this. It's an idiom I sometimes use in ML and elisp has had it for a long time (ie, save-excursion); no doubt it's older than me, even.

    51. Re:Under the Rug by rmstar · · Score: 1
      What in particular do you abhor?

      All sorts of things. The 7 different ways (or so) of calling a function, the whacky precedence rules, the syntax for templates, etc.

      It is also far too easy to write some crap which executes but is useless. Consider

      double a=5.0,b=3.0;

      a=0,5*b;

      I got sick of looking for this kind of idiotic mistake, which is only made possible by a needlessly terse syntax. The convoluted syntax from hell conspires against the programmer in other ways, for instance by producing error messages that make no sense even to experts.

      What would you replace it with?

      With a large gapping hole of oblivion. This madness should have never happened. I know people who have lost themselves and ruined their careers in this convoluted monstruosity called C++. Fairly experienced programmers who did actually believe and follow all the hype and lost badly.

      Do you favor LISP/Scheme syntax?

      I do, yes. Keep in mind that modern Common Lisp is pretty much an imperative multipurpose OO language, while (appart from C# and Java, which just look like C++; I know them not too well) the other options you mention are rather "functional" in orientation. There are other fairly sane syntaxes; I'm thinking of delphi here, or python, if people get dizzy with the parenthesis (although this effect usually only lasts a week, if you use a decent editor).

    52. Re:Under the Rug by hak1du · · Score: 3, Insightful

      this article sweeps under the rug most of the reasons why languages dependent on garbage collection have always failed to find much deployment in industrial settings.

      For the same reason people in industry have kept programming in Cobol and Fortran, and for the same reason they keep producing software with all sorts of problems, bugs, and limitations.

      A previous poster noted that most GC algorithms are distinctly unfriendly to virtual memory systems. They usually have similar problems with cache locality, which can result in an enormous slowdown, regardless of the time actually spent in the GC itself

      Not true at all. Generational collectors generally achieve far better locality than malloc-style allocators.

      A more fundamental problem is that memory is only one of many resources a typical industrial program must manage. GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually, as in C.

      Fundamentally, the point behind GC is not to make your life easier, it is to make it possible for the language to be safe. Without GC, a language with heap allocated mutable data structures just cannot be safe. GC generally cannot reliably manage any other resources besides memory and it is not meant to.

      Given a resource management regime that can handle all these other important resources, as is commonly practiced in C++, memory becomes just another resource.

      But memory isn't just any resource, memory is a resource that can contain machine pointers to other memory (as well as references to other resources).

      The problem of resource management for memory is that of arbitrary directed graph reachability. And that is exactly the problem that a garbage collector solves, as efficiently as possible.

      A language that lacks the tools necessary to implement such a regime needs GC, so the presence of GC may actually (as in the case of Java) indicate a fundamental weakness in the language.

      C++ solves a common but limited subset of the resource management problem and then just declares victory. And even that false victory is not very satisfying because in order to achieve it, C++ has sacrified runtime safety. (In fact, with that choice, it has also sacrificed efficiency, but you aren't going to believe that no matter what I say.)

    53. Re:Under the Rug by Euphonious+Coward · · Score: 1
      You have misunderstood the whole thing completely.

      How insightful to know precisely what I do and do not understand.

      Of course sloppiness in memory management (as practiced in all invisible GC systems) is far more easy to tolerate than mismanagement of other resources. Even yet, users complain legitimately about enormous memory footprints and code that spends most of its time idle, waiting on main memory. It may be that the majority of programs have no need to manage anything but memory, and for those, GC is the ideal solution, assuming (as is common) that performance doesn't matter.

      Some of us would like nice notation, but need practicality, and we don't get it from the languages that offer the nice notation. The fault lies with those who offer us languages with nice notation but without practical usefulness. (More fault lies with those who offer us neither, such as is embodied in Java.) Is practicality inherently contrary to pleasant notation? I say not.

      I will be the first to abandon C++ when a language that really is practically better comes along, because I have work to do. Pretenses don't help anybody but academics, and not them in the long run.

    54. Re:Under the Rug by Euphonious+Coward · · Score: 1
      Reference counting can be an inefficient way to do memory management if, in fact, it is the only way one does memory management. In practice, in C++ programs, only very few of the objects are reference-counted, so that argument is irrelevant, in context.

      ... most languages let you manually manage non-memory resources

      The goal is to enable encapsulation of resource management. In C++ one normally encapsulates management of all resources the same way. In most GC languages, as in C, you are left with no choice but to manage them manually, because you are offered no other alternative. That's the problem. GC is often an attempt to hide the problem "under the rug". In academia, where programs often manage no resources other than memory, the problem may easily remain hidden. Languages that are successful in academia have a way of failing to escape it, for sound practical (but avoidable!) reasons.

      We sorely need a replacement for C++, but academia has thus far failed to provide one.

    55. Re:Under the Rug by sartin · · Score: 1
      Destructors aren't "automatic". You have to manually free the memory,

      Funny you should use that word. Creating the TCPSocket as an "automatic" variable does, in fact, result in it being destroyed automatically when it goes out of scope.

      {
      TCPSocket socket(args);
      //...
      }

      Similarly, if you make it an instance variable of a class, it will be automatically destroyed (and therefore closed) when the instance containing it is destroyed. And if you feel you must use something pointer-like:

      auto_ptr<TCPSocket> sock(new TCPSocket(args));

      This gives transferrable ownership and the object pointed to will automatically be destroyed when the auto_ptr is automatically destroyed (at block exit or object destruction depending on context).

      C++ is filled with options that let the "Resource Aquisition is Initialisation" pattern be used with automatic resource release at automatic destruction.

    56. Re:Under the Rug by Spy+Hunter · · Score: 1

      GC doesn't preclude those constructs. It makes them unnecessary for memory management, and as a result they might be overlooked by bad language designers, but they are not precluded by any means. Did you mean to use a different word?

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
    57. Re:Under the Rug by Pseudonym · · Score: 1

      One more thing.

      A more fundamental problem is that memory is only one of many resources a typical industrial program must manage. GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually, as in C. (Java has this problem, for instance.) "Finalization" simply cannot provide the necessary guarantees.

      Agreed, with a few caveats. Note that I'm not a Java programmer, and I don't particularly like Java, but nevertheless:

      1. Java does not require you to manage mutexes manually. The synchronized keyword takes care of it.
      2. Java also provides finally, which does provide guarantees that finalization doesn't. It's not pretty, but it works for maybe 90% of your resource cleanup jobs.
      3. I think that the manual resource cleanup is not a fault of GC as such. The problem here, I think, is that the paradigms of "imperative programming" and "system manages stuff for you" are inherently incompatable.

      Further on the last point, I submit as evidence the classic example of a stack implemented as a Vector in Java. When you pop an element, you must null out the entry otherwise the element will remain referenced and hence not be GC'd. So even when you have garbage collection, if you're programming imperatively, you still need to do some memory management manually! Even with C++ and smart pointers, if you're implementing something like this (which, thanks to the STL, you probably don't have to) you still need to null out a smart pointer or destruct an object manually. (Note: C++ never claimed to be a garbage collected language, so this isn't false advertising on C++'s part.)

      OTOH, you almost never see declarative programmers (ML, Haskell, Prolog, Scheme, whatever) having to do this. That's why I think it's the mix of GC and imperative programming which causes these odd issues rather than just GC.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    58. Re:Under the Rug by Markus+Registrada · · Score: 1
      Lisps' with-open-file and the like are a good adaptation to the lack of more complete facilities. That you mention them, though, emphasizes that lack. Although they allow you to bound the use of a resource lexically, stackwise, they don't allow you to encapsulate lifetime other than lexically. In particular, they don't help you to (e.g.) open the file as part of a library function that returns an object, and arrange to have the file closed automatically when the object, some time later, is no longer needed, even if that object's own lifetime is lexically bound.

      This inability to encapsulate resource management is fundamental in Lisp, so no amount of tricksiness helps much. The ultimate result is that when a Lisp program manages resources, it manages them manually, pretty much like C.

    59. Re:Under the Rug by Markus+Registrada · · Score: 1
      you almost never see declarative programmers (ML, Haskell, Prolog, Scheme, whatever) having to do this. That's why I think it's the mix of GC and imperative programming which causes these odd issues rather than just GC.

      In fact, most programs in those languages never need to manage anything but memory. Academia is like that. When they do have to manage other resources, they do it manually, pretty much like C, and Java. The problem is not, at base, GC. C has no GC, and C programmers suffer the same resource management problems as functional programmers do -- with the same leaks, and the same ill-localized handler code. GC is an attempt (successful, for purely academic uses) to paper over the hole.

      Lisp's "with-this-and-that" and C#'s "using" construct also just paper over the hole -- they simply cannot help encapsulate management of resources that must live after the function that created them returns, so cannot be used to help write a library that yields constructs that depend on a limited resource, and manages it without further manual help from the client. Java's "finally" does no encapsulation either, it's just another notation for manual management.

      While GC isn't, at base, the problem, it does interfere with (somebody said "preclude") the constructs that are necessary to solve the problem. Tying resource lifetime to object lifetime is the only effective means I know of to enable a solution, the "resource acquisition-is-initialization" discipline. Splitting out memory management means that there is no place to bind the other resource management apparatus.

      The problem really is fundamental. No amount of writhing or wiggling can fill in the hole. The only question is, are you honest enough to look at the problem without blinking? In my experience, among functional programmers, Lisp people are the least willing to face it honestly, and the OCaml people are the most willing.

    60. Re:Under the Rug by Anonymous Coward · · Score: 0

      Most C++ compilers will issue a diagnostic for the example you gave, and most decent editors make the mistake obvious anyways. I'm not sure if you are actually trying to convince anybody that they shouldn't be using C++, but you come across as little more than a child throwing a temper tantrum.

    61. Re:Under the Rug by Pseudonym · · Score: 2, Informative
      In fact, most programs in those languages never need to manage anything but memory. Academia is like that.

      Ah, yes. Any language which isn't procedural is "academia only".

      There are many counter-examples, some of which I can't really talk about. Probably the most famous is Erlang, a functional language used to implement highly scalable, fault-tolerant telephone exchanges. Declarative symbolic languages like O'Caml are also widely used in bioinformatics, but somebody will probably dismiss that as "academia", too. (Admittedly, memory is usually the most critical resource that bioinformatics software has to manage.)

      Lisp's "with-this-and-that" and C#'s "using" construct also just paper over the hole [...]

      Actually, they solve a fairly large proportion of resource management problems. Many resource acquisitions really are scoped. I work writing highly scalable database servers for a living (in C++, incidentally) and I can report that for our domain at least, this is true most of the time.

      "Resource acquisition is initialisation" is an extremely good discipline for object-oriented-esque imperative languages, such as C++. Still, you have to remember that in C++, object lifetimes are still scoped! Even those which are stored on the heap are still, when you get down to it, managed by scoped objects. Those which aren't are, when you get down to it, managed manually. In other words, "resource acquisition is initialisation" doesn't free you from manual resource management, it just pushes it to a higher level. Another way to think about this is that it effectively ties related resources together, letting you manage them as an abstracted bundle.

      Still, most of the time, outside of "academia", the abstractions still leak. In our database server, for example, you still have to be aware of what locks are held when to avoid deadlock, no matter how much you abstract it. So even if you don't have to write code to manage resources manually, you still have to go through most of the thought processes that you would if you did manage them manually.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    62. Re:Under the Rug by rmstar · · Score: 1
      Most C++ compilers will issue a diagnostic for the example you gave, and most decent editors make the mistake obvious anyways.

      It was a simple example, meant to illustrate a point. One can construct arbitrarily subtle and convoluted examples. If you have programmed C++ long enough, you know that they come up just by themselves.

      I'm not sure if you are actually trying to convince anybody that they shouldn't be using C++, but you come across as little more than a child throwing a temper tantrum.

      If you don't like the message, attack the messenger. Anonymous Coward indeed.

    63. Re:Under the Rug by Markus+Registrada · · Score: 1
      Any language which isn't procedural is "academia only".

      Hardly. Still, academia's criteria for language merit certainly do discount the ability (and inability) to abstract and encapsulate resource management, as we have seen demonstrated repeatedly right here. A non-procedural language that also provides the tools to abstract out resource management is certainly possible. The problem is just that there aren't any yet. (Mercury might be an exception.) When a better language that can solve the problems that C++ does comes along, we will all embrace it eagerly. Unfortunately, Lisp ain't it, and neither is O'Caml, nor Erlang.

      Actually, ["with-this-and-that" etc.] solve a fairly large proportion of resource management problems.

      Of course, in C++ as well. However, let us not be distracted from the point: they remain utterly insufficient to encapsulate management of library implementation resources. We can invent dozens of other tricks that also don't solve the problem, but to mention them would only amount to more wriggling.

      Most abstractions don't leak. Some people like to say all do, but that is because they only notice the ones that do, and (in Joel's case) because they spend much of their time working around buggy vendor libraries. (Bugs leak.) When was the last time the quantum behavior of transistors leaked into your code?

    64. Re:Under the Rug by arkanes · · Score: 1

      I should rather say that the implementation of GC in the most popular languages that uses it precludes them. D (digitalmars.com/d) has automatic objects and GC and if it had the backing of Sun or Microsoft I think it'd be tremendously popular.

    65. Re:Under the Rug by DavidTurner · · Score: 1
      C++ solves a common but limited subset of the resource management problem and then just declares victory. And even that false victory is not very satisfying because in order to achieve it, C++ has sacrified runtime safety. (In fact, with that choice, it has also sacrificed efficiency, but you aren't going to believe that no matter what I say.)

      I agree that there are some resource management issues that are more difficult to deal with in C++ than in a garbage collected language (i.e. those for which shared_ptr and weak_ptr are not sufficient). However, these are very, very few and far between. Also, they tend to arise in situations where careful thought about the design is required in any case. I'd rather explicitly think about lifetime and ownership issues in these cases than just sweep the problems under the rug that is garbage collection.

      C++ is just as efficient and easy to use in the vast majority of memory management cases as Java or C#. What's more, it trumps those languages when it comes to managing other types of resources.

      For those who are about to point out C#'s using statement: yes, this is not a bad idea, but you still have to remember to use it, and it's still therefore error-prone. In C++, we can make our lock() and unlock() methods accessible only to the RAII proxy, thus ensuring that there is exactly one unlock() for every lock(). Try enforcing usage contracts like that in Java/C#.

      Which brings me to my next point: properly written C++ programs are much safer than Java/C# programs. One reason for this is that the C++ compiler is capable of forcing usage patterns in the manner described above. Another reason is that the judicious use of generic programming and encapsulated types significantly reduces the scope for the sins of ommission that are all too common amongst programmers.

      Yes, it's still far easier to produce a core dump in C++. On the other hand, it's also far easier to produce a provably correct C++ program of any complexity than to produce the equivalent Java/C# program.

    66. Re:Under the Rug by Anonymous Coward · · Score: 0

      Ah, but you say....you have to manually call closeSocket() to close the socket! True, but that's no different that your implementation. The only difference is that instead of calling 'mySocket.closeSocket()', I have to call 'free tcpSocket'. Either way, the programmer is responsible for managing the resource.

      If you're going to compare two languages, you might want to know both of them before you shoot your mouth off. Your C++ ignorance is showing.

      a) The function "free" doesn't call destructors. It is in the C library and therefore doesn't know WTF a destructor is.

      b) Since "free" is a function, parentheses are required to call it.

      c) C++ does not require you to dynamically allocate all of your objects.

    67. Re:Under the Rug by Pseudonym · · Score: 1
      Most abstractions don't leak.

      You're right. Let me expand on what I meant, because I don't think it came across well.

      The "with-this-and-that" approach can be made to encapsulate by defining it in such a way that it will work with abstractions. Haskell's "using" function does the job quite nicely, for example. An abstraction (in OO, it would be an object; in FP, it would most likely be a higher-order function) encapsulates the bundle of resources which you need for the operation.

      I said that "with-this-and-that" solves a large proportion of resource management problems. That's true. The 5-10% that they don't solve are inherently hard to solve to begin with.

      Let's try an analogy on for size.

      When you talk about multi-threaded programming with certain people, they will say, "multi-threaded programming has problems, so use a single-threaded event loop or multiple processes". They're almost right. Avoiding threads does solve 90-95% of syncrhonisation problems. If your problem is inherently sequential or each task has a quick turn-around time, then an event loop (or even better, coroutines) will avoid problems. If your problem is inherently parallel and SMP scalability is desired, multiple processes will avoid problems.

      What they fail to point out is that these are the easy cases. If your problem is somewhere in between there, then you have a hard concurrency problem which only multithreading can solve. They see complex solutions with leaky abstractions and assume threads are broken. No, threads are not broken. The problem was difficult to begin with, and so the solution is tricky.

      Similarly, with resource management, you do see leaky abstractions sometimes not because they're poorly designed, but because the problem is so difficult that every other candidate solution is worse. Once again, I work with highly scalable database servers, so this may be my bias coming into play. We deal with horrible concurrency problems (though not as horrible as some hard real-time programmers do) and our solutions are correspondingly complex.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    68. Re:Under the Rug by Pseudonym · · Score: 1
      Most abstractions don't leak.

      I re-read what I said, and I did indeed say something that I didn't mean to say. You're right; my apologies.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    69. Re:Under the Rug by hak1du · · Score: 1

      C++ is just as efficient and easy to use in the vast majority of memory management cases as Java or C#.

      But it isn't about ease of use. It's about safety. C++ isn't safe and it cannot be made safe without garbage collection.

      Which brings me to my next point: properly written C++ programs are much safer than Java/C# programs

      Safety is not a property of programs, it's a property of languages and runtimes. A safe language ensures that no piece of memory can be accessed or modified in a way that violates the type system, and it catches all such attempts. Any memory management error anywhere in a C++ program may result in the unexpected alteration of data in any module anywhere else in the C++ program.

      Which brings me to my next point: properly written C++ programs are much safer than Java/C# programs. One reason for this is that the C++ compiler is capable of forcing usage patterns in the manner described above.

      You may claim that proper use of C++ abstractions lets you avoid many bugs. That may or may not be the case, but most C++ programs contain large amounts of code over which you have no control. I don't want my bank statement to be off by $1000 because someone didn't abstract properly in the printer driver.

      And, in fact, most C++ abstractions are merely advisory and no guarantees. Just because you tie methods to an object that you intend to be stack allocated doesn't mean that I won't allocate it on the heap and fail to release the lock because of that.

    70. Re:Under the Rug by oliveira · · Score: 1

      Please explain to me exactly how you can implement a resource management system (or regime as you call it) in C++ for, lets say, managing socket connections, that has no equivlent in Java. You are aware of this method, right? (close() method)

      And your GC with automatic resource management goes out of the window right there! The point is GC is suitable for memory management only. But there's a lot more resources other than memory that requires managing. Well designed systems have to have some support for that too, otherwise you end up doing it with your bare hands (as in Java).

    71. Re:Under the Rug by oliveira · · Score: 1

      You can "early" dispose them using the dispose() methods we talked about before, but the GC is also smart. It takes into account relative costs of external resources (for GUI only, not for files etc) and will call GC pre-emptively when that cost gets too high.

      I'm enjoying! GC is not (yet) smart enough to handle any resources but memory. You should get to know how brushes/pens/other of such stuff management is implemented in .NET. There are separate small resource managers for that.

      An even if it is just a non-resource based class, I do new() on a class a few hundred times per program. Trying to find a delete() for each of those (especially if it is not located near the new() or if you have multiple execution paths etc.)

      You're obviously not familiar with smart_ptr Boost library or even RAII idiom. Manual new/delete is not a issue in C++.

      It amuses me that people not familiar with basic C++ ideology (like Web developers for example) tend to speculate about C++ deficiencies.

    72. Re:Under the Rug by voodoo1man · · Score: 1
      GC takes over memory management, but leaves the other scarce resources -- file descriptors, sockets, mutexes, database connections -- to be managed manually, as in C.
      What? Do you mean to tell me that all this time my Lisp implementation has been lying to me about collecting file descriptors, sockets and processes? But how can that be? My web server would have run out months ago!

      Oh yes, now I get it! You're an ignoramus troll who doesn't know what he is talking about.

      Believe it or not, file descriptors, sockets, and pretty much any other objects, are, well, objects, and so they have to reside in memory, where the garbage collector eventually comes to take them away. When it does so, it can tell apart a file descriptor from a pile of dung (please sir, will you spare me a few tag bits?), and call the appropriate routine to free the resource! Yes, the miracles of modern computers, indeed. Now please go away, before you embarass yourself much further.

      With the amount of responses this guy had, I am amazed that no one noticed this. Perhaps we're all a little too busy arguing about GCed languages instead of using them?

      --

      In the great CONS chain of life, you can either be the CAR or be in the CDR.

    73. Re:Under the Rug by DavidTurner · · Score: 1
      C++ isn't safe and it cannot be made safe without garbage collection.

      I'd be very hesitant to make such a bold assertion. Define "isn't safe"! As I said, C++ programs can be made a lot safer than Java programs. They also can be made a lot less safe. It depends on who's writing them.

      Safety is not a property of programs, it's a property of languages and runtimes.

      That sounds suspiciously like an advertising jingo to me. Safety is most certainly a property of programs. Either a program is "safe", or it isn't. How the program was developed is another matter: some languages certainly promote safe practises more than others. Java promotes some safe practises, but in the process rides roughshod over a lot of other safety considerations. This makes it good for some applications (where the first set of safety criteria is all-important) and less than good for others (where the second set comes into play). Does this surprise anyone?

      ...most C++ programs contain large amounts of code over which you have no control.

      That may or may not be the case. On the other hand, with C++ I have a lot of choice as to exactly which code that is not under my control I use. So "not under my control" is a relative thing ;-).

      And, in fact, most C++ abstractions are merely advisory and no guarantees. Just because you tie methods to an object that you intend to be stack allocated doesn't mean that I won't allocate it on the heap and fail to release the lock because of that.

      Once again, I'd hesitate before making such an assertion. Here is a code fragment that may make you think twice:

      class Foo {
      private:
      Foo() { }
      public:
      static std::auto_ptr<Foo> create()
      {
      return new Foo();
      }
      };

      As always, it depends on what you're trying to achieve, and most importantly, on who is writing the code. Use the right tool for the right job.

  7. Circular References by bcore · · Score: 3, Insightful

    Another flaw of ref counting is that if you have two objects which are no longer referenced by any of the active application, but which have references to each other, they will not get GC'ed, leading to memory leaks. Circular refs alone are just not good enough for any serious application, unless you force the programmer to look after cleaning up circular references, which kinda defeats alot of the benefit of using a GC'ed language.

    1. Re:Circular References by wormbin · · Score: 1

      if you have two objects which are no longer referenced by any of the active application, but which have references to each other, they will not get GC'ed

      This is completely incorrect.

      The mark/sweep algorithm will catch unreachable circular references--just read the article. All the popular java VMs and MS's .NET implementation GC unreachable circular references.

    2. Re:Circular References by Anonymous Coward · · Score: 0

      Mark/sweep is not reference counting.

  8. Re:Sigh. It's not a "feature" of other languages.. by QuantumG · · Score: 0, Troll

    God this is crap.

    --
    How we know is more important than what we know.
  9. Give me break by Great_Jehovah · · Score: 1
    How about some facts:
    • All the languages mentioned in the article are just as "stack-based" as C and Java (with the possible exceptions of prolog and haskell, depending on your definition.)
    • If you turn off the GC, all the GC'd languages will leak memory but continue to work "correctly".
    • Java has no means of managing memory sans GC.
    1. Re:Give me break by random_static · · Score: 1
      All the languages mentioned in the article are just as "stack-based" as C and Java (with the possible exceptions of prolog and haskell, depending on your definition.)

      there are stack-based (and RxRS-compliant) Scheme implementations out there? i thought it was tricky at best to implement continuations in a stack-based system, don't you have to allocate them on the heap? (or have i just proven how clueless i really am about Scheme compiler design...?)

  10. Re:Sigh. It's not a "feature" of other languages.. by jonadab · · Score: 3, Interesting

    > By "correctly," I'm specifically leaving out memory leaks.

    What a thing to leave out. Memory leaks are one of the hardest-to-track-down
    and most annoying kinds of bugs that we perpetually see in application after
    application. Okay, crashes are more annoying and pervasive, sure. And
    buffer overruns (which are not a problem in most languages that have GC,
    albeit GC is not the reason they're not a problem). But memory leaks are
    high on the list.

    > And in functional programming, you're creating functions on the fly.

    I'm trying to imagine a programming language that doesn't let you create
    functions on the fly but is powerful enough for writing real applications.
    The only thing I can come up with is that you could write what basically
    amounts to an interpreter so that you wouldn't have to write "functions"
    in the implementation language but could write them in the interpreted
    language instead. But that seems like a really ugly hack, just to avoid
    including real memory management in the compiler/interpreter/vm/whatever.

    It is possible to get around the need for closures (i.e., anonymous routines
    that hold references to otherwise-out-of-scope lexicals), if you have a
    sufficiently powerful object system. But again, it seems like a questionable
    goal; sometimes closures are really the most convenient way to accomplish
    something. (Sometimes they're not, of course... that's why I favour
    multiparadigmatic languages.)

    > So for all those languages, it's not an "ease of use" thing. It's a
    > "there's no way for a programmer to do even do it manually at all" thing.
    > GC is the only option.

    Strictly *theoretically*, the programmer can do all that stuff in any
    Turing-complete language; it's possible to do functional programming in
    8086 assembly language, for example, if you're willing to go far out of
    your way to do it. But in practice, neither assembly language nor C
    really makes that easy or practical, no. But then, there are actually
    quite a lot of things that those languages don't make easy or practical.

    --
    Cut that out, or I will ship you to Norilsk in a box.
  11. err, what? by Anonymous Coward · · Score: 0
    Please explain to me exactly how you can implement a resource management system (or regime as you call it) in C++ for, lets say, managing socket connections, that has no equivlent in Java.

    Dammit - I can't tell whether you are truely retarded or just trolling. But for anyone else who might be confused, the grandparent is referring to RAII (Resource Acquisition In Initialization), which ties the lifetime of a resource to the lifetime of the managing object. This is the part that requires deterministic finialization - i.e. destructors. (The assumption is that the managing object's lifetime is controlled via something like auto_ptr or some ref-counted smart pointer).

    1. Re:err, what? by jrumney · · Score: 1

      No, the grandparent specifically said that finailizers in Java do not meet his criteria, so that cannot be it.

    2. Re:err, what? by TheSunborn · · Score: 1
      Are you using finalize as destructor? That's not good. Just look at the documentation for finalize (As described by Sun in the Object class)


      The Java programming language does not guarantee which thread will invoke the finalize method for any given object. It is guaranteed, however, that the thread that invokes finalize will not be holding any user-visible synchronization locks when finalize is invoked. If an uncaught exception is thrown by the finalize method, the exception is ignored and finalization of that object terminates.


      Welcome to the world of leaked resources in Java(TM).
    3. Re:err, what? by Anonymous Coward · · Score: 0

      ugh. Learn to read. Or write clearly. Or something.

      I (the AC who posted "err, what?") was agreeing with my grandparent post (Markus Registrada: "Under the Rug") about Java being weak in this area compared to C++. Ie that finalizers were not enough.

  12. Dilbert GC by StarWynd · · Score: 5, Funny

    This is my kind of garbage collection!

  13. a way to give the GC a hint? by Doppler00 · · Score: 3, Interesting

    It might be useful if some languages had an optional method of hinting that an object should be garbage collected soon. This would help in languages like Java where you get a huge amount of data stored and then all at once the disk thrashes as it GC everything. For some algorithms, it would be nice to tell Java ahead of time that you're done with the object and you're not going to reference it anymore. The nice thing is though, it wouldn't be a requirement, so you wouldn't have to worry about deleting an object still in use by mistake. I wonder how efficient this would be.

    1. Re:a way to give the GC a hint? by Nasarius · · Score: 2, Informative
      Simply doing something like:
      foo = null;
      is sufficient. If you really want to, you can call the garbage collector manually:
      System.gc();
      --
      LOAD "SIG",8,1
    2. Re:a way to give the GC a hint? by metalix · · Score: 1

      or in .net:

      var = Nothing

      or manually

      var.Dispose()

    3. Re:a way to give the GC a hint? by iwadasn · · Score: 1


      This can be done with escape analysis, which java currently doesn't do, but hopefully will in the future. Basically, what happens is that the bytecode compiler analyzes the program and detects all pointers that don't leak, basically the rules are like this...

      A pointer in a function is "stackable" if it is...

      1) Never assigned to a static variable.
      2) Never assigned to a member variable of a non-stackable class.
      3) Never passed as a non-stackable argument to a function.

      And there might be other conditions as well, some function arguments might be semi-stackable, that is, stackable only if the pointer to the object itself is stackable, and no non-stackable variable is ever assigned from a member of the object. A semi-stackable pointer can be assigned to any member of the "this" pointer.

      In this way, for instance, Map.add(object o) would probably have a semi-stackable argument o, Comparable.compareTo(object o) should have a stackable pointer for o....

      At this point, all stackable objects can be allocated on the stack, and no garbage collection is needed for them. Just run their finalizers when the stack frame returns and discard them. You probably also have to check that the finalizer can't ressurect the object, but that should be rare.

      That should wipe out the advantage of value objects, and reduce the amount of garbage to be collected, probably cut it in half.

      Another possibility is to consider (I don't know the technical term for this...) object compounding. Basically any object that is internal to another object should just be inlined inside it. That would reduce the total number of objects by 1, and the pointer to the new object (someObject.blah) is just calculated as (someObject.this + blah_offset). As long as nobody can ever get this pointer outside of someObject, you don't have to track it separately. This could probably greatly reduce the number (but not the total size) of live objects, and thus accelerate garbage collection.

      Basically, what I'm getting at is that intelligent algorithms would have very little to gain (and a lot to lose) by listening to human input.

    4. Re:a way to give the GC a hint? by tcopeland · · Score: 2, Informative
      > you can call the garbage collector manually

      Unfortunately, this doesn't force it to run. From the JDK 1.4.2 Javadocs for System.gc():
      Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse.
      Note the "suggests". No guarantees...
    5. Re:a way to give the GC a hint? by wormbin · · Score: 1

      I've always wondered if the opposite would help. i.e. being able to say "never GC this allocation"

      Many of my programs consist of objects that exist for the life of the program and every time the generation GC hits I picture it scanning over all of these permanent objects, forcing page hits, swapping, and possibly thrashing.

    6. Re:a way to give the GC a hint? by hding · · Score: 1

      Well, a typical generational collector will promote the object to a longer lived generation after touching it a few times without collecting it (perhaps even just once), and many will allow you to simply allocate it into a longer lived (or even permanent) generation to begin with.

    7. Re:a way to give the GC a hint? by hak1du · · Score: 1

      It might be useful if some languages had an optional method of hinting that an object should be garbage collected soon

      Good garbage collected languages let the programmer give extensive hints to the garbage collector about locality, expected lifetime, desired latencies, mutability, and other relevant properties. That kind of functionality goes back to at least the early 1980's. Over the last two decades, however, some garbage collectors have gotten so good that they don't need a lot of those hints anymore.

      Whether giving hints to the GC would help Sun's Java implementation is hard to tell. It might or it might not.

    8. Re:a way to give the GC a hint? by napir · · Score: 1

      I've seen a lot of papers relating to escape analysis in Java. I've never seen one that gets a significant improvement in GC performance from stack allocation compared to a generational collector. (Note what I'm saying, because escape analysis can do wonderful things for removing lock/unlock calls in objects that never escape their thread.)

      Basically, it seems like allocating on the stack just isn't really cheaper than bump pointer allocation into the nursery. Both spaces also exhibit good locality, so your cache performance is similar. And the stack-allocatable objects don't live that long, so they never get copied out of the nursery, so their reclamation is free.

      Not to mention that a heavyweight escape analysis that finds a lot of candidates to stack allocate is generally pretty darned expensive, and that you're compiling Java at runtime, so the user sees the added compilation time.

      The "object compounding" you talk about is actually called "object inlining" like you hinted. It's a nifty idea, and you'll probably start seeing a lot more work done on the idea soon in the research community.

    9. Re:a way to give the GC a hint? by dvdeug · · Score: 1

      Note the "suggests". No guarantees...

      Pretty much every programming language runs on the as-if rule. The C code "a = b + c;" doesn't necessarily perform an addition and store; in theory, it could create a new closure a that includes references to the static values of b and c at that time and an addition operator, like might be done on a machine designed around lazy-evaluation functional programming.

      Even on a normal architecture with GCC, GCC could possibly change that to "a = b;" (if it knew c was 0), or delete it all together (if it was unused, or only used in an if statement that could be proved to be constant), or hoisted outside a loop.

      It's quite possible the gc won't run if it just run, or is currently running, or is below a certain memory usage.

    10. Re:a way to give the GC a hint? by tcopeland · · Score: 1

      > the as-if rule

      Yup. I think I was mostly responding to the original poster's claim that a System.gc() call would _force_ garbage collection, which it doesn't.

      I'd like to see exactly what System.gc() does, but, alas, it delegates to a native method in java.lang.Runtime...

    11. Re:a way to give the GC a hint? by iwadasn · · Score: 1


      I had a paper sitting around that discussed good escape analysis, can't seem to find a link to it now. Anyway, it suggested that the average performance gain was something like 20-40%, though that was mostly from synchronization elimination. Anyway, I can't imagine that the performance hit for escape analysis matters for a Server VM. When I'm starting up Tomcat it already takes two minutes to start. I don't really care if it takes ten minutes, if it makes it 40% faster. Seems to be true of most things. For instance, I don't restart jedit all that often, or Squirrel, or most of my other java apps.

      Anyway, pretty much agree with everything you have to say, though I think the startup time thing is an old saw. People wouldn't really care that much if their apps take a long time to start, so long as they run fast once they're started, I know I don't. This is within reason of course. A minute or two isn't devastating, especially if you can take most of it at boot time, or take it in bits and pieces as the app gets exercised.

  14. GC has always been efficient by hak1du · · Score: 3, Insightful
    however, [manual storage management] can be more efficient in many ways if properly handled. This discrepancy in efficiency has slowed the widespread adoption of the automated approach.

    There hasn't been a "discrepancy in efficiency". Good garbage collectors have been comparable to, or better than, manual storage allocators for decades.

    The perception of a "discrepancy in efficiency" has several causes:
    • Garbage collection allows programmers to get sloppy about storage managmentt: if a non-GC program gets sloppy about storage management, it crashes, if a non-GC program gets sloppy about storage management, it just runs slowly. Unfortunately, as a result, many core libraries in garbage collected languages are pretty sloppily written and slow--the fault is with the libraries, not with garbage collection.
    • Garbage collection allows language implementors to make different design decisions. Many garbage collected languages will do memory allocation every time you use a floating point number. Imagine how slow C would be if you called "malloc" for every floating point number.
    • Garbage collection often bundles memory management overhead into single chunks of time, while manual storage allocators don't. Furthermore, garbage collector implementations really rub your nose in it, printing messages like "[starting garbage collection... done]". But doing a lot of storage management at once is usually more efficient overall--in aggregate, manual storage managers spend more time, they just diffuse it out. However, both kinds of behaviors exist with both storage managers, and you can pick and choose.
    The article is right that garbage collection is a good choice today. It is wrong in that it has pretty much always been a good choice. Garbage collection could have been widely adopted in the 1970's or 1980's, and we would have saved ourselves a lot of headaches and troubles without any loss in efficiency.
    1. Re:GC has always been efficient by Brandybuck · · Score: 1

      Either I've had too much beer this evening, or your post is completely nonsensical. Given that I've had no beer this evening, I greatly suspect that latter. Unfortunately, since the moderators are on crack, I fear that your post will be moderated up.

      --
      Don't blame me, I didn't vote for either of them!
    2. Re:GC has always been efficient by Anonymous Coward · · Score: 1, Insightful

      Given that your response has zero technical content, you should consider the possibility that you are still so drunk that you don't even know how drunk you are.

      Maybe you should think your response through a little more and respond with some technical insight or point.

    3. Re:GC has always been efficient by HuguesT · · Score: 1

      What about real-time constraints? GC are generally non-deterministic (they start and finish according to their own rules), which might destroy your maximal response time in a RTOS. This is this very issue that has been the thorn in the side of GC adoption for the C and C++ standards.

      How about one of the earlier comments to the effect that mark-and-sweep type algorithms page-faults all the memory used by an application? That has got to be inefficient, and since virtual memory is not under the control of the application by definition there is nothing that can be done, except if the GC is directly under the control of the OS, which doesn't often makes sense (it's not very flexible then).

      The article itself says that there is no way to make a GC perform as well or better as a finely tuned hand-micro-managed in every case. The article being hugely in favour of GCs I'd take this comment as probably true. The advantage of GC is that it makes memory management easier, not necessarily more efficient.

      In languages that don't have GCs you can add one yourself (Bohm's GC works fine for C/C++, and is in fact used for GCJ, the GNU implementation of the Java language), with the benefit that you can turn it off if you don't want it for some reason, something you can't do in Java for example.

      Finally Memory is one but many of the critical resources that need to be managed in a program. GCs only manage memory. C++ teaches a nice way of working with all critical resources including memory: the Resource Acquisition is Initialization idiom. Worth learning about, and deterministic.

    4. Re:GC has always been efficient by ratboy666 · · Score: 1

      Micro-managed memory vs. GC Efficiency

      GC wins. And here is why...

      Most implementations of "micro-managed" memory use the allocate/free model. Programmers are very careful to allocate what they need, and free it when done.

      But... Allocation is usually very cheap. You have a big hunk of memory, and a "high water" mark. If the new allocation fits, just take it and advance the mark. Free is not so cheap. Blocks need to be coallesced (sp?).

      GC approach is to give the memory (same low cost as allocate), and simply NOT FREE the memory. When we run out of memory is the time that the "free" operation needs to be done. And only then. Also, only "live" memory needs to be dealt with. Everything else is "dead" (garbage). So, the "free" overhead tends to be less.

      Consider the following pattern:

      a = allocate
      b = allocate
      c = allocate
      d = allocate

      free c
      free a
      free d
      free b

      In a GC system, non of the free calls is done. a, b, c, d go out of scope or die, and when GC is needed, the dead memory is contiguous. No fancy "joining" needed.

      To achieve this is a "careful" non-GC system, the programmer must introduce object caches, and must know about the global behaviour of her program. Typically not the case.

      So, in general, GC actually beats "careful micro-managed memory".

      "Resource Acquisition is Initialization" idiom... Can easily lead to deadlocks. Pre-declaration of resources is arguably superior for your RTOS. Other approaches are available as well.

      Ratboy

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    5. Re:GC has always been efficient by hak1du · · Score: 4, Insightful

      What about real-time constraints?

      What about them? Real-time garbage collectors give you guaranteed real-time responses.

      I suspect that you have actually never used a real-time storage allocator of any form. The memory allocators that ship with major C/C++ compilers certainly make no real-time guarantees. The way people usually get real-time performance out of them is by pre-allocating large chunks of memory. Well, you can do in garbage collected languages as well.

      GC are generally non-deterministic (they start and finish according to their own rules),

      No, they don't. Just like with malloc implementations, their behavior may differ from implementation to implementation, but it is generally pretty well understood. It can usually also be controlled well.

      Simple garbage collectors only will start a garbage collection when you ask for a block of memory and it can't satisfy the request; they don't just start up for no reason at all. Parallel garbage collector may run a thread in parallel to the main program but never stopping it. Incremental collectors do a little bit of work each time you allocate. Real-time collectors guarantee well-defined maximum responses for allocation.

      If the garbage collector in your language (Java?) doesn't do what you want, it's not a problem with garbage collection in general, it's a problem with the specific implementation your vendor has chosen to give you. Just like there are mediocre or bad malloc implementations, there are mediocre or bad garbage collectors.

      How about one of the earlier comments to the effect that mark-and-sweep type algorithms page-faults all the memory used by an application? That has got to be inefficient, and since virtual memory is not under the control of the application by definition there is nothing that can be done, except if the GC is directly under the control of the OS, which doesn't often makes sense (it's not very flexible then).

      Well, that comment is wrong. First of all, you don't have to use a mark-and-sweep collector. Most high-performance collectors are, in fact, generational and are very VM friendly (moreso than malloc/free in many cases). Second, operating systems have interfaces to their VM subsystems, so the GC can, in fact, control what is happening with paging--prefetching pages, etc. And they do. Even 20 years ago, Berkeley UNIX had system calls specifically designed to let Franz Lisp let the kernel know what it was doing. Third, a malloc implementation cannot move pointers around to make accesses more local or sequential--good garbage collectors do, so GC is actually superior in that regard.

      The article itself says that there is no way to make a GC perform as well or better as a finely tuned hand-micro-managed in every case.

      You can "micro-manage" and "fine tune" in the presence of a GC as much as you can in its absence. But in the presence of GC, you have the freedom to be sloppy and your code will still run--so many people don't bother. In C/C++, you don't have a choice.

      In languages that don't have GCs you can add one yourself (Bohm's GC works fine for C/C++, and is in fact used for GCJ, the GNU implementation of the Java language), with the benefit that you can turn it off if you don't want it for some reason, something you can't do in Java for example.

      No, that is backwards. In languages without GC, you cannot add a GC and get all the benefits from the GC. Boehm's GC, for example, may retain arbitrary amounts of garbage, and its lack of integration with the language and compiler means that it can't be anywhere near as efficient as an integrated GC. Boehm's GC is a great hack, and it work really well, but it is not something you can ultimately rely on. Furthermore, if you add Boehm's GC to a language without GC, you are still left with an unsafe programming language.

      Secondly, languages with garbage collection often give you full control over the GC: you can enable it or disable i

    6. Re:GC has always been efficient by Brandybuck · · Score: 1

      Very well. One example:

      "if a non-GC program gets sloppy about storage management, it crashes, if a non-GC program gets sloppy about storage management, it just runs slowly." What is it? Does it crash or run slowly? It can't do both!

      --
      Don't blame me, I didn't vote for either of them!
    7. Re:GC has always been efficient by rikkus-x · · Score: 1

      There's a way to avoid many mallocs/frees all at the same time, which is to use a memory pool and have some object look after it. Allocations usually come from an already allocated pool of memory, deallocations are simply the setting of a flag.

      The object handling the pool keeps some free space in the pool so that a burst of new small allocations won't cause a 'real' malloc.

      This technique is used by KDE. It's all done behind the scenes, so as a developer, you don't need to worry about it too much.

      Rik

    8. Re:GC has always been efficient by Tom7 · · Score: 1

      A friend of mine did his thesis on a Garbage Collector that is parallell, concurrent, real-time and (mostly) tagless. We have it running in a real compiler. (Here's a later paper he wrote.) So real-time GC is absolutely possible.

      How about one of the earlier comments to the effect that mark-and-sweep type algorithms page-faults all the memory used by an application?

      Not all garbage collectors are mark-and-sweep. Also, copying GCs have much better fragmentation than malloc, so if you're concerned about keeping the working set small, a copying GC is absolutely the way to go.

    9. Re:GC has always been efficient by hak1du · · Score: 1
      That was a typo. Sorry. The correct reading (which I thought was still fairly clear from context, hence I posted no correction) is:
      if a non-GC program gets sloppy about storage management, it crashes, if a GC program gets sloppy about storage management, it just runs slowly.

      And "GC program" is shorthand for "program written in a language with garbage collection".
    10. Re:GC has always been efficient by hak1du · · Score: 1

      There's a way to avoid many mallocs/frees all at the same time, which is to use a memory pool and have some object look after it. Allocations usually come from an already allocated pool of memory, deallocations are simply the setting of a flag.

      Yes, and the same technique works in garbage collected systems and has been used for pretty much as long as garbage collectors have been around. Good garbage collectors actually give you extensive APIs to manage areas and pools in a way that the GC knows about and can help you with, but you can do it "by hand" in even the simplest of them.

      This technique is used by KDE. It's all done behind the scenes, so as a developer, you don't need to worry about it too much.

      It's unclear that it's a good idea for something like KDE to do this because many malloc implementations already do it for you. Building this sort of functionality in user code generally does a worse job than when it's integrated into the memory allocator.

    11. Re:GC has always been efficient by rikkus-x · · Score: 1
      It's unclear that it's a good idea for something like KDE to do this because many malloc implementations already do it for you. Building this sort of functionality in user code generally does a worse job than when it's integrated into the memory allocator.

      KDE uses dlmalloc, slightly patched to add some spinlocks. glibc uses dlmalloc now, apparently, so the speed advantage has probably gone away on Linux.

      Rik

    12. Re:GC has always been efficient by HuguesT · · Score: 1

      Thanks, this is a good reply, I still have more questions:

      1- It is true that even though I've programmed in RT environments, I've never used a RT GC, I'm just relaying the comment that both the C and C++ committee have delayed adoption of a GC standard for those two languages and that a serious issue is real-time performance. The C/C++ standards committees are not a bunch of idiots so I take the view that there must be something true in there.
      You are correct that for some hard real-time problems even using malloc is a no-go area.

      2- Language safety. I'm going back to the comment regarding other resources than memory. If you need a GC for your language to be safe then you need some other automated mechanism to close down open resources for you as well. GCs are not the be-all and end-all of language safety.

    13. Re:GC has always been efficient by hak1du · · Score: 1

      It is true that even though I've programmed in RT environments, I've never used a RT GC, I'm just relaying the comment that both the C and C++ committee have delayed adoption of a GC standard for those two languages and that a serious issue is real-time performance. The C/C++ standards committees are not a bunch of idiots so I take the view that there must be something true in there.

      And the pope used to say that the earth was a the center of the universe. Well-known people have their hangups, just like everybody else, and particular religions gather in their own churches.

      You are correct that for some hard real-time problems even using malloc is a no-go area.

      Using "malloc" is a "no-go area" in each and every hard real-time problems: "malloc" makes no guarantees whatsoever about latency. The only way you can call "malloc" in a hard real-time system is if you know its implementation and know that it actually makes hard real-time guarantees. But the

      If you need a GC for your language to be safe then you need some other automated mechanism to close down open resources for you as well. GCs are not the be-all and end-all of language safety.

      Oh, but it is. Leaking a file descriptor is not a question of runtime safety, it's just a bug. However, deallocating memory that is still in use has unpredictable and undefined consequences, and that is a violation of runtime safety.

    14. Re:GC has always been efficient by voodoo1man · · Score: 1
      I'm just relaying the comment that both the C and C++ committee have delayed adoption of a GC standard for those two languages and that a serious issue is real- time performance. The C/C++ standards committees are not a bunch of idiots so I take the view that there must be something true in there.
      I'm not familiar with the stance of the C and C++ committees on garbage collection, but "real-time" (however you choose to define it) considerations are the least of the problems of getting GC into C. hak1du already mentioned some of the faults of Boehm's collector. All these can be traced to the root of all evil: pointers and weak typing. The reason that Boehm's collector is called "conservative" is that it cannot reliably predict what is and isn't a pointer (it works on probabilities derived from heuristics), since in C on contemporary architectures there isn't a difference between one and an int. This means that some garbage may in fact go unclaimed, and even worse, there are some situations (but you really have to go out of your way to produce them) where legitimate data is wiped. In a "pure" garbage collected language, there is no concept of pointers or dereferencing - you always pass everything by reference (objects like integers are just made immutable). This is, IMO, the biggest benefit garbage collection has for the programmer (safety concerns are for the end users :)).
      I'm going back to the comment regarding other resources than memory. If you need a GC for your language to be safe then you need some other automated mechanism to close down open resources for you as well.
      Most GCed languages I've come across automatically close things like file descriptors and sockets. However, the good ones provide you with explicit commands to do so, since it's quite easy (on Unices, anyway) to use up all the file descriptors between GC runs.
      --

      In the great CONS chain of life, you can either be the CAR or be in the CDR.

  15. Feels like by nate+nice · · Score: 2, Interesting

    I feel like I just read a small section in the memory management section of an operating systems or programming languages text book. I'm not sure what to discuss here, no knew ideas were expressed or presented here. Perhaps the author could have postulated new ideas for memory management or suggested how current ideas could be improved. Interesting read if you're a programmer who never really got into the mechanics of a programming language and what certain runtime systems do to make your program work. Then again, I would probably call you a strict-scripter and when scripting you're generally more concerned with expressions rather than mechanics.

    Although, the point the author made about CPU's being cheaper and faster and how this is allowing the programmer to care less and less about mechanics so the can make use of this extra power to make programming a more expressive rather than mechanical practice is interesting.

    Personally, I see no problem with one day having high level application programmers who know nothing of hex, memory management or physical hardware but rather algorithms, computability and productions, etc. Of course, there will always be a place for the "computer programmer", but also a place for the "analytical abstractionist engineer".

    --
    "If you are a dreamer, a wisher, a liar, A hope-er, a pray-er, a magic bean buyer ..."
    1. Re:Feels like by Just+Some+Guy · · Score: 1
      Personally, I see no problem with one day having high level application programmers who know nothing of hex, memory management or physical hardware but rather algorithms, computability and productions, etc. Of course, there will always be a place for the "computer programmer", but also a place for the "analytical abstractionist engineer".

      A lot of us already work in that space. I do happen to know my way around bare metal, but I haven't done any low-level coding in a long, long time. I'm currently writing a lot of Python that runs within a Zope server, and none (repeat: none) of my programming involves hex, memory management, or physical hardware. Instead, whenever we find a bottleneck, I analyze the high-level algorithms to find non-obvious design characteristics that can be remodeled.

      I have tens of thousands of Python LOC (would be hundreds of thousands of C LOC) online and used by customers every day, and I have no idea what the memory profile of any of it would look like. I know that the system never touches swap, and that the whole site is robust and responsive, but beyond that, I simply don't care anymore.

      I've put in my time at the hardware level. However, that's not what I want to be doing for the rest of my life, and with the current crop of programming languages and design environments, I don't have to.

      --
      Dewey, what part of this looks like authorities should be involved?
    2. Re:Feels like by nate+nice · · Score: 1

      Yeah, Python is a personal favorite of mine as well. It's very expressive in that I tell a computer what to do, not how. I don't really have anything else smart to say, but I would agree with you Python is such a language that allows you to get on with it and still look under the hood if necessary. It's the language I use more often now days and I'm still blown away from time to time at how easy and natural it is to do most anything where as in my C++ life I had to spend so much time writing so many basic algorithms (and yes, I know the STL very well) or search for some kind of library that works nothing like my other libs and will take time to conform to the authors great ideas of programming. In Python, you get it all and it's all predictable (this is a good thing) and I only hope I am as lucky as you to find work using it withen the domain of other smart language systems.

      --
      "If you are a dreamer, a wisher, a liar, A hope-er, a pray-er, a magic bean buyer ..."
  16. The extreme situations are the only ones ... by taigu · · Score: 1

    The extreme situations are the only ones that are valuable. If you are not coding an "extreme situation," your job is outsourced. Any application that can tolerate garbage collection is trivial. Thanks anyway -- I'll stick to C and assembly. At least I will have a job tommorrow.

    1. Re:The extreme situations are the only ones ... by Tom7 · · Score: 2, Interesting

      Any application that can tolerate garbage collection is trivial. Thanks anyway -- I'll stick to C and assembly.

      Wow, with this attitude I can see why you are worried about keeping your job.
      Did you know that GCC uses a garbage collector? They found it too difficult to manually manage memory. Is GCC a trivial application?
      In my program we write loads of decidedly non-trivial software all of the time that not only tolerates, but benefits greatly from GC.

      Garbage collection is not appropriate for every task, but to assert that all tasks worth doing demand C and assembly is ridiculous.

    2. Re:The extreme situations are the only ones ... by cheesybagel · · Score: 1

      So *THAT* is why GCC is so slow and a memory hog. ;-)

    3. Re:The extreme situations are the only ones ... by Just+Some+Guy · · Score: 1
      I'll stick to C and assembly. At least I will have a job tommorrow.

      Most schools will continue to let you "work" there as long as you keep paying tuition.

      On the other hand, I can't think of a single programmer I respect that doesn't understand "different tools for different jobs".

      --
      Dewey, what part of this looks like authorities should be involved?
  17. The GC pitfall by jtheory · · Score: 4, Insightful

    Good article, though very limited in scope (basically just a list of GC methods, wrapping up with the methods used by recent Java and .NET interpreters). I was a little disappointed that they didn't get into the implications of using languages with GC.

    One pitfall that I've noticed basically comes along with the benefit of avoiding "micro-managed" explicit memory management -- there are a lot of Java coders who don't think at *all* about memory management, because they think it's all handled for them. Mix that in with an over-excitement about OO, and you get some impressively slow and non-scaleable code.

    You DO need to understand, at least on a basic level, what's going onto the heap, and what the garbage collector has to do to keep up with your "garbage". Carefully nulling out objects that are going to be out of scope in a millisecond is just wasting space, but you should definitely keep an eye on what objects you're allocating within that loop that runs a million times. They're all going on the heap; are they all going to be on there at the same time? When are they going to be eligible for collection? Are they just Strings, or larger objects (which possible create other objects when they are created)?

    If you have to optimize a section of code, consider sticking to primitives and Strings (obviously you're balancing this against the cost of possibly less-maintainable code!), and don't forget that when you instantiate com.foo.Bar, all of its superclasses are also instantiated, including any member objects they hold. And don't make a variable static for no reason -- it won't get collected with the object instance....

    Two useful things to think about -- heap size (the objects you're actively using at a given moment, so they can't be collected), and churn rate (how fast you're creating and trashing objects). Object creation/destruction isn't as costly as it was with the early versions of Java (no, you probably don't need that Thread pool!). But any application that needs to scale requires some thought on memory usage and churn before you start coding.

    --
    There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
    1. Re:The GC pitfall by Antity-H · · Score: 1
      But any application that needs to scale requires some thought on memory usage and churn before you start coding.

      While I agree with your point, you might want to (re)read this previous discussion.

      Do applications _need_ to scale ? well, yes of course a very restricted set of applications do need to scale, but for most applications, being developped by
      [... Java] coders who don't think at *all* about memory management, because they think it's all handled for them
      is perfectly alright, because in practice the result is that memory management actually _is_ all handled for them.
    2. Re:The GC pitfall by StrawberryFrog · · Score: 2, Interesting

      there are a lot of Java coders who don't think at *all* about memory management, because they think it's all handled for them. Mix that in with an over-excitement about OO, and you get some impressively slow and non-scaleable code.

      While you are entirely right, this is no differnt from previous generations of programming languages. You always do better if you have a bit of understanding of the wiring behind the board.

      I'm sure that there were objections to high-level languages by assembler coders who objected that "there are a lot of C coders who don't think at *all* about the assember generated" and "there are a lot of C++ coders who don't think at *all* about the pointers behind those object refs"

      --

      My Karma: ran over your Dogma
      StrawberryFrog

    3. Re:The GC pitfall by jtheory · · Score: 1

      While you are entirely right, this is no differnt from previous generations of programming languages

      Fair enough, though if you don't understand memory management in C you will know it, because you'll have massive memory leaks or serious, noticeable bugs.

      I think I mistakenly gave the impression that I don't like GC. I love it, and I think it's definitely worth the few drawbacks -- I just wanted to point out that it's not a silver bullet. And I do get frustrated with inexperienced programmers who speak scornfully of how slow Java is compared with C or C++, because their own class projects were slow...

      --
      There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
    4. Re:The GC pitfall by jtheory · · Score: 1

      of course a very restricted set of applications do need to scale, but for most applications being developed by [novice coders it is] perfectly alright, because in practice the result is that memory management actually _is_ all handled for them.

      Yeah, I think I clicked "Reply" with the idea of giving a few GC pointers, and ended up with a weird kind of elitist rant. I love GC, and it's totally worth the few tradeoffs (which are more education issues than anything else).

      You might be minimizing the number of projects that need some level of scaleability, though. Any server-centered technology (i.e., interactive website) is based on scaling -- thin client, all the work done on the server, that sort of thing. Though I guess it comes down to this: I have seen a lot of projects coded by people who probably weren't ready for them yet... but that happens everywhere, and that's why projects have a crazy failure rate, and it isn't a problem specific to GC in any way.

      Eh... I should avoid those late night posts.

      --
      There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
    5. Re:The GC pitfall by StrawberryFrog · · Score: 1

      Fair enough, though if you don't understand memory management in C you will know it, because you'll have massive memory leaks or serious, noticeable bugs.

      Wewell, you might not know. The program I'm currently working on had moderate memory leaks, because most of the engineeers working on it mostly understood how to use a non-GC OO language. Everybody suspected, nobody knew, nobody was quite sure of the impact, until I used a tool to locate the memory leaks and fix them.

      --

      My Karma: ran over your Dogma
      StrawberryFrog

    6. Re:The GC pitfall by angel'o'sphere · · Score: 1

      Hm ...

      I'm not convinced ....

      Most of the time you absolutely do not need to consider anything you preach here. First reason: all objects in Java live on the heap. So you simply can't do anything about it.

      Second: what have static variables to do with garbage collection? Static variables get collected just like non static ones.

      Third: why does everybody repeat the myth of "carefully nulling references"? You assign null if you WANT the reference to be null, and you don't null to assist teh GC. Nulling does not assist, it only costs runtime ... and likely thats why you claim Java si slow.

      (no, you probably don't need that Thread pool!) This one beats it all. Sure you need a thread pool. Creating a new thread, does not only allocate a thread object, but it likely makes a kernel call to get a new thread handle. You pool to avoid the kernel call. Not to avoid the memory allocation/deallocation.

      angel'o'sphere

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    7. Re:The GC pitfall by jtheory · · Score: 1

      I can clarify some of my points, and give examples. None of these are hard and fast rules, by the way -- it all does depend on what you're doing, and what you need. I do NOT advocate making your code unmaintainable for the sake of questionable optimizations.

      Most of the time you absolutely do not need to consider anything you preach here. First reason: all objects in Java live on the heap. So you simply can't do anything about it.

      Correct, many times you *don't* need to worry about memory in Java at all, especially in client-side applications. Server-side, though, it often helps to know a little more. I'm not sure what you mean by "there's nothing you can do about" the objects on the heap... your design can have a large effect on how many objects are created, how big they are, and how long they stay on the heap. If you are processing a massive resultset from a sql query, creating a complex hierarchy of objects that each knows how to process a certain type of row, and creating one for each row, will throw a lot of objects onto the heap, and you'll get a lot more churn. If you store each one in a collection to process after you're done reading, then the garbage collector can't touch them yet, and instead of churn, your heap just grows a lot.

      Static variables get collected just like non static ones.
      Think about a static references to a large object. Even when all instances of class "Foo" have been GC'ed, that reference to "bigThing" remains, and bigThing is not eligible for collection, until you specifically set Foo.bigThing = null. It's not necessarily bad -- maybe there are lots of instances of Foo that all use bigThing, and you want to keep it on the heap.

      Third: why does everybody repeat the myth of "carefully nulling references"? You assign null if you WANT the reference to be null, and you don't null to assist teh GC. Nulling does not assist, it only costs runtime

      I'll check my post again, but I think I said that people who carefully null references are *wasting their time*. Nulling out variables *technically* does affect GC (because an object is "dead" and ready to collect when all references to it are gone), but there's usually no reason to null a local variable for this reason, because it's going out of scope in a few millisecs anyway. ... and likely thats why you claim Java si slow.
      Java isn't slow at all! I do most of my server-side work in Java, and it's pretty zippy and very scalable. Anyway...

      This one beats it all. Sure you need a thread pool. Creating a new thread, does not only allocate a thread object, but it likely makes a kernel call to get a new thread handle. You pool to avoid the kernel call. Not to avoid the memory allocation/deallocation.

      I said "probably". If you're writing a webserver or something like that, which otherwise would create scads of threads and only use them for a short time, yes. (Object pool, never -- but thread pool, sometimes). In most other applications, a thread pool is just adding complexity (especially if you write it yourself).

      How long do you think it really takes to allocate a new Thread (system call, memory and all)? It's less than a millisecond; on my old laptop it's about 1/5th of a millisecond. Try it out. Even if your app creates a new Thread every 10 seconds, you're probably wasting your time with a thread pool, and you should be optimizing your SQL instead.

      --
      There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
    8. Re:The GC pitfall by Salamander · · Score: 1
      Sure you need a thread pool. Creating a new thread, does not only allocate a thread object, but it likely makes a kernel call to get a new thread handle. You pool to avoid the kernel call. Not to avoid the memory allocation/deallocation.
      If your program uses lots of short-lived threads there's something fundamentally wrong with it that thread pools won't fix. Threads are often used as a crutch, where an event-based model running on one or more long-lived threads (not tied to specific operations) would perform much better. I've written about this extensively on my own website, most notably in my server-design guide.
      --
      Slashdot - News for Herds. Stuff that Splatters.
    9. Re:The GC pitfall by angel'o'sphere · · Score: 1

      Sorry,

      I was slightly offtopic, mixing two posts contents :D

      You did not claim that nulling references is important. But for static references you suddelny shift level.

      Of course, a object only gets collected when it is no longer reachable. If any reference, static or not, remains reachable from the applications root objects, that object is not collected.

      So, you likely want to say that having static references might be a cause for programm errors/memory leaks?


      but there's usually no reason to null a local variable for this reason, because it's going out of scope in a few millisecs anyway. .


      Yes, indeed, but it is also not necessary to null away attribute references of other objects. Some people seem to think in a GC environment you have to manually null every reference to an object .... or it wont get collected soon enough.

      angel'o'sphere

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    10. Re:The GC pitfall by jtheory · · Score: 1

      Ah, now I see where you're coming from.

      Of course, a object only gets collected when it is no longer reachable. If any reference, static or not, remains reachable from the applications root objects, that object is not collected.

      Right. The tricky aspect of static references that I'm trying to point out is that they're *always* reachable by those root objects, so a static reference must be explicitly nulled to free the object it refers to. This is different from normal references in object instances, which are collected along with the instance. For example -- java.awt.Color has static members white, black, etc. - each of the static references refers to an instance of the class Color. Once java.awt.Color is referenced anywhere, all of those objects will be created, and they won't be destroyed until the JVM shuts down; it doesn't matter if the instance of Color that you were using is garbage collected. See what I'm talking about? You can reference Color.white from anywhere, so it can't be collected.

      So, you likely want to say that having static references might be a cause for programm errors/memory leaks?

      It can be, if the coder doesn't understand what the static modifier does. I've seen code where the member variables of a class were declared static for no reason I could see, since they were accessed as if they were non-static. This caused strange errors occasionally -- when two separate instances of the class found themselves modifying the same member object (because it was static, there was only one shared by all instances). If this error were widespread, and lots of objects had these mistaken static references, then memory usage could be dramatically increased because even though the instances were GC'd, their members never could be (because they were static).

      This isn't a common problem, I don't think. It just came to mind as one of the many ways your code choices will affects what the VM can do with memory.

      Yes, indeed, but it is also not necessary to null away attribute references of other objects. Some people seem to think in a GC environment you have to manually null every reference to an object .... or it wont get collected soon enough.

      Correct. Assuming all non-static references, once an object is no longer accessible, that means its members aren't accessible either, and the whole thing is eligible for collection.

      --
      There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
  18. Education? by Arngautr · · Score: 1

    So if we start using GC for everything will this yield programmers who produce sloppy code and don't know about memory management? If so, will it matter?

    Admittedly I haven't used a language with native GC for a few years, but I don't have fond memories of it.

    For small programs ie less than 2000 or so LOC I would preffer no garbage colection so compiler directives would be nice, some object property that allows me to chose whether to place the object into the list of objects that may need GC would also be nice.

    1. Re:Education? by adamofgreyskull · · Score: 2, Insightful

      There are Godlike Real Programmers and there are sloppy programmers. I don't think that GC will automatically make everyone a sloppy programmer. If it stops some of the sloppy programmers from creating applications with huge memory leaks, isn't that a good thing?

    2. Re:Education? by renoX · · Score: 1

      Unfortunately you can have memory leaks with a GC: what if you are sloppy and don't put some 'ref = 0;'
      in your program?

      You have a memory leak..

    3. Re:Education? by BCoates · · Score: 1

      I recently used a php library that keeps track of objects by appending them to an (global)array, and passing the array index around to various functions that need to access the data. This array never emptied or deleted from until the end of the php session, so it basically creates its own gc-defeating memory space, with a giant memory leak.

  19. Re:Sigh. It's not a "feature" of other languages.. by devphil · · Score: 1


    I knew some kid was going to start bitching and moaning about the memory leak comment. I'm not saying they're not important. I'm saying that one has nothing to do with the other.

    I'm trying to imagine a programming language that doesn't let you create functions on the fly but is powerful enough for writing real applications.

    C, C++, Java. None of these support closures or lambdas. C++'s Boost makes a good try, but none of them allow me to construct new functions using nothing more than standard language features.

    --
    You cannot apply a technological solution to a sociological problem. (Edwards' Law)
  20. Re:Sigh. It's not a "feature" of other languages.. by Anonymous Coward · · Score: 0


    This is insightful?

    If you have something to say, pull your head out of your ass and say it.

  21. Reference counting by Antity-H · · Score: 3, Informative

    It was mentionned earlier that reference counting was pretty good, but had a few drawbacks when it came to cycles and multi-threading.

    I took a bit of time to go and read Wikipedia's page

    In the description they give, they mention that reference counting GC can represent managed objects by directed graphs.
    I know there exists algorithm to find cycles in such graphs. So I suppose these could be applied to this problem. Other proposal are to use a tracing GC to detect them. To which it was replied that this would be able to reclaim the memory but not to properly finalize the objects. I don't see why that would be true. I mean, if you have found a member of the cycle to be collected, can't you just finalize that one and let the whole cycle unravel itself ? If there are cycles inside that cycle, just do it again on these etc ...

    As I said, another common objection was the cost of updating the counters in multithreaded environnments. Multiple solutions have been proposed, some more portable than others (using processor/platform specific atomic increments, or deferring the update until it is really necessary and using the standard mutex protection)

    All this said, I try to understand a couple of things.
    -I am no genius, thus these ideas must not be new, what is the problem which can't be solved with these?
    -Reference counting seems to integrate better in the runtime of the program. All the other techniques proposed seem to imply some monolithic operation on the memory summing up all the overheads at on time and doing the cleaning once in a while, with the possibility of becoming a bottleneck in heavily loaded systems. Reference counting OTOH seems to allow the cleanup to continually add a little bit of overhead to the system but nothing which will bring the whole thing to a grinding halt before allowing it to go on. What have I missed?

    1. Re:Reference counting by hak1du · · Score: 1

      It was mentionned earlier that reference counting was pretty good, but had a few drawbacks when it came to cycles and multi-threading.

      Reference counting is not a "pretty good" technique: it is very hard to implement well, and the implementations everybody uses are inefficient and don't even handle the general case.

      In the description they give, they mention that reference counting GC can represent managed objects by directed graphs.
      I know there exists algorithm to find cycles in such graphs. So I suppose these could be applied to this problem. Other proposal are to use a tracing GC to detect them.


      The two solutions amount to roughly the same thing.

      Reference counting OTOH seems to allow the cleanup to continually add a little bit of overhead to the system but nothing which will bring the whole thing to a grinding halt before allowing it to go on. What have I missed?

      Normal reference counting does not at all release things "continually": when you lose the last reference to a huge, complex data structure, you may end up doing millions of deallocations at once. To cope with that, you need to do a lot more work. And when you are done, you end up with an implementation that's complicated and still doesn't solve the general case (cycles), so you need a garbage collector anyway.

      The simplest available algorithms to do gradual cleanup of memory are, in fact, certain kinds of garbage collection algorithms.

  22. Quicksort (OT?) by hummassa · · Score: 1

    the reason quicksort is great is that its most internal loop is very tight... two incs, a cmp and maybe a swap data. all others sort algorithms have more complicated inner loops. so, even if the complexity is O(n log n) in the medium case and O(n^2) in the worse case, quicksort wins because its constant k*(n log n) is smaller than, p.ex., mergesort's j*(n log n)

    --
    It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
    1. Re:Quicksort (OT?) by Anonymous Coward · · Score: 0
      > reason quicksort is great is that its most internal loop is very tight

      The average case of the qsort you mention is only 5-20% faster than a good heapsort or mergesort. However, there is a faster version of qsort which uses iteration instead of recursion and has a two-sided pivot (hint: it's quite a bit more complicated). It runs around 40% faster than a performance tuned heapsort or mergesort.

      The main advantage of qsort is the fact that it doesn't move its elements very much (average case). Thus a good qsort implementation can "short circuit" when the list is already sorted, but heapsort/mergesort still have to move all of the elements (twice).

  23. WOW by Anonymous Coward · · Score: 0

    Wish I had mod points for you C++ boys. It's really lame how slashdot is populated with Java-advocating idiots. I mean, I'm no Java lover, but I gotta feel sorry for the language when it's proponents are retards...

  24. Finalization by Latent+Heat · · Score: 1

    GC has the problem of non-deterministic finalization -- with reference counting, every time you give up a lock on an object (decrement the reference count), you check to see if the reference went to zero so not only can you release the object, you can invoke the object destructor to close file handles and stuff like that.

  25. Relevant references by hding · · Score: 2, Informative

    A couple of relevant references for garbage collection are the following website (which unfortunately hasn't been updated for a while - still, it's useful):

    The Memory Management Reference

    and of course Jones and Lins book, Garbage Collection: Algorithms for Automatic Dynamic Memory Management

  26. Re:Sigh. It's not a "feature" of other languages.. by jonadab · · Score: 2, Insightful

    > > I'm trying to imagine a programming language that doesn't let you
    > > create functions on the fly but is powerful enough for writing real
    > > applications.
    >
    > C, C++, Java.

    [Scratches Java off list of languages to learn.]

    I know C and C++ have been traditionally used for writing applications, but I
    have long been of the opinion that they're not really powerful enough for the
    job. It takes several times as many programmer-hours as it ought to to do
    anything, from prototyping to new feature work to debugging, which IMO means
    that "powerful enough" is a real stretch. These languages get by and continue
    to be used at this point mostly because a lot of people know them.

    In the past, these languages were selected because programmer time was cheaper
    than computer resources (with which they're more miserly than a higher-level
    language), but that's no longer anywhere near true, as the article points out;
    the *average* computer has enough RAM to run three horribly-inefficient
    extreme memory-hog applications at the *same time* without needing any swap,
    and newer models are coming with more and more. You talk about GC screwing
    up virtual RAM algorithms, but it's really not an issue on most systems; if
    a process grows to three or four *times* the size it needs to be, it doesn't
    actually have any user-noticeable impact on performance. Memory leaks are
    actually much worse, because in that case the wasted memory doesn't ever get
    collected and eventually it becomes a problem, after a couple of hours of
    use. (Actually, a very small memory leak can go for days without being a
    problem, but those aren't the ones we notice so much.) In 1996, when most
    consumer-grade operating systems were so stable that you had to reboot every
    few hours, memory leaks weren't such a big deal (provided you had lots of
    swap space), but now that almost any modern OS (and most applications) can
    run for weeks and weeks if not months or even years without being restarted,
    memory leaks are now a big deal. It's okay to continually use five times as
    much RAM as you technically need; it's not okay for your memory requirements
    to keep growing as a function of how long you've been running, because that
    can get to be *way* more than five times what you need.

    Back to creating functions on the fly, I'm just a little bit surprised to
    learn that Java doesn't have such an important feature; I had been lead to
    believe it was a relatively high-level language with fairly high-level
    features. It runs on a virtual machine, for crying out loud; I had imagined
    it would be fairly modern and flexible in its design. Are you sure it can't
    create functions on the fly, or is that just something you don't know how to
    do in Java? That's a pretty serious accusation to level at a language,
    almost as bad as saying it can't allocate extra memory on the fly.

    --
    Cut that out, or I will ship you to Norilsk in a box.
  27. Re:Sigh. It's not a "feature" of other languages.. by myzz · · Score: 1
    "> And in functional programming, you're creating functions on the fly.

    I'm trying to imagine a programming language that doesn't let you create functions on the fly but is powerful enough for writing real applications."

    In most functional languages you can write something like this:

    (* This function takes a function f as argument,
    applies it to number 2 and prints out the result *)
    let print_f_2 f = print_int (f 2);;

    (* This function sums two integers *)
    let plus a b = a + b;;

    (* integer x is read from stdin *)
    let x = int_of_string (input_line stdin) in

    (* function plusx is created, which adds value of x to its argument *)
    let plusx = plus x in

    (* plusx is given as argument to print_f_2,
    which in turn cause to print x + 2 *)
    print_f_2 plusx;;
    In this OCaml code, the plusx is created "on the fly" and it is different function, depending on the value of x that is read on runtime. How do you do this in C ?
  28. Re:Sigh. It's not a "feature" of other languages.. by Urkki · · Score: 1
    [memory leaks]
    • What a thing to leave out. Memory leaks are one of the hardest-to-track-down
      and most annoying kinds of bugs that we perpetually see in application after
      application.

    Well, there are plenty of applications that never need to (or should for optimal performance) release any memory, only shuffle it around, and have well defined points for releasing other resources (such as closing files and sockets) that can't be left to be done in the background. Such applications have no fear of memory leaks (well, mostly anyway, you can always screw up with data structures and pointers).

    • I'm trying to imagine a programming language that doesn't let you create
      functions on the fly but is powerful enough for writing real applications.

    This I don't quite understand. Any compiled language by definition can't create functions on the fly, every functions needs to be compiled before the program is run. So what do you mean by "creating functions on the fly", actually?
  29. Java doesn't have *a* garbage collector by blamanj · · Score: 3, Informative

    It has different collectors, which you can select according to the needs of your application. Currently there are two, the default collector (generational) and an incremental collector which is slower but less likely to pause.

    Also, the default collector is a 3-generation one, not 2, at least as of Java 1.4.1. More details here.

  30. Re:Sigh. It's not a "feature" of other languages.. by jonadab · · Score: 2, Interesting

    > Any compiled language by definition can't create functions on the fly

    This is flat-out false. There are various compiled languages (compiled as
    in compiled to native machine code, yes) that not only allow creating functions
    on the fly but actively encourage it. Common Lisp is just one example. Yes,
    garbage collection gets compiled in. (This is no weirder than compiling a
    memory-management library into a C program, and actually being standardized
    is an advantage.)

    Besides that, the whole compiled-versus-interpreted-languages argument is
    getting fairly blurry these days. It's no longer as simple as C and C++ on
    the one extreme, which take hours to compile and then run on systems that
    don't even have a compiler, and BASIC on the other extreme where you can stop
    the program while it's running, change some variables and maybe some lines of
    code, and set it running again (possibly at a different line) in-progress
    with the state intact. There are all kinds of in-between cases now, Perl
    and Java and Python and so on, which technically are both compiled and
    interpreted or neither or somewhere in-between. Java runs on a virtual
    machine, okay, and Perl6 will, but what do you do with Perl5 and others like
    it, which don't really run on a vm per se but have separate compile-time and
    run-time phases yet allow more code to be compiled later at run time (through
    eval and things like it), ... and then there's JIT compilation... and then
    you have compilers that take languages designed to compile to a virtual
    machine and instead compile them to native machine code for a specific
    platform...

    --
    Cut that out, or I will ship you to Norilsk in a box.
  31. those who claim it can't be done... by random_static · · Score: 1
    Any compiled language by definition can't create functions on the fly

    i wish people would learn LISP before claiming LISP is impossible.

    the trick is to have a runtime system that includes a parser and compiler for your language; it can then compile any newly created functions on the fly for you. it's not as grotty as it sounds - we've got several decades of experience with it already, most of the bugs got ironed out back fifteen years ago or so.

    of course, an alternative method is to not have your language be compiled, or be at most bytecode-compiled; interpreters and byte-code compilers are often a bit lighter-weight than a "full" native-language compiler, so don't burden you with quite as large a runtime library. whether the gain is worth it is a bit debatable, though.

    1. Re:those who claim it can't be done... by Urkki · · Score: 1
      • i wish people would learn LISP before claiming LISP is impossible.

      Point taken, I was inaccurate. By compiled language I meant basically non-interactive languages, ie languages that you can't make a meaningful interactive interpreter for (without changing their syntax at least a bit).

      Though if you really wanted, you could do runtime compilation even in C. Just output C code, compile it into dynamic object, load it and call it. Just wrap doing all that into a nice function, and it's even simple to use, and could only compile any given piece of code once and then just re-execute it if exactly same code is asked for again etc. Of course not practical, but possible.
  32. Re:Sigh. It's not a "feature" of other languages.. by tyrecius · · Score: 1

    Are you sure it can't create functions on the fly, or is that just something you don't know how to do in Java? That's a pretty serious accusation to level at a language, almost as bad as saying it can't allocate extra memory on the fly.

    Java can't create functions on the fly as LISP or Scheme can do. It does have runtime reflection and class loading. This means that classes (and therefore the methods in those classes) can be loaded at runtime. But it would be quite a bit of work to use this facility to generate new functions from inside a program.

    I'm more curious as to why you are so adamant that generating functions at runtime is such an important capability. Many people avoid runtime code generation because they find it harder to reason about. Could you give me some examples where you have used runtime function-generation to good effect?

    --
    char a[]="lbiitgt l e \n\n\0";main(){for(char*c=a; *(short*)c;c+=2){putchar(*(short*)c);}}
  33. Re:Sigh. It's not a "feature" of other languages.. by Anonymous Coward · · Score: 0
    >How do you do this in C ?
    Answer: You allocate memory.
    value_t
    eval (value_t x, int argc, value_t * argv)
    {
    switch (x.type)
    {
    case TYPE_CLOSURE:
    return (*x.closure.code) ((*x.closure.data), argc, argv);
    // ...
    };
    }

    value_t
    curry_add (data_t data, int argc, value_t * argv)
    {
    // do whatever asserts you need.
    return add (argv[0], data.argv[0]);
    }

    value_t
    makeadd (value_t x)
    {
    value_t tmp;
    tmp.type = TYPE_CLOSURE;
    tmp.closure.code = curry_add;
    tmp.closure.data = alloc_closure_data (1, &x); /* yes, you'll need a GC */
    return tmp;
    }

    void
    somewhere(/* ... */)
    {
    value_t two, add2, input, output;
    two.type = TYPE_INTEGER;
    two.intval = 2;
    add2 = makeadd(two);

    input = readvalue();
    output = eval( add2, 1, &input );
    writevalue( 1, &output );
    }
    * - Stupid ECODE doesn't retain formatting.
  34. Are you crazy? by Tom7 · · Score: 1

    The implementation of higher order functions as closures does not require garbage collection, if you are willing to leak memory. The same exact issue comes up whenever someone returns an allocated object from a function in Java or C. The creation of a closure is an allocation of an object (which may copy values from the current stack frame into it), but the stack frame still goes away when you leave it, as do the local variables.

    You seem to have a mistaken impression of the way functional languages are implemented.

    On top of that, I don't see why we'd even bother talking about an implementation where we leak memory indefinitely.

  35. Re:Sigh. It's not a "feature" of other languages.. by hak1du · · Score: 1

    Stack-based languges like the C family (including Java) don't need GC to operate correctly

    Correct memory management in the presence of heap-allocated mutable objects require garbage collection, even in C.

    There simply is no way around it: you can allocate an arbitrary graph of objects, and what can be freed depends on what is reachable from the roots.

    And if garbage collection isn't built into the language, then the language has to be unsafe, like C is.

    So, if you want a safe language with heap allocation of mutable objects, you need garbage collection. Lexical closures have nothing to do with it.

    Which means that locally-declared variables have to keep existing after the creating function returns, even if the coder can't get to them anymore. And the only way to do that is to have the runtime system manage its own heap, which means a garbage collector.

    There is no such requirement: you can deallocate closures manually just like any other data structure. But once people get to that point, they generally realize that it's a bad idea.

  36. Re:Sigh. It's not a "feature" of other languages.. by Anonymous Coward · · Score: 0

    Hi. I'm not the grandparent poster, but I'd like to address your question about when runtime function generation works well.

    First of all, remember that a runtime generated function can be as simple as a string of bytecode -- then you just need an interpreter (either as a library or written in the language in question). More advanced languages can emit native machine code and copy it into executable memory pages, but technically you could even spawn a new process to invoke the compiler and then use dlopen (hint: you'd need a consistent way of defining symbol names, and you might need to invoke ld and/or nm)...

    Runtime generated functions are usually combined with lexical closures or blocks, and that's what most people actually mean when they say refer to runtime generated functions. By allowing a 'child' function to access its parents' local variables, you can produce some really nifty effects. But is this really any different from OO programming? Barely. You can simulate it in C++ by allocating a new object with a "virtual obj operator()(obj* arg0, ...)" member function, then you can call the result like a function. Combine this with the runtime code generation method mentioned above (i.e. invoke g++ and use dlopen), and you're good to go.

    *cough* ok that's a lot of work, and I pity whoever is tasked with maintaining such a system. ;) It's a lot easier to let the language decide which variables are needed for closures and silently do the necessary memory allocations in the background.

    Now to answer your original question: when are runtime generated functions useful? Answer: Any application where the user (or administrator) is allowed to write code that affects the running program: Word processors (emacs uses a lisp dialect), spreadsheets, database access programs, CAD programs (autocad uses/used lisp), MUDs/MMORPGs, etc. (In a strict technical/hair-splitting sense, this also includes dynamically loadable kernel modules, and if you want to split even more hairs, it also includes every program you've ever used from the shell/gui.)

  37. *sigh* by Estanislao+Mart�nez · · Score: 2, Informative
    In this OCaml code, the plusx is created "on the fly" and it is different function, depending on the value of x that is read on runtime.

    I'm ready to believe you're simplifying this deliberately to illustrate functional programming techniques, but I think the simplification here is confusing.

    It's important to keep in mind the difference between code routines and closures. The term "function" as is commonly used doesn't respect this difference. C's "functions" are code routines, while ocaml's are closures, i.e. a pair of a routine and an invocation frame.

    What's being "created on the fly" are closures, which are like stack frames (storage for local values of identifiers in an invocation), but which:

    1. are allocated in the heap,
    2. have a pointer to a "parent" frame (the bindings in the enclosing environment), and
    3. have unlimited extent (since the invocation of a closure A might return a closure B whose "parent" is A, requiring that A be kept around indefinitely after the call to it returns).
    I think the point the original poster is making can be expressed in another way, but one that's more revealing of what's at stake: stack allocation is a form of automatic memory management.

    In any modern language, there is some form of automatic storage management behind the scenes for function-local storage. Imagine if in C, you had to manually allocate the stack frame of any function you called, and every function had to deallocate its frame before returning. This would be tedious and repetitive. Automatic management of a stack of limited-extent frames provides the programmer a simple (but restricted) way of doing this.

    It would be possible to have a functional language where the storage for closures was managed by hand. Imagine a language like C except that it allowed you, when you called a function, to specify a heap-allocated binding to be used in the invocation, instead of a stack frame.

    This would be similar to the hypothetical C variant from above, where the programmer was responsible for creating and destroying stack frames. But much harder, since closures have unlimited extent. In the stack allocation case, it's clear when the allocations and deallocations need to happen (before the a function is called, and before one returns). In the manually allocated closure language, the programmer would have to figure out on his own the extent of every closure, and when and where it's safe to free them. This is not simply tedious like in stack allocation, but rather devilishly complex in general.

    So, garbage collection solves it.

  38. Re:Sigh. It's not a "feature" of other languages.. by renoX · · Score: 1

    > Memory leaks are actually much worse

    GC doesn't protect against memory leaks, so I fail to see how it relevant to the GC vs no-GC discussion?

    Also considering that nearly all the languages currently used do not allow you to create function at runtime, you could have a serious problem finding a job if you refuse learning them..

  39. Why not use a GC Processor and Memory by razmaspaz · · Score: 1

    Why not employ a processor with its own ram, seperate from the main system to do nothing but GC? Expensive? Not really. Consider a machine with 8 processors and 8GB of ram for an app server. Whats 1 more of each to the cost? This solves the paging problems as memory can be paged into the separate memory space. It does not eat up processor because it is a dedicated task. It only hurts disk performance when the disk has to be hit for a page fault. A dedicated "device" (term used loosely here) would create an almost impact free way to manage memory. Would this work? I'm still thinking it out myself.

    --
    I tried for 5 years to come up with a clever sig...only to realize that I am not clever.
  40. Good example by taigu · · Score: 1

    I guess that is a good example of what I am taking about. Gcc does not make anyone any money, directly. And it is not an especially great compiler either. Gcc is a wonderful commodity. Try benchmarking low level code with the (also recently free beer) MS compiler (under W2000) against gcc (under W2000 or Linux) and you will find out what I am talking about.

    I did not intend the comment as a troll, though I suppose it was overly terse. People are constantly worried about outsourcing and the devaluation of engineering jobs. I'm not saying easier jobs have no value or are not worth doing, just that harder problems have higher relative value.

    Basically, I think everything will become a commodity except what is still hard. And the only way to attack what is still hard is at the lower levels, because the main problem with hard problems is consistently time. The "extreme situations" are the ones we should be attacking; in the beginning all we worked on were extreme situations, and I and other software engineers were far more professional and respected than we are today. By trying to abstact and simplify so much we have denigrated the value of the computer profession as a whole, usually for the benefit of the proponents of the specific "solution" or abstraction. It's great that it is so easy to put together an online transaction system these days, but whatever happened to natural language recognition?

    Also, there is a big difference between using task specific garbage collection in the context of a proprietary data structure and trying to develop a general abstract collector. It is even worse to disallow programmatic memory mangement like Java does.

  41. How can I explain this? by taigu · · Score: 1

    I should try being more eloquent. I was obviously in a bad mood when I wrote that comment.

    To an extent, it is like being a cabinet maker. There are different kinds of people who make cabinets. Some are in love with wood and form. Some build cabinets to make money. Some people are actually more interested in the tools than the piece they are building.

    It is an axiom of the artistic cabinetry world that the best work is done with the fewest and simplest tools. A band saw, a couple hand saws, some chisels, a couple of hand planes, a bit and brace. That's pretty much it. The tools force you to work directly with the thing that matters -- the wood and the construction. With these tools you have complete expressiveness in the material and complete control. You keep track of exactly what the grain of the wood is doing, and how the joints are holding up.

    Many of the people who work as custom cabinet makers make a lot of money. Their pieces are worth hundreds of thousands of dollars. Other people work as laborers in furniture making factories. Of course here the managers apply the "different tools for different jobs" notion, though the employees probably don't care that much -- they get paid by the hour. It is the carefully choosen "different tool" that is most likely to take their fingers off anyway.

    One of the programmers I have the most respect for, the one who wrote my favorite programming books, is Donald Knuth. His books contain some of the most advanced consructs I have seen in book form, including a lengthy discussion on garbage collection. But he choose to present his ideas using an assembly language -- MIX. Garbage collection is not the tool, it is a product of the tool. I think the reason I responded so strongly to the assertions of the parent article is because it is like (for instance Java is like) Large Woodworking Corp telling me I cannot use hand saws and chisels anymore. LWC says I have to buy their premolded modular furniture components and join them together with LWC fasteners. Which was not really the intent of the article. The article was just talking about improvements in GC techniques. But the statement about GC being a panacea was absurd -- unless you are working on a nearly trivial problem.

    There is a disease amongst Computer Scientists that makes us get lost in the "tools". It is as if our job is to tell other people how to solve problems, instead of pursuing the solution of real problems ourselves. What is the "wood" of computer science? I think the substance is the problem, especially the hard problem. How are we doing with computer vision, with natural language, with common sense and reasoning? Not too good. A couple of decades ago it got hard, despite our initial optimisim. So everyone gave up and started selling "tools" instead of promising solutions. I'm sure that, if anything, all these tools just get in the way.

  42. There are more than 2! by Anonymous Coward · · Score: 0

    Actually the number has increased for the SERVER VM (YES, THERE IS A DIFFERENCE BETWEEN CLIENT AND SERVER VM FOR THOSE OF YOU DOING BENCHMARKS!)
    It's now more like 4 or 5 garbage collectors to choose from rather than 2.

  43. Parallel GC by Anonymous Coward · · Score: 0

    Java provides parallel garbage collector for the Server VM as an option. Not the same, but takes advantage of multi-processing environments.

  44. Re:Sigh. It's not a "feature" of other languages.. by Anonymous Coward · · Score: 0

    Now to answer your original question: when are runtime generated functions useful? Answer: Any application where the user (or administrator) is allowed to write code that affects the running program: Word processors (emacs uses a lisp dialect), spreadsheets, database access programs, CAD programs (autocad uses/used lisp), MUDs/MMORPGs, etc.

    I can't speak for DB access programs or CAD programs, but I've played around with the scripting features of office suites and MUDs to state that what these programs implement is probably not what most programmers think of when they hear the term "runtime function generation". What you're describing is more like parsing. MUDs, for example, might be written in C, but then invent their own specialized scripting language that simply is an API for calling the C functions that were written by (human) programmers.

    If you accept "parsing" as a form of "runtime function generation", you could stretch the analogy so that you could consider a file deletion program to do runtime function generation. The program as you if you're sure you want to delete a file. Based on the user input, it either generates the "deleteFile" function (if it receives 'y'), or the "doNothing" function (if it receives 'n').

    When I hear "runtime function generation", I picture perhaps an operating system that detects security holes and patches itself without the need for a human developper to tell it what the security flaw was in the first place (of course, merely connecting to the internet to download patches off a server is cheating.) Or perhaps an OS for which you'd use it's paint application to draw rough sketches of screenshots, and the OS figures out you want to do from those screenshots. If it already has that feature, great, it runs it for you. Otherwise, it writes a new program that does it for you. Perhaps I could describe to the computer that I want a game like Diablo II, except set in the future, and it'd generate the program for me.

  45. Really? by Anonymous+Brave+Guy · · Score: 1
    The world's busiest e-commerce websites are largely written in Java.

    Actually, the last real data I saw (but note that this was some time ago) was that 7 of the 10 most visited web sites in the world ran on C++ back-ends. Sorry, no link off the top of my head, but IIRC there was some discussion of that statistic in these parts, so a search will probably turn it up. Do you have any more up-to-date information?

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    1. Re:Really? by sfjoe · · Score: 1

      Do you have any more up-to-date information?

      Too lazy to look. I know Ebay, for one, is wrapping up their transition from C++ to Java.

      --
      It's simple: I demand prosecution for torture.
  46. C# has this by Anonymous Coward · · Score: 0
    In C# the "using" keyword is overloaded for this:
    using (File f = new File(path))
    {
    // do everything but close file here...
    }
    Thus, the file will automatically be closed as soon as the using block is exited, no matter how. This is quite handy for things like GUI objects.

    aQazaQa
  47. Re:Sigh. It's not a "feature" of other languages.. by oliveira · · Score: 1

    the *average* computer has enough RAM to run three horribly-inefficient extreme memory-hog applications at the *same time* without needing any swap

    You know my computer runs at least 50 apps at the moment, not 3 or 4. RAM is still not a resource conscious developers prefer to waste.

    You talk about GC screwing up virtual RAM algorithms, but it's really not an issue on most systems; if a process grows to three or four *times* the size it needs to be, it doesn't actually have any user-noticeable impact on performance.

    Depends. It's not noticible only if that memory is not actually accessed, i.e. no frequent cache misses and, God forbid, swaps.

    Memory leaks are actually much worse, because in that case the wasted memory doesn't ever get collected and eventually it becomes a problem, after a couple of hours of use.

    I'm sick of a "memory leak" argument. Do we compare well written apps here or sloppy written apps? If GC allows you to afford sloppiness in software development and still have a "decently" performing app then so much it's worth.

  48. Re:An Obvious Fault [in your post] by voodoo1man · · Score: 2, Insightful

    I think you mean "mark and sweep" collectors. "Stop and copy" collectors just trace the working set from whatever your heap root is. Add in the copy step, and you only touch twice the size of you working set. If your collector is well-written and the OS provides the hooks, it will ask for the new space to be allocated in core, and the old space to be discarded, wherever it is.

    --

    In the great CONS chain of life, you can either be the CAR or be in the CDR.

  49. that's easy by Estanislao+Mart�nez · · Score: 1
    Unfortunately you can have memory leaks with a GC: what if you are sloppy and don't put some 'ref = 0;' in your program?

    When ref goes out of scope, the object it references becomes available for collection. It's that simple.

    In fact, your misconceived example negates the whole point of GC-- your ref = 0, in terms of programmer logic, amounts to free(ref), which is exactly what you don't have to do if you have GC!

    1. Re:that's easy by renoX · · Score: 1

      >When ref goes out of scope, the object it references becomes available for collection. It's that simple.

      Except that it may take *much longer* to have ref going out of scope after you don't have the need of the object it references.

      During this time, your program is using more memory than it needs.. This time may be the whole duration of the program if your reference is referenced by a global variable --> memory leak.

      >In fact, your misconceived example negates the whole point of GC-- your ref = 0, in terms of programmer logic, amounts to free(ref), which is exactly what you don't have to do if you have GC!

      Not really, the difference between a free(ref) and a ref=0, is that if another part of the program have another reference to the same object, in the first case you have a core dump when it tries to access the object, in the second part it works correctly.

      I agree that if the other part of the program share the same reference there is a problem, but it is much more easier to have different references to the same object and to free references as soon as it isn't needed anymore, this way it is possible to have simple reference management and still avoid consuming too much memory, as it is unfortunately too common for programs using sloppily a GC..

  50. God, that's a horrible design. by Estanislao+Mart�nez · · Score: 1
    Global variables should be avoided vehemently. At any rate, I will venture the guess that whatever this library does, it should be implemented in terms of a stack, FIFO queue, priority queue, or some other such data structure that imposes a discipline on adding, processing and removing objects.

    If there is something bad programmers should be forbidden from authoring, libraries are it.

  51. Scope and good programming style by Estanislao+Mart�nez · · Score: 1
    Except that it may take *much longer* to have ref going out of scope after you don't have the need of the object it references.

    In good programming style, a function should be short and simple: it should do just one thing, and return. Which means that any references in local scope should cease to exist quickly. If your function allocates objects A, B, C, doing something with them, then allocating D and E, and doing something else with those two, and then returning, you should rewrite that as two functions.

    During this time, your program is using more memory than it needs.. This time may be the whole duration of the program if your reference is referenced by a global variable --> memory leak.

    Good programming style also avoids the use of global variables. And if one does need such variables, one certainly doesn't stick ephemeral values in them.

    Essentially, the scope of a variable is a way of managing the lifetime of the objects it refers to. If you want an object to live forever, you reference it from a variable with a very wide scope. If you want it to live for a very short time, you confine it to a narrow scope. Given these reasonable programming practices, GC can identify objects as soon as they are available for collection.

    If you have unneeded objects that can't be collected, you're misscoping references.

    1. Re:Scope and good programming style by renoX · · Score: 1

      You make it sound like memory management is very easy, if it is so easy, why do we need GCs?

      >Good programming style also avoids the use of global variables.

      Well now, we don't use global variables, we use singleton class, it's better but it doesn't change the problem much..

      Also you basically divide the objects in two: objects that live only for the duration of a method call or objects that live forever.
      Well, the problem is of course for the objects which need to live for more than a method call and that don't need to live forever..

      That is where the problem arise: reference an object from an attribute and don't delete the attribute even when the object will not be used anymore..
      Do this all the time with a few singleton class in the mix and you have memory leaks..

      I believe that the reason why language with GCs have such a bad reputation concerning memory usage is programmers that mismanage memory "oh, it is the GC's problem".

      GCs are very helpfull but to be memory efficient one still must be carefull to free unneeded references to long lived object when we're sure will not need those anymore in this part of the program.