C# Memory Leak Torpedoed Princeton's DARPA Chances
nil0lab writes "In a case of 20/20 hindsight, Princeton DARPA Grand Challenge team member Bryan Cattle
reflects on how their code failed to forget obstacles it had passed. It was written in Microsoft's C#, which isn't supposed to let you have memory leaks. 'We kept noticing that the computer would begin to bog down after extended periods of driving. This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles. The computer performance would just gradually slow down until the car just simply stopped responding, usually with the gas pedal down, and would just drive off into the bush until we pulled the plug. We looked through the code on paper, literally line by line, and just couldn't for the life of us imagine what the problem was.'"
The linked "article" is just a "sponsored review" for a C# profiler...
Just because a language is garbage collected doesn't mean you can't "leak" memory (in the more standard definition of "waste memory over time"), it only means you can't completely lose track of references to objects (which is often used as a more technical definition of "leak"). It is quite common for people coding in such languages to accidentally generate live object structures that are mostly made up of garbage that they should have released their references to. Put another way: these people's program was legitimately claiming memory and never releasing it due to their limited understanding of how event handlers work.
This is a programming error, plain and simple. From TFA:
Though we thought we had cleared all references to old entries in the list, because the objects were still registered as subscribers to an event, they were never getting deleted.
So references were held to the objects in two places - the list of encountered obstacles, and the list of event subscribers. They were being removed from the list of encountered obstacles, but not being unsubscribed from the event.
How do you think event subscription works? Something has to hold a reference to the objects that are subscribed to the event! That thing is going to hold a reference until you unsubscribe the object - it neither knows nor cares about any other list of references you may be maintaining separately, how could it?
This is a coding error. A subtle, non-obvious one perhaps, but a bug nevertheless. It is not an error in the CLR, and in fact the article never paints it as such. That particular bit of spin is wholly down to the submitter.
It's official. Most of you are morons.
It would have to, because otherwise you could register an anonymous class or other similarly constructed object as an event handler, release all your other references, and the GC would collect it while it still needs to be able to recieve messages. (That, or the GC would unregister the event handler, which would be utterly mysterious to debug if you didn't already know the intricacies of the GC, especially since you have no way of predicting when your event handler is collected).
This problem is actually less drastic in C#, since you have the ability to instantiate anonymous methods (delegates) instead of anonymous classes when doing event handling, and they have lower overhead. So even if you mess up your resource management, it's still much easier to control your resource usage because you have access to cheaper primitives to work with.
using namespace slashdot;
troll::post();
I've RTFA, is wasn't a memory leak caused by C#, is was caused by bad programming
We've had similar trouble trying to build big systems on an MS platform. Stubborn memory leaks that don't seem to have a common cause. It's happened in VB and C#. You can't blame them all on bad programming because we've had similar problems from completely different groups of programmers with varying skill levels. We build mainly web apps and this was a desktop app but their description was strangely familiar.
It's pretty callous to blame the programmer when they trust garbage collection to do its job. That was one of the big selling points of the whole .NET framework from day one. This great tool that manages memory for you. Like a lot of what MS sells it turned out to me more of a false sense of security.
That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
As is the case with Microsoft's GC, Java's won't delete things that are still being referenced by other things, because it quite reasonably assumes that an object which is referenced by another object that hasn't itself been marked for collection isn't garbage.
The main problem with garbage collectors (I like GCs, so this isn't a diatribe against them) is that far too many mediocre programmers assume they have a magical ability to know precisely what they want their code to do. The reality of course is that they use algorithms to decide what should be collected, when it should be collected, and how it should be collected, and those who are unfamiliar with the particular strategies that their GC uses can therefore not only write code with more than a few memory leaks, but also code that results in the GC being used so inefficiently that it does vastly more work than would be necessary if the same functionality was implemented in a slightly different way.
There are plenty of articles about Java memory leaks that can be found by Googling "java memory leaks". Googling "java GC tuning" will produce some useful links to articles containing tips on ensuring that it's not used inefficiently.
I'm not going to change your sheets again, Mr. Hastings.
Well the Event Subscribed 'problem' is well known and makes sense if you think about it. I mean subscribing to an Event means placing a pointer to a delegate of a method in a event subscriber list.. when someone raises that event then each delegate in the list is invoked... so basically it is an implicit reference and hence can prevent the it from being marked for garbage collection.
;) :)
However, i had another memory 'leak' problem where the Garbage Collector simply didn't collect in time which caused my application to use more and more memory until it reached the system limit and crashed... i found that simply calling
GC.Collect();
GC.GetTotalMemory(true);// (the true 'forces' collection
once would fix this problem... i though i needed to call it every minute or so... but when calling just once it did SOMETHING that prevented this problem from occurring again.. no idea exactly what.. but it works
They can't have had anyone on the team with experience of coding for Swing in Java then - you get these all the time, sometimes hanging tens of megabytes of unwanted GUI objects off a single listener registration, and learn how to spot and fix them.
They also didn't pick a very good hack because it didn't leave the car in a safe state when the software broke.
Lack of practical experience I'd say. A few more events like that and they'll make decent devs one day.
It's not the garbage collector's fault. If an object is still in use, it can't be collected and destroyed. Managed memory only prevents the kind of memory leak where the programmer "loses" all references to the memory and thus never frees it. It also prevents the kind of bug where memory which is still in use is freed. Programs usually crash when that happens (either the OS terminates them due to a memory protection violation or they overwrite their own data and crash later on). That is also what would likely have happened in this case if it weren't for managed memory, because obviously the programmers mistakenly thought that these objects were no longer in use, so they would have freed them when they were still handling events.
Unintentional object retention. I think that is the official name of the problem. It occurs with managed languges. You have a big application with hunderts of objects referencing each other, it's inevidable that you will forget to null out a reference somewhere. (and by the way this is not exclusively a problem of event handlers, it can happen anywhere).
The solution it to use weak references.
When I switched from c++ to java, it seemed very obvious to me that this would happen, I just assumed that all experienced coders were using weak references. Out of curiousity, I asked on the forums how many people use weak references, the answer was "what is a weak reference?". Then I explained it's to prevent weak references, they just laughed at me: "dude, we have a garbage collector, we don't have to worry about memory leaks"
just to be clear, THE BUG WAS NOT IN THE RUNTIME, not by any stretch.
there are very clear constructs in place in the language/runtime to allow any object to unregister itself from event registrations it initiated.
this was VERY MUCH a bug in the end-user software, not the runtime (i've written code almost IDENTICALLY to this and blew lots of time having made this same mistake).
the only thing the runtime could do to protect the idiot developer (myself included) is automagically make all event references WEAK references, but that has plenty of undesirable side-effects too... in clr, you can do this yourself if you're so inclined... (just like in a JVM)
cheers.
Peter
They've simply neglected to remove the event handlers for the "obstacle" objects, so they were still referenced and wouldn't be garbage collected: "Though we thought we had cleared all references to old entries in the list, because the objects were still registered as subscribers to an event, they were never getting deleted." So, maybe it's better to read and understand the article (which does seem like an advertisement to ANTS profiler), before just bashing an innocent programming language with no reason at all...
Just read the CodeProject article to see why:
;-) )junk near them in the game, hence not getting garbage collected due to their object detection algorithm.
- "so it wasn't a memory leak per se"
- "It was the closest thing to a memory leak that you can have in a "managed" language. "
- "Unfortunately, our system was seeing and cataloging every bit of tumbleweed and scrub that it could find along the side of the road."
So they just goofed up.
The objects didn't get deleted in time, because there were always ( literally
Bad Slashdot. Bad Slashdot.
Beware: In C++, your friends can see your privates!
these don't have any problems with circular reference structures - if it can't be reached from a root and marked, it'll get collected.
still just a blunder, as you say.
this article should be binned - misleading title and nothing but a puff-piece for a profiler. i much prefer YourKit, incidentally:-)
Actually, C# doesn't reference count at all, it 'Reference Traces' :)
Please, let me explain; it's quite sad how often people don't get this ...
.Net has its block of managed memory, called the Managed heap. It's separated into 3 'generations'. This heap has 2 areas, free space and reserved space, from top to bottom.
When you allocate and object to the heap, by using the new command (object o = new object();) there is a set of rules? that have to be enforced:
The GC manages Reference tracing, and this doesn't occur when the object goes out of scope, it actually happens when the Heap is full and you attempt to allocate a new object.
In something called 'the sweep', the GC goes through each object in the heap to see if it's reachable. To do this it starts with so-called 'roots'. It then traces to see which objects are referenced by these roots.
A root identifys a storage location, which referes to objects on the managed heap, or objects that are set to null. For example, all of an applications global and static objects are considered to be it's roots. (hence the reason that all C# apps have a static void main).
When the sweep starts, it assumes that all objects are garbage. So for each root object, it builds up a graph of the objects that root references, and marks them as being live.
However, if it finds an object that's already in the graph, it stops traversing that path. This is two (massively) increase performance by not scanning the same object twice, and more importantly, it stops you getting into an infinite loop by scanning a circular list.
The pinch is, it prevent the circumstance that you mentioned! :)
Because the strong reference to a linked circular list is gone, the circular list isn't attached to a root object, so it gets disposed. If you don't want it to get dropped, unless it theres a memory shortage, the C# GC also supports something called Weak References, but I'm not going to go into those here as it's headhurting
So once all the roots have been checked and we've got a nice graph of all the objects that are referenced by the live parts of the application somehow, the second stage of GC happens.
Any objects that haven't been touched by the walk are of course still marked as Garbage. The GC now walks up the heap linearly, looking for contigious groups of garbage which are now considered to be free space. The GC looks for the next live object and moves it to the start of this free space with a good old memcpy :)
This ofcourse invalidates all the root pointers, so the GC then updates the points in the root objects.
:)
So now, we've got rid of all the garbage and our heap is pleasantly compacted; Take that Heap Fragmentation, Kerpow!!
But, that's not all she wrote of course
Now we're free'd and compacted, the 'nextObjPtr' is moved to the top of the heap. At this point the new object creation that triggered the collection is performed and the new object appears at the top of the heap.
This is a dramatic over-simplification and I've not attempted to explain finalization or weak references, but it's still good to know this stuff, it helps us as .Net programmers to consider how to write our code properly :)
The other thing I've not explained is how the Generations work:
When we remember we are all mad, the mysteries disappear and life stands explained.
Free Software games list and commentary
There's also the issue where you need to explicitly remove your event listeners when you no longer need the object. The listener keeps a reference to the object (via the interface) so even if it goes out of scope or what-have-you, YOU may think you don't have any references to the object but it implicitly does, through the listener you handed to the system. So... if you're using event listeners, make sure you explicitly remove them in your object's destructure... or else you'll end up with a memory 'leak'.
I'm from team Cornell, #26. We finished the race (although slowly due to what looks like a buggy throttle controller). C# was used exclusively in our system for the strategic planner. It was also used quite a bit for the behavior/operational systems.
I'm very much a C++ programmer, and with a strong focus on micros to add to that, so yeah, I was a bit... skeptical.
At one point in development we did have a "memory leak" issue but it was entirely our fault (while obviously there are no "new"/"delete"'s around, if you don't dereference things then essentially you get a nice "memory leak" with all of the associated symptoms).
I think that C# really sped up our development time, and, in the end, our car finished. I'm sure that there are other fully valid languages/IDEs/etc, but we happened to be most proficient in MSVS; we tested the crap out of C#'s compiler's performance on our machine for our specific application; we used it and that part of the system performed admirably. C# also let us write numerous support utilities quickly.
Microsoft may have many faults but I'm pretty sure that C# / the Visual Studio IDE environment as a whole aren't it.
Princeton seemed to have a number of issues outside of "slowing down" during runs. They completely scrapped their first two qualification runs, ran maybe once, and then left.
P.S. No, i'm not paid by microsoft or such. Aside from the usual departmental benefits like free copies of MSVS and winXP, we didn't get any kind of sponsorship from them.
There's actually an accepted safe way to do memory management - reference counts and weak references. That's what both Perl and Python have settled on, and it's worth noting that programmers in those languages seldom have serious memory management problems. In C and C++, one has to obsess on memory management issues, and even in Java and C#, which are garbage collected, it takes more attention than it should.
Reference counts have the advantage of repeatability - deletion will occur at predictable times. This allows the use of destructors. You can safely use destructors to manage other assets, like windows, open files, network connections, and such.
Destructors in systems with garbage collection make for an unhappy marriage. Calling a destructor or finalizer from the garbage collector is essentially equivalent to calling it at some random time from another thread. So race conditions are possible. Check out Microsoft's "managed C++" for an attempt to get all the cases for this right. It's not pretty.
The classic complaint about reference counts is "what about cycles"? There's a simple answer - cycles, that is, loops of strong pointers, are errors. This isn't a severe restriction; it just requires some data structure design. With trees, for example, links towards the leaves are strong pointers, and links towards the root are weak. (I've revised Python's BeautifulSoup HTML parser to work that way; "down" and "forward" links are strong, while "up" and "backwards" links are weak. It took about 20 lines of code and eliminated annoying problems in programs dealing with HTML trees.)
If you really need a symmetrical circular list, which might happen in, say, a window library with many links between widgets, there's a simple solution. Have all the objects owned by some collection, then use weak pointers between them. When the collection is dropped, all the bits and pieces go away, in a well defined order.
In Python, you can turn off garbage collection while leaving reference counting active, then list any orphaned cycles at program end for debugging purposes. This is a practical way to program without leaks or garbage collection. It's generally easy to find cycles, because cycles are created by data structure design, not by bugs. So if a program has cycles, it will probably have them every time, and thus they can be found early in debugging. With better language support for debugging, cycles could be caught at the moment of creation, which would make it easy to eliminate them.
Now if we could get this into a hard-compiled language, we'd have the problem solved. Repeated attempts to bolt reference counting onto C++ via templates have resulted in fragile systems. The fundamental problem is that C++ still requires access to raw pointers to get anything done, and this puts a hole in the protection provided by the reference counting system. It takes language support to make this work right.
I thought I'd add:
.NET Framework, 3.0/3.5 uses Weak References by default for almost all event handlers you care about. .NET 3.5 runs on the 2.0 CLR though, so it's simply finally using the weak references provided by the CLR.
The next version of the
We are the fire that lights our world.. and we are the fire that consumes it.
Its quite clear in the article that they forgot to unregister their "deleted" objects from events. Since they were still registered, they weren't garbage collected. And rightly so. This was THEIR programming mistake, and has nothing to do with a GC bug in C# or any such thing. Fuck slashdot is pissing me off these days... as soon as they see a story that could be spun as "Microsoft screwed up!", they publish it withouth any fact-checking (or even reading the goddamn article!).
I wonder is MS could sue Slashdot for slander?
Jeremy
First: The bug was not in the runtime. It was a simple programming bug. Second: the bug had nothing to do with parallel processing. It was an object leak due to event handling. The fastest way to solve it would have been to print out the object graph of the program after it had started running and then again after it had "slowed down". They would have seen a particular class of object had become much more numerous over time. That's you're leaker. Memory leaks ARE often easier to track down empirically rather than by just reading the code over. After all, the bug is that the state of the application is in an unwanted state. So why wouldn't you want to characterize the nature of the unwanted state (not just "memory is gone" but "event listener objects are leaking")?
"NO! It is not a good thing, if a program slowly leaks memory then it just makes it harder to find the bug. If you have to reboot the app every week because it has a little leak, no-one's going to be bothered (except the users who see it slowly getting slower). If it has to be restarted daily then you're going to be looking to fix the bug."
Actually the good companies do debug the slow memory leaks, and the bad ones don't debug the slow ones. Besides, any memory leak in a Java program is possible in a C app, so you are eliminating a class of leaks, not replacing them with harder to find leaks. Thus your entire argument is moot. Furthermore, where are you getting the idea that Java memory leaks are going to be slow while C memory leaks are going to be fast? I've seen slow C memory leaks and fast Java ones. I can think of nothing regarding the nature of garbage collection that would effect the speed of the leak.
"I have a good analogy - Firefox. I use FF a lot, I like it, but it does tend to increase its memory usage over time, and has been rightly criticised for it. Now, I'm sure the 'bug' is an aspect of its design and not a programming bug (and I don't want to start a FF memory discussion - I'm only using it as a real-world example) but just imagine if *every* program was like FF - slowly using more and more RAM over time until you restarted it."
First, thats not an analogy, thats an example. Second, Firefox is not an application, at least not in this day and age. Today its a platform for web applications which are just vulnerable to memory leaks as any other. If that cool new javascript app that is running on the page you are loading leaks memory, there really isn't a whole lot Firefox can do.
Third, I fail to see your point. Do memory leaks suck? Of course. Is it best to get rid of them? Of course. Will garbage collection get rid of memory leaks? Of course not. Will they make the problem any worse? No, any code that leaks in Java will also leak in C. Will it make it better? Of course, there are types of leaks which simply are not possible in Java. Those will be eliminated resulting in fewer leaks (though it is of course impossible to eliminate the completely).
Mathematics is made of 50 percent formulas, 50 percent proofs, and 50 percent imagination.