C# Memory Leak Torpedoed Princeton's DARPA Chances
nil0lab writes "In a case of 20/20 hindsight, Princeton DARPA Grand Challenge team member Bryan Cattle
reflects on how their code failed to forget obstacles it had passed. It was written in Microsoft's C#, which isn't supposed to let you have memory leaks. 'We kept noticing that the computer would begin to bog down after extended periods of driving. This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles. The computer performance would just gradually slow down until the car just simply stopped responding, usually with the gas pedal down, and would just drive off into the bush until we pulled the plug. We looked through the code on paper, literally line by line, and just couldn't for the life of us imagine what the problem was.'"
I'll show you my perpetual motion machines if you show me your perfect autonomous garbage collector. You go first.
"There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy."
The linked "article" is just a "sponsored review" for a C# profiler...
This is a stupid, stupid article headline. Of course you can have a memory leak in a managed language! Any Java programmer who's decent understands that.
.NET CLR did its job, just like it was supposed to.
It's not C#'s fault. The team had references to the obstacle list (event handlers), which prevented garbage collection. The
Just because a language is garbage collected doesn't mean you can't "leak" memory (in the more standard definition of "waste memory over time"), it only means you can't completely lose track of references to objects (which is often used as a more technical definition of "leak"). It is quite common for people coding in such languages to accidentally generate live object structures that are mostly made up of garbage that they should have released their references to. Put another way: these people's program was legitimately claiming memory and never releasing it due to their limited understanding of how event handlers work.
This is a programming error, plain and simple. From TFA:
Though we thought we had cleared all references to old entries in the list, because the objects were still registered as subscribers to an event, they were never getting deleted.
So references were held to the objects in two places - the list of encountered obstacles, and the list of event subscribers. They were being removed from the list of encountered obstacles, but not being unsubscribed from the event.
How do you think event subscription works? Something has to hold a reference to the objects that are subscribed to the event! That thing is going to hold a reference until you unsubscribe the object - it neither knows nor cares about any other list of references you may be maintaining separately, how could it?
This is a coding error. A subtle, non-obvious one perhaps, but a bug nevertheless. It is not an error in the CLR, and in fact the article never paints it as such. That particular bit of spin is wholly down to the submitter.
It's official. Most of you are morons.
I've RTFA, is wasn't a memory leak caused by C#, is was caused by bad programming. After that,the whole article starts to advertise some obscure profiling tool. Maybe they should should have written the whole thing in C++ and use valgrind instead. Just an ideea...
It's not as if C doesn't leak memory when you mishandle resources. All these people needed to do was spend 5 minutes with the (free) MS .NET profiler and look at the allocation and GC graphs, and they'd be done.
using namespace slashdot;
troll::post();
It would have to, because otherwise you could register an anonymous class or other similarly constructed object as an event handler, release all your other references, and the GC would collect it while it still needs to be able to recieve messages. (That, or the GC would unregister the event handler, which would be utterly mysterious to debug if you didn't already know the intricacies of the GC, especially since you have no way of predicting when your event handler is collected).
This problem is actually less drastic in C#, since you have the ability to instantiate anonymous methods (delegates) instead of anonymous classes when doing event handling, and they have lower overhead. So even if you mess up your resource management, it's still much easier to control your resource usage because you have access to cheaper primitives to work with.
using namespace slashdot;
troll::post();
This section totals 15 points.
Background:
There are more types of resource leaks than just memory leaks. A memory leak is when your program keeps hold of memory it's not using. An object leak is when your program keeps hold of objects it's not using. A file descriptor leak is when your program fails to reuse the descriptors for files it has closed and will not reopen. Many other types of leaks could be considered.
Exercises:
1. Determine which issue this scenario describes.
2. Figure out which issue can be handled by automatic memory management.
3. Discuss whether, and if so why, the answers to Exercises 1 and 2 mean there is some conceptual discord between the wording of the scenario and the use of the term "memory leak".
I don't see why they just didn't write it in C.
They were using massive cooling systems and having very thorough code reviews, sounds like a perfect reason to use C over C#.
What surprises me most is the small size of their software, only 10 thousands lines of source code (I think that the average car processor already have these for today's cars -ignition & braking systems-). Given a team of a dozen programmers working for a year, I was expecting at least 50KLOC, or maybe 200KLOC (for example, the GCC compiler is 3MLOC, and the linux kernel has comparable size.)
Of course memory leaks can happen with garbage collected languages, but these leaks are a little easier to find....
Maybe they should have coded in a higher level language like Ocaml, Haskell.
And yes, I'm sure most of an autonomous vehicle software is not low-level drivers, but in the planification & perception tasks. On such tasks, higher-level languages definitely make sense.
I also did not understood what kind of libraries these teams are using.
I'm also surprised that it is apparently so easy to get funded to have only 10KLOC inside a car!
I see quite a few comments from C/C++ coders who wonder whether managed memory people know how event handling works. If they knew a little more about managed memory languages, they'd know a reference does not have to be "hard": you can have a reference to an object that does *not* prevent garbage collection.
So I guess the real question here is whether event handlers should be hard-referenced (as they are here), or just soft/weak referenced...
From a developer perspective it's quite natural to think that, as long as his code doesn't hold any reference to an object, it should be garbage collectable. If registerEvent() shall hard-reference handlers, documentation should be *very* explicit about it (and the need to unregister a handler for GC to work on it).
On the other hand, if handlers are not hard-referenced you can no longer register anonymous class event handlers...
:(){
Just like most windows machines it bogs down and starts crashing after about 40 minutes of hard use.
In C#, the problem manifests itself as a memory leak. In C/C++ however, you would have freed the memory even while the listeners were still active. Now you have a reference to previously freed memory. I know what I would prefer. The only advantage is that - maybe - the C/C++ error would show up earlier, but the form of the manifestation might vary.
That's why there are no memory leaks in C/C++ code [/sarcasm]
As user of programs written in GCed languages, in my experience usually they are bad memory hogs. And don't tell me that memory is cheap. People constantly forget that we are not any more in the days of DOS, where there was essentially only one program running at any time.
The Tao of math: The numbers you can count are not the real numbers.
(1) You are supposed to test your software.
(2) You are particularly supposed to test your software if you send $200k and 1 ton of hardware careening through the street on autonomous real-time control.
(3) Garbage collectors do not prevent memory leaks.
(4) Garbage collected systems can be good for building real-time systems, but you need a real-time garbage collector or you need to treat the system as if it didn't have a garbage collector at all.
What "ruined their chances" was not that they overlooked a memory leak, what ruined their chances was that they didn't know what they were doing.
As is the case with Microsoft's GC, Java's won't delete things that are still being referenced by other things, because it quite reasonably assumes that an object which is referenced by another object that hasn't itself been marked for collection isn't garbage.
The main problem with garbage collectors (I like GCs, so this isn't a diatribe against them) is that far too many mediocre programmers assume they have a magical ability to know precisely what they want their code to do. The reality of course is that they use algorithms to decide what should be collected, when it should be collected, and how it should be collected, and those who are unfamiliar with the particular strategies that their GC uses can therefore not only write code with more than a few memory leaks, but also code that results in the GC being used so inefficiently that it does vastly more work than would be necessary if the same functionality was implemented in a slightly different way.
There are plenty of articles about Java memory leaks that can be found by Googling "java memory leaks". Googling "java GC tuning" will produce some useful links to articles containing tips on ensuring that it's not used inefficiently.
I'm not going to change your sheets again, Mr. Hastings.
This kind of thing makes me so happy. Sure, it's not really a bug in C#, but this is even better, a perfect demonstration of how GC does next to nothing to prevent this type of bug, and instead fools people into complacency while making the bug much more subtle.
In my opinion there is a proper language level for nearly any task. For kernel programming, drivers, or RT stuff, C. User-level stuff is usually better in C++. Well, I'm a big fan of C++ and more comfortable there so I'll usually extend its range down to some lower-level work and sometimes I'll bang out a quick-and-dirty app or script type thing (lots of user input parsing and other things C++ isn't great at) in it too, even if it could be done better (yes, better as in higher quality) or faster in another language.
Anyway, although I could be making incredibly wrong assumptions about the nature of the problem, I'm pretty sure that C# wasn't the right language for the job. C# very nicely occupies the space between C++ level languages and scripting languages, but for a problem that involves probably no parsing whatsoever (it shouldn't, anyway), needs to be perfectly stable (in my experience GC apps are buggier, I'm not going to go off on that tangent now and explain, but it's been my experience), and have as deterministic runtime as possible, it's C or a subset of C++ (little to no STL) all the way. This paragraph was brought to you by Lisp.
This problem was caused by, I'm going to go out on a limb here and say the wrong language choice. If this was C/C++, there would have been a segfault (easy to debug--usually) or the old reference wouldn't have mattered at all. C#'s real strong point, its huge and well-integrated library, probably didn't help them out very much.
Every programmer who wants to call themselves a real programmer should learn as many languages in as wide a range as possible. Sure, have favorites, but that should mean trying to work in your language's realm, not extending it way beyond its range.
<xml><I><am><so><damn>Web 2.0</damn></so></am></I></xml>
Well the Event Subscribed 'problem' is well known and makes sense if you think about it. I mean subscribing to an Event means placing a pointer to a delegate of a method in a event subscriber list.. when someone raises that event then each delegate in the list is invoked... so basically it is an implicit reference and hence can prevent the it from being marked for garbage collection.
;) :)
However, i had another memory 'leak' problem where the Garbage Collector simply didn't collect in time which caused my application to use more and more memory until it reached the system limit and crashed... i found that simply calling
GC.Collect();
GC.GetTotalMemory(true);// (the true 'forces' collection
once would fix this problem... i though i needed to call it every minute or so... but when calling just once it did SOMETHING that prevented this problem from occurring again.. no idea exactly what.. but it works
They can't have had anyone on the team with experience of coding for Swing in Java then - you get these all the time, sometimes hanging tens of megabytes of unwanted GUI objects off a single listener registration, and learn how to spot and fix them.
Yikes. So these guys have the smarts to make a computer drive a car on its own, but managed to forget some basic safety mechanisms such as a watchdog and other failsafe mechanisms ?
Geez guys - real world engineering 101: Do not let a computer control anything that might have a remote chance of harming someone without appropriate safety mechanisms.
They also didn't pick a very good hack because it didn't leave the car in a safe state when the software broke.
Lack of practical experience I'd say. A few more events like that and they'll make decent devs one day.
I just read TFA and it doesn't give any details. My guess? I just checked, and C# apparently uses reference-count garbage collection. That means that an object will stay around until there are zero references to it. The best way to create an object that will never go away is to create a circular linked list, then delete the reference to the list. All the items refer to each other, but there is nothing else that references them. But any complicated data structure that can have circular references will leak memory.
A mark-sweep garbage collector will catch this, but at the cost of interrupting the program temporarily to do GC. This isn't exactly friendly to real-time applications.
So basically this looks like a classic noob blunder. Just because there is "automatic" garbage collection doesn't mean that you can turn your brain off.
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Unintentional object retention. I think that is the official name of the problem. It occurs with managed languges. You have a big application with hunderts of objects referencing each other, it's inevidable that you will forget to null out a reference somewhere. (and by the way this is not exclusively a problem of event handlers, it can happen anywhere).
The solution it to use weak references.
When I switched from c++ to java, it seemed very obvious to me that this would happen, I just assumed that all experienced coders were using weak references. Out of curiousity, I asked on the forums how many people use weak references, the answer was "what is a weak reference?". Then I explained it's to prevent weak references, they just laughed at me: "dude, we have a garbage collector, we don't have to worry about memory leaks"
A funny thing happened with during my co-op this summer:
I was working at a coal-fired power plant which needed a new pollution control device before 2010. There, I would dig through the literature, and try to find suitable products and operating conditions for this device. Anyway, this involved a lot of meetings, conference calls, and business lunches with the suppliers in question.
Then there was Joe.
Joe was our Alstom sales rep: portly, humorless, slow to speak and slower to understand. He was also a devote Utahnian.
Well, one day, we were killing time while waiting on a conference call, my supervisor left the room, and we started talking about universities. Then he dropped the bomb:
"In my Senior year, I worked on developing perpetual motion machines."
My supervisor then reentered the room, and we got back to work. I felt like I'd just seen a dancing frog.
Just read the CodeProject article to see why:
;-) )junk near them in the game, hence not getting garbage collected due to their object detection algorithm.
- "so it wasn't a memory leak per se"
- "It was the closest thing to a memory leak that you can have in a "managed" language. "
- "Unfortunately, our system was seeing and cataloging every bit of tumbleweed and scrub that it could find along the side of the road."
So they just goofed up.
The objects didn't get deleted in time, because there were always ( literally
Bad Slashdot. Bad Slashdot.
Beware: In C++, your friends can see your privates!
Criticisms of the team aside, I would like to say that neither Java nor C# have made any steps to remedy problems like this with seem to be all too common with inexperienced developers. Both Java and C# need to support attaching to event handles with "weak" handlers. That is, the handler will not hold onto the object which defines the handler (and will automatically deregister itself sometime after the object has been collected). In many cases, there is a need for an object to listen and handle an event from another object, but only whilst the object that is listening is still referenced (with the exception of the reference held by the object firing the event).
In C#, the (admittedly ugly) way to implement this is to use an anonymous method and a weak reference: The "closure" that is created for the anonymous method does not hold a reference to "this" as it does not access any of "this"'s fields or methods unless it's through the weakreference.
The code has a flaw where the event handler code (only a few bytes to hold the closure) will never deregistered be collected unless the event is fired sometime after the owner object has been collected. This can be fixed by using a NotifyingWeakReference (a weak reference that raises an event when it has been collected).
Slashdot editors are even more pathetic than I thought they were. It's bad enough that they didn't skim through the article, but they apparently didn't even take a look at the URL. Look at this thing:
http://www.codeproject.com/showcase/IfOnlyWedUsedANTSProfiler.asp
"IfOnlyWedUsedANTSProfiler"? That didn't raise any flags?
Of course, I'm trying to assume good faith and not just conclude that the editors knew this was an advertisement, but they sure are making that difficult.
...on slash dot if it was written in perl on the linux platform. This is just an oppurtunity for someone to read part of a story, make a snappy title that bashes Microsoft based on the misunderstanding of the technology in an article.
This is not a c# memory leak, it was a memory leak written in c#. The developers used a commerical tool to find there problem, a trail version even. So how about a title "Commercial Code Profiler Saves the Day For DARPA team"?
Oh because then it would never be a slash dot article, ugh.
Developers making a mistake != c# bug
The IDisposable interface is there for a reason.
It's the programmer and the language. Give the world's best carpenter a ball-peen hammer and ask him to build you a beautiful armoire, see what happens.
You can say now that they'll be much further next year, but until then "Which means that the language did the job very nicely" should be "Which would mean that the language did the job very nicely." If you put in a reminder of some sort to come back and say I told you so, I'd be more than happy to eat my words if they continue using C# and place in the top 33%. Hell, I'd even concede that you might be right if they manage the top 50%.
I say, however, that there is a right language for the job. Sure, there's overlap, but you don't implement your FFT in Perl when the problem is that you need the fastest FFT possible, you don't write a word-processor in assembly, and you don't write anything in Brainfuck even though they're all Turing-complete. Anyone who says you can do anything in any language is trying to justify using their favorite language for absolutely everything.
<xml><I><am><so><damn>Web 2.0</damn></so></am></I></xml>
I usually like /. articles, even the ones against MS, but I cannot just skip over this one:
if the moderator read the article he would have noticed that the article was an advertisement for the profiler product, not just a review of it (it was written directly by Red Gate).
Second, the article itself says that they found that the error was in how they coded the application, because they left some reference so the garbage collector didn't trow away the objects.
This is a really bad article and bad information.
Come on. Really. What kind of idiot marketer sends in stories like this to Slashdot? We know what happens. First, you get derided mercilessly for trying to sway us with your ridiculously transparent attempt at marketing. Then, the real experts come out and poke holes in everything you've said. Then everyone else chimes in with better (and often free) alternatives. You and your company end up looking like buffoons, and your product ends up looking like utter garbage.
You may think you're pulling one over on the editors, and maybe you are. But you aren't pulling one over on us, and I think after all these years, the editors know this. So, just don't. Unless your product or service is absolutely bulletproof people here are more likely to shoot it full of holes than rush out and buy it.
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
I'm from team Cornell, #26. We finished the race (although slowly due to what looks like a buggy throttle controller). C# was used exclusively in our system for the strategic planner. It was also used quite a bit for the behavior/operational systems.
I'm very much a C++ programmer, and with a strong focus on micros to add to that, so yeah, I was a bit... skeptical.
At one point in development we did have a "memory leak" issue but it was entirely our fault (while obviously there are no "new"/"delete"'s around, if you don't dereference things then essentially you get a nice "memory leak" with all of the associated symptoms).
I think that C# really sped up our development time, and, in the end, our car finished. I'm sure that there are other fully valid languages/IDEs/etc, but we happened to be most proficient in MSVS; we tested the crap out of C#'s compiler's performance on our machine for our specific application; we used it and that part of the system performed admirably. C# also let us write numerous support utilities quickly.
Microsoft may have many faults but I'm pretty sure that C# / the Visual Studio IDE environment as a whole aren't it.
Princeton seemed to have a number of issues outside of "slowing down" during runs. They completely scrapped their first two qualification runs, ran maybe once, and then left.
P.S. No, i'm not paid by microsoft or such. Aside from the usual departmental benefits like free copies of MSVS and winXP, we didn't get any kind of sponsorship from them.
There's actually an accepted safe way to do memory management - reference counts and weak references. That's what both Perl and Python have settled on, and it's worth noting that programmers in those languages seldom have serious memory management problems. In C and C++, one has to obsess on memory management issues, and even in Java and C#, which are garbage collected, it takes more attention than it should.
Reference counts have the advantage of repeatability - deletion will occur at predictable times. This allows the use of destructors. You can safely use destructors to manage other assets, like windows, open files, network connections, and such.
Destructors in systems with garbage collection make for an unhappy marriage. Calling a destructor or finalizer from the garbage collector is essentially equivalent to calling it at some random time from another thread. So race conditions are possible. Check out Microsoft's "managed C++" for an attempt to get all the cases for this right. It's not pretty.
The classic complaint about reference counts is "what about cycles"? There's a simple answer - cycles, that is, loops of strong pointers, are errors. This isn't a severe restriction; it just requires some data structure design. With trees, for example, links towards the leaves are strong pointers, and links towards the root are weak. (I've revised Python's BeautifulSoup HTML parser to work that way; "down" and "forward" links are strong, while "up" and "backwards" links are weak. It took about 20 lines of code and eliminated annoying problems in programs dealing with HTML trees.)
If you really need a symmetrical circular list, which might happen in, say, a window library with many links between widgets, there's a simple solution. Have all the objects owned by some collection, then use weak pointers between them. When the collection is dropped, all the bits and pieces go away, in a well defined order.
In Python, you can turn off garbage collection while leaving reference counting active, then list any orphaned cycles at program end for debugging purposes. This is a practical way to program without leaks or garbage collection. It's generally easy to find cycles, because cycles are created by data structure design, not by bugs. So if a program has cycles, it will probably have them every time, and thus they can be found early in debugging. With better language support for debugging, cycles could be caught at the moment of creation, which would make it easy to eliminate them.
Now if we could get this into a hard-compiled language, we'd have the problem solved. Repeated attempts to bolt reference counting onto C++ via templates have resulted in fragile systems. The fundamental problem is that C++ still requires access to raw pointers to get anything done, and this puts a hole in the protection provided by the reference counting system. It takes language support to make this work right.
People complain that Slashdot sucks: the headlines are sensationalistic, the editors get commissions based on the number of dupes they post, and articles about 6-month-old events get posted as "news".
So why do I even bother visiting Slashdot? The answer is two things: the community of posters, and Slashcode moderation.
The value of Slashdot is in its community. You and I, dear Slashdotters. Our collective mind will pick through the various articles, point out their flaws, expose sensationalist FUD for what it is (and, surprisingly, will do this equally for anti-Linux and anti-MS FUD), debate various trends, and provide a signficantly international (though heavily USA-centric) perspective.
This value is enhanced by Slashdot's moderating system, so that information and insight can bubble to the top among the mass of inane posts. Metamoderation limits the amount of crack that the moderators can be on.
So, Slashdot editors, take note! *WE* are the reason we are here. *YOU* are not. Many of us don't even bother to read the articles any more, preferring to soak up the collective wisdom of techies from varying age groups and fields. If you piss us off, and the collective community of Slashdot deteriorates, then there's no reason for me (or others) to keep coming back.
Think about it.
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
I suspect it is the fault of slashdot user base as much as the editors. I bet a lot of users were in the firehose, saw the sensationalist title, etc, and rated it highly. The editor comes in, sees it has a sensationalist title and is now colored read, meaning users really think it is great, and posts it. So yes, the editor may not have read the article, but I'm sure the user base didn't either, at least not until after it got posted.
Beware of bugs in the above code; I have only proved it correct, not tried it.
I thought I'd add:
.NET Framework, 3.0/3.5 uses Weak References by default for almost all event handlers you care about. .NET 3.5 runs on the 2.0 CLR though, so it's simply finally using the weak references provided by the CLR.
The next version of the
We are the fire that lights our world.. and we are the fire that consumes it.
Hey, here's a wacky idea that's just crazy enough to work - DON"T USE DYNAMIC MEMORY ALLOCATION! Why in holy hell would someone construct what amount to an embedded real-time system using dynamic memory. Define fixed memory allocations for everything. Run tests. If the memory is insufficient, the program crashes. Then you can see where the program crashes and why. Then you can fix it.
Just because you *can* do something doesn't mean you should.
Brett
Slashdot has editors. I know this, because the stuff below "nil0lab writes..." is heavily editted from what I actually submitted! In fact, I started my actual submission with something like "in a shameless plug for some code analysis product..."
I sent this story to slashdot, and I'm not a marketer nor do I have any
relationship with the product. In fact, I started my submission (which
was editted, see other comment above) with something like "in a blatant
plug for some kind of profiling product..."