Firefox Memory Leak is a Feature
SenseOfHumor writes "The Firefox memory leak is not a bug. It's a feature! The 'feature' is how the pages are cached in a tabbed environment." From the article: "To improve performance when navigating (studies show that 39% of all page navigations are renavigations to pages visited less than 10 pages ago, usually using the back button), Firefox 1.5 implements a Back-Forward cache that retains the rendered document for the last five session history entries for each tab. This is a lot of data. If you have a lot of tabs, Firefox's memory usage can climb dramatically. It's a trade-off. What you get out of it is faster performance as you navigate the web."
The answer to that is pretty simple:
The heap, where dynamic allocations occur, is only allowed to grow or to be truncated. An application cannot release memory in the middle of the heap without also releasing the memory at the end of the heap.
So let's say Firefox makes 10 one-page allocations, and frees the first 9. The memory layout might look something like:
XXXXXXXXXU (X- unused, U- used)
Those 9 pages worth of memory aren't being used, but it's impossible to release them back to the OS.
Thankfully, there is some good news: when Firefox needs to allocate more memory, it can and will just reuse those 9 unused pages instead of allocating more memory from the OS and growing the heap.
The best solution to this problem is to use a compacting garbage collector. Which is something that Java and C# and other higher-level langauges can easily make use of (and many do use them), but which C and C++ can't really make use of given the complete lack of compiler support. That's one reason why a Java or C# app can actually out-perform a similar C/C++ app, especially with a good native-code compiler and an library implementation with a modern GC.
I think this submission is confusing two points. First of all, is this really a memory leak? A program that uses a lot of memory is not necessarily a leaking program. A memory leak is a programmatic error where memory is allocated but never freed, even when there's no way to use that object again. As the program continues to allocate memory, the heap size of the process increases until eventually the OS terminates the process (eg., the OOMKiller). Actually, many applications you normally use leak memory - but as long as they don't waste a ridiculous amount of memory most people don't care, especially since most process lifetimes are relatively short (compared to a daemon process like apache), and after termination the OS reclaims all the program's memory, leaked or not.
What is being described here sounds much more like a cache of recent pages, which in my opinion is perfectly sane for a browser. Sure, maybe the cache is a bit overzealous, but even if that's the case, just disable it - worse case scenario, you edit the source. But otherwise, this is definitely a feature - I can promise you it's much more programming effort to save old pages for a quick redraw than to free the old page and replace it with the new.
So I guess the discussion here is, "is it right for firefox to use so much memory?" My answer is yes. It is not a memory leak, it seems like a very valid design decision. But if you disagree, old versions of firefox still work great (I still haven't upgraded myself).
http://www.talknerdy.org
You're confusing garbage collection with heap compaction.
People have written garbage collectors for C++, and they work just fine. But they do not help with fragmentation, which is the problem you're describing. That requires a heap-compacting allocator (aka a "handle" allocator). Many languages with garbage collection also use a heap-compacting allocator. C++ does not, because of a low-level language "feature": pointers, and specifically, pointer arithmetic.
If an object moves in memory, then people have to be notified that it has moved, or they won't know where to access it. Languages like Java handle this behind-the-scenes; the system library tracks objects for you, and your program never knows (or cares) whre an individual object is.
C++ allows direct access to system memory, and it tells you precisely where your objects are located. Programmers are then free to do all kinds of things like compute distance to other objects, or convert the location to a number and do arbitrary math operations on it.
When an object moves, anything that refers to it needs to be updated. Well, good luck figuring that out in a language with pointer arithmetic! The system would need to magically determine whether or not a numeric value was actually a memory locations. And what if a program computed the distance between two objects, and later on used that distance to get from one object to the other? The system has no idea of what can be safely moved, and what has to stay put. So nothing can ever be moved.
There are workarounds of course -- if you write a program with heap-compaction in mind, then you can use a "handle" system, where every object has an ID. You remember the ID, and to access the object, you ask the system for a temporary memory location. And as soon as you're done, you "forget" the memory location and let the system shuffle things around in memory. The next time you give that ID to the system, you might get back a different memory location, but you were already expecting that so your program doesn't mind.
But handle allocation is slower, less efficient, and more annoying to use than a traditional fixed-location allocator. You have to start your project with it in mind; retrofitting existing code to use a handle allocator is a giant timesink and prone to conversion errors. And if you don't mind the loss of performance due to using a handle allocator, why are you using C++ in the first place?