Java Urban Performance Legends
An anonymous reader writes "Programmers agonize over whether to allocate on the stack or on the heap.
Some people think garbage collection will never be as efficient as direct memory management, and others feel it is easier to clean up a mess in one big batch than to pick up individual pieces of dust throughout the day. This article pokes some holes in the oft-repeated performance myth of slow allocation in JVMs."
How much time have I spent with Electric Fence and valgrind finding memory leaks in my C programs? In Java, the auto garbage collection is as good as Perl's, without that tricky "unreadable code" problem ;). And I can always tune garbage collection performance by forcing a garbage collect when I know my app's got the time, like outside of a loop or before creating more objects in storage.
--
make install -not war
The memory allocation management routines are normally running when the JVM thinks it's best, but as a programmer it is usually possible to predict the best time when to actually take care of the housekeeping. Even if the memory management cleanup takes the same time in both cases Java has a tendency to issue them in the middle of everything. So if I as a programmer does the garbage collection at the end of a displayed page and Java does it uncontrollable in the middle of the page the latter case is more annoying to the user.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
What Java apps are you talking about?
There aren't many professional-grade Java Desktop apps out there, and those that _are_ out there are generally built for maintainability, not speed (Eclipse, Netbeans). I built a Java web client that blew the pants off its older C++ version. The reason Java apps are generally slower is because the Java culture is about maintainence over performance.
Don't even get me started on C#.
.Net violates the EULA. And we
wouldn't want that!
I should hope not! Any form of benchmarking of Microsoft's
So when Microsoft declares their interpreted inverse-polyglotic language as "faster" than compiled pure C, just accept it. Best for everyone that way.
the FACT is that people's real-world experience, no matter how anecdotal, consistently demonstrates that Java is MASSIVELY slow than similar apps in C or C++.
Well, the linked article contained a number of what I will graciously call "assumptions" (rather than "outright lies") about allocation patterns in C/C++ that simply don't hold true in most cases.
For example, the parent mentions the old "stack or heap" question... Which no serious C coder would ask. Use the heap only when something won't fit on the stack. Why? The stack comes for "free" in C. If you need to store something too large, you need the heap. But then, you can allocate it once, and don't even consider freeing it until you finish with it (generational garbage collection? Survival time? Gimme a break - It survives until I tell it to go away!). As for recursion... You can blow the stack that way, but a good programmer will either flatten the recursion, or cap its depth.
And the article dares to justify its "assuptions" by comparing Java against a language interpreter such as Perl. Not exactly a fair comparison - Yes, Perl counts as "real-world", but in an interpreter, you can't know ahead of time anything about your memory requirements. At best, you can cap them and dump the interpreted program if it gets too greedy. Now, some might point out that Java gets interpreted as well - And I'll readily admit that, for doing so, it does a damn fine job with its garbage collection. But if you want to compare Java to Perl, then do so. Don't try to sneak in a comparison to C with a layer of indirection.
One last point - The article mentions that you no longer need to use object pooling. SURE you no longer need it - Java does it implicitly at startup. You can avoid all but a single malloc/free pair in C as well, if you just steal half the system's memory in main(). Sometimes that even counts as a good choice - I've used it myself, with the important caveat that I've done so when appropriate. Not always. I don't have malloc() as the second and free() as the second-to-last statements in every program I write. And that most definitely shows in the minimal memory footprints attainable between the two languages... Try writing a Java program that eats less than 32k.
The copying collector sounds really fast indeed, but I can immediately see two problems:
The first one is the need for a huge amount of memory. It would seem that the optimal way of dealing with this is restricting the amount of memory available to the application, otherwise any app can grow to the maximum size allowed by the VM, whether it needs it or not. But this sounds rather crappy to me, now every developer needs to figure out an right limit for the application.
The second is that performance is going to suck when garbage collection is performed. The slowdown could be a lot larger than a single execution of malloc/free, especially if virtual memory is taken into account. The unused half of the memory will often be quickly moved to swap by the VM, especially when the process grows to sizes in the order of hundreds of MB. Then GC will force bringing all that back to RAM, while possibly swapping the previously used half to disk. Exactly the same situation as what's described with heap allocation, but a whole lot worse.
It sounds to me that even if malloc is slower, it's a lot less inconvenient in applications like games, where something that is always slow can be taken into account, but where a sudden run of the GC could be really inconvenient.
But this is not my area of experience, so it's just what came to mind. Can anybody confirm or refute these suspicions?
Here is a paper (PostScript) from 1987 on the topic of GC being faster than manual allocation.
The author went on to make a very fast GC that set speed records.
If you are looking for factual arguments, with performance measurements and so on, just look at his work over the last few decades -- you'll see he did a lot of work in these very practical areas.
When you see how productive guys like him can be, it makes me wish that some people would just stay alive, and keep working, for a few hundred years more, instead of our typical mortal lifespans.
http://www.thebricktestament.com/the_law/when_to_
Programmer cycles are expensive.
Indeed. It might be worth (pardon my pun) reiterating what those cycles really are, in regard to application performance.
In all languages I know of, you get some library functions ready-made, and you need to code some stuff yourself.
Most performance problems occur in the code you made yourself.
In my experience, you get most bang for buck when you are able to efficiently allocate your programmer time to a) program a functionally complete draft version, b) optimize those parts which need optimization and c) maintain the program, in a manner which is BALANCED, but biased towards maintenance.
De facto, you get better balance between those things, and most bang for buck, using languages such as Java, as opposed to languages such as C++, because (say) Java offers a pretty coherent conceptual framework (class libraries) for creating your draft in a maintainable way, provides default access to excellent non-invasive performance measurement tools such as YourKit and JProfiler which let you objectively find out where you need to do performance work.
This means you can do only the optimization work that is necessary, and create optimized packages which extend the default class library interfaces which means that generally maintenance programmers don't have to put nearly as much work into figuring out how the optimizations affect the draft work.
It's not perfect, but it gets you more bang for buck, which is what matters to you when you manage resources.
Not the default developer perspective, I know.
The article's main point is that Java's memory allocation is faster than malloc, and it's garbage collection is better than cumulative free's.
However, thats not the problem. All memory in a Java program has to be allocated dynamically. Other languages offer static memory alternatives. Static memory use will be more efficient in many cases.
The my language is faster than yours argument is inherently stupid. There is no "best" language. You need to use the right tool for the right job.
--Barry
I have been doing Java programming for several years now and ported many C/C++ applications to Java, mostly server side apps and I'd say roughly 85% of the time the Java apps outperformed the originals, sometimes by an order of magnitude. Now these were more redesigns than straight ports and the performance gains were not because Java was any better from a performance standpoint, but because design is more of a factor in speed than the language used, especially for larger applications. Usually when I find big performance hurdles that are hard for me to overcome, I find I would have same issues in most languages, so finding a better design is usually the solution. If you are writing small - medium apps or mostly GUI apps then I might have reservations about Java, but for larger apps Java is a good choice.
The common code path for new Object() in HotSpot 1.4.2 and later is approximately 10 machine instructions (data provided by Sun; see Resources), whereas the best performing malloc implementations in C require on average between 60 and 100 instructions per call (Detlefs, et. al.; see Resources).
Wow, that's really shocking. Until you actually look at the Detlef paper and realize that it was published in 1994, 11 years ago!! Who knows, maybe things have improved a bit in 11 years. The author certainly thinks Java is getting better; maybe it's possible that C/C++ compilers have improved as well.