Java Performance Urban Legends
An anonymous reader writes "Urban legends are kind of like mind viruses; even though we know they are probably not true, we often can't resist the urge to retell them (and thus infect other gullible "hosts") because they make for such good storytelling. Most urban legends have some basis in fact, which only makes them harder to stamp out. Unfortunately, many pointers and tips about Java performance tuning are a lot like urban legends -- someone, somewhere, passes on a "tip" that has (or had) some basis in fact, but through its continued retelling, has lost what truth it once contained. This article examines some of these urban performance legends and sets the record straight."
has poor start up time, and requires an absolutely massive amount of memory. That, and garbage collection makes almost-real time ("soft" real time I believe is the technical term) UIs more difficult than they should be.
Oh, good, another one to shoot down. While I don't have any numbers at all, I know that Apple 'fixed' this problem to an extent by making parts of java shared, just like any shlibs. This alleviates the 14 apps, 14 bags of shit problem to some extent.
Apple then returned the changes to SUN, who rolled them into 1.4.x.
I wish I had numbers. Sorry.
I think he's talking about something like this:
/* snip */
while(true)
{
Object o = new Object();
o = null;
}
The GC won't free the memory in realtime (or, sometimes, ever), as would be the case for C++/C with new/delete malloc/free.
Do you even lift?
These aren't the 'roids you're looking for.
I've worked on two embedded projects using Java on low power (energy consumption/CPU performance both) platforms. Both projects had amazingly similar things happen. I stated up from, "Java is interpreted; it will be slower than the C code of the previous project on the platform, potentially significant."
The reply, "We don't care about performance."
Four months later... "Why the hell is your code so slow?"
Interpreted is as interpreted does.
I used to wonder what was so holy about a silent night, now I have a child.
In a real project, using JDK 1.3 on various platforms, we had performance issues. So, we measured speed in various ways, and found three main problems.
1: Synchronization.
This is slow. Really slow. And it just gets worse when you're running on dual or quad processor machines. StringBuffer is a major offender; in a lame attempt to save one object allocation, it uses a simple reference counting device which requires synchronization for operations as trivial as appending a character. Writing a simple UnsynchronizedStringBuffer gave a measurable performance boost.
2: Object creation
This is the real problem. GC is slow. GC on SMP machines is still really slow in JDK 1.3 -- maybe JDK 1.4 is better, my experience is a little out of date. By rewriting large chunks of code to create fewer objects (often by using arrays of primitives) we made it much faster -- close to twice as fast, if memory serves.
3: Immutable objects
Yes, these add to GC, and so are bad for performance. But not such a great evil, so long as you don't overuse them.
Funny that the article "debunks" these myths without figures, when our thorough measurements showed that the problems are real, and in our case would have killed our chances of meeting performance targets had we not found them and dealt with them.
Some bigger issues for server-side design: be careful how you use remote calls (such as RMI) and how you use persistence (such as JDO). But the small things, which the article seems to misrepresent, matter too.
I've used Java on embedded applications, on systems that create lots, and lots of objects. And I don't recall ever running out of memory, if there wasn't a bug in the Java program.
But I'm not saying you're lying or wrong, only that a well tuned, well supported, JVM doesn't do this.
the opportunistic garbage collection of C/C++ simply leads to better performance than any language that tries to do the garbage collection for you.
What opportunistic garbage collection of C/C++? You mean delete and free? Get real! Personally, I wouldn't trust the average programmer to even collect garbage correctly more than half the time, and that doesn't cut it. I've had way, way less problems with Java GC than I've ever had with C/C++ in a realtime system. People have spent weeks finding memory leaks; and one time a leak I found was a ghastly C++ compiler bug where the compiler screwed up the automatic destructors on unnamed objects.
-WolfWithoutAClause
"Gravity is only a theory, not a fact!"One of the reasons is that interactions with caches are hard to model, making it hard to know what to do to minimize problems. Caching is, inherently, a deal with the devil: you get speed but lose understanding. Sometimes you lose the speed too. Even when you understand, there's not much you can do. Sometimes complicated stuff is inherently expensive.
When I say caching, I mean not just CPU caches of RAM, but also RAM caches of (potentially swapped-out) process space. If you allow a naive garbage-collector to operate freely, it will happily consume the entire address space available, typically the sum of available RAM and swap space, before garbage-collecting, so the process will run not from RAM but from swap. When it garbage-collects, too, it has to walk a lot of that memory, and swap it all in.
Just running "ulimit -d" in the shell where the java (or other GC-language) program runs can help a lot. It will GC a lot more often, but if nothing is swapped out, the GC happens a lot faster, and the program's regular execution doesn't have to touch swapped-out pages. You have to know a lot about the program and the data it uses to guess the right ulimit value, and if you guess wrong the program fails, but a thousand-fold speed improvement earns a lot of forgiveness.
Did you really believe garbage collection would mean you don't have to know about memory management? It makes memory management harder, because the problem remains but there's less you can do about it. (For trivial programs it doesn't matter. If you only write trivial programs, though, you might as well find some other job.)
There's a similar effect with the CPU cache and RAM. Ideally you want the program code and the data it operates on all to live in cache, because touching the RAM takes 100 times as long at touching cache. With bytecodes, you have a lot more "cache pressure" -- you have the bytecodes themselves, the just-in-time compiler, and the native code it generates. At the same time, since your memory manager generally can't re-use memory that you just freed, it allocates other memory that, when touched, pushes out something else that was useful (such as program code).
The result is that no matter how clever the JVM is, there's not much it can do to get the performance of real programs close to optimal, or even within a pleasing fraction of equivalent C++ code. This despite all the toy benchmarks that seem to prove otherwise, and which carefully avoid all these real-world problems.
Of all the promised features of Java (like Lisp before it and C# after it), we're left with the sole remaining feature, that its virtual machine specifies precisely (or abstracts away) enough details of the runtime environment that the code is more portable than a faster native implementation, and the code might get written faster for the author having avoided thinking about details that affect performance.
The sole saving grace is that most programs don't have much need to run very fast anyway, or if they do it's hard to prove that they ought to run faster. Most people take what they get without complaining, or without complaining to anybody who cares, or without doing anything to make whoever is responsible uncomfortable enough to have to do anything differently. A whole generation trained to accept programs that crash daily or hourly is thrilled to find a program whose biggest problem is that they suspect it might be sluggish.