Java Urban Performance Legends
An anonymous reader writes "Programmers agonize over whether to allocate on the stack or on the heap.
Some people think garbage collection will never be as efficient as direct memory management, and others feel it is easier to clean up a mess in one big batch than to pick up individual pieces of dust throughout the day. This article pokes some holes in the oft-repeated performance myth of slow allocation in JVMs."
JVMs are surprisingly good at figuring out things that we used to assume only the developer could know.
:) ) they weren't. The performance of the language has greatly improved while the perception of language has remained the roughly same (at least amoung the general coding community).
Yes they are. Now. 10 years ago when Java applets were being embedded in webpages (to show rippling water below a logo
Just goes to show that even if you have a great technical product you'll still need the marketdriods. Unfortunately.
http://twitter.com/onion2k
These articles keep popping up flatly stating that Java's slowness is a myth. But, no matter how many times you say it is a myth and how hard you try to create a new perception, the FACT is that people's real-world experience, no matter how anecdotal, consistently demonstrates that Java is MASSIVELY slow than similar apps in C or C++.
Java is slower. Don't even get me started on C#.
You so called counter argument is 6 years old... a lot of it weren't true to begin with, and a lot of things have happened since then.
In the end though, the MOST important thing is that these days, processor cycles are cheap. Programmer cycles are expensive. Therefore it makes sense to sacrifice a little bit of program performance to get more productive programmers.
Being bitter is drinking poison and hoping someone else will die
If you spend alot of time using stuff like valgrind and searching for memory leaks, then you are doing something very wrong. Its really not hard to just free the memory you allocate. The fact that people can write entire operating systems in C without having these problems indicates that the problems are with people, not C.
Of course, that's why I program everything in machine language. Memory allocation in hex opcodes worked for me in 1981, it still works in 2005. Programming languages don't leak memory, programmers leak memory.
In seriousness, look at the bugzillas and changelogs of any programming project. Memory leaks abound. So the reality is that memory de/allocation programming eats quite a lot of productivity, and impedes many otherwise ready releases. Of course we can de/allocate memory properly in C. And we can compute with scientific calculators. When we've got something better, we use it.
--
make install -not war
1. Make it work.
2. Make it work well.
3. Make it work fast
And #3 is the most interesting... how fast is fast? In an absolute sense, sure, C/C++ will always be faster. But does the end user notice this in a webapp? NO!
I have a p3 450mhz box running Tomcat/MySql. It serves my local applications fast enough. The server can render the page much more quickly than my browser (on a p4 1.5ghz box) can render it. As a webapp, java on an old machine is plenty fast.
Java as a desktop GUI is an altogether different story, but I'm not using java on the desktop. This point is moot to me.
"Fast enough" to not be noticeable. That is the secret of #3. In a webapp, this is easily achieved in java.
Then use Delphi, or better yet, C#. (or even Python and a few other choices)
Faster productivity, less bugs, no ram guzzling 5 minute startup. Java isn't the only language that reduces development time, it's just the only one (besides VB) that makes you sacrifice big things to get it.
Also, let us not forget that java has memory leaks too and in my opinion they are much less obvious than C++ memory leaks.
A *good* C++ programmer will probably write code that outperforms the equivalent in Java. A *good* C+ programmer will remember to deallocate all of his objects to prevent memory leaks. A *good* C++ programmer will copy his strings correctly to prevent buffer overflow exploits.
If you have been involved in developmnent for any reasonable amount of time or worked on projects of reasonable size, you know that *good* programmers are hard to come by. When you add the real world to the picture you find that simple things like garbage collection and a virtual machine can make a mediocre java programmer outperform a mediocre C++ programmer.
If schools actually learned to produce good programmers, and HR departments learned how to identify them, and job interviews verified them, we wouldn't be having this discussion.
----- If communism is a system where the government owns business, what do you call a system where business owns govern
Valid point... for 1999. But Java has been through a lot of changes since "the original JBuilder" came out. There have been three major changes to the class libraries (1.2, 1.4, and the 5.0).
Honestly, I understand why people are down on Java -- it's because there have always been two strengths to the Java "platform":
That said, I think you hit the nail ont the head -- people are still thinking in terms of 1999 and not 2005 when they think of Java -- at least on Slashdot. But the typical Java developer is busy writing enterprise apps, not writing kernel code, device drivers, or anything else that requires C/C++.
You're right, Java does have memory leaks, in a sense, but these same problems can arise in any language. You could have a cache implemented in C/C++ that is never properly emptied, so over time it fills up. Or you could have a request log where, due to a bug, requests are not always removed from the log after they are processed. Sure, you might have included code to free the objects in the cache or log when you are done with them, but the programming error is that you forgot to remove the item. This is not a language issue, this is just a programming mistake.
At least in Java you only need to worry about one problem: removing items from caches, logs, queues, etc when you are done with them; in C++ you have to also worry about de-allocating that object, which requires strict planning since the object must not be referred to anywhere else when you de-allocate it. In general you can't prove that the object isn't still referred-to by someone who had it before it was put into the cache/log/whatever; instead you need to establish rules and write documents to make sure everyone knows what the lifecycle of an object should be.
As for your company not having a memory bug logged in 3.7 years, that's great, but without knowing what kind of applications you write, or how they're used, it's hard to say if that's significant. Some programs, like, say, grep, or mv or ls or any one-time-use utilities may be rife with memory leaks, but who cares? When the program terminates the memory is freed, and the user will likely not notice the leak. But even if your apps are long-running network services or whatever, I bet the time spent by developers preventing memory leaks is significant, even if that time is completely spread out throughout the development cycle instead of in large, easy-to-spot chunks running Valgrind or some other profiler.
Please forward me the resumes of these programmers, which include "Skills" like "I never write memory leaks". And whose "Certifications" include "Good Programmer", not those other resumes which say "Bad Programmer".
The fact is that Java programs include more intelligence about programming from the compiler and the JVM. So programmers are more productive than with C. Which means that the same programmer budget can produce more product, or a lower budget can produce the same product. With distributed teams, outsourcing, OSS integration projects, those considerations are crucial to project success. Sure, quality projects also require good design and management. But Java eliminates some quality problems right out of the box. That can't be dismissed merely by saying "hire good programmers, not bad". Because it's a continuum, with a sliding scale of costs that isn't always proportional to productivity.
--
make install -not war
Unfortunately, programs such as Azureus, which run piggishly slow on a 1.2GHz laptop do nothing to dispell these 'performance myths'.
Manually determining how much memory to give to an application was one of the things that us Mac users were more than happy to leave behind with the transition to OS X. Hell, we were more than happy to kill that "feature." Dancing on its coffin and all that. Are you seriously suggesting the world go back?
Besides, I thought the biggest examples of Java was that it was all cool and dynamic and took care of things for you. It seems fair to me to expect such a language/platform to be smart enough to figure out that it's not polite to walk to the buffet, heap three plates with a ridiculous pile of food, and go back to the table and graze on a few pieces of celery, leaving the rest of the food untouched.
Which is the point.
Systems programmers write systems code. There is no one size fits all. There is no silver bullet. Comparing out-of-the-box C/C++ to out-of-the-box Java is a non-starter in my opinion because I've never used out-of-the-box C/C++ for large scale performance applications. What Google did in writing custom systems software is something that cannot happen with Java and is the accepted practice for C/C++.
Java programmers write applications in a "one-size-fits-all" performance environment. Comparing Java to C/C++ is like comparing apples to oranges.
Serious C/C++ systems programmers write their own malloc and systems software.
You can't just compare Java allocation to:
Foo *foo = new Foo();
You also have to compare it to:
Foo foo;
Experienced C++ programmers will prefer the latter both because it is faster and because you don't have to worry about freeing it. One of the problems with a lot of the "See, Java's faster than C++!" comparisons is that they tend to translate directly from Java to C++, ignoring the things you can do in C++ but not Java (like stack allocation) that have a huge performance impact.
The cake is a pie
Except, you're wrong. You're assuming that the compiler is doing *no* optimization. So, how would a compiler treat the first "inefficient" code example? Let's take a look:
public double getDistanceFrom(Component other) {
Point otherLocation = other.getLocation();
int deltaX = otherLocation.x - location.x;
int deltaY = otherLocation.getY() - location.getY();
return Math.sqrt(deltaX*deltaX + deltaY*deltaY);
}
Look at the implementation of getLocation(). It's a simple non-recursive procedure. In virtually all compilers, a simple procedure like this is going to be inlined, producing:
public double getDistanceFrom(Component other) {
Point otherLocation = new Point(other.location) ;
int deltaX = otherLocation.getX() - location.getX();
int deltaY = otherLocation.getY() - location.getY();
return Math.sqrt(deltaX*deltaX + deltaY*deltaY);
}
(Note that to the compiler, all variables are effectively public, so this would not be an access violation.)
Next, the compiler can (easily) prove (by inspection) that Point(Point) is a copying constructor. As a result, it can replace the use of new Point(other.location) with other.location so long as it proves it will not modify otherLocation, and of course, it can prove this quite easily by inspection of getDistanceFrom(). This results in:
public double getDistanceFrom(Component other) {
Point otherLocation = other.location ;
int deltaX = otherLocation.getX() - location.getX();
int deltaY = otherLocation.getY() - location.getY();
return Math.sqrt(deltaX*deltaX + deltaY*deltaY);
}
Also, note that getX() and getY() are also simple non-recursive leaf procedures, so the compiler would have inlined them at the same time, so you would actually get code equivalent to this:
public double getDistanceFrom(Component other) {
Point otherLocation = other.location ;
int deltaX = otherLocation.x - location.y;
int deltaY = otherLocation.y - location.y;
return Math.sqrt(deltaX*deltaX + deltaY*deltaY);
}
Which is now faster that your hand-written "optimization." Note in fact, that you "over-optimized" by replacing otherLocation with other.location twice. If a compiler were actually dumb enough to implement it that way, literally, it would have involved an extra (and needless) pointer dereference to get the value of other.location twice, when it already had it sitting in a register. (Fortunally, most any compiler nowadays would have caught your "optimization" and fixed it.)
In fact, the compiler wouldn't stop here. It might even rearrange the order in which it fetches x and y to avoid cache pollution and misses. Do you consider that each time you fetch a variable? Most programmers don't, and shouldn't. The moral of the story is that most programmers fail to realize how smart compilers have become. Compiler writers design optimizations with "good coding practices" such as this in mind, and the programmer will be rewarded if he or she uses them.
See, that right there is a huge chunk of the whole problem. If programmers weren't being lazy and arrogant jerks and assuming that nobody cares about how poorly designed and coded their algorithms are, or how blatently horrible their memory usage is, we would have a lot fewer problems.
I don't care that Java will get there at some point, or that it's better now. I care about whether it actually works right, and right now it does not. It does have too long a startup, most of the UI toolkits are horribly slow, it does not follow native platform UI conventions, and it is just not as nice to use a Java app as it is an equally well written native app. It does take far too memory, especially compared to other languages. Even the cross-platform nature of the code is largely a lie. Different JREs don't support the same things, and many platforms lack a JRE. There are massive differences between the various editions of Java so as to make it useless to try to write something across them. Your Java app written against "Vendor X Version Y Edition Z JRE" should work fine on any other platform with the same version of "Vendor X Version Y Edition Z JRE". That's all you can safely say without a good amount more testing.
However, most of all, my CPU time and RAM might be cheap, but they are abused every time I use an application written in that mindset. I have memory needlessly abused, and CPU time needlessly wasted every time I run some poorly coded program. That means that I lose time and productivity every time I run your poorly written app. That attitude pisses me off, why we should all squander our system resources so that a few lazy programmers can write some app a little faster and with less effort, rather than doing their job right.
Programmer cycles are a lot cheaper than making thousands of people upgrade their hardware. I imagine those programmer cycles are a few orders of magnitude cheaper.
Summary: "If JVMs were smart, garbage collection would be fast."
Reality: "JVMs are mostly very stupid, and you can never be sure what JVM your users are going to use, so in the real world of deployed applications garbage collection performance--and Java performance generally--is a nightmare."
I am so tired of GC advocates talking smugly about theoretical scenarios. Who cares?. When I can run a Java app on an arbitrary JVM and not have it come to a grinding halt every once in a while as the garbage collector runs--or worse yet bring the machine to a grinding halt because the garbage collector never runs--only then will GC will be useful.
The weasel-words in the article are worthy of a PHB: "the garbage collection approach tends to deal with memory management in large batches" Translation: "I wish GC dealt with memory management in large chunks, but it doesn't, so I can't in all honesty say it does, but I can imagine a theoretical scenario where it does, so I'll talk about that theoretical scenario that I wish was real instead of what is actually real."
This is not to say that there aren't one or two decent JVMs out there that have decent GC performance. But having managed a large team that deployed a very powerful Java data analysis and visualization application, and having done work in the code myself, and having had to deal with user's real-world performance issues and having seen the incredible hoops my team had to go through to get decent performance, I can honestly say that up until last year, at least, Java was Not There with regard to GC and performance.
The most telling proof: my team did such a good job and our app was so fast that many users didn't believe it was written in Java. It was users making that judgement, not developers. Users whose only exposure to Java was as users, and whose empirical observation of the language was that it resulted in extremely slow apps. They didn't observe that it was theoretically possible to write slow apps. They observed that it was really, really easy to write slow apps, in the same way it's really easy to write apps that fall over in C++, despite the fact that theoretically you can write C++ apps that never leak or crash due to developers screwing up memory.
Every language has its strengths. Java is a good, robust language that is safe to put into the hands of junior developers to do things that would take a senior developer to do in C++. But its poor performance isn't a myth, nor is its tendency to hog system resources due to poor GC. Those are emprical facts, and this article introduces no actual data to demonstrate otherwise.
Blasphemy is a human right. Blasphemophobia kills.