Java Native Compilation Examined
An Anonymous Coward writes: "DeveloperWorks has an interesting article about compiling Java apps to run as native appplications on the target machine without the need for a JVM."
← Back to Stories (view on slashdot.org)
Lets see TowerJ has been out since when ? 1999. Having tried my hand at this I have some reservations. The project I was on ... a large dinosaur of a thing which will remain nameless had 12,000 classes which TowerJ turned into C files which were then compiled by GCC. Resulted in 50 megabyte executables on a Sun. Didn't really solve the problem which wasn't really about speed but throughput. The solution was a better design using servlets and Java Webserver ... result DRAMATICALLY faster without any need for native compilation.
Mind you I noticed in the IBM article that the memory footprint was much smaller. That might be nice.
Bitter and proud of it.
Most notably, there is very little support for AWT, making GCJ unsuitable for GUI applications.
That's the real shame of the matter. Java shines most in its ease-of-use for creating complex GUIs -- unfortunately that's also where the worst memory/performance problems appear. For instance, Swing is good for client apps if you can ship with minimum RAM requirements of 64+ mb (and even that's cutting it close). Performance is most important in the UI, where the user notices any lack of responsiveness. Hopefully some Java native compilers will help out here.
Different compilers support differing levels of class library; Excelsior JET is one compiler that claims to completely support AWT and Swing.
Maybe there's hope yet!
I'm not going to try to champion Java, JITs, or native compilation, I'm just going to point out what's wrong with this "study".
This has to be the third or fourth weak study of Java performance I've seen over several years. Issues such as whether or not all Java features are in place in the native compilations (e.g. array bound checking, but note that GCJ turns this on by default) or what sort of optimizations are performed by the native compiler and JVMs are completely ignored. The author also suggests that compiling from bytecode to native code is the natural route when it's quite possible that gains could be made by compiling directly from source to native. While GCJ accepts either Java source or bytecode, it's not clear from the documentation I've read whether or not it first translates source to bytecode or goes straight from source to native.
When comparing disk space, the author comments that an installed JVM can be up to 50 MB vs. 3 MB or so for libgcj. This is a ridiculous comparison since those directories probably include all of the tools, examples and libraries, and as far as I know, libgcj doesn't include the whole J2SE API set, so it's nowhere near a fair comparison. It's a pretty limited set of benchmarks for making these judgements too.
I played around with native compilation of Java a few years ago. At one point (probably around 1996/7?) native compilers could offer noticable gains over Sun's JVM on a numerically intensive application (neural network stuff). However, after the initial versions of HotSpot and IBM's JIT, I couldn't find a native compiler that gave competitive performance on similar apps. I think this is largely due to underdeveloped native compilers with poor optimization (HotSpot is quite insane in its runtime optimizations).
Anyway, I sure hope IBM doesn't pay too generously for studies this poor. Its final message is essentially "who knows?" - not terribly useful.
Alea
I found it interesting that this author, an IBM researcher, chose to only test a single java-to-native compiler, the GCJ (GNU product). This is an immature open-source package that I would not expect much performance from. His paper rehashes a lot of really basic info, then gives some performance results which show IBM's JVM spanking Sun, Kaffe, and GCJ. This is no great surprise; IBM is tooting it's own horn - fine, they deserve to IMHO. But as an exercise in "the state of native compilation" it's useless. What would actually be really useful is a comparison that also included at least a half-dozen other major players in the java native compiler market. I suspect you'd see some different results.
As an aside; I see people call Java "painfully slow," but in my experience it's not that painful post 1.3. I'm not giving you benchmarks, and anti-Java people will just "no" me, but these are my experiences after a few hundred thousand lines of Java code over the past few years. Anyway, it's a good exercise to ask naysayers what _their_ basis is; they often have none.
Also, as other posters have pointed out, the speed loss must be seen in the runtime safety context, as bounds checking and garbage collection yield stability and security dividends and, at the end of the day, we almost always want those things and are willing to wait the extra few ms for the guarantees.
All these complaints about speed are especially ironic given how many massive wasters there are in the modern computer, _especially_ in Windows NT/2k/XP.
But the biggest flaw in this Java vs. C debate is that often you don't get a choice between writing code in Java vs. C/C++, since you don't have the extra 50% in your time budget to do C/C++, and your real choice is between Java and VBScript...
All the people shouting "I can write C++ 10 times as fast as you can write Java, loser" please form a line in an orderly fashion, you will be spanked in the order you arrive...
We're on the road to Tycho.
Lots of experts here.
Some experts who have never used Java want to tell me that it's no good, and will never be any good--why? They don't know, but they know!
And some experts who want to tell me all about why Java's compilation, why it is hard or easy even though they really don't know anything about a compiler.
And some experts on Java's market share who really don't know anything about who uses Java.
And some experts who sat in a room where Java was... gosh gee... being implemented, telling me... well I don't quite know what, but gosh!
So many experts here--I must be reading slashdot!
-
for
(long test=2; test < i; test++)
If the compiler generates slow code for that, something is very wrong in the compiler.{ if (i%test == 0) { return false; }
}
On the safety front, subscript checking is almost free if done right. Subscript checking can usually be hoisted out of loops. Old studies on Pascal optimization showed that 95% of subscript checks can be optimized out at compile time without loss of safety. GCC, though, being a C compiler at heart, isn't likely to know how to do that. Merely putting a Java front end on it doesn't make it a Java compiler deep inside.