Java VM & .NET Performance Comparisons
johnhennessy writes "Just came across some good virtual machine performance benchmarks (on Mono's mailing list). It covers executing java bytecode via a host of different Java runtimes and also the mono runtime. Not only does it give numbers for IBM's runtime (1.4.2 and 1.3.1), Sun's runtime (1.5.0 and 1.4.2), GCJ, Mono, Jikes and much more! These numbers are also given for both Intel and Opteron (where relevant). Before the flames begin, don't forget to read the authors description of how the benchmark was carried out. Hopefuly this should inspire educated discussion on ones favourite JVM/CLR."
Everyone else knows it as well as Sun. The language and VM spec are public knowlege.
Sun has a better JVM not because they have some mystical insight because they invented it, but because they made it a huge part of their company survival strategy. Improving the JVM gets a very high priority in the company. IBM on the other hand might not care if they are second, as long as their performance is competitive.
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
Unless I'm reading the chart wrong, it looks like IBM's VM is faster than Sun's for many applications. (It just doesn't seem to have an Opteron version)
The policy of the United States is worse than bad---it is insane. -- Ludwig von Mises, Economic Policy(1959)
Look again. It is faster in the Linpack 500x500 test. That shows there is potential, at least, but obviously there's work to be done.
Besides, there isn't much reason why a JIT should be slower than a natively compiled binary, besides startup time. The code still gets compiled, just at runtime. In fact, since profiling feedback is close at hand, it has an advantage (though newer versions of gcc/gcj can use profiling data to optimize as well).
Yes you are crazy. Don't use java then. Really when I look at C functions I think they are an awful mess of vomit, however I except that this is the aesthetic/style that C expect and I deal with it. If you worked with Java enough you woudln't have to look it up. What I like about java is that since there are so many layers I can take out the FileInputStream reader and throw a NewtworkStream and leave the rest of the code the same since it uses the buffered reader. For exapmle in C i hate how there is fprintf, sprintf, printf. As far as I am concernred those should be all rolled into one function. The point is either except the way that Java is or don't use it: don't make Java like C. I'm sure you can see how much you'd hate C if I tried to make it more like Java.
Your CPU is not doing anything else, at least do something.
You'd think that natively compiled and optimized code would be faster than an interpreter.
Code run in modern Java VM's is not interpreted...it's compiled on the fly with whatever performance optimizations are appropriate for the particular machine at that particular point in the execution path.
Sigh... for the last time: Java code executed with Sun or IBM JVMs is not interpreted, but compiled to native and heavily optimized code. Before anybody complains: yes, the Sun VM uses mixed-mode execution, which means that code is interpreted first until it has been used for a certain amount of times, then it's compiled. It might seem like a contradiction, but that's mostly a performance improvement (turning byte code into native code with optimizations costs time... often more time than it would take to simply interpret the code; not to mention the memory overhead of creating native code for every one-off method). IBMs VMs always JIT compile their code, but they have different levels of JIT Compilers; the first time code is used, a baseline compiler simply transforms byte code into native code. If that code is heavily used, it is optimized and the old code is replaced with the faster version. Again: No Bytecode interpretation, Dynamic compilation! Next week: Why Java != JavaScript
murphee
Because for that to work, every OS would have to support it, and since actually none DO (except maybe Hurd, which nobody uses), ipso facto, it must be done in the libraries. You could of course argue whether it is done in libraries that are visible to programs, or built into the VM itself but that sort of academic.
Furthermore, I don't think it is necessarily true that everything-is-a-file is the best abstraction for all IO forever and ever. If you want such an abstraction simply use java.net.URL and write protocol handlers for any protocols not provided. So far, http[s]:// and file:// (and probably some others) work out of the box. Add your own.
It's 10 PM. Do you know if you're un-American?
The big advantage of VMs and dynamic compilers is that they can take advantage of the CPU that they are running on. Ie. if you use gcj to compile software, you have to compile optimize it for a specific CPU (instruction order is important and makes a difference, as different OutOfOrder engines on different CPUs (think Intel vs. AMD) act differently). It also means more complicated deployment, for instance: if your code uses SSE2, your program only runs on Pentium4 and other CPUs that have SSE2. If you don't want that, you have to factor out the functions that make use of SSE2 and provide replacements that don't use it (which also means having to check the CPU at startup and choosing the libs appropiately).
.NET) is a dynamic environment, ie. you can load classes at runtime. This proves to be a bit of a problem with static compilation, or I should say: with optimization. Hotspot (Suns Dynamic Compiler or VM and also other JVMs) do some optimizations that improve OOP code, like devirtualization (which removes the vtable lookup overhead for virtual method calls) and even virtual method inlining.
This doesn't matter for a compiler working at runtime, as it knows about the capabilities of the CPU and just uses them if possible (I know that the JVM uses SSE when available, though I haven't found out what exactly it is used for).
A JIT/DynamicCompiler also knows everything about your CPU, for instance Cache sizes,... and can exploit that knowledge to further optimize the code or data layout.
Also: you have to keep in mind, that Java (like
Now: these optimizations have one problem: they may be accurate when the code is generated... but what happens when a new class is loaded? For instance, the compiler devirtualizes a method in class A, because Class Hierarchy Analysis tells him, that this method is never overridden. So... 10 minutes later, a class is loaded, that is derived from class A, and overrides the aforementioned method... so... the code is suddenly incorrect. In this case, the dynamic compiler, takes back the old optimization, and everything is OK again.
This kind of scenario makes it necessary for many optimizations to have a system (compiler) working at runtime.
Mind you: Runtime code loading or generation is a point where gcj gets really slow, because then it has to run that code with gij (the interpreter). There are people that got Eclipse compiled with gcj, but they ran into that problem too (the Plugins in Eclipse are loaded at runtime; they worked around that by compiling the plugins as shared libraries).
murphee
Performance is not the biggest reason to use the GCJ. It is that you don't have to have a pre-installed Java Runtime Environment. Sun's JRE license does not allow you to redistribute, so for every install, you have to be sure there's a JRE or download it from the internet. This is most annoying, especially for disconnected networks.
With GCJ-built code, you can put everything needed on a single installation medium.
There's no time to stop for gas, we're already late.
So... instead of RTFM you ask here? Oh sorry... I forgot, this is Slashdot. They aren't seconds, they are some kind of score on the respective benchmarks, the higher the score the better;
murphee
The other replyer did a good job, but missed your question, which was that you have to recompile on every execution.
On SUN, -server means that a full JIT is applied to all code before it is run (though some enhancements might not be possible until a few runs are performed; run-time profiling / enhancements). -client means that code is only JIT'd if it's run more than a few times (meaning it's a hot-spot which is worth trading off compile-time for run-time). While -client does a good amount of inital JITing, you can see that there's not a terrible amount of performance difference between -client and -server. At least not compared to gcj.
This should convince you that the overhead of compiling is nothing compared to the amount of work being executed. In my experience, runing tomcat with -server v.s. -client is night-and-day as far as load-time is concerned (2 to 3 times slower to get into a state ready to accept web-requests), but I don't notice tremendous differences in runtime performance (compilation is hidden in between web requests). What does this say about the differences between tomcat and the benchmark? Again that there must be several orders of magnitude more time spent executing than loading.
If you wanted to write "ls" in java, then yes, you'd prefer something LIKE gcj (don't know what their load time is so I can't say with certainty). Something that does very little, and thus wants a minimalist foot-print is not going to like java. Even perl was too much over-head a few years ago for most small tasks, which is why awk, sed are still in use today, even though perl totally blows them out of the water for even the most trivial tasks. Today machines are fast enough that the human-perceivable difference in latency goes away. Perhaps when we reach 10GHZ VLIW 1024reg CPUs and have 1TB of RAM, the latency of "java -server" will go away too. Note multi-CPU / hyper-threading isn't going to help single-application load-time.
SUN JVM 1.5 has already started to do what you're essentially asking though; they've pre JIT'd most of the rt.jar file. Much of this file is used for even the most trivial possible hello-world application, so it made sense to pre-store that material. I suspect in future versions, we'll see dynamicly cached JIT code, which would tremendously improve start-up time. Of course, with major java applications (web servers, database wrappers) the only time you tend to restart java is when you're upgrading the jar files, so who knows if anyone really cares.
-Michael
What these benchmarks suggest to me is that C#/.NET is fully competitive in terms of performance with the best Java implementations, and C#/Mono is good enough for most work. In fact, given the additional language features of C#, it may well be easier to write fast code for many compute-intensive problems using Mono than Java today.
The level of performance of Mono is even more impressive given how young the project is; at the same point in time in its evolution, Sun had barely managed to produce a JIT compiler.
It's no surprise Sun's JDK seems to perform the best out of the other versions. They know Java the best since they created it,
The techniques necessary for compiling Java well have largely been around since long before Sun even released Java. Sun has no special, secret knowledge there, they have just been hacking on their implementation longer than anybody else. If anything, Sun's progress on Java has been pokey.
Sun Java is probably pretty close to what is theoretically possible, so they will largely be standing still, while other implementations will keep improving until they also hit the limit. In a couple of years, you can expect that all actively developed JVM and CLR implementations will have roughly the same performance on comparable code.
- The default character encoding resulting from your particular combination of JVM and platform will be correct and non-lossy every time the program is run (e.g. not in Windows, which defaults to ISO8859-1), or;
- You're certain that the file contains only 7-bit ASCII.
Readers and Writers deal with characters, whereas InputStreams and OutputStreams deal with bytes. FileReader and FileWriter are miscreants that should be deprecated, because they hide a very important implementation detail, namely that they always use the platform's default character encoding, which is often lossy.If you want to read characters from a file (or socket) you need to come up with some way to agree on the character encoding and specify it precisely. Not even HTTP does a good job of this--you don't know the character encoding of a request or response until the Content-Type header has been transferred, and often not even then.
What's the character encoding for URLs and domain names? Convention seems to be settling on UTF-8 but AFAIK it's just that.
The equivalent technique that's less risky (but of course much more verbose) is:
Where "UTF-8" is a sane default non-lossy character encoding. If you don't know the encoding that was used to write the file you're about to read, you're sort of screwed. You can try some heuristics to try to detect its encoding, or if you're "lucky" you might find a Unicode Byte Order Mark.
Note that none of this headache is particular to Java, it's just that the designers of Java knew early on that a character is not a byte and formalized that distinction (poorly at first) in the language and libraries.
If you'd know what dynamic compilation is, you'd fully expect the opposite. I'm only surprised by the difference in numbers. GCJ is doing much worse than I expected. From the looks of it, the only times it comes close to competing is when executing simplistic numerical benchmarks (which are presumably easy to optimize). What you are seeing here is that runtime optimization actually works when doing real-life complicated stuff.
The use of the word interpreter is really suggestive. A compiler is nothing more than a static interpreter. The only two differences with a JIT compiler is that it permanently stores the results of the interpretation and that it has much less information to predict the performance of the code it is generating. A static compiler will do well on simplistic programs whereas a run-time optimizing jit compiler will be able to always match the performance of a static compiler (simply by running the same optimizations) or beat it (by applying optimizations to address observed performance bottlenecks in the running program). Of course this costs some computation time but you can take that into account as well when deciding to optimize or not.
In simplistic scenarios such as simple benchmarks, there won't be much to gain from run-time optimization. So performance is about equal with a slight advantage for the static compiler because it is not wasting resources on figuring out how to optimize the program. Despite this GCJ's performance is disappointing even on these kinds of benchmarks.
Jilles
For one of my classes, I've been using Eclipse to do Java development. The CS lab at school has only Windows machines (XP Pro; P4-3.2G w/ HT, I think) and Eclipse runs very nicely on these computers.
At home, I run Linux on a P4-2.6G w/ HT. Frankly, Eclipse runs like ass on Linux. I've tested it with pretty much every available JRE/J2SDK for Linux, and it sucks with all of them.
This is using Eclipse with SWT compiled for GTK+. I have used WebSphere with the Motif libraries, and it was quite fast, but Motif is just horrid to use next to GTK+. But, SWT with GTK+ is just so damned slow. I've run Eclipse on my PowerBook G4, and it sucks even more on Mac OS.
I find this funny that Windows is the best supported platform for Java GUI programs, considering that MS hates Java. So, is GUI performance with Java apps ever going to be acceptable on Linux? It's really pathetic at this point.
I know Sun added OpenGL 2D acceleration to their 1.5.0 JRE, but that gives me lots of artifacts with the included demo programs, and SWT doesn't use the acceleration at all.
1, 1, 2, 3, 5, 8, 13, 21 -- Mathematics is the Language of Nature.