Book Review: Java Performance
jkauzlar writes "The standard Oracle JVM has about sixty 'developer' (-XX) options which are directly related to performance monitoring or tuning. With names such as 'UseMPSS' or 'AllocatePrefetchStyle', it's clear that Joe Schmo Code Monkey was not meant to be touching them, at least until he/she learned how the forbidding inner recesses of the JVM work, particularly the garbage collectors and 'just-in-time' compiler. This dense, 600-page book will not only explain these developer options and the underlying JVM technology, but discusses performance, profiling, benchmarking and related tools in surprising breadth and detail. Not all developers will gain from this knowledge and a few will surrender to the book's side-effect of being an insomnia treatment, but for those responsible for maintaining production software, this will be essential reading and a useful long-term reference." Keep reading for the rest of jkauzlar's review.
Java Performance
author
Charlie Hunt and Binu John
pages
693
publisher
Addison Wesley
rating
9/10
reviewer
Joe
ISBN
0-13-290525-6
summary
Java performance monitoring and tuning
In my experience, performance tuning is not something that is given much consideration until a production program blows up and everyone is running around in circles with sirens blaring and red lights flashing. You shouldn't need a crisis however before worrying about slow responsiveness or long pauses while the JVM collects garbage at inconvenient times. If there's an opportunity to make something better, if only by five percent, you should take it, and the first step is to be aware of what those opportunities might be.
First off, here's a summary of the different themes covered:
The JVM technology: Chapter 3 in particular is dedicated to explaining, in gory detail, the internal design of the JVM, including the Just-In-Time Compiler and garbage collectors. Being requisite knowledge for anyone hoping to make any use of the rest of the book, especially the JVM tuning options, a reader would hope for this to be explained well, and it is.
JVM Tuning: Now that you know something about compilation and garbage collection, it's time to learn what control you actually have over these internals. As mentioned earlier, there are sixty developer options, as well as several standard options, at your disposal. The authors describe these throughout sections of the book, but summarize each in the first appendix.
Tools: The authors discuss tools useful for monitoring the JVM process at the OS level, tools for monitoring the internals of the JVM, profiling, and heap-dump analysis. When discussing OS tools, they're good about being vendor-neutral and cover Linux as well as Solaris and Windows. When discussing Java-specific tools, they tend to have bias toward Oracle products, opting, for example, to describe NetBean's profiler without mentioning Eclipse's. This is a minor complaint.
Benchmarking: But what good would knowledge of tuning and tools be without being able to set appropriate performance expectations. A good chunk of the text is devoted to lessons on the art of writing benchmarks for the JVM and for an assortment of application types.
Written by two engineers for Oracle's Java performance team (one former and one current), this book is as close to being the de facto document on the topic as you can get and there's not likely to be any detail related to JVM performance that these two men don't already know about.
Unlike most computer books, there's a lot of actual discussion in Java Performance, as opposed to just documentation of features. In other words, there are pages upon pages of imposing text, indicating that you actually need to sit down and read it instead of casually flipping to the parts you need at the moment. The subject matter is dry, and the authors thankfully don't try to disguise this with bad humor or speak down to the reader. In fact, it can be a difficult read at times, but intermediate to advanced developers will pick up on it quickly.
What are the book's shortcomings?
Lack of real-world case studies: Contrived examples are provided here and there, but I'm really, seriously curious to know what the authors, with probably two decades between them consulting on Java performance issues, have accomplished with the outlined techniques. Benchmarking and performance testing can be expensive processes and the main question I'm left with is whether it's actually worth it. The alternatives to performance tuning, which I'm more comfortable with, are rewriting the code or making environmental changes (usually hardware).
3rd Party tool recommendations: The authors have evidently made the decision not to try to wade through the copious choices we have for performance monitoring, profiling, etc, with few exceptions. That's understandable, because 1) they need to keep the number of pages within reasonable limits, and 2) there's a good chance they'll leave out a worthwhile product and have to apologize, or that better products will come along. From my point of view, however, these are still choices I have to make as a developer and it'd be nice to have the information with the text as I'm reading.
As you can see, the problems I have with the book are what is missing from it and not with what's already in there. It's really a fantastic resource and I can't say much more than that the material is extremely important and that if you're looking to improve your understanding of the material, this is the book to get.
You can purchase Java Performance from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
First off, here's a summary of the different themes covered:
The JVM technology: Chapter 3 in particular is dedicated to explaining, in gory detail, the internal design of the JVM, including the Just-In-Time Compiler and garbage collectors. Being requisite knowledge for anyone hoping to make any use of the rest of the book, especially the JVM tuning options, a reader would hope for this to be explained well, and it is.
JVM Tuning: Now that you know something about compilation and garbage collection, it's time to learn what control you actually have over these internals. As mentioned earlier, there are sixty developer options, as well as several standard options, at your disposal. The authors describe these throughout sections of the book, but summarize each in the first appendix.
Tools: The authors discuss tools useful for monitoring the JVM process at the OS level, tools for monitoring the internals of the JVM, profiling, and heap-dump analysis. When discussing OS tools, they're good about being vendor-neutral and cover Linux as well as Solaris and Windows. When discussing Java-specific tools, they tend to have bias toward Oracle products, opting, for example, to describe NetBean's profiler without mentioning Eclipse's. This is a minor complaint.
Benchmarking: But what good would knowledge of tuning and tools be without being able to set appropriate performance expectations. A good chunk of the text is devoted to lessons on the art of writing benchmarks for the JVM and for an assortment of application types.
Written by two engineers for Oracle's Java performance team (one former and one current), this book is as close to being the de facto document on the topic as you can get and there's not likely to be any detail related to JVM performance that these two men don't already know about.
Unlike most computer books, there's a lot of actual discussion in Java Performance, as opposed to just documentation of features. In other words, there are pages upon pages of imposing text, indicating that you actually need to sit down and read it instead of casually flipping to the parts you need at the moment. The subject matter is dry, and the authors thankfully don't try to disguise this with bad humor or speak down to the reader. In fact, it can be a difficult read at times, but intermediate to advanced developers will pick up on it quickly.
What are the book's shortcomings?
Lack of real-world case studies: Contrived examples are provided here and there, but I'm really, seriously curious to know what the authors, with probably two decades between them consulting on Java performance issues, have accomplished with the outlined techniques. Benchmarking and performance testing can be expensive processes and the main question I'm left with is whether it's actually worth it. The alternatives to performance tuning, which I'm more comfortable with, are rewriting the code or making environmental changes (usually hardware).
3rd Party tool recommendations: The authors have evidently made the decision not to try to wade through the copious choices we have for performance monitoring, profiling, etc, with few exceptions. That's understandable, because 1) they need to keep the number of pages within reasonable limits, and 2) there's a good chance they'll leave out a worthwhile product and have to apologize, or that better products will come along. From my point of view, however, these are still choices I have to make as a developer and it'd be nice to have the information with the text as I'm reading.
As you can see, the problems I have with the book are what is missing from it and not with what's already in there. It's really a fantastic resource and I can't say much more than that the material is extremely important and that if you're looking to improve your understanding of the material, this is the book to get.
You can purchase Java Performance from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
It seems this kind of volatile deep non-documented black magic might change radically from JVM revision to revision. Although the Oracle "documentation" page seems to contain a lot of "legacy" options, there still seems a risk that this book would be outdated as soon as the next JVM release.
Oh, well, the tech publishing industry seems to be doing pretty well, even if the rate of technology change means that a tech fact is OBE before it's committed to ink.
Welcome to the Panopticon. Used to be a prison, now it's your home.
If you don't like JNI, you can use SWIG, it generates JNI code for you.
There is a _big_ difference between clumsily optimized (or unoptimized) Java and carefully-optimized Java--more, in my experience, than the difference between clumsily optimized Java and clumsily optimized C or C++. So if you are already using Java for some reason (robustness to faults, ease of parallelism of certain kinds (w.r.t. C), library that does exactly what you need, etc.), you should figure out how to optimize it before bailing out and using a different language.
Only if you absolutely must get as much out of your hardware as physically possible should you start using C/C++, and at that point, don't expect to be using ANSI C; you should be issuing SSE4 instructions and such (basically writing targeted assembly, even if you are doing so in a way that looks like C functions) that have been cleverly crafted to do exactly what you need.
(And don't forget that while you are taking extra time to write all this low-level high-performance code, your computers _could_ have been running using the slower code, making progress towards a solution, or serving customers albeit with delays, etc..)
-XX11 is useful as its +1 faster than -XX10, should be the default really.
The JVM really needs to get smarter. 60 different controls and switches is just too much. How hard can it be for the JVM to look at the available number of cores and just turn on the Parallel Garbage Collector. Do I really have to manually turn it on so that Minecraft will use it? Why can't the JVM allocate more memory on its own? Does it really need permission to use more than 1 Gig of memory? It just sits there waiting for the day some user decides to import every single possible datapoint into it, crashes, having used 1 Gig of 8, with a "Out of Memory" error. It's not like Developers know what the Xmx and Xms setting need to be. They just set them arbitrarily high in hopes that some user doesn't try to find out what the maximum datafile it can take is. That just slows it down and makes it so when the GC finally does fire off it has 10x the amount of trash it should have if the value was set lower. Those options are only useful on internal applications that never get into the hands of everyday users. It's probably a great book for server side development, but it is highlighting a major failing of the JVM.
In my experience, performance tuning is not something that is given much consideration until a production program blows up and everyone is running around in circles with sirens blaring and red lights flashing.
- if production blows up it signals that the underlying problem is not likely to be fixed with 'performance tuning'.
There is performance deficiency and then there is "production blows up" and those are different things and must be addressed by different sets of practices at different times.
Production blowing up means the design is flawed, it means misunderstanding of how the application was going to be used.
Slow response time on the other hand is about tuning, but it's very unlikely that environment tuning can help really to fix this.
Back in 2001 I was working for then Optus that was bought out by Symcor and the main project they brought me to (contract) was for this Worldinsure insurance provider, and the project was to do some weird stuff, business wise speaking, allow clients to compare quotes from different insurance providers. Business model was changing all the time, because insurance providers do not want their products to be compared against one another on line (big surprise).
The contract was expensive (5million) and WI wouldn't pay the last bit (a million I think) until the application would start responding at a 200 requests per second, and it was doing 20 or so :)
If anybody thinks that just some VM tuning can fix a problem where application is 10 times slower than expected, well, you haven't done it for long enough to talk about it then.
It took a month of work (apparently I wrote a comment on it before) that included getting rid of middleware persistence layer, switching to jsp from xslt, reducing session size by a factor of 100, desynchronising some data generators, whatever. Finally it would do 300 requests per second.
But the point is that when things are crashing or when performance is really a huge issue, you won't be optimising the VM.
VM optimisation is not generally done because I think the application has to do something that is not generic.
Imagine an application that only does one thing - say it only reads data from a file and then runs some transformation on it, maybe it's polygon rendering. Well then you know that your app. is doing only ONE THING, then you can probably use VM optimisation, because you can check that the one thing your app does will become faster or more responsive, whatever.
But if your app includes tons of functionality and tons of libraries that you don't have control over and it runs in some weird container on top of JVM, then what do you think you are going to achieve with this?
You likely will optimise something very specific and then you'll introduce a strange imbalance in the system that will hit you later and you won't see it coming at all.
If your app does one thing, maybe you have a distributed cluster with separate instances being responsible for one type of processing, then you probably can use specific optimisation parameters.
You can't handle the truth.
Fair points but I'd say there's a few places that Java just doesn't cut it - strict scheduling and / or real time needs. O, and GUI stuff, too.
So basically, no GUI, no audio, no video. Which only more or less leaves what it currently is king of the hill for - server side business processing.
I've tried real time and Java, and the jitter that garbage collection introduces (yes, even with parallel garbage collection and all that stuff) makes it hugely unpredictable.
Strangely enough, the company formerly known as SUN knew this, and tried to create a "real time" flavour of Java with such extensions - but it takes so much effort to port your code (and the JVM is slightly less than brilliant) that you're far better off going to C/C++.
Or just use the FFI, Java Native Access.
Ezekiel 23:20
I've worked with Java since 1.0. The only optimization options I've ever used were the heap and stack size adjustments.
Setting your memory heap too high actually degrades performance, oddly enough. I've got 4GB on this box and over 2.5G is normally used by disk cache, but if I allocate more than about 768MB to the heap, the performance suffers.
Maybe some of these options have real effects on certain production code characteristics, but I've found the best performance tuning options are:
So there you have it -- my favourite REAL WORLD, TESTED, and PROVEN TO WORK performance tweaks.
I do not fail; I succeed at finding out what does not work.
Oh, please, neither Java nor C++ is superior to the other. They both have strengths at certain kinds of programming but altogether support extremely similar semantics in most areas. There are very difficult-to-use portions of both Java and C++. You should get yourself some more experience.
Brian Fundakowski Feldman
Bugger. I had modded a bunch of posts on this thread and now have to chuck them away just to counter your ignorance (lest your incorrect assumption spread to others). It's cool you are taking a stab at why Java might limit memory, but unfortunately that guess is not correct - so I hope to set yourself (and any other readers) straight on why Java has this feature (if you don't understand the feature you'll see it as a limitation, when in fact it is very important for security).
The limit Java imposes on memory is to ensure that critical resources (memory) is preserved for the system - and gives the Java user a way to limit what any application can do. This is a very important protection if you followed Java's internet model where Applets (or these days, WebStart applications) and allow remote applications to run. System administrators can also partition what any application can do, in case one is rogue (they can't trust those damn application developers, like me) and uses up all the system memory shared between many applications when running on Big Iron. I don't know whether any remote running .NET applications have this protection and would be interested to hear if they do - if they don't then that means any .NET application could in theory bring your server (and all the other important services you have running on it) to a crawl (as it swaps memory) or possibly even crash through memory exhaustion.
More like you didn't read it at all. Right in the summary it says: This dense, 600-page book will not only explain these developer options and the underlying JVM technology, but discusses performance, profiling, benchmarking and related tools in surprising breadth and detail.
Unfortunately, when you really get down to it you will have similar problems in C or C++. You cannot allocate memory from the system heap (malloc or operator new) in the critical path of real-time code. That means everything is preallocated. That means you could have been using Java anyways, with GC disabled. There may be other reasons to use C or C++ for these systems, but the nondeterminism of dynamic memory allocation really applies across the board. It just hits Java users earlier.
Woah, browsers! Nice one, in that way you let out QT, which only runs on Window, Mac and Linux. And blows SWT out of the water in every way.
Have you got your LWN subscription yet?
Maybe you could use JNI for that, in certain very specialized cases, but if you write parts of your application in C/JNI you run the risk of just combining Java's weaknesses (memory, performance) with C's weaknesses (error prone). A nullpointer in a native code part of your application will unceremoniously crash your JVM and everything running in it.
JNI is often used for things that can only be done in native code. An example I can think of are atomic compare and swap operations in the java.util.concurrent package. These are implemented via a JNI method essentially calling just one machine instruction (on Intel, it may be a hand full on some architectures). Yes, this is for performance reasons - an atomic compare&swap is faster that using locks, not because native code runs faster but because it's the only efficient way to implement it.