Java Performance Tuning, 2nd Ed.
Every developer has written a microbenchmark (a bit of code that does something 100-1000 times in a tight loop and measure the time it takes for the supposed "expensive operation") to try and prove an argument about which way is "more efficient" based on the execution time. The problem, is when running in a dynamic, managed environment like the 1.4.x JVM, there are more factors that you don't control than ones that you do, and it can be difficult to say whether one piece of code will be "more efficient" than another without testing with actual usage patterns. The second edition of Review of Java Performance Tuning provides substantial benchmarks (not just simple microbenchmarks) with thorough coverage of the JDK including loops, exceptions, strings, threading, and even underlying JVM improvements in the 1.4 VM. This book is one of a kind in its scope and completeness.
The Gory Details
The best part of this book is that it not only tells you how fast various standard Java operations are (sorting strings, dealing with exceptions, etc.), but he has kept all of the timing information from the previous edition of the book. This shows you how the VMs performance has changed from version 1.1.8 up to 1.4.0, and it's very clear that things are getting better. The author also breaks out the timing information for 3 different flavors of the 1.4.0 JVM: mixed interpreted/compiled mode (standard), server (with Hotspot), and interpreted mode only (no run time optimization applied).
Part 1 : Lies, Damn Lies and Statistics
The book starts off with three chapters of sage advice about the tools and process of profiling/tuning. Before you spend any time profiling, you have to have a process and a goal. Without setting goals, the tuning process will never end and it will likely never be successful.
The author outlines a general strategy that will give you a great starting point for your tuning task forces. Chapter 2 presents the profiling facilities that are available in the Java VM and how to interpret the results, while chapter 3 covers VM optimizations (different garbage collectors, memory allocation options) and compiler optimizations.
Part 2 : The Basics
Chapters 4-9 cover the nuts and bolts, code-level optimizations that you can implement. Chapter 4 discusses various object allocation tweaks including: lazy initialization, canonicalizing objects, and how to use the different types of references (Phantom, Soft, and Weak) to implement priority object pooling. Chapter 5 tells you more about handling Strings in Java that you ever wanted to know. Converting numbers (floats, decimals, etc) to Strings efficiently, string matching -- it's all here in gory detail with timings and sample code.
This chapter also shows the author's depth and maturity; when presenting his algorithm to convert integers to Strings, he notes that while his implementation previously beat the pants off of Sun's implementation, in 1.3.1/1.4.0 Sun implemented a change that now beats his code. He analyzes the new implementation, discusses why it's faster without losing face. That is just one of many gems in this updated edition of the book. Chapter 6 covers the cost of throwing and catching exceptions, passing parameters to methods and accessing variables of different scopes (instance vs. local) and different types (scalar vs. array). Chapter 7 covers loop optimization with a java bent. The author offers proof that an exception terminated loop, while bad programming style, can offer better performance than more accepted practices.
Chapter 8 covers IO, focusing in on using the proper flavor of java.io class (stream vs. reader, buffered vs. unbuffered) to achieve the best performance for a given situation. The author also covers performance issues with object serialization (used under the hood in most Java distributed computing mechanisms) in detail and wraps up the chapter with a 12 page discussion of how best to use the "new IO" package (java.nio) that was introduced with Java 1.4. Sadly, the author doesn't offer a detailed timing comparison of the 1.4 NIO API to the existing IO API. Chapter 9 covers Java's native sorting implementations and how to extend their framework for your specific application.
PART 3 : Threads, Distributed Computing and Other Topics
Chapters 10-14 covers a grab bag of topics, including threading, proper Collections use, distributed computing paradigms, and an optimization primer that covers full life cycle approaches to optimization. Chapter 10 does a great job of presenting threading, common threading pitfalls (deadlocks, race conditions), and how to solve them for optimal performance (e.g. proper scope of locks, etc).
Chapter 11 provides a wonderful discussion about one of the most powerful parts of the JDK, the Collections API. It includes detailed timings of using ArrayList vs. LinkedList when traversing and building collections. To close the chapter, the author discusses different object caching implementations and their individual performance results.
Chapter 12 gives some general optimization principles (with code samples) for speeding up distributed computing including techniques to minimize the amount of data transferred along with some more practical advice for designing web services and using JDBC.
Chapter 13 deals specifically with designing/architecting applications for performance. It discusses how performance should be addressed in each phase of the development cycle (analysis, design, development, deployment), and offers tips a checklist for your performance initiatives. The puzzling thing about this chapter is why it is presented at the end of the book instead of towards the front, with all of the other process-related material. It makes much more sense to put this material together up front.
Chapter 14 covers various hardware and network aspects that can impact application performance including: network topology, DNS lookups, and machine specs (CPU speed, RAM, disk).
PART 4 : J2EE Performance
Chapters 15-18 deal with performance specifically with the J2EE APIs: EJBs, JDBC, Servlets and JSPs. These chapters are essentially tips or suggested patterns (use coarse-grained EJBs, apply the Value Object pattern, etc) instead of very low-level performance tips and metrics provided in earlier chapters. You could say that the author is getting lazy, but the truth is that due to huge number of combinations of appserver/database vendor combinations, it would be very difficult to establish a meaningful performance baseline without a large testbed.
Chapter 15 is a reiteration of Chapter 1, Tuning Strategy, re-tooled with a J2EE focus. The author reiterates that a good testing strategy determines what to measure, how to measure it, and what the expectations are. From here, the author presents possible solutions including load balancing. This chapter also contains about 1.5 pages about tuning JMS, which seems to have been added to be J2EE 1.3 acronym compliant.
Chapter 16 provides excellent information about JDBC performance strategies. The author presents a proxy implementation to capture accurate profiling data and minimize changes to your code once the profiling effort is over. The author also covers data caching, batch processing and how the different transaction levels can affect JDBC performance.
Chapter 17 covers JSPs and servlets, with very little earth shattering information. The author presents tips such as consider GZipping the content before returning it to the client, and minimize custom tags. This chapter is easily the weakest section of the book: Admittedly, it's difficult to optimize JSPs since much of the actual running code is produced by the interpreter/compiler, but this chapter either needs to be beefed up or dropped from future editions.
Finally, chapter 18 provides a design/architecture-time approach towards EJB performance. The author presents standard EJB patterns that lend themselves towards squeezing greater performance out of the often maligned EJB. The patterns include: data access object, page iterator, service locator, message facade, and others. Again, there's nothing earth shattering in this chapter. Chapter 19 is list of resources with links to articles, books and profiling/optimizing projects and products.
What's Bad?
Since the book has been published, the 1.4.1 VM has been released with the much anticipated concurrent garbage collector. The author mentions that he received an early version of 1.4.1 from Sun to test with. However, the text doesn't state that he used the concurrent garbage collector, so the performance of this new feature isn't indicated by this text.
The J2EE performance chapters aren't as strong as the J2SE chapters. After seeing the statistics and extensive code samples of the J2SE sections, I expected a similar treatment for J2EE. Many of the J2SE performance practices still apply for J2EE (serialization most notably, since that his how EJB, JMS, and RMI ship method parameters/results across the wire), but it would be useful to fortify these chapters with actual performance metrics.
So What's In It For Me?
This book is indispensable for the architect drafting the performance requirements/testing process, and contains sage advice for the programmer as well. It's the most up to date publication dealing specifically with performance of Java applications, and is a one-of-a-kind resource.
You can purchase Java Performance Tuning, 2nd Edition from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Each String is around 64 bytes of memory minimum. What a stupid decision to make such a fundamental data type so heavy weight.
I have noticed my JAVA programs run considerably faster under the Sun Forte/One IDE. Once the JAVA app is on its own (especially through a browser), it slows considerably. Does anyone else have experience with this phenomenon?
stuff |
We ported some of our internal Java business applications to C# for use with Mono, and emperical results already suggest the solution is several times faster than the Java code. The port was very easy, with each line of Java code mapping onto one line of C# or less. Porting the UI to Gtk# was more difficult, but we find the Gtk# code more maintainable and the UI, along with the Gtk+ WIMP plugin integrates much more nicely with Windows than SWING. We'll be investigating a switch to Linux over the next few months for some of our Point-of-Sales terminals as a result, and it should be easy thanks to the portability of Mono and Gtk#.
We also ported some of our backend tools for use with Mono. In use with the newly released Mono JIT runtime, Mini, we've achieved some truly stunning results. It turns out that some of the optimisations in the new JIT are better than those used by GCC, so once the code is loaded in memory, it performs better than raw C code. Although I don't yet have hard numbers to back up these result (the transition is still in progress), it has to be said that Mono is the real answer to Java performance. Being Open Source, we can also contribute back to the runtime to make it better suit our needs. It also plays nicely with RedHat 9's NPTL threading implementation, which is more than I can say for the current crop of Java JREs.
Why would the users care whether you use a J2EE or PHP4 backend? You must be thinking of Java applets which very few few people actually use for any "web page development". And the chapters 15-18 that talk about J2EE, what's wrong with using J2EE for the generation of webpages and content?
Previously, the startup slowdown was due to the system having to load, verify, and link the twenty or so classes a simple program depends upon. Pjava and J2ME-CDC solved that by storing an image of the heap with the system classes already loaded, verified, and linked (and quickened) so the system was run-ready almost immediately. I wonder if the J2SE folks picked up on that? Alternatively, they could just be skipping the verify for those classes in the signed rt.jar, and offline preverify them prior to signature - the verifier always was the slow part of the process.
Your point about threads is well taken, and applies more generally to much of java programming. Java's language and libraries make it all to easy to write architecturally-slow programs - you really still have to fully understand what you're doing in order to write a decent program, regardless of the language.
## W.Finlay McWalter ## http://www.mcwalter.org ##
I challenge you to make a C++/C# application that is thread-safe and can scale to millions of pageviews per day without writing a ton of supporting code. With a good J2EE app server, a java coder essentially just has to wrap his thread-unsafe code in a syncronized() statement, and he's done-- his app is now thread-safe.
Additionally, the "cross-platform doesn't matter for sysadmins" is a false statement; our CIO asked our net ops group "what would be the impact of us moving to an Intel platform?" and our sysadmins (after consulting with the coders) replied "absolutely no impact". That made our CIO very, very happy. Again, I challenge you to move your C++ apps from Solaris to Linux, or even to Windows, without any hiccup.
All of these other arguments are very specious: "I don't have enough RAM" will get you a reply of "go down to Fry's and spend $125 on another GB" every time. Processor speeds, even on Sun boxes, is getting to the point where the processor will never be a bottleneck for anything. Sure, java won't run as fast as a natively-compiled app. Neither will perl, php, tcl, or what have you. Raw processor speed is not as important when you have a couple of GHz to play with.
For most of theses transformations to be correct, the compiler has to prove there is only one pointer to the object -- in the whole program. Whole-program analyses are expensive, and so are point-to analysis. And there is just not that kind of time to spare in a JIT, where every second spent analysing the program is time spent not executing it.
The optimization the book proposes are all hit-or-miss adventures. Even for a programmer with intimate knowledge of the code, it is sometime difficult to predict if a change will help or imper performance. The compiler has even less chance to do so correctly -- and nobody like a compiler which slows down their code trying to optimize it.
This post was compiled with `% gec -O`. email me if you need the sources
How much does that extra development time cost?
Writing ones' own java.lang.String takes time. Writing routines to convert com.donkeybollocks.String to java.lang.String and back again takes time. Supporting it takes time. And time is money. Me, I'd rather spend an extra £100 on a faster processor, or a Gb of RAM, and take a 25% performance improvement.
Come on guys, one of the major wins of the OO methodology is code reuse. Time was when programmers would always have to write their own I/O routines - I thought those days were long-gone. Rewriting fundamental parts of the Java API is just plain silly, unless it has a bug or a serious limitation (eg, it's non-threadsafe).
The more advanced the technology, the more open it is to primitive attack