Java Performance Tuning, 2nd Ed.

← Back to Stories (view on slashdot.org)

Java Performance Tuning, 2nd Ed.

Posted by timothy on Wednesday April 9, 2003 @04:00AM from the percolation dept.

cpfeifer writes "Performance has been the albatross around Java's neck for a long time. It's a popular subject when developers get together "Don't use Vector, use ArrayList, it's more efficient." "Don't concatenate Strings, use a StringBuffer, it's more efficient." It's a chance for the experienced developers to sit around the design campfire and tell ghost stories of previous projects where they implemented their own basic data structures {String, Linked List...} that was anywhere from 10-50% faster than the JDK implementation (and in the grand oral tradition of tall tales, it gets a little more efficient every time they tell it)." Want to kill the albatross? Read on for the rest of cpfeifer's review of O'Reilly's Java Performance Tuning, now in its 2nd edition. Java Performance Tuning, 2nd Edition author Jack Shirazi pages 570 publisher O'Reilly and Associates rating 9/10 reviewer cpfeifer ISBN 096003773 summary It's the most up to date publication dealing specifically with performance of Java applications, and is a one of a kind resource.

Every developer has written a microbenchmark (a bit of code that does something 100-1000 times in a tight loop and measure the time it takes for the supposed "expensive operation") to try and prove an argument about which way is "more efficient" based on the execution time. The problem, is when running in a dynamic, managed environment like the 1.4.x JVM, there are more factors that you don't control than ones that you do, and it can be difficult to say whether one piece of code will be "more efficient" than another without testing with actual usage patterns. The second edition of Review of Java Performance Tuning provides substantial benchmarks (not just simple microbenchmarks) with thorough coverage of the JDK including loops, exceptions, strings, threading, and even underlying JVM improvements in the 1.4 VM. This book is one of a kind in its scope and completeness.

The Gory Details
The best part of this book is that it not only tells you how fast various standard Java operations are (sorting strings, dealing with exceptions, etc.), but he has kept all of the timing information from the previous edition of the book. This shows you how the VMs performance has changed from version 1.1.8 up to 1.4.0, and it's very clear that things are getting better. The author also breaks out the timing information for 3 different flavors of the 1.4.0 JVM: mixed interpreted/compiled mode (standard), server (with Hotspot), and interpreted mode only (no run time optimization applied).

Part 1 : Lies, Damn Lies and Statistics
The book starts off with three chapters of sage advice about the tools and process of profiling/tuning. Before you spend any time profiling, you have to have a process and a goal. Without setting goals, the tuning process will never end and it will likely never be successful.

The author outlines a general strategy that will give you a great starting point for your tuning task forces. Chapter 2 presents the profiling facilities that are available in the Java VM and how to interpret the results, while chapter 3 covers VM optimizations (different garbage collectors, memory allocation options) and compiler optimizations.

Part 2 : The Basics
Chapters 4-9 cover the nuts and bolts, code-level optimizations that you can implement. Chapter 4 discusses various object allocation tweaks including: lazy initialization, canonicalizing objects, and how to use the different types of references (Phantom, Soft, and Weak) to implement priority object pooling. Chapter 5 tells you more about handling Strings in Java that you ever wanted to know. Converting numbers (floats, decimals, etc) to Strings efficiently, string matching -- it's all here in gory detail with timings and sample code.

This chapter also shows the author's depth and maturity; when presenting his algorithm to convert integers to Strings, he notes that while his implementation previously beat the pants off of Sun's implementation, in 1.3.1/1.4.0 Sun implemented a change that now beats his code. He analyzes the new implementation, discusses why it's faster without losing face. That is just one of many gems in this updated edition of the book. Chapter 6 covers the cost of throwing and catching exceptions, passing parameters to methods and accessing variables of different scopes (instance vs. local) and different types (scalar vs. array). Chapter 7 covers loop optimization with a java bent. The author offers proof that an exception terminated loop, while bad programming style, can offer better performance than more accepted practices.

Chapter 8 covers IO, focusing in on using the proper flavor of java.io class (stream vs. reader, buffered vs. unbuffered) to achieve the best performance for a given situation. The author also covers performance issues with object serialization (used under the hood in most Java distributed computing mechanisms) in detail and wraps up the chapter with a 12 page discussion of how best to use the "new IO" package (java.nio) that was introduced with Java 1.4. Sadly, the author doesn't offer a detailed timing comparison of the 1.4 NIO API to the existing IO API. Chapter 9 covers Java's native sorting implementations and how to extend their framework for your specific application.

PART 3 : Threads, Distributed Computing and Other Topics
Chapters 10-14 covers a grab bag of topics, including threading, proper Collections use, distributed computing paradigms, and an optimization primer that covers full life cycle approaches to optimization. Chapter 10 does a great job of presenting threading, common threading pitfalls (deadlocks, race conditions), and how to solve them for optimal performance (e.g. proper scope of locks, etc).

Chapter 11 provides a wonderful discussion about one of the most powerful parts of the JDK, the Collections API. It includes detailed timings of using ArrayList vs. LinkedList when traversing and building collections. To close the chapter, the author discusses different object caching implementations and their individual performance results.

Chapter 12 gives some general optimization principles (with code samples) for speeding up distributed computing including techniques to minimize the amount of data transferred along with some more practical advice for designing web services and using JDBC.

Chapter 13 deals specifically with designing/architecting applications for performance. It discusses how performance should be addressed in each phase of the development cycle (analysis, design, development, deployment), and offers tips a checklist for your performance initiatives. The puzzling thing about this chapter is why it is presented at the end of the book instead of towards the front, with all of the other process-related material. It makes much more sense to put this material together up front.

Chapter 14 covers various hardware and network aspects that can impact application performance including: network topology, DNS lookups, and machine specs (CPU speed, RAM, disk).

PART 4 : J2EE Performance
Chapters 15-18 deal with performance specifically with the J2EE APIs: EJBs, JDBC, Servlets and JSPs. These chapters are essentially tips or suggested patterns (use coarse-grained EJBs, apply the Value Object pattern, etc) instead of very low-level performance tips and metrics provided in earlier chapters. You could say that the author is getting lazy, but the truth is that due to huge number of combinations of appserver/database vendor combinations, it would be very difficult to establish a meaningful performance baseline without a large testbed.

Chapter 15 is a reiteration of Chapter 1, Tuning Strategy, re-tooled with a J2EE focus. The author reiterates that a good testing strategy determines what to measure, how to measure it, and what the expectations are. From here, the author presents possible solutions including load balancing. This chapter also contains about 1.5 pages about tuning JMS, which seems to have been added to be J2EE 1.3 acronym compliant.

Chapter 16 provides excellent information about JDBC performance strategies. The author presents a proxy implementation to capture accurate profiling data and minimize changes to your code once the profiling effort is over. The author also covers data caching, batch processing and how the different transaction levels can affect JDBC performance.

Chapter 17 covers JSPs and servlets, with very little earth shattering information. The author presents tips such as consider GZipping the content before returning it to the client, and minimize custom tags. This chapter is easily the weakest section of the book: Admittedly, it's difficult to optimize JSPs since much of the actual running code is produced by the interpreter/compiler, but this chapter either needs to be beefed up or dropped from future editions.

Finally, chapter 18 provides a design/architecture-time approach towards EJB performance. The author presents standard EJB patterns that lend themselves towards squeezing greater performance out of the often maligned EJB. The patterns include: data access object, page iterator, service locator, message facade, and others. Again, there's nothing earth shattering in this chapter. Chapter 19 is list of resources with links to articles, books and profiling/optimizing projects and products.

What's Bad?

Since the book has been published, the 1.4.1 VM has been released with the much anticipated concurrent garbage collector. The author mentions that he received an early version of 1.4.1 from Sun to test with. However, the text doesn't state that he used the concurrent garbage collector, so the performance of this new feature isn't indicated by this text.

The J2EE performance chapters aren't as strong as the J2SE chapters. After seeing the statistics and extensive code samples of the J2SE sections, I expected a similar treatment for J2EE. Many of the J2SE performance practices still apply for J2EE (serialization most notably, since that his how EJB, JMS, and RMI ship method parameters/results across the wire), but it would be useful to fortify these chapters with actual performance metrics.

So What's In It For Me?

This book is indispensable for the architect drafting the performance requirements/testing process, and contains sage advice for the programmer as well. It's the most up to date publication dealing specifically with performance of Java applications, and is a one-of-a-kind resource.

You can purchase Java Performance Tuning, 2nd Edition from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

15 of 287 comments (clear)

Min score:

Reason:

Sort:

Definite purchase by Timesprout · 2003-04-09 04:13 · Score: 3, Informative

I have drastically cut back on my tech book purchases in recent times but this book will definitely be on my shopping list. The First edition offered many insights into not only getting the best performance from Java but also solid guidelines for when and where to apply optimisations.
As a side note I would disagree about performance being an albatross for Java. Well written Java code can be very high performant just as poorly written code in ANY language can perform slowly. Many of the performance issues associated with Java are inexperienced developers using inappropriate methods and objects.

--
Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
What truth?
There is no dupe
Correct ISBN is 0596003773 by zipwow · 2003-04-09 04:24 · Score: 5, Informative

The bn.com link is broken for me, here's the correct ISBN:

0596003773

--
I don't know which is more depressing, that 2/3 didn't care enough to vote, or that 1/2 of those that did are crazy.
Re:Isn't this the compiler's job? by chrisseaton · 2003-04-09 04:35 · Score: 3, Informative

Think about the string object problem - people have to use stringbuffer because strings are immutable.

When a program thrashes strings around, why doesn't the compiler detect that, and switch to a string buffer object to perform those operations, and then convert the final result back to a string?
Re:Isn't this the compiler's job? by cmburns69 · 2003-04-09 04:39 · Score: 4, Informative

With non-bytecode langauges, the compiler can optimize to the environment. It can re-order code based on the fastest execution time for the platform the code is compiled for.

Java (and other bytecode languages) were desinged to run well not just on a single platform, but on a variety of platforms. So as a trade-off, you lose environment-specific optimizations at compile time.

JIT JRE/compilers can work to prevent this. They can further optimize the bytecodes at execution time because they are platform specific.

An online Starcraft RPG? Only at
In soviet Russia, all your us are belong to base!

--
Online Starcraft RPG? At
Dietary fiber is like asynchronous IO-- Non-blocking!
Inherent performance issues by MSBob · 2003-04-09 04:44 · Score: 2, Informative

There are certain design decisions that were made by the java team that limit java's performance in a number of ways. Lack of stack objects comes to mind and collections that cannot store basic types.
That said for most network centric applications java is plenty fast. Now if we only stopped short of introducing the unbelievable overhead of XML's excessive verbosity...

--
Your pizza just the way you ought to have it.
String/StringBuffer by toriver · 2003-04-09 05:10 · Score: 4, Informative

It does under the hood whenever you use + for concatenation; this is why using String + String in a loop is ineffective: You create a new StringBuffer object per iteration. The solution in this case is to declare the StringBuffer outside the loop and use append() explicitly within.

For concatenating two strings, the concat() method can be faster than using StringBuffer, since it only needs to create a new char[] and do a (fast) arraycopy from the two internal arrays.

Also, everyone should be aware of the 1.4.1 memory leak associated with using StringBuffer's toString() and setLength() methods.
Re:More efficient != better by blamanj · 2003-04-09 05:43 · Score: 5, Informative

Actually,that's exactly what the compiler does. The problem occurs in cases like this:
String foo = ""; while (source.hasMoreTokens()) { foo += source.nextToken(); }
where you are creating a destroying a large number of strings. In this case, using a StringBuffer is far more efficient and doesn't really harm readability.
Re:Who cares? by Randolpho · 2003-04-09 05:49 · Score: 3, Informative

Perhaps you should pick your examples better. Here's an exerpt from the StringBuffer JavaDoc:
String buffers are used by the compiler to implement the binary string concatenation operator +. For example, the code:
x = "a" + 4 + "c"
is compiled to the equivalent of:
x = new StringBuffer().append( "a" ).append( 4 ).append( "c" ).toString()

Granted, people should get in the habit of coding optimizations automatically, but in this case it's actually more efficient to do String + String + String; it takes less time to code than typing the method calls, and is easier to read/understand.

Which just brings me to my biggest beef about Java: no syntactic sugar. Operator overloading should be a part of Java, and bugger whatever the purists say. I want to save time typing dammit! :)

--
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
Java is plenty fast by ChrisRijk · 2003-04-09 05:57 · Score: 1, Informative

I've just been testing with a FFT benchmark I have, where I have both a Java version and a C version. Using GCC 3.2 on Linux, I've yet to be able to build a faster binary than what Sun's 1.4.2 beta JVM can do. IBM's JVMs are generally best at this type of benchmark, though Sun's been catching up fast, quite possibly passed them.

Even with CPU specific optimisations, advanced compiler options etc, the Java version is 30-80% faster than GCC's binary. (this is on both AMD and Intel CPUs) To get anything faster, you'd have to pay for it.

I also do server side programming, and I don't see why so many Linux users complain about Java's performance, while using/promoting Perl and PHP. If you want a high performance, responsive site, Java completely blows Perl and PHP away. I've only used JSP and servlets so far but they're all most web sites need anyway.
1. Re:Java is plenty fast by be-fan · 2003-04-09 06:57 · Score: 4, Informative
  
  The FFT benchmark is a very specific case. Once the JIT kicks in, it's not Java vs C++ anymore, it's the JVM optimizer vs the GCC one. Contrary to popular belief, the GCC optimizer is very good (check out benchmarks vs ICC at coytegulch.com). However, the FFT benchmark is a case where the additional information available to the JIT optimizer allows it to outperform native code. The whole benchmark is so small, it probably even fits in cache, and doesn't really stress any of the performance pitfalls of the language itself. Now, if you have a larger application, that doesn't consist of a single inner-loop, and meanders through a lot of varied code (ie. most real applications) then the performance story will be very different. At that point, Java's performance faults (excessive bookkeeping overhead, object allocation/deallocation, overhead from the JVM, etc) come much more into play.
  
  --
  A deep unwavering belief is a sure sign you're missing something...
2. Re:Java is plenty fast by be-fan · 2003-04-09 13:41 · Score: 2, Informative
  
  Why? It's smaller than most code, but why does that inherantly benefit Java?
  >>>>>>
  The reason it inheretly benefit Java is because of the characteristics of the Java language. First of all, it's a JIT language. Thus, if you have a tight inner loop, the JIT optimizer can optimize the hell out of it (even more so because it has access to runtime information that the static C++ optimizer does not) and just hand it over to the processor for execution. The JVM isn't even executed again until the loop is over. This situation doesn't invoke any of the language overhead that makes Java slow. This overhead takes many forms. The JVM has a large instruction cache footprint. Java objects all have an extra header containing type information that causes a data cache footprint and impacts memory bandwidth. The garbage collector can be a big problem. As Tannenbaum said, avoiding disaster is more important that optimal performance. The average case allocation/deallocation might be quite fast for a garbage collector, but when an actual collection occurs, you get that "disaster" that you're trying to avoid. The collection process thrashes the cache and occupies the application for a (comparatively) long time. When a function is invoked, the JVM has to check to see if the JIT has the code already cached. This takes time. Move beyond that to the Java APIs themselves. The Java APIs are designed for purity and ease of use. APIs like the C++ STL are designed for pure performance. For example, a dynamic cast is inherently slow, by the nature of the operation. Yet, the Java Collection APIs require them for every access. All Java class methods are by default indirect. Last time I measured (on a PII 300) indirect calls are about 10 times slower than direct calls. A small numeric benchmark doesn't hit any of these performance issues, but real application code hits them, hard.
  
  --
  A deep unwavering belief is a sure sign you're missing something...
Re:Pre-written appendix for Java Tuning by egomaniac · 2003-04-09 06:36 · Score: 4, Informative

That article is the most absurd joke I have ever read. He spends half the article complaining about Java's startup time (which (A) does not apply in any server situation, and (B) is unfair, because you don't count the machine's bootup time when talking about the performance of C programs, do you?).

Then he invents other ways to talk about the startup time without seeming to talk about the startup time (for instance, trussing Hello World results in a ton of output, but naturally that's Java starting up and loading its classes. Again, do you consider what the machine has to do to boot itself up when you're talking about C programs?). I will point out again that Java's startup time is almost irrelevant, especially in a server environment (which is what he's talking about).

The rest of the article is picking on the "jar" tool. jar is a program written in Java. Criticisms against the jar tool no more reflect on Java than criticisms against gzip reflect on C. The fact that jar doesn't do a good job of reporting errors is (A) irrelevant, because it's a developer tool and we know how to read exceptions, and (B) still more irrelevant, because how well it reports errors has nothing to do with what language it was written in. Tons of C programs have lousy error reporting as well, such as a number of Unix utilities I might name.

Further, this article is obviously very old. He's talking about Java 1.1.8, which is what, five years old now? Might as well criticize Linux by talking about obscure video driver bugs that were fixed five years ago. Obviously, that's not the article's fault for having been written so long ago, but it is the parent poster's fault for bringing it up as if it is somehow still relevant.

--
ZFS: because love is never having to say fsck
Re:More efficient != better by avandesande · 2003-04-09 06:59 · Score: 2, Informative

Totally agree. Unless you are doing bioinformatics routines or building multimegabyte docs, you don't need the string buffer. Might as well keep the server busy while you are waiting for that 30ms query to complete.

--
love is just extroverted narcissism
Re:The original post is wrong, anyway... by tgd · 2003-04-09 07:09 · Score: 2, Informative

Well, there are two ways you can prove it -- write a test... do a million iterations of one, versus a million of the other and watch your RAM usage and time it takes. Alternately, compile it and then use a java disassembler, and look at the resulting code. The string concatination is very heavily optimized, and structures like that where you are concatinating hardcoded strings basically does the equivalent of interning them (there aren't real string objects for those). Compilers are smart. If you concatinate to produce the string, it knows what you are trying to do and can optimize. If you self-optimize and get it wrong (which is what happens when you use a StringBuffer), it doesn't know what you are doing and can't optimize it. Two seconds of google searching turned up this, if you don't want to test it yourself: http://www.precisejava.com/javaperf/j2se/StringAnd StringBuffer.htm
Re:New math? by Anonym0us+Cow+Herd · 2003-04-09 15:27 · Score: 2, Informative

You only need 4 bytes to make reflection work. Each object has a pointer to what class it is. The class information has everything else, the vtable so to speak, what interfaces are implemented, etc. All String objects have one single pointer to a common String class object the defines what a String is, what its members are, what its methods are, etc. everything needed for reflection. Think of the vtable as being more than just an array of method pointers, but meta information (the class object) about the object. More than just a vtable. Therefore, reflection requires no overhead really, since an OO language needs a vtable pointer.

--
The price of freedom is eternal litigation.