Java Performance Tuning, 2nd Ed.
Every developer has written a microbenchmark (a bit of code that does something 100-1000 times in a tight loop and measure the time it takes for the supposed "expensive operation") to try and prove an argument about which way is "more efficient" based on the execution time. The problem, is when running in a dynamic, managed environment like the 1.4.x JVM, there are more factors that you don't control than ones that you do, and it can be difficult to say whether one piece of code will be "more efficient" than another without testing with actual usage patterns. The second edition of Review of Java Performance Tuning provides substantial benchmarks (not just simple microbenchmarks) with thorough coverage of the JDK including loops, exceptions, strings, threading, and even underlying JVM improvements in the 1.4 VM. This book is one of a kind in its scope and completeness.
The Gory Details
The best part of this book is that it not only tells you how fast various standard Java operations are (sorting strings, dealing with exceptions, etc.), but he has kept all of the timing information from the previous edition of the book. This shows you how the VMs performance has changed from version 1.1.8 up to 1.4.0, and it's very clear that things are getting better. The author also breaks out the timing information for 3 different flavors of the 1.4.0 JVM: mixed interpreted/compiled mode (standard), server (with Hotspot), and interpreted mode only (no run time optimization applied).
Part 1 : Lies, Damn Lies and Statistics
The book starts off with three chapters of sage advice about the tools and process of profiling/tuning. Before you spend any time profiling, you have to have a process and a goal. Without setting goals, the tuning process will never end and it will likely never be successful.
The author outlines a general strategy that will give you a great starting point for your tuning task forces. Chapter 2 presents the profiling facilities that are available in the Java VM and how to interpret the results, while chapter 3 covers VM optimizations (different garbage collectors, memory allocation options) and compiler optimizations.
Part 2 : The Basics
Chapters 4-9 cover the nuts and bolts, code-level optimizations that you can implement. Chapter 4 discusses various object allocation tweaks including: lazy initialization, canonicalizing objects, and how to use the different types of references (Phantom, Soft, and Weak) to implement priority object pooling. Chapter 5 tells you more about handling Strings in Java that you ever wanted to know. Converting numbers (floats, decimals, etc) to Strings efficiently, string matching -- it's all here in gory detail with timings and sample code.
This chapter also shows the author's depth and maturity; when presenting his algorithm to convert integers to Strings, he notes that while his implementation previously beat the pants off of Sun's implementation, in 1.3.1/1.4.0 Sun implemented a change that now beats his code. He analyzes the new implementation, discusses why it's faster without losing face. That is just one of many gems in this updated edition of the book. Chapter 6 covers the cost of throwing and catching exceptions, passing parameters to methods and accessing variables of different scopes (instance vs. local) and different types (scalar vs. array). Chapter 7 covers loop optimization with a java bent. The author offers proof that an exception terminated loop, while bad programming style, can offer better performance than more accepted practices.
Chapter 8 covers IO, focusing in on using the proper flavor of java.io class (stream vs. reader, buffered vs. unbuffered) to achieve the best performance for a given situation. The author also covers performance issues with object serialization (used under the hood in most Java distributed computing mechanisms) in detail and wraps up the chapter with a 12 page discussion of how best to use the "new IO" package (java.nio) that was introduced with Java 1.4. Sadly, the author doesn't offer a detailed timing comparison of the 1.4 NIO API to the existing IO API. Chapter 9 covers Java's native sorting implementations and how to extend their framework for your specific application.
PART 3 : Threads, Distributed Computing and Other Topics
Chapters 10-14 covers a grab bag of topics, including threading, proper Collections use, distributed computing paradigms, and an optimization primer that covers full life cycle approaches to optimization. Chapter 10 does a great job of presenting threading, common threading pitfalls (deadlocks, race conditions), and how to solve them for optimal performance (e.g. proper scope of locks, etc).
Chapter 11 provides a wonderful discussion about one of the most powerful parts of the JDK, the Collections API. It includes detailed timings of using ArrayList vs. LinkedList when traversing and building collections. To close the chapter, the author discusses different object caching implementations and their individual performance results.
Chapter 12 gives some general optimization principles (with code samples) for speeding up distributed computing including techniques to minimize the amount of data transferred along with some more practical advice for designing web services and using JDBC.
Chapter 13 deals specifically with designing/architecting applications for performance. It discusses how performance should be addressed in each phase of the development cycle (analysis, design, development, deployment), and offers tips a checklist for your performance initiatives. The puzzling thing about this chapter is why it is presented at the end of the book instead of towards the front, with all of the other process-related material. It makes much more sense to put this material together up front.
Chapter 14 covers various hardware and network aspects that can impact application performance including: network topology, DNS lookups, and machine specs (CPU speed, RAM, disk).
PART 4 : J2EE Performance
Chapters 15-18 deal with performance specifically with the J2EE APIs: EJBs, JDBC, Servlets and JSPs. These chapters are essentially tips or suggested patterns (use coarse-grained EJBs, apply the Value Object pattern, etc) instead of very low-level performance tips and metrics provided in earlier chapters. You could say that the author is getting lazy, but the truth is that due to huge number of combinations of appserver/database vendor combinations, it would be very difficult to establish a meaningful performance baseline without a large testbed.
Chapter 15 is a reiteration of Chapter 1, Tuning Strategy, re-tooled with a J2EE focus. The author reiterates that a good testing strategy determines what to measure, how to measure it, and what the expectations are. From here, the author presents possible solutions including load balancing. This chapter also contains about 1.5 pages about tuning JMS, which seems to have been added to be J2EE 1.3 acronym compliant.
Chapter 16 provides excellent information about JDBC performance strategies. The author presents a proxy implementation to capture accurate profiling data and minimize changes to your code once the profiling effort is over. The author also covers data caching, batch processing and how the different transaction levels can affect JDBC performance.
Chapter 17 covers JSPs and servlets, with very little earth shattering information. The author presents tips such as consider GZipping the content before returning it to the client, and minimize custom tags. This chapter is easily the weakest section of the book: Admittedly, it's difficult to optimize JSPs since much of the actual running code is produced by the interpreter/compiler, but this chapter either needs to be beefed up or dropped from future editions.
Finally, chapter 18 provides a design/architecture-time approach towards EJB performance. The author presents standard EJB patterns that lend themselves towards squeezing greater performance out of the often maligned EJB. The patterns include: data access object, page iterator, service locator, message facade, and others. Again, there's nothing earth shattering in this chapter. Chapter 19 is list of resources with links to articles, books and profiling/optimizing projects and products.
What's Bad?
Since the book has been published, the 1.4.1 VM has been released with the much anticipated concurrent garbage collector. The author mentions that he received an early version of 1.4.1 from Sun to test with. However, the text doesn't state that he used the concurrent garbage collector, so the performance of this new feature isn't indicated by this text.
The J2EE performance chapters aren't as strong as the J2SE chapters. After seeing the statistics and extensive code samples of the J2SE sections, I expected a similar treatment for J2EE. Many of the J2SE performance practices still apply for J2EE (serialization most notably, since that his how EJB, JMS, and RMI ship method parameters/results across the wire), but it would be useful to fortify these chapters with actual performance metrics.
So What's In It For Me?
This book is indispensable for the architect drafting the performance requirements/testing process, and contains sage advice for the programmer as well. It's the most up to date publication dealing specifically with performance of Java applications, and is a one-of-a-kind resource.
You can purchase Java Performance Tuning, 2nd Edition from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Networking and secure transactions asside, I have a major problem with things like scrolling text java applets. The problem is I see it too much.
__
cheap web site hosting with coins per month.
If all these performance hacks are documented, why doesn't the compiler implement them?
I've often found that will bytecode languages (Java, C#...) the bytecode instructions are made for the language so that the compiler can just throw them out easy peasy, but they seem to overlook the sort of optimizations that C compilers, for example, work hard to implement.
there is a difference, you know.
There appears to be a huge market for Java tuning books and tools. This seems to be a warning sign. Maybe Sun should just simplify and reduce Java so that some of the more onerous issues are just elimintated.
The book starts off with three chapters of sage advice about the tools and process of profiling/tuning. Before you spend any time profiling, you have to have a process and a goal. Without setting goals, the tuning process will never end and it will likely never be successful.
No, you have to profile first. Profiling will tell you whether there is even any point in tuning, and, if so, what goals are reasonable.
Remember there is a distinction between client- and server-side Java. Java on the server makes me very happy.
Perhaps it is more efficient. I say, let the compiler do it for me. Code like this:is much more readable/maintainable than
Okay,
flippant comment but let's think about this for a second: The majority of the time the alleged efficiency advantage is small or, as is generally the case, a pointless optimisation. Java coders seem to have the major efficiency/speed hangup - they use it to lord it over scripting programmers but they want/lack/desire the swiftness of C. (And yes, I do program in Java.)
To my mind, this is approching the problem from entirely the wrong direction: CPU time and CPU power are far cheaper than developer time and designer time. Therefore, rather than use some cobbled-together hack, use the standard implementations and take the performance hit.
This will be cheaper, probably 95% as efficient and, most importantly, be 195% easier to maintain or change at a later date. Consider the big picture rather than a single aspect.
NB - YMMV, for certain apps, it really does make sense to break all of the above ideas and principles, but if you REALLY need it to run that fast, you should be using C anyway.
Elgon
Because the sort of people who like to get involved in discussions about whether C# is 'better' than Java or Java is 'better' than Perl or crunchy peanut butter is 'better' than textured masonry paint can't cope with more than one thing at a time, and tend to apply their religious zealotry with great vigour.
Those of us who can program in more than one language and know that sometimes it's a matter of choosing the right tool for the job (peanut butter for sandwiches, masonry paint for walls) tend to go through three stages:
1) Try to engage in such discussions on the premise that there's actual intelligent debate going on.
2) Discover ourselves becoming violently opposed to whatever rant we're reading at the time, writing tracts about how Java sucks when we're reading the work of a Java fanatic and drooling about the glory of Java when faced with a C++-toting moron.
3) Either give up in disgust and let the language fanboys get on with it, or sit on the sidelines and snipe at both sides - similar to stage 2, but more consciously applied. Normally that progresses towards giving up, though, since the zealots are just too easy and predictable...
++ Say to Elrond "Hello.".
Elrond says "No.". Elrond gives you some lunch.
Amazingly, Java actually performs very well once the JVM loads. Sure, it can't match uber-efficient c code, but, let's be honest here, how much c code really is efficient? I'm sure it's less than c programmers like to believe. ;)
That said, "slow" performing Java GUI aps are not so much the fault of the platform itself as they are the fault of the Java programmer's inability to deal efficiently with threads.
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
You could have saved yourself some porting by just compiling your java code with GCJ. GCJ allows you to compile your java byte code to native executables.
This might become an option in a few years, but the GNU classpath is as yet not complete enough for our years. We actually didn't find gcj output that performant, despite it being compiled to native code. The JRE still beat it in many cases.
Use SWT with Java. SWT uses Windows native widgets on Windows or GTK on Linux.
We also investigated this. SWT is a _horrendous_ API which offers very little abstraction. You end up writing your code once for the Gtk+ target, and again for the native Windows target. It isn't really a cross-platform abstraction like WxWindows, and it's probably the reason why the Eclipse codebase is so large. You end up writing your application for each UI target platform. Gtk# runs and integrates with the platform instead, so you only write your code once.
Either your telling a big lie or dont have your facts straight. Unless you can show hard facts your not going to sway anyone into believing interpreted code outperformed compiled.
I did mention the results are empirical, but they're also pretty obvious from where I stand. You don't need benchmarks when something performs, in some cases, eight times faster than the original implementation. I may well put togther some benchmarks and post them to mono-list or linuxtoday.com. I don't have benchmarks yet; does that make me a liar? Sigh.
What is exactly wrong with Java's use of native threads on Linux boxes?
It's pointless to interface with the threads layer directly when pthreads exists. It makes the runtime essentially unportable to other unices/operating systems. Mono plays nicely with the environment, so the runtime can just be compiled on any POSIX-compilant system. Linux is great, but being attached to it so firmly that your application breaks when Linus changes some internal interfaces is not.
Sorry, that should be "for our uses".
Don't be an idiot. The size of the standard api does not relate to any inefficiency java has. How can the number of standard classes translate to inefficiency What is the magic number of standard classes to be "just right"?
The best thing about java is the richness of the api. And the size of the documentation. C++/C should take a page from java's book in this department.
You don't have to use the standard classes, go ahead and write the classes you need.
Jonathan
I read the first edition of this book completely. There are some good tips for extracting a few percentage points of improved performance. However, nothing has as profound an impact as simply using a better VM ... for example, many of my applications saw 25%+ speed increases simply by switching from the 1.2.x series VM to the 1.3.x series VM.
Java does a pretty could job as a language of encouraging best practices, i.e. the inclusion of a standard StringBuffer. Extreme optimization at the code level will always be limited given the high abstraction of the language. However, extreme optimization at the VM level is a very real thing, and it doesn't take a whole lot of effort for the Java programmer.
(Score:-1, Wrong)
Java isn't just about applets. In fact, applets are the least used feature of Java -- they're a neat little toy useage. Java is used primarily for back-end code now. Servlets talking to databases, for instance, are where Java is most often found.
Join Tor today!
Bwahaha, riiiight.
Sorry, but the C# compiler produces CLR bytecode. The JIT compiler runs at run-time.
With a Java JIT compiler you're running native code at run-time too.
Java and C# work the same way.
Java would've been far better if they'd stuck to a few basic classes, and let people develop the classes they need as they go.
Well, gosh, you go right ahead and write your own replacement classes for everything that Sun has done already. What's stopping you?
That's exactly why I like Java. They have a lot of good built-in libraries that cover a wide-range of applications. I don't have to reinvent the freaking wheel every time I write an app.
Your hybrid is not saving the environment. Its purpose is to make you feel good about buying something.
The bottleneck in our applications is not how fast whatever server-side language we use, and I imagine this is similar is most IT shops.
Our bottleneck is how fast we can execute lots and lots of stored procedures in our SQL and Oracle databases.
It really hasn't mattered if one of our coders has been terminating loops via try{}catch{}, or ending on a condition.
The most important thing has been, "Does each line, each method, each class do what it's actually supposed to do?"
Our bottlenecks have always been flow back and forth between different systems, including Lotus Domino, Oracle, MS SQL Server, Websphere, etc. etc.
Java is a small player in all this... C++, C#, Fortran, Lisp would not speed this up for us.
SWT offers a very high level of abstraction. If you want a still higher level of abstraction, then use the jface interface.
I've written a filesystem tool for QFS (QNX file system) and it runs without a single line of modification on QNX, windows and solaris!
SWT is a very sweet API. After using the utter crap that is Swing, its refreshing to see SWT. It uses native widgets so the app doesn't feel "out-of-place" !!
Combine this with the fact that SWT is as fast as any other GUI toolkit interface in a higher level language (higher than C/C++) and that its a filesystem tool, nobody ever suspects that its written in java !!!
And no dude, eclipse codebase is not huge... its not just another IDE as you think... its a complete platform!! You can write your own whole software platform using that baby!!
- mritunjai
I disagree.
You should always try to find the best, most efficient, most cost-effective approach/solution to your problem.
If your internal time is billed out at $50 per hour, and you want to save your company money, you aren't going to spend 4 hours to create a custom garbage collector just to save another 5k of RAM-- you're going to go out and buy another stick of memory.
I agree wrt bad coding habits and the like, but everything has its price. If someone can push an application out the door rapidly that can still be easily maintained and only requires a bit more memory or a bit faster processor, I'm more than willing to expense the money for that new hardware.
Since other posters have already indicated that gcj /does/ lead to better performance, I think I have a cause for your performance increase beyond "Java sux":
Re-implementation removed the bottleneck.
What kind of profiling did you do against your original Java application? Where was the time being spent? I've worked on some pretty high-performance java applications, and have found them to be quite scalable.
If you're talking about GUI responsiveness (not client/server or high processing interactions), then you may have a point. All the nefarious interactions between the platform-specific GUI toolkits and their OS of choice (this applies both to Windows and Linux) do a lot of very specific optimizations that just can't be done as well cross platform.
Interestingly, the original AWT used components based on native ones for just this reason, but that turned out to be problematic.
Anyway, if you have the intention of supporting your claim that your application had performance problems due to Java itself, I'd be interested in hearing about your profiling process.
-Zipwow
I don't know which is more depressing, that 2/3 didn't care enough to vote, or that 1/2 of those that did are crazy.
The FFT benchmark is a very specific case.
Why? It's smaller than most code, but why does that inherantly benefit Java?
Once the JIT kicks in, it's not Java vs C++ anymore, it's the JVM optimizer vs the GCC one.
That's the whole point. Unless you only care about programs where the entire execution time is a few seconds, the JVM optimisation time isn't going to be much of an issue.
However, the FFT benchmark is a case where the additional information available to the JIT optimizer allows it to outperform native code.
I compiled specifically for the machine I was running on. I tried everything I could to make it faster than Java. For Java VMs, being able to get "additional information" is what always happens. It's not specific to the benchmark.
The whole benchmark is so small, it probably even fits in cache, and doesn't really stress any of the performance pitfalls of the language itself.
The code is 10KB file with a number of critical functions. A good optimiser would have to do inlinging, loop unrolling as well as a lot of data-flow optimisation. I ran it across a range of data sizes, and Java did better at bigger array sizes (until memory bandwidth was the limiting factor). You have it the wrong way around - the smaller the code/data, the more language specific issues show up.
Now, if you have a larger application, that doesn't consist of a single inner-loop, and meanders through a lot of varied code (ie. most real applications) then the performance story will be very different.
Unless your app's performance is dependant on I/O, OS calls or strings, making the application bigger is not going to make things very different. Actually, it can give JVMs a number of extra advantages due to the super-inlining capabilities available to run-time generated code.
At that point, Java's performance faults (excessive bookkeeping overhead, object allocation/deallocation, overhead from the JVM, etc) come much more into play.
I don't know what you mean by "bookkeeping overhead" but C code has to allocate/deallocate memory too, and has problems like memory fragmentation. How things compare with a garbage collector depends on the code, system, JVM etc, but modern JVMs can handle 10s of gigabytes of data in the heap. Simply to say that GC is slow is simplistic. Unless you have code that only runs for a few seconds, "overhead from the JVM" (if by that you mean the optimiser) isn't going to be a problem.
I'm not saying there's not areas where Java is at a disadvantage (I've listed some in this post). In some areas, it's going to remain inherant, but in others, JVMs are becoming advanced enough that issues like run-time optimisation can be a distinct advantage. Anyway, as far as I'm concerned, I have no problems with Java performance for anything I've done, including GUI code.
These days when writing highly optimised web-server code in Java, I have to get a super-accurate timer since Java's standard timer is only accurate to one millisecond, and that's too coarse grained to tell how much different various optimised I add make. This is where all the HTML is dynamically generated for each request btw.
When I write a Java program... if it's too slow today, then, in time, the problem will go away without any more effort on the part of the programmer. In a year from now, we'll certainly have faster computers, which will make up for any speed problems.
On the other hand...
A year from now, we will almost certainly not have CPUs that are suddenly immune from dangling pointers and memory leaks.
In other words, there are not plausible, near-future-forseeable advancements in computing hardware that could fix the worst problems of C/C++. Meanwhile, the near-future advancements in hardware are almost guranteed to fix Java's worst problem.
The same holds true for doing your computing today... regardless of what hardware is available a year from now. Personally, I'd rather have a slow program that could keep running than one that was really fast, but crashed before I could save my work.
There's an additional 16 bytes involved in the object class descriptor pointer and the reference to the object's monitor (*2 - one for the String and one for the char[], which is a fully fledged object in its own right), plus probably a couple of bytes overhead for the memory allocator. About 32 bytes seems reasonable. I think the point to make is, though, there is no _sensible_ way of making the string any smaller without sacrificing performance. Plus the objects have the ability to share the array between two strings that have similar data (say one is a substring of another) and that _substantially_ reduces memory requirements. I'd say the Java string implementation is about as good as it gets.