Java Performance Tuning, 2nd Ed.
Every developer has written a microbenchmark (a bit of code that does something 100-1000 times in a tight loop and measure the time it takes for the supposed "expensive operation") to try and prove an argument about which way is "more efficient" based on the execution time. The problem, is when running in a dynamic, managed environment like the 1.4.x JVM, there are more factors that you don't control than ones that you do, and it can be difficult to say whether one piece of code will be "more efficient" than another without testing with actual usage patterns. The second edition of Review of Java Performance Tuning provides substantial benchmarks (not just simple microbenchmarks) with thorough coverage of the JDK including loops, exceptions, strings, threading, and even underlying JVM improvements in the 1.4 VM. This book is one of a kind in its scope and completeness.
The Gory Details
The best part of this book is that it not only tells you how fast various standard Java operations are (sorting strings, dealing with exceptions, etc.), but he has kept all of the timing information from the previous edition of the book. This shows you how the VMs performance has changed from version 1.1.8 up to 1.4.0, and it's very clear that things are getting better. The author also breaks out the timing information for 3 different flavors of the 1.4.0 JVM: mixed interpreted/compiled mode (standard), server (with Hotspot), and interpreted mode only (no run time optimization applied).
Part 1 : Lies, Damn Lies and Statistics
The book starts off with three chapters of sage advice about the tools and process of profiling/tuning. Before you spend any time profiling, you have to have a process and a goal. Without setting goals, the tuning process will never end and it will likely never be successful.
The author outlines a general strategy that will give you a great starting point for your tuning task forces. Chapter 2 presents the profiling facilities that are available in the Java VM and how to interpret the results, while chapter 3 covers VM optimizations (different garbage collectors, memory allocation options) and compiler optimizations.
Part 2 : The Basics
Chapters 4-9 cover the nuts and bolts, code-level optimizations that you can implement. Chapter 4 discusses various object allocation tweaks including: lazy initialization, canonicalizing objects, and how to use the different types of references (Phantom, Soft, and Weak) to implement priority object pooling. Chapter 5 tells you more about handling Strings in Java that you ever wanted to know. Converting numbers (floats, decimals, etc) to Strings efficiently, string matching -- it's all here in gory detail with timings and sample code.
This chapter also shows the author's depth and maturity; when presenting his algorithm to convert integers to Strings, he notes that while his implementation previously beat the pants off of Sun's implementation, in 1.3.1/1.4.0 Sun implemented a change that now beats his code. He analyzes the new implementation, discusses why it's faster without losing face. That is just one of many gems in this updated edition of the book. Chapter 6 covers the cost of throwing and catching exceptions, passing parameters to methods and accessing variables of different scopes (instance vs. local) and different types (scalar vs. array). Chapter 7 covers loop optimization with a java bent. The author offers proof that an exception terminated loop, while bad programming style, can offer better performance than more accepted practices.
Chapter 8 covers IO, focusing in on using the proper flavor of java.io class (stream vs. reader, buffered vs. unbuffered) to achieve the best performance for a given situation. The author also covers performance issues with object serialization (used under the hood in most Java distributed computing mechanisms) in detail and wraps up the chapter with a 12 page discussion of how best to use the "new IO" package (java.nio) that was introduced with Java 1.4. Sadly, the author doesn't offer a detailed timing comparison of the 1.4 NIO API to the existing IO API. Chapter 9 covers Java's native sorting implementations and how to extend their framework for your specific application.
PART 3 : Threads, Distributed Computing and Other Topics
Chapters 10-14 covers a grab bag of topics, including threading, proper Collections use, distributed computing paradigms, and an optimization primer that covers full life cycle approaches to optimization. Chapter 10 does a great job of presenting threading, common threading pitfalls (deadlocks, race conditions), and how to solve them for optimal performance (e.g. proper scope of locks, etc).
Chapter 11 provides a wonderful discussion about one of the most powerful parts of the JDK, the Collections API. It includes detailed timings of using ArrayList vs. LinkedList when traversing and building collections. To close the chapter, the author discusses different object caching implementations and their individual performance results.
Chapter 12 gives some general optimization principles (with code samples) for speeding up distributed computing including techniques to minimize the amount of data transferred along with some more practical advice for designing web services and using JDBC.
Chapter 13 deals specifically with designing/architecting applications for performance. It discusses how performance should be addressed in each phase of the development cycle (analysis, design, development, deployment), and offers tips a checklist for your performance initiatives. The puzzling thing about this chapter is why it is presented at the end of the book instead of towards the front, with all of the other process-related material. It makes much more sense to put this material together up front.
Chapter 14 covers various hardware and network aspects that can impact application performance including: network topology, DNS lookups, and machine specs (CPU speed, RAM, disk).
PART 4 : J2EE Performance
Chapters 15-18 deal with performance specifically with the J2EE APIs: EJBs, JDBC, Servlets and JSPs. These chapters are essentially tips or suggested patterns (use coarse-grained EJBs, apply the Value Object pattern, etc) instead of very low-level performance tips and metrics provided in earlier chapters. You could say that the author is getting lazy, but the truth is that due to huge number of combinations of appserver/database vendor combinations, it would be very difficult to establish a meaningful performance baseline without a large testbed.
Chapter 15 is a reiteration of Chapter 1, Tuning Strategy, re-tooled with a J2EE focus. The author reiterates that a good testing strategy determines what to measure, how to measure it, and what the expectations are. From here, the author presents possible solutions including load balancing. This chapter also contains about 1.5 pages about tuning JMS, which seems to have been added to be J2EE 1.3 acronym compliant.
Chapter 16 provides excellent information about JDBC performance strategies. The author presents a proxy implementation to capture accurate profiling data and minimize changes to your code once the profiling effort is over. The author also covers data caching, batch processing and how the different transaction levels can affect JDBC performance.
Chapter 17 covers JSPs and servlets, with very little earth shattering information. The author presents tips such as consider GZipping the content before returning it to the client, and minimize custom tags. This chapter is easily the weakest section of the book: Admittedly, it's difficult to optimize JSPs since much of the actual running code is produced by the interpreter/compiler, but this chapter either needs to be beefed up or dropped from future editions.
Finally, chapter 18 provides a design/architecture-time approach towards EJB performance. The author presents standard EJB patterns that lend themselves towards squeezing greater performance out of the often maligned EJB. The patterns include: data access object, page iterator, service locator, message facade, and others. Again, there's nothing earth shattering in this chapter. Chapter 19 is list of resources with links to articles, books and profiling/optimizing projects and products.
What's Bad?
Since the book has been published, the 1.4.1 VM has been released with the much anticipated concurrent garbage collector. The author mentions that he received an early version of 1.4.1 from Sun to test with. However, the text doesn't state that he used the concurrent garbage collector, so the performance of this new feature isn't indicated by this text.
The J2EE performance chapters aren't as strong as the J2SE chapters. After seeing the statistics and extensive code samples of the J2SE sections, I expected a similar treatment for J2EE. Many of the J2SE performance practices still apply for J2EE (serialization most notably, since that his how EJB, JMS, and RMI ship method parameters/results across the wire), but it would be useful to fortify these chapters with actual performance metrics.
So What's In It For Me?
This book is indispensable for the architect drafting the performance requirements/testing process, and contains sage advice for the programmer as well. It's the most up to date publication dealing specifically with performance of Java applications, and is a one-of-a-kind resource.
You can purchase Java Performance Tuning, 2nd Edition from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Networking and secure transactions asside, I have a major problem with things like scrolling text java applets. The problem is I see it too much.
__
cheap web site hosting with coins per month.
If all these performance hacks are documented, why doesn't the compiler implement them?
I've often found that will bytecode languages (Java, C#...) the bytecode instructions are made for the language so that the compiler can just throw them out easy peasy, but they seem to overlook the sort of optimizations that C compilers, for example, work hard to implement.
I have drastically cut back on my tech book purchases in recent times but this book will definitely be on my shopping list. The First edition offered many insights into not only getting the best performance from Java but also solid guidelines for when and where to apply optimisations.
As a side note I would disagree about performance being an albatross for Java. Well written Java code can be very high performant just as poorly written code in ANY language can perform slowly. Many of the performance issues associated with Java are inexperienced developers using inappropriate methods and objects.
Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
What truth?
There is no dupe
Java Performance Tuning: A course in C programing
--
http://www.dennistighe.com
there is a difference, you know.
There appears to be a huge market for Java tuning books and tools. This seems to be a warning sign. Maybe Sun should just simplify and reduce Java so that some of the more onerous issues are just elimintated.
The bn.com link is broken for me, here's the correct ISBN:
0596003773
I don't know which is more depressing, that 2/3 didn't care enough to vote, or that 1/2 of those that did are crazy.
Each String is around 64 bytes of memory minimum. What a stupid decision to make such a fundamental data type so heavy weight.
I have noticed my JAVA programs run considerably faster under the Sun Forte/One IDE. Once the JAVA app is on its own (especially through a browser), it slows considerably. Does anyone else have experience with this phenomenon?
stuff |
The book starts off with three chapters of sage advice about the tools and process of profiling/tuning. Before you spend any time profiling, you have to have a process and a goal. Without setting goals, the tuning process will never end and it will likely never be successful.
No, you have to profile first. Profiling will tell you whether there is even any point in tuning, and, if so, what goals are reasonable.
Java has performance troubles? I thought we were all supposed to deny that. Did I miss a memo or something?
Don't use a system that is a rip off of source code from public domain that a company decided to call theirs and make profit, like Sun Java.
Anyone wanting to know about JAVA performance tuning should be sure to look at the time honored review of Why JAVA Sucks for Sysadmins .
Keith
This is awesome! I am just getting back into a Java project again and wasn't aware that this book existed.
The only frustration is that I use safari.oreilly.com and love it, but they don't seem to have the 2nd edition from what I can tell... oh well - I'll add the edition that comes up in the search and that's better than nothing.
There are some odd things afoot now, in the Villa Straylight.
Remember there is a distinction between client- and server-side Java. Java on the server makes me very happy.
now that the shine has worn off Java. They're discarding the slow Java code and rewriting it in C++. Most companies cannot afford a Gig of RAM per web page view. By the way - how much of Google or Yahoo is written in Java... let's see - none of it.
Please, send a copy to Ian Clarke and Matthew Toseland...
HAHA!! Agreed!!
We ported some of our internal Java business applications to C# for use with Mono, and emperical results already suggest the solution is several times faster than the Java code. The port was very easy, with each line of Java code mapping onto one line of C# or less. Porting the UI to Gtk# was more difficult, but we find the Gtk# code more maintainable and the UI, along with the Gtk+ WIMP plugin integrates much more nicely with Windows than SWING. We'll be investigating a switch to Linux over the next few months for some of our Point-of-Sales terminals as a result, and it should be easy thanks to the portability of Mono and Gtk#.
We also ported some of our backend tools for use with Mono. In use with the newly released Mono JIT runtime, Mini, we've achieved some truly stunning results. It turns out that some of the optimisations in the new JIT are better than those used by GCC, so once the code is loaded in memory, it performs better than raw C code. Although I don't yet have hard numbers to back up these result (the transition is still in progress), it has to be said that Mono is the real answer to Java performance. Being Open Source, we can also contribute back to the runtime to make it better suit our needs. It also plays nicely with RedHat 9's NPTL threading implementation, which is more than I can say for the current crop of Java JREs.
That said for most network centric applications java is plenty fast. Now if we only stopped short of introducing the unbelievable overhead of XML's excessive verbosity...
Your pizza just the way you ought to have it.
Why does programming languages have to be an either or situation? Everyone here assumes that anyone who programs in JAva does not know C/C++...why is that? Can't someone know multiple prog langs? I know many (too many too really list here) and find it asinine that people really think that everyone should just program in one lang.
click me
...another Slashdot book report.
Yes, I am an agent of Satan, but my duties are largely ceremonial.
Perhaps it is more efficient. I say, let the compiler do it for me. Code like this:is much more readable/maintainable than
It does under the hood whenever you use + for concatenation; this is why using String + String in a loop is ineffective: You create a new StringBuffer object per iteration. The solution in this case is to declare the StringBuffer outside the loop and use append() explicitly within.
For concatenating two strings, the concat() method can be faster than using StringBuffer, since it only needs to create a new char[] and do a (fast) arraycopy from the two internal arrays.
Also, everyone should be aware of the 1.4.1 memory leak associated with using StringBuffer's toString() and setLength() methods.
"The J2EE performance chapters aren't as strong as the J2SE chapters. After seeing the statistics and extensive code samples of the J2SE sections, I expected a similar treatment for J2EE. Many of the J2SE performance practices still apply for J2EE (serialization most notably, since that his how EJB, JMS, and RMI ship method parameters/results across the wire), but it would be useful to fortify these chapters with actual performance metrics."
J2SE has more coverage, because this is the area where Sun is focusing right now on improving speed. J2EE has been fairly successful - also, since CPU, RAM, HD resources tend to be more excessive on servers than desktops, J2EE speed on the server isn't as critical than J2SE speed on the desktop. Getting Java-based desktop apps to perform as well as their C/C++ brethern is the 'holy grail' of Java/J2SE development right now, so the focal point of this book makes perfect sense.
Okay,
flippant comment but let's think about this for a second: The majority of the time the alleged efficiency advantage is small or, as is generally the case, a pointless optimisation. Java coders seem to have the major efficiency/speed hangup - they use it to lord it over scripting programmers but they want/lack/desire the swiftness of C. (And yes, I do program in Java.)
To my mind, this is approching the problem from entirely the wrong direction: CPU time and CPU power are far cheaper than developer time and designer time. Therefore, rather than use some cobbled-together hack, use the standard implementations and take the performance hit.
This will be cheaper, probably 95% as efficient and, most importantly, be 195% easier to maintain or change at a later date. Consider the big picture rather than a single aspect.
NB - YMMV, for certain apps, it really does make sense to break all of the above ideas and principles, but if you REALLY need it to run that fast, you should be using C anyway.
Elgon
This is offtopic but I saw this story at CNET last week talking about JBOSS . The idea is basically creating the underpinnings for JAVA much the way .NET creates most of the underpinnings for .NET apps. Maybe this is reinventing the wheel but it sounded like a good idea to me.
Anyone who's ever done any performance testing in Java knows that these days, concatinating produces FAR more efficient code than the StringBuffer method...
It's Wednesday. Today we should have the weekly "XML sucks ass" -discussion, not the weekly Java bashing thread...
I was looking on CNN and etc for a news report- couldn't find one.
click me
Amazingly, Java actually performs very well once the JVM loads. Sure, it can't match uber-efficient c code, but, let's be honest here, how much c code really is efficient? I'm sure it's less than c programmers like to believe. ;)
That said, "slow" performing Java GUI aps are not so much the fault of the platform itself as they are the fault of the Java programmer's inability to deal efficiently with threads.
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
Java was never meant as a language for performance. If you're rewriting the lowest-level parts of the JDK, why not just do it in C? You'll be writing pretty much the same code, while getting an another massive speed boost. Multiplatform problems are moot: if efficiency is that important any Java graphics library will be unusable, and nongraphics stuff is just as crossplatform in C as in java.
So what's the idea? Is this a book for people so addicted to Java they can't use any other language, even when they unquestionably should?
"C" is designed to run well on a variety of platforms and you can gain "evironment specific optimizations" at compile time. There is is not really a big tradeoff there. Furthermore there's no reason Java can't be compiled. See GNU Java compiler. I think the original poster's question still stands. Can we rely on the Java compiler to do a good job of optimizing the code?
The ultimate answer is "maybe". Sure, it'll be able to do some stuff, perhaps a lot. It won't be able to fix poor data structure usage, bad algorithms and pussy footing around.
I have Limewire.org and DVarchive already. I know about Moneydance, which might be popular someday. Freenet might work well enough someday to qualify. Anything else? If you got 'em, post 'em.
Ick.
The albatross doesn't need killing -- it's already dead. The albatross was hanging from the mariners neck because he had killed it, and by doing so had brought bad luck upon his ship.
Quoting from memory here, because I can't be bothered to go find my copy of the poem:
As I said, that's from memory, so there are probably plenty of mistakes in there, but I'm sure a little googling will turn up a proper copy of the poem.
Once enough corporations realize they have been sucked into one of the greatest technical marketing programs ever produced... they'll open their eyes and look for other solutions.
Todays business world cannot wait for solutions and the development paradigms need to change to respond to these needs. The scripting world does a better job of answering todays business needs than traditional languages, including Java. Coding today has to be immediate and the ability to throw away code as the business changes needs to occur.
So both language and methodologies are mostly out-of-step for todays business needs.
case in point... a developer using php/apache/mysql will out produce the java/iplanet/oracle developer on any given day.
the business world does not need to spend money to develop software to last 20 years, 10 years, 5 years... the market will have changed by then. Even if the company is in a market where technology change is slow... why spend the money and get a lousy ROI
BigusDadicus
isn't killing the albatross what got the ancient mariner into so much trouble?
Sorry for the flamebait; but the man's got a good point! Any more software other than that toy Java app you wrote for you company? Any serious Java apps out there? Don't thow away your K&R and Stoustrop, yet, guys.
Don't be an idiot. The size of the standard api does not relate to any inefficiency java has. How can the number of standard classes translate to inefficiency What is the magic number of standard classes to be "just right"?
The best thing about java is the richness of the api. And the size of the documentation. C++/C should take a page from java's book in this department.
You don't have to use the standard classes, go ahead and write the classes you need.
Jonathan
I read the first edition of this book completely. There are some good tips for extracting a few percentage points of improved performance. However, nothing has as profound an impact as simply using a better VM ... for example, many of my applications saw 25%+ speed increases simply by switching from the 1.2.x series VM to the 1.3.x series VM.
Java does a pretty could job as a language of encouraging best practices, i.e. the inclusion of a standard StringBuffer. Extreme optimization at the code level will always be limited given the high abstraction of the language. However, extreme optimization at the VM level is a very real thing, and it doesn't take a whole lot of effort for the Java programmer.
(Score:-1, Wrong)
Well, performance IS king. And, when it comes to perfomance and ease, Fortran still is king, even if it is near 50 yeas old. Java has no chance. C++ may fight for it at times, and C may be on par if you are a non-sucky programmer.
. "
However, development cost prefer "development perfomance", not binary performance! So, Java and to some extent C++ may have the upper hand. Object Oiented development DOES rule here, period.
Then, semi-good news! Fortran is going OO! However, it is seriously delayed. But, progress is here.
Check out the latest draft of the Fortran 2000 standard here.".
And, read the recent discussion on-topic here:
"If you're looking for information on the content of the next standard,
assuming it gets adopted, try looking at John Reid's (only very slightly
out-of-date) summary:
ftp://ftp.nag.co.uk/sc22wg5/N1501-N1550/N1507.pdf
Java isn't just about applets. In fact, applets are the least used feature of Java -- they're a neat little toy useage. Java is used primarily for back-end code now. Servlets talking to databases, for instance, are where Java is most often found.
Join Tor today!
I've just been testing with a FFT benchmark I have, where I have both a Java version and a C version. Using GCC 3.2 on Linux, I've yet to be able to build a faster binary than what Sun's 1.4.2 beta JVM can do. IBM's JVMs are generally best at this type of benchmark, though Sun's been catching up fast, quite possibly passed them.
Even with CPU specific optimisations, advanced compiler options etc, the Java version is 30-80% faster than GCC's binary. (this is on both AMD and Intel CPUs) To get anything faster, you'd have to pay for it.
I also do server side programming, and I don't see why so many Linux users complain about Java's performance, while using/promoting Perl and PHP. If you want a high performance, responsive site, Java completely blows Perl and PHP away. I've only used JSP and servlets so far but they're all most web sites need anyway.
Previously, the startup slowdown was due to the system having to load, verify, and link the twenty or so classes a simple program depends upon. Pjava and J2ME-CDC solved that by storing an image of the heap with the system classes already loaded, verified, and linked (and quickened) so the system was run-ready almost immediately. I wonder if the J2SE folks picked up on that? Alternatively, they could just be skipping the verify for those classes in the signed rt.jar, and offline preverify them prior to signature - the verifier always was the slow part of the process.
Your point about threads is well taken, and applies more generally to much of java programming. Java's language and libraries make it all to easy to write architecturally-slow programs - you really still have to fully understand what you're doing in order to write a decent program, regardless of the language.
## W.Finlay McWalter ## http://www.mcwalter.org ##
I challenge you to make a C++/C# application that is thread-safe and can scale to millions of pageviews per day without writing a ton of supporting code. With a good J2EE app server, a java coder essentially just has to wrap his thread-unsafe code in a syncronized() statement, and he's done-- his app is now thread-safe.
Additionally, the "cross-platform doesn't matter for sysadmins" is a false statement; our CIO asked our net ops group "what would be the impact of us moving to an Intel platform?" and our sysadmins (after consulting with the coders) replied "absolutely no impact". That made our CIO very, very happy. Again, I challenge you to move your C++ apps from Solaris to Linux, or even to Windows, without any hiccup.
All of these other arguments are very specious: "I don't have enough RAM" will get you a reply of "go down to Fry's and spend $125 on another GB" every time. Processor speeds, even on Sun boxes, is getting to the point where the processor will never be a bottleneck for anything. Sure, java won't run as fast as a natively-compiled app. Neither will perl, php, tcl, or what have you. Raw processor speed is not as important when you have a couple of GHz to play with.
First, I'm not a Java lover, however I have grown to respect the language. But, I would never ever want to be coding in it ( just preference and my feeling of lack of control ) . I'm currently a co-op for a wireless handheld device manufacturer (try to guess which one!) and I've learnt that Java, although slow has its merits. It's publicly known that we use a C/C++ based OS and optimized VM, and then on top of that we have our apps which are written in Java Micro edition. Although the apps are slower than our older c++ apps, we've found that the scalability really helps. We no longer need to have separate builds for each platform, and third/second party developers don't need to modify their software to have it run on our devices. It really simplifies things when you don't have to worry about CPU/chipset specific issues and worry about functionality, where the C/C++ guys (I'm in this group) can do all the hardcore work with the hardware. I'm curious to see what the next few reviews on this book say, as I'm sure some of the Java people here will appreciate it.
Java would've been far better if they'd stuck to a few basic classes, and let people develop the classes they need as they go.
Well, gosh, you go right ahead and write your own replacement classes for everything that Sun has done already. What's stopping you?
That's exactly why I like Java. They have a lot of good built-in libraries that cover a wide-range of applications. I don't have to reinvent the freaking wheel every time I write an app.
Your hybrid is not saving the environment. Its purpose is to make you feel good about buying something.
Ugh. If C++ took the Java route, alternative operating systems would be impossible. C is so popular because it has a very minimal runtime system. Java is extremely difficult to port because of it's huge runtime. Java and C++ aren't really aimed at the same market. C++ is aimed at systems programming, where people can take the time to find the best external libraries for particular jobs, and need the performance of native code. Java is aimed (right now) at the server market, where having an quickly accessible, well documented (though not necessarily top-quality, if only because of the lack of competition) platform is more important. Personally, I think Java has more to fear from languages like Python (which have the extensive class libraries and are much more high-level to boot) than languages like C++.
A deep unwavering belief is a sure sign you're missing something...
Java Performance Tuning is an excellent book and covers its subject well. Kudos for the review.
The problem with focusing a discussion of Java on performance is that it ignores the real value of Java. Well written Java programs will never perform better than equally well written C/C++ code that makes efficient use of high level functionality as well as inline assembly. The tools available for a capable developer who understands his target architecture to develop native code are too good to beat with interpreted code. The real value of Java is the combination of portable "Write once, run anywhere" bytecode with purist object-orientation allowing development that focuses on core design issues and leaves platform implementation to someone else.
Having a book about Java performance is like having a book about how to cook shit the most tasty way.
heh. Actually, it got a score of one, because it's in your preferences. You add 1 to people who log in. See, for me, they don't get that. Also, AC's get negative 1 rather than zero. It works out well to weed out people like me and you, I think.
Also, if really quoting from memory, the author deserves some credit, since it's letter-but-not-punctuation perfect for the stanzas quoted, at least according to this source.
(Well, except for two missing lines in this stanza: : :
O happy living things ! no tongue
Their beauty might declare
A spring of love gushed from my heart,
And I blessed them unaware
Sure my kind saint took pity on me,
And I blessed them unaware.)
I seem to remember seeing some benchmark that said that native compiled code was actually slower than the Hotspot JRE.
Can any confirm this and/or explain how this is possible?
Take threads, for instance - synchronizing primitives are cheap in a Java VM that fakes threads, more expensive in a uniprocessor machine with real threads, and still more expensive in a multi-processor implementation. Code that performs well on the simple VM may be a dog on the later, more-capable VM.
Storage management (garbage collection) is another example. There are a bunch of different strategies for garbage collection, with potentially different costs for creating primitive types versus class objects, class objects with no references versus those with references, etc. Some types of garbage may be almost cost-free in some VMs, and very expensive in others.
But, optimizing code for a particular VM/platform is exactly the thing that Java was supposed to free us from!
The most important advice this book gives is the stuff in Part 1:
To a Lisp hacker, XML is S-expressions in drag.
The bottleneck in our applications is not how fast whatever server-side language we use, and I imagine this is similar is most IT shops.
Our bottleneck is how fast we can execute lots and lots of stored procedures in our SQL and Oracle databases.
It really hasn't mattered if one of our coders has been terminating loops via try{}catch{}, or ending on a condition.
The most important thing has been, "Does each line, each method, each class do what it's actually supposed to do?"
Our bottlenecks have always been flow back and forth between different systems, including Lotus Domino, Oracle, MS SQL Server, Websphere, etc. etc.
Java is a small player in all this... C++, C#, Fortran, Lisp would not speed this up for us.
How many times have we changed the way accounting is done? What about payroll, HR, Inventory, Purchasing, asset management, etc. etc. These are fundemental systems that should not have to change at the whim of some monkey-ass MBA!
...and not by the whim of some marketing dumb-ass and the monkey-assed MBAs
Todays business world cannot wait for solutions and the development paradigms need to change to respond to these needs
It is statments like this that KILL me! The agile business...BS! These changes are brought about by someone selling snake oil.
Inovation, Revolution, Change; should have solid business reasons.
At what point will a companies stop spending millinois just to save thousands?
The java bytecodes can be compiled at runtime, called Just In Time compilation. Since all the details of the current system are available at run-time it can be more optimized than it you are just targeting a platform. It is true that this is not done much in real life, but the theory is there. Note that JIT are not neccasarily platform specific, they just know the platform when run. The code for all the JIT compilers *CAN* be in the same executible. The parent post is for all I know correct in practise and mostly theory as well. This is just details that are seldom used.
.... which still does not excuse you from bad coding habits, poor algorithm choice, neophytic style of programming and a plethora of other things they taught you in CS school. I find the "RAM is cheap and so is CPU power" argument laughable, when it is spoken in the face of good efficient, programming. You should _always_ try to find the best (or most optimal, I should say) approach/solution to your problem. As soon as you've done that, you can use the "RAM is cheap, CPU power's cheap" argument, but not before that.
Let's do a linear search of every list we ever create in Java - 3Ghz is FAST! It'll find our value FAST! Instead of maybe doing a binary search on a pre-sorted list, right? Or let's always use bubble sort, it's fast on a 3Ghz CPU - forget quicksort, heapsort and the like.. too complex.
'A lie if repeated often enough, becomes the truth.' - Goebbels
... that during the time you developp or debug you C code, Java guys can focuse on Optimizations ;)
... the platform is here and the advantages brought could not expect some Java experiencer to go back to C or C++ ;)
;)
Here is the difference : Java is 40% less time in developping a project that C++ project (the same project with a good C++ technique).
C was good, but then went C++.
C++ was good, but then went Java.
But the main difference is that Java is not only a language. Even if he will continue to bring revolution (see 1.5 features) to each OS,
I used to be a C (KR then ANSI) fan, now i am a cafeine adict
Try to jump into the Java wisdom and you will understand why are people so happy to use Java !
-SLK
Java is the most redundant general purpose language in major use today. I write in java every day because my employer insists certain software should be written in java.
Java will *never* be as fast as straightforward C/C++. This is a simple fact borne out every time after evry time in practice, and you should distrust anyone who seeks to persuade you otherwise. They are either stupid or dishonest or both.
The difference in effort needed to write code in either of Java or C/C++ is for all non-numpties not significant. The only exception is where the API set available to the language is appropriate to the problem domain. This is again a simple fact borne out every time after every time in practice.
So what for is Java? If you want fast-to-run code then you have to use C/C++. If you want fast-to-write code then you can more easily use Perl/Python/Ruby/J[ava]script/VBScript/...
This difference in effort for non-numpties should be contrasted with the effort for numpties. It is a fact that the numpties can write code faster in java. However we already had a solution perfect for the numpties -- VB!
The sooner Java fades into the shadows, the better for the world. It is an abberation. Sure some people will have to retrain, and the colleges will have to rework some course materials, but this is no big deal.
I disagree.
You should always try to find the best, most efficient, most cost-effective approach/solution to your problem.
If your internal time is billed out at $50 per hour, and you want to save your company money, you aren't going to spend 4 hours to create a custom garbage collector just to save another 5k of RAM-- you're going to go out and buy another stick of memory.
I agree wrt bad coding habits and the like, but everything has its price. If someone can push an application out the door rapidly that can still be easily maintained and only requires a bit more memory or a bit faster processor, I'm more than willing to expense the money for that new hardware.
squeak
netrexx
rexx
euphoria
python
xbasic.org
tcl/tk
I accept. Yes, I did have to recompile. Looks like
it worked. Do I win something? How about getting scored down to flamebait for refuting a Java diatribe?
I thought so.
Oh well.
WTF? You don't know what you're talking about.
A fast disk or fast screen is not going to find the largest prime number. "Never"? Wrong word.
If Java were the answer; real life apps that you
used every day would be written in them. Your OS is written in C. Yourbrowser is written in C++. So is you word processor and email client. Where's the beef?
This is a legit. question. Where are the Java
apps? Outside of the custom toys Java app developers write for in house use, where are the Java apps. Bottom line; Their C/C++ cousing kick their ass so horendously that there are none.
Where's the beef?
If your internal time is billed out at $50 per hour, how much money did that comment cost your employer? Mine spent about $1 on this comment. :)
Dammit. I knew some of those stanzas should've had six lines. Still, I'm pretty chuffed that I did as well as I did. :)
Once upon a time, I could recite the entire first part from memory. But, unexercised, that skill lasted only a fortnight or so before I started forgetting bits.
Anyway, I think this has strayed a little off-topic. But having spent the last few months coding in Java, I can say that Coleridge's poetry is a far more pleasant thing to be thinking about.
The classes of the Java standard library are, by default, thread-safe. This means that all methods that could cause race conditions are synchronized. Unfortunately, unneeded synchronizations are a major performance hit (it depends on the thread implementation).
So, whether you write s1 + s2 + s3 or rewrite this expression using a StringBuffer (which is, anyway, what the compiler does), you incur on most implementations a performance hit because the StringBuffer will be treated as if it could be shared between several threads.
Now, in this case, it would be sufficient to have a StringBuffer_unsynchronized class. In more complex cases, you would like to compile all methods with or without synchronization and have a system automatically switch to unsynchronized methods if it is safe to do so.
Unfortunately, telling whether it is safe to use unsynchronized methods is non trivial. Essentially, you have to know whether your objects may escape to other threads. Of course, as any nontrivial semantic properties, this is undecidable (which means there's no generic algorithm always giving the right answer to the question in finite time). There are whole doctorate theses written on such topics!
Since other posters have already indicated that gcj /does/ lead to better performance, I think I have a cause for your performance increase beyond "Java sux":
Re-implementation removed the bottleneck.
What kind of profiling did you do against your original Java application? Where was the time being spent? I've worked on some pretty high-performance java applications, and have found them to be quite scalable.
If you're talking about GUI responsiveness (not client/server or high processing interactions), then you may have a point. All the nefarious interactions between the platform-specific GUI toolkits and their OS of choice (this applies both to Windows and Linux) do a lot of very specific optimizations that just can't be done as well cross platform.
Interestingly, the original AWT used components based on native ones for just this reason, but that turned out to be problematic.
Anyway, if you have the intention of supporting your claim that your application had performance problems due to Java itself, I'd be interested in hearing about your profiling process.
-Zipwow
I don't know which is more depressing, that 2/3 didn't care enough to vote, or that 1/2 of those that did are crazy.
If your internal time is billed out at $50 per hour, and you want to save your company money, you aren't going to spend 4 hours to create a custom garbage collector just to save another 5k of RAM-- you're going to go out and buy another stick of memory.
This reminds me how broken many (most?) corporate accounting systems are. Where I work, for a stick of RAM (or software, or whatever), it would take at least four hours spread over a couple weeks just to figure out who to submit the request to, wait for our "purchasing agent" to get a couple signatures from bureaucrats, wait for the purchase order to work its way to the top of a pile, and finally get the RAM only to discover they ordered the wrong type. All the while, they'll happily pay for labor hours wasted on slow computers with inadequate RAM (for example).
Why there is such a fundamental disconnection between spending money on labor versus spending it on time-saving equipment and software leaves me questioning reality.
Healthcare article at Kuro5hin
"interpreted code outperformed compiled"
All modern JREs have a JIT compiler, which compiles frequently used functions to native code. It is possible that the JIT compiler in the JRE is better than gcj.
From a more general point of view, it is possible for JIT compilation to optimize better than ordinary compilation. This is because the JIT compiler has access to dynamic profiling information that is not available to the "normal" compiler (though you can feed profile information from benchmarks to some "normal" compilers - but these benchmark profiles may not fit the dynamic load).
becuase i know c, c++, asp, php, perl, ada, python, lua, java, delphi, pascal, object c, visual basic, qbasic blah blah blahitti blah blah blah and i have to say i like proggin in java from time to time. i know alot of other people that do as well...and really, if it's compiled source (which is what you are comparing it to, correct, compiled C source? I'm not talking about JIT compilers but something like a true binary compiler (which exists, mostly as a GCC ext), Java is not much slower than other higher-level languages.
I'm soooo sick of language snobbery.
click me
NO TEXT! BIYATCH!
click me
How much does that extra development time cost?
Writing ones' own java.lang.String takes time. Writing routines to convert com.donkeybollocks.String to java.lang.String and back again takes time. Supporting it takes time. And time is money. Me, I'd rather spend an extra £100 on a faster processor, or a Gb of RAM, and take a 25% performance improvement.
Come on guys, one of the major wins of the OO methodology is code reuse. Time was when programmers would always have to write their own I/O routines - I thought those days were long-gone. Rewriting fundamental parts of the Java API is just plain silly, unless it has a bug or a serious limitation (eg, it's non-threadsafe).
The more advanced the technology, the more open it is to primitive attack
I'd thought a major bottleneck would be attending all those meetings... :)
I don't know what pletorah of things they taught you in CS school, but much of the wisdom they taught some of us can be summarized:
- Big O matters. Optimization of constants is an expensive luxury.
- Reimplementing the wheel for the sake of marginal efficiency is a sure way to get a square and inefficient wheel.
Most algorithms of any common use are provided in the standard libraries of each language. If not there, any algorithm can be implemented in any language by virtue of its Turing-completeness. This guarantees you bigO efficiency, which is what matters in the long run.
The article complains about Java being slow for the sake of its pcode nature. That's a constant factor, not bigO. It's automatically defeated by "CPU is cheap, RAM is cheap", i.e.: constant factor acceleration is cheap.
You better have a good reason to worry about constant factors: if your program demands so much from the machine that the constants make the difference on whether it's practical or not, you better be experimenting with the 'bleeding edge' or there's something really wrong with your program.
Efficient algorithms are used on every language by any programmer worth 2 bucks. Java has the advantage of implementing a bunch of them on standard libraries that work quite well, thank you. Someone who uses bubblesort in Java outside of a classroom is not lazy, he's an idiot. Implementing bubblesort is more complex and expensive than calling Arrays.sort().The same thing actually applies to any programming language.
If your concerns about speed as a typical sysadmin (servers and workstations) or even worse, as a developer, are dominated by constant factors, it's time to go back to take data structures and algorithm analysis at CS school.
Freedom is the freedom to say 2+2=4, everything else follows...
Sounds like this guys really done his homework, but if timing is that critical ask if java should be your first choice.
I primarily use java, both for work and home projects, and I have to say that I have almost no performance issues with java. I think that the way CPU speed has increased (and will probably continue to do so) that bytecode interpreted languages now have less of a performance gap againts native compiled code.
I personally use Optimizeit Suite for Java development from Borland software as early as possible in the development cycle. This gives me an edge to keep the code fast and efficient without too many efforts. Including the profiling process in your development cycle avoid big surprises that you have to fix even if it's sometimes too late to revise your programs. Performance sometimes involves re-evaluating algorithms and that can't be done in code freeze. As it turn out, then even have a profiler for .NET that has just been released.
Another point is that java wasn't created for server side. It's a complex language and it's slow to boot.
I agree - he doesn't know what he's taling about.
'go buy $125 on another Gb' of RAM... yeah, and how many slots do you have to put that RAM in? You'll run out eventually. He also talks about Solaris machines. RAM for Solaris is NOT $125 per Gb, oh no.
You need to write thread-safe code... just wrap the non-thread-safe code in a synchronized statement... which will happily synchronise the object that code lives in and kill performance, for a million page-view app.
Semi-on topic: a while ago, I gave some thought to pooling StringBuffer objects to improve performance. Bad idea. This page explains my findings.
It does not matter at all if this method is in a n, n^2, exp(n) or even exp(exp(n)) loop, since this relates to the size of the input problem. During the program, the method will be called some number of times (say x). If the method first cost u time, then after changing it, it wil cost (u*k) time. Calling it x times will take (u*k)*x = (u*x) * k time. So you are still slowed down by a factor k.
Now let's look at the problem size. If complexity is n^2, then doubling your problem size will increase execution time by a factor 4 (2^2). If the method is slower, and we double the problem size, execution time will still increase with a factor 4, since the complexity itself does not change. If we compare the two double-size problems, there is a factor k between them.
Don't try to bazzle us with mathematics if you can't handle them properly.
Excellent rant, man! I agree totally.
There are no trolls. There are no trees out here.
Turing-completeness. This guarantees you bigO efficiency
I agree that bigO efficiency do matter, and you can do that in any half-decent language. But you cannot do efficient programs on a Turing-machine, you will often add a linear factor to the time usage by implementing it on a Turing-machine.
Implementing bubblesort is more complex and expensive than calling Arrays.sort().
Reminds me about the first book I read about assembly code. It started explaining that performance was one of the reasons to write assembly. The first code example was an implementation of bubblesort. (Duh!)
Do you care about the security of your wireless mouse?
OK, so I was curious, so I checked up on this. Maybe I'm wrong, but I don't see 64 bytes. Here's a reduced snippet of the member variables in a String class (1.4.1_06):
/** The value is used for character storage. */ /** The offset is the first index of the storage that is used. */ /** The count is the number of characters in the String. */ /** Cache the hash code for the string */
public final class String
implements java.io.Serializable, Comparable, CharSequence
{
private char value[];
private int offset;
private int count;
private int hash = 0;
There are 3 ints, and a char array. each int is 4 bytes, the reference to the array is 4 bytes, and the array has a length attribute (another int) that is 4 bytes. That's 3*4 + 4 + 4 = 20 bytes. Then you would add 2 bytes for each character. But an empty string would be 20 bytes by my count.
Where did you come up with 64 bytes?
$foo = "$frob noz $baz->barCount() bars found.";
The size of the runtime really is irrelevant to porting Java to a given platform since the majority of the runtime is java class files. The part that needs to be ported is the VM and any other native stuff (which isn't much).
My blog: http://jkratz.dyndns.org/~jason/blog/
I've seen a lot of responses in this thread from people saying "bugger CPU time, just spend a few hundred dollars on more/faster CPU/RAM. Dev time is expensive!"
Well, yes it is, but it's not always that simple.
I have a Java app here I'm performance tuning for my PhD that allocates frequency hop sets to mobile phone networks. Running on a 2.2Ghz Athlon wih 512Mb RAM for a 15-transmitter test case takes an hour, it scales exponentially with transmitter size, and I want to address a 458 transmitter case. It's about 10^500 calculations, or it was before I started improving the algorithm. Even so, it's still going to be billions of iterations through the inner loop. Even a 1% speedup is hours off my runtime and that's a big thing for me.
So when you dismiss all performance tuning with a wave of your hand, remember us poor beleagured scientists. We actually need all this stuff.
You win again, gravity!
Don't post when you're completely ignorant on a subject. Not only is this statement wrong,
I challenge you to make a C++/C# application that is thread-safe and can scale to millions of pageviews per day without writing a ton of supporting code.
as, in the context of this statement, although naive, and frankly, well, wrong again,
"With a good J2EE app server, a java coder essentially just has to wrap his thread-unsafe code in a syncronized() statement, and he's done-- his app is now thread-safe."
It's just factually incorrect. Pick up a C# book and look up "lock". Leave development to the engineers, ok?
When I write a Java program... if it's too slow today, then, in time, the problem will go away without any more effort on the part of the programmer. In a year from now, we'll certainly have faster computers, which will make up for any speed problems.
On the other hand...
A year from now, we will almost certainly not have CPUs that are suddenly immune from dangling pointers and memory leaks.
In other words, there are not plausible, near-future-forseeable advancements in computing hardware that could fix the worst problems of C/C++. Meanwhile, the near-future advancements in hardware are almost guranteed to fix Java's worst problem.
The same holds true for doing your computing today... regardless of what hardware is available a year from now. Personally, I'd rather have a slow program that could keep running than one that was really fast, but crashed before I could save my work.
Sun Microsystems made claims of an incredible language that just weren't true. They claimed the language would be revolutionary and that software written for it could work everywhere. Partially true but no one wants a slow computer either. The whole reason people buy new computers is for speed.
Java is not liked nor used by everyone. No one wants it on their computer. No one wants large apps written in java.
Don't be part of the herd mentality. Just because Sun or Microsoft tells you to use their software doesn't mean you have to use it.
Try these languages if you don't like java.
modula-3
squeak
netrexx
rexx
euphoria
python
xbasic.org
tcl/tk
1) *The* Java VM is very difficult to port. Java VMs in general aren't, but if you want something that is performance competitive, you need one of the big ones. This is a pain at the application level, but would suck for a systems-level language like C/C++.
2) C/C++ doesn't have the benefit of a portable runtime. If it had a giant class library like Java, you'd have to port a whole lot of native code.
This is nothing to slight Java. For the places where Java is used, the large runtime isn't a big disadvantage. But C/C++ is used in very different places than Java.
A deep unwavering belief is a sure sign you're missing something...
Java apps feel creepy. Java may work buried deep in the servering up crap in the back room. Java as a client App tastes funny.
I have been writing Java apps for 6 years and all the projects have succeded. The Java apps are stable and scalable.
We tried similar apps in C# the apps just broke down on load and the holes are bigger than venus.
"1) *The* Java VM is very difficult to port. Java VMs in general aren't, but if you want something that is performance competitive, you need one of the big ones. This is a pain at the application level, but would suck for a systems-level language like C/C++."
o n/html/VMSpecTOC.doc.html - approximately compiled (on win32 platform) to << 1mb dll's (could probably squeze it into << 100kb if you wanted to remove some parts- e.g. CDC, using SWT instead of Swing + AWT etc.). If you consider porting this rather few native files hard... ok, so be it - and
t ml - runtime environment which consists of "pure" java classes that when necessary binds with these earlier mentioned dll's. - this classes needs not be ported at all (don't mention bugs please - remember WORA ;)
I think you're confusing the *java vm* with the *java runtime environment*, e.g.
http://java.sun.com/docs/books/vmspec/2nd-editi
http://java.sun.com/j2se/1.4.2/docs/api/index.h
Why did the thread bitching about + vs. append() get so many points?
Blar.
I can't find any of your numbers at the Sun sites you linked to. I'm talking about porting the HotSpot or the IBM VM. Those are most definately not less than 1MB of code. The HotSpot VM is more than 235,000 lines of code. For reference, that's about as large as the non-driver core of Linux 2.0. Imagine writing trying to write an OS in C/C++ and having to port 235,000 lines of code first!
A deep unwavering belief is a sure sign you're missing something...
challenge you to make a C++/C# application that is thread-safe and can scale to millions of pageviews per day without writing a ton of supporting code.
With a good J2EE app server, a java coder essentially just has to wrap his thread-unsafe code in a syncronized() statement, and he's done-- his app is now thread-safe.
LOL. Firstly C# has a "lock" statement and C/C++ have libraries with similar functionality.
Secondly, you know nothing about concurrent programming. Simply locking a bit of code doesn't suddenly make your program "thread safe". Get a clue.
That Java is interpreted is bad enough compared to compiled languages like C. Just imagine the situation within a loop - suppose there is an access of a class' data member. The interpreter / JVM reads in this instruction, parses it and executes it within its own space. Imagine on the other hand what compiled code would do - the class.value thing would be traslated directly to a memory address. And loading code is done by the processor. Everything is some orders of magnitude faster. And this whole thing is in a loop that might run a zillion times. I know most people around know this, but I was trying to describe it in its full glory so that the optimisations that all these posts are discussing look powerless.
So optimisation for Java will, most probably, not do much. But still, it makes sense to make it as fast as it can get. In this context, it has to be pointed out that the development of faster and faster C compilers was spurred on by a lot of academic research. That was the time when those things were still new and hot. Now, I doubt if that magnitude of research goes into JVM & javac optimisation.
This sig is empty.
yup, that's it. go fuck yourself.
click me
Anyway, it's good to know what your app is actually doing to the db. IE basically just swaps your JDBC driver and captures all the driver's interactions with the db.
Sad news, Stephen King troll wrong again at 14.
Karma: Undead.
Contrary to your CS conventional wisdom, the difference adds up in many cases when you are talking dollars and cents. It doesn't just apply to the "bleeding edge."
I have a positive modifier on Troll. When I mod someone Troll their karma should go UP!
Wow, people sure get offended when you say one language is better than another. I wonder if it is because we learn a few things and then we don't want to use something else because it hurts our pride and the time invested.
Java is just a tool to get the job done. It has its pros and cons just like anything else. I love it for server-side stuff, but if speed was the ultimate issue then yes, I would code in somthing else. The point is that you use the right launguage to get the job done.
Also, there is a reason why we all don't code in binary. It sure would make the fastest code, but it would take a hell of a long time!
20% sounds like an inflated number. The typical app spends most of its time waiting for I/O or doing silly stuff for user interaction. A 20% overall difference would imply what, 50%, 100% slower when it's actually working?
There's some overhead, but it's never that bad. Sure, the overhead matters, which is why there's an investment on improving VM technology, providing access to native operations, etc.
But also the worst overhead offenders are not VM issues, but application design issues: blocking I/O, threading bugs, NOT using multithreading when you should, etc. Also, using Swing/AWT in a non-trivial GUI. I'm beginning to consider Swing/AWT just a giant bug.
However, let's assume the 20% speed difference in an application...
It's not that constant factors don't matter at all, it's that they matter the least. And when you have so many more important problems to fix, they take the last place in the priority list.
By definition, choosing your implementation language is an early decision that takes place long before that list is filled up. Although speed is always a concern, algorithmic speed is more important than PL speed, and independent of it. So PL speed shouldn't have that much to do with the decision.
Saving 20% in hardware is great, and the efficiency marketing advantage is important too. But those advantages are worth nothing if your application is buggier, less extendable, less flexible than your competitors, and specially if it gets to the market AFTER your competitors.
Developing applications in C is more expensive and complicated than developing them in Java, and the difference is typically more than the difference in speed. (I'm not saying that C applications are inherently worse than Java apps, just that to develop the same application with the same extensibility, features and stability takes more time, and a bunch of non-standard libraries).
Development costs add much more quickly than hardware costs, and unlike hardware costs, development costs are not guaranteed a return in performance. These are not dollars and cents, but hundreds and tens of dollars: development is more expensive than hardware.
Then you go to the client 6 months after your competitors and try to sell them the application. If they haven't already bought from the competition, you'll try to convince them that although your application is more expensive, and altough it hasn't been tested in the market for as long as the other ones, and although they'll need a C programmer versed in your favorite non-standard libraries to maintain it (you do give them API documentation and the tools to maintain it, right?), your application will save them a few bucks in hardware.
Unless the difference in hardware costs has more than 4 digits, I think your customer will advice you to take an accounting class.
If the difference is more than 4 digits, you are pushing the technology and you need to care about constant factors.
Either that or you're not dealing with a typical application. For example, a scientific analysis program that spends most of its time in pure computation needs all the juice it can get. Although I understand Java works fine for pure computational tasks.
Now, if you can make your applications in C as cheap, fast, and safely as with Java, then you have great C developers so you should just keep doing that. Most people can't.
Freedom is the freedom to say 2+2=4, everything else follows...
I have seen the same, just that after spending another six months on software development, it didn't make a difference and they had to buy new hardware nonetheless just as advised in the beginning.
And yes, it solved the performance problem (faster drives it was for a database application).