Book Review: Java Performance
jkauzlar writes "The standard Oracle JVM has about sixty 'developer' (-XX) options which are directly related to performance monitoring or tuning. With names such as 'UseMPSS' or 'AllocatePrefetchStyle', it's clear that Joe Schmo Code Monkey was not meant to be touching them, at least until he/she learned how the forbidding inner recesses of the JVM work, particularly the garbage collectors and 'just-in-time' compiler. This dense, 600-page book will not only explain these developer options and the underlying JVM technology, but discusses performance, profiling, benchmarking and related tools in surprising breadth and detail. Not all developers will gain from this knowledge and a few will surrender to the book's side-effect of being an insomnia treatment, but for those responsible for maintaining production software, this will be essential reading and a useful long-term reference." Keep reading for the rest of jkauzlar's review.
Java Performance
author
Charlie Hunt and Binu John
pages
693
publisher
Addison Wesley
rating
9/10
reviewer
Joe
ISBN
0-13-290525-6
summary
Java performance monitoring and tuning
In my experience, performance tuning is not something that is given much consideration until a production program blows up and everyone is running around in circles with sirens blaring and red lights flashing. You shouldn't need a crisis however before worrying about slow responsiveness or long pauses while the JVM collects garbage at inconvenient times. If there's an opportunity to make something better, if only by five percent, you should take it, and the first step is to be aware of what those opportunities might be.
First off, here's a summary of the different themes covered:
The JVM technology: Chapter 3 in particular is dedicated to explaining, in gory detail, the internal design of the JVM, including the Just-In-Time Compiler and garbage collectors. Being requisite knowledge for anyone hoping to make any use of the rest of the book, especially the JVM tuning options, a reader would hope for this to be explained well, and it is.
JVM Tuning: Now that you know something about compilation and garbage collection, it's time to learn what control you actually have over these internals. As mentioned earlier, there are sixty developer options, as well as several standard options, at your disposal. The authors describe these throughout sections of the book, but summarize each in the first appendix.
Tools: The authors discuss tools useful for monitoring the JVM process at the OS level, tools for monitoring the internals of the JVM, profiling, and heap-dump analysis. When discussing OS tools, they're good about being vendor-neutral and cover Linux as well as Solaris and Windows. When discussing Java-specific tools, they tend to have bias toward Oracle products, opting, for example, to describe NetBean's profiler without mentioning Eclipse's. This is a minor complaint.
Benchmarking: But what good would knowledge of tuning and tools be without being able to set appropriate performance expectations. A good chunk of the text is devoted to lessons on the art of writing benchmarks for the JVM and for an assortment of application types.
Written by two engineers for Oracle's Java performance team (one former and one current), this book is as close to being the de facto document on the topic as you can get and there's not likely to be any detail related to JVM performance that these two men don't already know about.
Unlike most computer books, there's a lot of actual discussion in Java Performance, as opposed to just documentation of features. In other words, there are pages upon pages of imposing text, indicating that you actually need to sit down and read it instead of casually flipping to the parts you need at the moment. The subject matter is dry, and the authors thankfully don't try to disguise this with bad humor or speak down to the reader. In fact, it can be a difficult read at times, but intermediate to advanced developers will pick up on it quickly.
What are the book's shortcomings?
Lack of real-world case studies: Contrived examples are provided here and there, but I'm really, seriously curious to know what the authors, with probably two decades between them consulting on Java performance issues, have accomplished with the outlined techniques. Benchmarking and performance testing can be expensive processes and the main question I'm left with is whether it's actually worth it. The alternatives to performance tuning, which I'm more comfortable with, are rewriting the code or making environmental changes (usually hardware).
3rd Party tool recommendations: The authors have evidently made the decision not to try to wade through the copious choices we have for performance monitoring, profiling, etc, with few exceptions. That's understandable, because 1) they need to keep the number of pages within reasonable limits, and 2) there's a good chance they'll leave out a worthwhile product and have to apologize, or that better products will come along. From my point of view, however, these are still choices I have to make as a developer and it'd be nice to have the information with the text as I'm reading.
As you can see, the problems I have with the book are what is missing from it and not with what's already in there. It's really a fantastic resource and I can't say much more than that the material is extremely important and that if you're looking to improve your understanding of the material, this is the book to get.
You can purchase Java Performance from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
First off, here's a summary of the different themes covered:
The JVM technology: Chapter 3 in particular is dedicated to explaining, in gory detail, the internal design of the JVM, including the Just-In-Time Compiler and garbage collectors. Being requisite knowledge for anyone hoping to make any use of the rest of the book, especially the JVM tuning options, a reader would hope for this to be explained well, and it is.
JVM Tuning: Now that you know something about compilation and garbage collection, it's time to learn what control you actually have over these internals. As mentioned earlier, there are sixty developer options, as well as several standard options, at your disposal. The authors describe these throughout sections of the book, but summarize each in the first appendix.
Tools: The authors discuss tools useful for monitoring the JVM process at the OS level, tools for monitoring the internals of the JVM, profiling, and heap-dump analysis. When discussing OS tools, they're good about being vendor-neutral and cover Linux as well as Solaris and Windows. When discussing Java-specific tools, they tend to have bias toward Oracle products, opting, for example, to describe NetBean's profiler without mentioning Eclipse's. This is a minor complaint.
Benchmarking: But what good would knowledge of tuning and tools be without being able to set appropriate performance expectations. A good chunk of the text is devoted to lessons on the art of writing benchmarks for the JVM and for an assortment of application types.
Written by two engineers for Oracle's Java performance team (one former and one current), this book is as close to being the de facto document on the topic as you can get and there's not likely to be any detail related to JVM performance that these two men don't already know about.
Unlike most computer books, there's a lot of actual discussion in Java Performance, as opposed to just documentation of features. In other words, there are pages upon pages of imposing text, indicating that you actually need to sit down and read it instead of casually flipping to the parts you need at the moment. The subject matter is dry, and the authors thankfully don't try to disguise this with bad humor or speak down to the reader. In fact, it can be a difficult read at times, but intermediate to advanced developers will pick up on it quickly.
What are the book's shortcomings?
Lack of real-world case studies: Contrived examples are provided here and there, but I'm really, seriously curious to know what the authors, with probably two decades between them consulting on Java performance issues, have accomplished with the outlined techniques. Benchmarking and performance testing can be expensive processes and the main question I'm left with is whether it's actually worth it. The alternatives to performance tuning, which I'm more comfortable with, are rewriting the code or making environmental changes (usually hardware).
3rd Party tool recommendations: The authors have evidently made the decision not to try to wade through the copious choices we have for performance monitoring, profiling, etc, with few exceptions. That's understandable, because 1) they need to keep the number of pages within reasonable limits, and 2) there's a good chance they'll leave out a worthwhile product and have to apologize, or that better products will come along. From my point of view, however, these are still choices I have to make as a developer and it'd be nice to have the information with the text as I'm reading.
As you can see, the problems I have with the book are what is missing from it and not with what's already in there. It's really a fantastic resource and I can't say much more than that the material is extremely important and that if you're looking to improve your understanding of the material, this is the book to get.
You can purchase Java Performance from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
It seems this kind of volatile deep non-documented black magic might change radically from JVM revision to revision. Although the Oracle "documentation" page seems to contain a lot of "legacy" options, there still seems a risk that this book would be outdated as soon as the next JVM release.
Oh, well, the tech publishing industry seems to be doing pretty well, even if the rate of technology change means that a tech fact is OBE before it's committed to ink.
Welcome to the Panopticon. Used to be a prison, now it's your home.
I want to purchase, but I see a book with the same title, but different ISBN, and different number of pages on amazon. Can you double check these values?
The development team at my company is currently reading this book. I'm three chapters in, and am having a hard time following it all. I often have to reread paragraphs, or entire pages, as soon as I finish them just to keep terms and names straight. Some applications which are discussed are five words long, like Microsoft Process Performance Analysis Console, or something head-spinning like that. That's the actual name of the application, so there is no other way to refer to it, so it's just the nature of the material. It's a good reference, and I'd highly recommend reading it if you work with Java full-time. The information for debugging and fixing performance issues is invaluable. Just be sure to take your Ritalin or Adderall before hand.
That's what JNI is for. Write your performance-critical code in C.
In a band? Use WheresTheGig for free.
Sure if you can stand the pain and ugliness of using JNI.
No, it's more like raising the efficiency of your diesel 18-wheeler by a few mpg and pocketing the savings.
Nicely played. Let's see if you get any bites!
If you don't like JNI, you can use SWIG, it generates JNI code for you.
Loser: Javascript != Java
I'd just like to point out that often even economy cars are worth upgrading for performance because car-makers often hamper their own designs for various reasons (e.g., severely under-dampening the suspension).
Also, do any JREs actually support NIO direct buffers*? I tried it on OS X and Amazon's Karmic 64 nodes about a year and a half ago and it didn't work, thus doubling my memory and severely hampering a project (since I unfortunately had to program sockets for the first time in over a decade and was advised to go with Java over C). Unless distributions actually support direct buffers, using C in Java is much worse than it needs to be.
*I think that's the right term for sharing memory between C and Java.
You're confusing ECMAScript (JavaScript) with Java. You should know what you're talking about before you make a fool out of yourself again.
I'm working on the inverse right now. Porting C code to Java and improving it. It does take longer to run, but it's also doing a lot more. However, setting up a worker thread pool in java is much easier than doing threading in C. With the -server flag and some tuning for max ram usage, it can do a reasonable amount of work. Of course the RAM needed for the program is much larger than it's C counterpart.
They should be 0. Or maybe 1. This just proves that Java is a piece of shit in every regard.
20 words to say that? There should be 0. Or maybe one. This just proves that AC is a piece of shit in every regard.
Write boring code, not shiny code!
There is a _big_ difference between clumsily optimized (or unoptimized) Java and carefully-optimized Java--more, in my experience, than the difference between clumsily optimized Java and clumsily optimized C or C++. So if you are already using Java for some reason (robustness to faults, ease of parallelism of certain kinds (w.r.t. C), library that does exactly what you need, etc.), you should figure out how to optimize it before bailing out and using a different language.
Only if you absolutely must get as much out of your hardware as physically possible should you start using C/C++, and at that point, don't expect to be using ANSI C; you should be issuing SSE4 instructions and such (basically writing targeted assembly, even if you are doing so in a way that looks like C functions) that have been cleverly crafted to do exactly what you need.
(And don't forget that while you are taking extra time to write all this low-level high-performance code, your computers _could_ have been running using the slower code, making progress towards a solution, or serving customers albeit with delays, etc..)
I can see the use for options to specify heap sizes, and to tweak latency vs. throughput of the GC. These are critical for memory-constrained and timing-constrained applications, respectively. I presume the other options are performance related, but how effective are they really?
Higher Logics: where programming meets science.
I wonder what type of projects would need to tweek the jvm. I am assuming it's huge java web based enterprise applications.
-XX11 is useful as its +1 faster than -XX10, should be the default really.
The JVM really needs to get smarter. 60 different controls and switches is just too much. How hard can it be for the JVM to look at the available number of cores and just turn on the Parallel Garbage Collector. Do I really have to manually turn it on so that Minecraft will use it? Why can't the JVM allocate more memory on its own? Does it really need permission to use more than 1 Gig of memory? It just sits there waiting for the day some user decides to import every single possible datapoint into it, crashes, having used 1 Gig of 8, with a "Out of Memory" error. It's not like Developers know what the Xmx and Xms setting need to be. They just set them arbitrarily high in hopes that some user doesn't try to find out what the maximum datafile it can take is. That just slows it down and makes it so when the GC finally does fire off it has 10x the amount of trash it should have if the value was set lower. Those options are only useful on internal applications that never get into the hands of everyday users. It's probably a great book for server side development, but it is highlighting a major failing of the JVM.
In my experience, performance tuning is not something that is given much consideration until a production program blows up and everyone is running around in circles with sirens blaring and red lights flashing.
- if production blows up it signals that the underlying problem is not likely to be fixed with 'performance tuning'.
There is performance deficiency and then there is "production blows up" and those are different things and must be addressed by different sets of practices at different times.
Production blowing up means the design is flawed, it means misunderstanding of how the application was going to be used.
Slow response time on the other hand is about tuning, but it's very unlikely that environment tuning can help really to fix this.
Back in 2001 I was working for then Optus that was bought out by Symcor and the main project they brought me to (contract) was for this Worldinsure insurance provider, and the project was to do some weird stuff, business wise speaking, allow clients to compare quotes from different insurance providers. Business model was changing all the time, because insurance providers do not want their products to be compared against one another on line (big surprise).
The contract was expensive (5million) and WI wouldn't pay the last bit (a million I think) until the application would start responding at a 200 requests per second, and it was doing 20 or so :)
If anybody thinks that just some VM tuning can fix a problem where application is 10 times slower than expected, well, you haven't done it for long enough to talk about it then.
It took a month of work (apparently I wrote a comment on it before) that included getting rid of middleware persistence layer, switching to jsp from xslt, reducing session size by a factor of 100, desynchronising some data generators, whatever. Finally it would do 300 requests per second.
But the point is that when things are crashing or when performance is really a huge issue, you won't be optimising the VM.
VM optimisation is not generally done because I think the application has to do something that is not generic.
Imagine an application that only does one thing - say it only reads data from a file and then runs some transformation on it, maybe it's polygon rendering. Well then you know that your app. is doing only ONE THING, then you can probably use VM optimisation, because you can check that the one thing your app does will become faster or more responsive, whatever.
But if your app includes tons of functionality and tons of libraries that you don't have control over and it runs in some weird container on top of JVM, then what do you think you are going to achieve with this?
You likely will optimise something very specific and then you'll introduce a strange imbalance in the system that will hit you later and you won't see it coming at all.
If your app does one thing, maybe you have a distributed cluster with separate instances being responsible for one type of processing, then you probably can use specific optimisation parameters.
You can't handle the truth.
-XX:+UseConcMarkSweepGC helps a ton if you need low latency, and have at least 4 cores.
And does the numeric value of that option dictate the degree of JIT-time optimization performed?
Higher Logics: where programming meets science.
You could also just use a faster programming language.
Fair points but I'd say there's a few places that Java just doesn't cut it - strict scheduling and / or real time needs. O, and GUI stuff, too.
So basically, no GUI, no audio, no video. Which only more or less leaves what it currently is king of the hill for - server side business processing.
I've tried real time and Java, and the jitter that garbage collection introduces (yes, even with parallel garbage collection and all that stuff) makes it hugely unpredictable.
Strangely enough, the company formerly known as SUN knew this, and tried to create a "real time" flavour of Java with such extensions - but it takes so much effort to port your code (and the JVM is slightly less than brilliant) that you're far better off going to C/C++.
My three favourites languages for programming that include an integrated garbage collector and are on top level of CPU performance are:
Successful stories: Apache's Hadoop over JVM, Plasma compiled by OCaml, Agda type assistant, etc.
JCPM: and what about slower programming languages? I won't use them much unless stupid code around there (e.g. bash for commands, html for layout, etc).
Or just use the FFI, Java Native Access.
Ezekiel 23:20
Agreed--you need to use libraries written in C/C++ for audio/video. But you can certainly call them from Java, as long as they handle the streaming side of things on their own. Whether or not it's suitable for GUI work depends on what you want your GUI to look like; if platform integration is your top priority (default widget sets, color schemes, etc.) then no, it doesn't work so well. Otherwise, it's not fantastic but it's servicable (again assuming blistering performance isn't necessary).
Hmmm, what are you talking about? Java is about as much of a scripting language as z80 assembly. When you run JVM, it essentially is an emulator, emulating imaginary hardware. Just like running a z80 processor emulator on your intel/amd/ppc/sparc cpu. Would you call z80 assembly a scripting language? When op talk about relations between JVM and Joe Schmo Code Monkey, what he really meant was "Programming in Java is very different from knowing your hardware and the hardware you are trying to emulate in order to figure out the most efficient way to complete tasks". Albeit Java was never designed to be efficient. The goal was to make a standard portable environment across many processors and operating systems. So performance tuning in one environment may very well be detrimental in another. As such mastery in performance tuning also varies from environment to environment. Someone who's awesome at tuning Java performance on solaris/sparc may totally suck at tuning Java performance on linux/intel. All in all, i think your assessment is oversimplifying and generally ignorant.
Where is the "Ignorant" mod tag?
Why do you need to specify heap size at all?
Every native program can allocate just as much heap as it needs from Virtual Memory, and no more.
No for audio/video you need libraries written using SIMD assembly. Anything else is going to be way too slow.
I've worked with Java since 1.0. The only optimization options I've ever used were the heap and stack size adjustments.
Setting your memory heap too high actually degrades performance, oddly enough. I've got 4GB on this box and over 2.5G is normally used by disk cache, but if I allocate more than about 768MB to the heap, the performance suffers.
Maybe some of these options have real effects on certain production code characteristics, but I've found the best performance tuning options are:
So there you have it -- my favourite REAL WORLD, TESTED, and PROVEN TO WORK performance tweaks.
I do not fail; I succeed at finding out what does not work.
The more you abstract what's underneath the less performance you can achieve. Assembly vs C, C vs Java/C#/Python, Running code on a host vs a guest virtual machine, I think you get the point.
Sounds like an oxymoron to me...
You have a point, and the answer isn't any good ether. You specify the Heap size so that the Garbage Collector can be lazy and not clean up memory until the specified heap is at a certain level. I think it's just a Java thing. C# and .net grow as needed up to the systems limit. GoLang even though native is Garbage Collected (sorta) and I've yet to find some arbitrary default if one exists. Java only wants to use 1/4th or less of the available system memory, and I can only think of reasons linked to Garbage Collection.
Oh, please, neither Java nor C++ is superior to the other. They both have strengths at certain kinds of programming but altogether support extremely similar semantics in most areas. There are very difficult-to-use portions of both Java and C++. You should get yourself some more experience.
Brian Fundakowski Feldman
The Java's performance is worse due to many factors:
JCPM: Java is still a toy, a baby of >10 years of unsustainable growning, and it's not adult yet.
Bugger. I had modded a bunch of posts on this thread and now have to chuck them away just to counter your ignorance (lest your incorrect assumption spread to others). It's cool you are taking a stab at why Java might limit memory, but unfortunately that guess is not correct - so I hope to set yourself (and any other readers) straight on why Java has this feature (if you don't understand the feature you'll see it as a limitation, when in fact it is very important for security).
The limit Java imposes on memory is to ensure that critical resources (memory) is preserved for the system - and gives the Java user a way to limit what any application can do. This is a very important protection if you followed Java's internet model where Applets (or these days, WebStart applications) and allow remote applications to run. System administrators can also partition what any application can do, in case one is rogue (they can't trust those damn application developers, like me) and uses up all the system memory shared between many applications when running on Big Iron. I don't know whether any remote running .NET applications have this protection and would be interested to hear if they do - if they don't then that means any .NET application could in theory bring your server (and all the other important services you have running on it) to a crawl (as it swaps memory) or possibly even crash through memory exhaustion.
Cool story bro
direct buffers work in windows. you just need enough contiguous memory space available to allocate the buffer to.
So basically, no GUI, no audio, no video.
That's why Vuze doesn't have an integrated video player or a native, cross platform GUI.
That's why Vuze doesn't have an integrated video player or a native, cross platform GUI.
Given Vuze does have (kinda) those things, I assume you're advocating that Java does?
I'd hardly say using third party toolkits and/or native blobs written in another language (for both of those you mention) adds weight to the strengths of Java....
In fact, it confirms what I said - you don't do those things in Java.
The standard Oracle JVM has about sixty 'developer' (-XX) options which are directly related to performance monitoring or tuning... This dense, 600-page book
600 pages to explain 60 compiler options? That's why I really hate Java.
The switches and controls described are run-time options, not code optimizations. They are more like the 'economy or performance' transmission setting. Sometimes you want to get the best MPG and don't particularly care how fast you accelerate. Other times you want quick acceleration, and are willing to let mileage suffer. Having a control means you don't have to re-build your car every time you want a change.
These options are the same kinds of things. If you only have one application running on a box you may as well let the JVM grab as much of the physical memory as it can, so it does as little garbage collecting as possible. If you add more applications you may want to dial back the memory usage and take a performance hit for that application so other more important apps run better.
Can you do the same things with C or C++? Of course, as long as each and every application provides its own controls, coded into the applications, for doing that.
Microsoft Works
Military Intelligence
Java Performance
If you do not have a performance issue, that is fine. Personally, Prototype in Python and port time-critical parts to C. (Object-oriented C, as far as it makes sense.)
What I do not like about Java is that it is neither a really modern language like Python, nor a really fast and memory-efficient language like C. It is sort of a jack-of-all trades and good at none. It is also a master of syntactic clutter and complex, long code. Still, if you really know what you are doing, you can write good Java code.
As to the worker-thread pool, if you stay within the limits of the VM, it is admittedly easy to do something like that in Java. But as soon as you go beyond the not very impressive capabilities of the typical Java VM, you are screwed. But my guess would be you know that are are staying within these limits. So, yes, if you know what you are doing, even porting C to Java can make sense.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
That is nonsense. As long as you get good scheduling responsiveness, C is entirely fine. The problem with Java is not even the language itself (slow as it may be), but the garbage collection.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Double Bugger. I was going to mod you up, but you're sort of missing a piece of the puzzle. Java is supposed to be Write once run anywhere. So, yes a decent operating system should be able to enforce memory limits of applications so they don't bring your server to a crawl, but that's not built into Java, so they had to do it by default with the stupid limitation on all programs.
Well.. maybe. Or Maybe not. But Definitely not sort of.
I must have misread the article. 600 pages to describe VM parameters? How insane is that! Java is turning into a more and more complex mess all the time.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
More like you didn't read it at all. Right in the summary it says: This dense, 600-page book will not only explain these developer options and the underlying JVM technology, but discusses performance, profiling, benchmarking and related tools in surprising breadth and detail.
The C Programming Language is just 228 pages. It explains the entire language.
This book is 600 pages. It explains the 60 possible command line options for the JVM.
Java: Enterprisey.
Optimizing Java is sort-of like optimizing a low-powered family car: It does not make a lot of sense. If you really need performance, go to C
probably true on the client, but not servers. every heard of JIT?
http://en.wikipedia.org/wiki/Just-in-time_compilation
-XX11 is useful as its +1 faster than -XX10, should be the default really.
I hear they're contemplating a "-XX12" option that threatens to rip asunder the very fabric of time and space.
.there is enough of everything for everyone.
SWT is quite a popular Java library written by one of the JVM vendors. Java is the only platform I know that has libaries to write a GUI that runs on Windows, Mac, Linux, Firefox, Chrome, Opera and Internet Explorer
Are you really that dense? It describes the command line options (which would have to be written into your own application if you wanted to have that level of tuning in C). It describes the technology of the JVM (so add some books describing processors and OS internals to your list). It describes performance, profiling, benchmarking, and the tools used to do those things (add a few dozen more books to your list).
Now obviously C is much easier to do those things with. That is why the first tool for analyzing C memory usage (ThreadSpotter) takes a 149 page book to describe it. So your C library is already up to 377 pages, and you haven't even touched on the technology issues. And of course any documentation on how to tune your application must be written by you.
Please elaborate how this affects the "Write once run anywhere". I'm afraid I don't quite get what you mean. Thanks.
Actually, a modern Java Virtual Machine and JavaScript Virtual Machine are far more similar than they are different.
Brian Fundakowski Feldman
Unfortunately, when you really get down to it you will have similar problems in C or C++. You cannot allocate memory from the system heap (malloc or operator new) in the critical path of real-time code. That means everything is preallocated. That means you could have been using Java anyways, with GC disabled. There may be other reasons to use C or C++ for these systems, but the nondeterminism of dynamic memory allocation really applies across the board. It just hits Java users earlier.
Yes; GCC and ICC are quite good at optimizing SSE intrinsics, to the point that hand-coded asm can be largely avoided. I have heard otherwise for Visual Studio; fortunately, Intel's compiler works on Windows as well. As to the rest of the code, well... just keep an eye on the assembler output from every. single. build.
The book is discussing run-time tuning, so 'write once run anywhere' does not enter into it.
It seems that a lot of people on here don't 'get' run-time tuning, or they don't think it is important. In the environment that these parameters would normally be used (enterprise software), tuning is critical. All enterprise middleware and most applications have tons of tuning. In an enterprise environment you do want the ability to trade off memory usage for processor usage, for instance. You want to control how much memory is being used. You want to control how many cores your application can use. And those things change depending on lots of variables. For example, at different points during the year different applications may have higher or lower priorities. Workload can be added to or removed from an image, requiring rebalancing of resources. You may find that scaling an application up or down requires different handing of garbage collection.
Now, all these things can be done (and must be done in an enterprise) regardless of language. If you don't have a tunable run-time then the other option is that every application builds in it's own tuning parameters.
An issue recently came up on my Engineering team where a pig mapreduce job that stores in hbase slowed over the course of completing tasks until all the tasks failed due to timeouts. What appeared to be happening was a gc failure and pause due to tenure region exaustion and the built in cluster function to kill off the garbage collecting regionserver. The link below describes the issue and possible workarounds by implementing a custom memory allocation strategy. It's also a must read for anyone who isn't a java garbage collection expert. http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
Java is an OOP butter knife that does it's very best to obfuscate hardware requirements.
C++ is the swiss army knife equivalent that requires full hardware awareness and customization when applicable, boost and TR2 when not.
If you're a code monkey, enjoy Java, Struts, Spring, and Swing, but optimizing for performance is like breaking down the foundation of the language.
Wow, I'm surprised at the driveby downmods in this thread. I mean, I held this image of Java developers as open minded individuals, but maybe there is a significant minority that just aren't. Guys: it is a fact. If you need your program to go fast, plus be OOP, then you should write it in C++. GCC will hand you somewhere between 10% and 200% speedup "for free" on real life algorithms, plus startup time normally faster by a multiple, and a fraction of the memory footprint. And small binaries, and runtime support always installed by default.
Java has its uses. It is generally faster to develop in and less tricky for the naive programmer. You can effectively build a project faster with less skilled and less expensive, more freely available developers. That is a big deal, not to be underestimated. But if your functional requirements include fast and tight, Java is not the right choice. Well, it's faster than Python or Bash. And sometimes not even that if you include JIT time.
Have you got your LWN subscription yet?
(Object-oriented C, as far as it makes sense.)
It never does, trust me. If what you need is object-oriented C, use C++.
Have you got your LWN subscription yet?
Only if you absolutely must get as much out of your hardware as physically possible should you start using C/C++...
Disagree. You should choose C++ if the speedup will be *noticeable*, not just if you need to squeeze out every last erg of CPU power. In many cases the difference between Java and C++ performance, especially startup time, can be huge. Don't even think about writing light little utilities in Java.
Have you got your LWN subscription yet?
Woah, browsers! Nice one, in that way you let out QT, which only runs on Window, Mac and Linux. And blows SWT out of the water in every way.
Have you got your LWN subscription yet?
Optimizing Java is sort-of like optimizing a low-powered family car: It does not make a lot of sense. If you really need performance, go to C
probably true on the client, but not servers. every heard of JIT?
Also true on servers, even with the JIT.
Have you got your LWN subscription yet?
In many cases the difference between Java and C++ performance, especially startup time, can be huge.
Quite apart from the fact that there are applications where startup time really isn't very important, it's important to emphasize that it is possible to write bad code in any language. It's certainly possible to do bad code in both Java and C++. It should be possible for the best of C++ code to beat the best of Java code, but it will be quite difficult to reach that level with either language (both are quite subtle in places) and good code in either will beat bad code in either. (Choosing a good algorithm is still important, folks!)
The weird thing is that I've found some utilities definitely written in Java still start up very quickly; there's obviously some costs there that most apps are paying which they shouldn't be.
"Little does he know, but there is no 'I' in 'Idiot'!"
Maybe you could use JNI for that, in certain very specialized cases, but if you write parts of your application in C/JNI you run the risk of just combining Java's weaknesses (memory, performance) with C's weaknesses (error prone). A nullpointer in a native code part of your application will unceremoniously crash your JVM and everything running in it.
JNI is often used for things that can only be done in native code. An example I can think of are atomic compare and swap operations in the java.util.concurrent package. These are implemented via a JNI method essentially calling just one machine instruction (on Intel, it may be a hand full on some architectures). Yes, this is for performance reasons - an atomic compare&swap is faster that using locks, not because native code runs faster but because it's the only efficient way to implement it.
whenever I see java and performance in the same sentence the only words that belongs in between them is "has no" as in:
java "has no" performance
Imho the best test for a language (implementation) is whether it is possible to implement the language in itself. Would it be possible to write a JVM in Java alone? Then back to the drawing board.
How come you geeks can memorize mountains of trivia and argue the most subtle and arcane points about that trivia, yet none of you understands the apostrophe? It's one of the smallest punctuation marks and is dead easy to understand, yet it utterly defeats all of you.
Unfortunately, when you really get down to it you will have similar problems in C or C++.
Not quite - I do realtime audio processing and when you get down to millisecond / submillisecond timing issues, you simply can't do this in standard Java - even when using a no-allocation loop (as you mention). I know, I've tried apples to apples.
You can go down the route of using the special realtime JVM as I mentioned above, but then you get other problems (you're not really writing Java, you can't use any of the standard libraries etc).
I prototype my audio stuff in Java, simply because writing and debugging the DSP code is so much easier with good tools, but the bare metal implementation is C++.
Not my experience. I'm continually impressed with how fast java and C# are, and how well systems written in these languages perform in realtime apps. Sure, you get outliers, but then you get outliers from the OS, core swaps, networking stacks, etc etc, it's just one more area you have to watch and carefully consider, that's all. I'm not suggesting that code which hasn't been thought about performs well in this environment, but that it's possible to produce perfectly functional realtime systems with these languages.
I'm not saying it's not fast, I'm saying that tight timing related things (like nano sleeps to wait until the soundcard buffers are ready to be filled again) have too much jitter to be useful.
I can happily get 10 ms audio latency in Java, but going any lower is where it just doesn't cut it. Letting the thread busy wait isn't really an option when the machine has to be doing other things at lower priority too.
In short, using Java's nanosleep with tight timing tolerances seems to randomly get wakeup jitter when the VM chooses to do some internal stuff. This is with a no-allocation loop confirmed with tracing tools.
My best guess from profiling the system calls is that Java is doing some mutex related things that's causing the jitter. I neither have the time nor inclination to "fix" this problem, as the solution is quite simple. Write it in a language where I have control over it.
Don't forget, fast != predictable and that's where C/C++ earns its paycheck.
Quite apart from the fact that there are applications where startup time really isn't very important...
Funny, why is it that I don't know any of those applications?
It is not only quite possible for C++ code to beat Java in performance, startup time and memory footprint, it is quite easy.
Could you please give me an example of a utility written in Java, and I will measure its startup time. Thanks.
Have you got your LWN subscription yet?
Wow, I'm surprised at the driveby downmods in this thread. I mean, I held this image of Java developers as open minded individuals, but maybe there is a significant minority that just aren't. Guys: it is a fact. If you need your program to go fast, plus be OOP, then you should write it in C++. GCC will hand you somewhere between 10% and 200% speedup "for free" on real life algorithms, plus startup time normally faster by a multiple, and a fraction of the memory footprint. And small binaries, and runtime support always installed by default.
Java has its uses. It is generally faster to develop in and less tricky for the naive programmer. You can effectively build a project faster with less skilled and less expensive, more freely available developers. That is a big deal, not to be underestimated. But if your functional requirements include fast and tight, Java is not the right choice. Well, it's faster than Python or Bash. And sometimes not even that if you include JIT time.
OK, what is it about Java programmers and not being able to tolerate factual statements about their favorite language?
Have you got your LWN subscription yet?
foreigners, non native speakers
Except you can't use your QT code in both a thick client and a web client.
Dumbass: java scripters do not write javascript.
It is not only quite possible for C++ code to beat Java in performance, startup time and memory footprint, it is quite easy.
Yup, the startup may be quicker with an app written in C++, bit all too often it's followed by a catastrophic crash.
It is not only quite possible for C++ code to beat Java in performance, startup time and memory footprint, it is quite easy.
Yup, the startup may be quicker with an app written in C++, bit all too often it's followed by a catastrophic crash.
So... why haven't I seen any of those catastrophic crashes you talk of in as long as I can reasonably remember? Oh wait, right, Chrome used to go "snap" and Firefox would oom once a week. Gosh.
Have you got your LWN subscription yet?
ditch the book - use "new" and "delete" to allocate and free memory ; could be a whole new paradigm!
it's and its are not easy to understand, since a rule is broken.
Struts's vs. struts' is not easy to understand, since they are edge/corner cases.
Hallowe'en is not easy to understand because it is archaic.
> It never does, trust me. If what you need is object-oriented C, use C++.
Don't get stupid. There's no reason to use a complex language with ultra-long compile-times and tons of inline code if you can just as well use C. Especially in the context of Python extensions or interfacing with other C code, using C++ is not a good option, as you have to make sure to stay C compatible and not leak exceptions to the C code calling you.
IIRC, Java VMs work on bytecode whereas JavaScript VM's usually work on ASTs.
That particular reason would only apply to applets where the default is set to 64megs. The first and primary reason to go in and change the memory allocation is to decrease Garbage Collections. If Eden is TOO big your garbage collections take TOO long, but if your Eden is TOO small your garbage collections are TOO frequent. Only a lazy and arrogant system admins would think that the memory settings were for them to enforce some form of system policy.
Sysadmins don't do it for 'policy' - they do it for performance and server safety.
Ok, I wasn't specifically talking about the context of the book, but rather the rationale behind the run time options. Their "write once, run anywhere" philosophy extends beyond the actual writing of the code to the run time tuning parameters. Comprendes?
Well.. maybe. Or Maybe not. But Definitely not sort of.
With the memory limitation built in, you can easily port your application to a server of completely different architecture that lacks the memory limitation feature. I don't see why that's so hard to understand, but apparently two of you do.
Well.. maybe. Or Maybe not. But Definitely not sort of.
No, you are entirely wrong. Please show us any document that states what you said. "Write once, run anywhere" means exactly what it says. The developer does things once, and the user runs it in whatever environment they want. How does a user specifying different tuning parameters when he starts his JVM affect what the developer did? The application was still written (and compiled) once. And the application will run, unchanged, no matter what the users priorities are with regards to things like heap sizes, GC techniques, etc. That is pretty much the very essence of "write once, run anywhere".
" it's faster than Python or Bash. And sometimes not even that if you include JIT time."
TROLL.
I just started reading this book, based on the review, and it looks like it'll be worthwhile.
However, in the first chapter, in the section titled Choosing the Right Platform and Evaluating a System (which really just focuses on how the newer Sun Sparc processors differ from other multi-core, multi-threaded processors), they manage to repeat themselves four or five times, using almost the same words each time.
I'm hopeful that this is an isolated case and not something that happens throughout the book. The earlier parts of that first chapter didn't have the same issue.
Stop thinking that everything is a utility, 90% of Java "applications" are not client side applications but run server side and where if it takes 5seconds or 30seconds to startup its irrelevant.
When you start hitting heavy server side "applications", even the C++ ones take a while to startup.
Then learn to be a good Sysadmin and set the memory limits for processes using the OS. Using the Java Xmx setting will only screw with the Garbage Collection, and you'll still not have accounted for the GC and VM overhead. I can actually tell Java to use 128megs and it still take 200. If you're trying to optimize the performance of the Garbage Collector then go ahead start fiddling with them. Other systems have ether determined that it's not worth the hassle or have figured out a better way of managing heap size Garbage Collection Ratios.
" And the application will run, unchanged, no matter what the users priorities are with regards to things like heap sizes, GC techniques, etc. That is pretty much the very essence of "write once, run anywhere"."
No, the application's performance is very much affected by run time switches, obviously. I can cause it to completly not run at all, with a run time switch, by setting the heap to a ridicoulously small number.
But the main point in our disagreement is you have an exceedingly small ability to understand abstract concepts. When people talk about the "spirit of open source" they are talking about more than strict license compliance. When I speak of "write once, run anwhere", I speak of more than just what the developer does ( the write part), but also what someone who uses the application does (the run part), and I assume the purpose of the heap size run time flag was part of that same line of thinking.
Well.. maybe. Or Maybe not. But Definitely not sort of.
All that shows is that it takes the resources of a Google and an app architecture that acknowledges any large program written in C/C++ will leak in order to create a "snappy" app that doesn't crash. (Although I've seen colleagues crash Chrome plebty of times). Bascially, I've worked for too many companies with labyrinthine, buggy C++ codebases to want to use that boondoggle of a language again. There's only so many times I could bear working on bugs related to thread issues caused by the myriad ways encapsulation can be broken in C++, or others where it was down to Stroustrup's knack of making the wrong decision whenever he was faced with a choice in how to implement a part of the C++ object model.
In production I have had to change the garbage collector limits to stop it thrashing when the total size of objects was near the default heap size.
You do need to have this ability, and the O/S doesn't provide it (you can't control the garbage collector using O/S switches). Oh, and if the overhead of the VM is 72M you can easily do the math to determine what size to set for you memory limit - again, at least you have the ability to do this with Java. So, the memory limit switch can be used in two ways.
All that shows is that it takes the resources of a Google and an app architecture that acknowledges any large program written in C/C++ will leak in order to create a "snappy" app that doesn't crash.
Wow, that's funny considering A) Google's apps are generally embarrassingly overweight and sloppily coded and B) most of the best C++ apps come from small teams or inviduals. Can you spell Carmack?
Have you got your LWN subscription yet?
Thus we are back to my original point. The Java memory switches are for Garbage Collection Management. They are not as you put it so the "System administrators can also partition what any application can do, in case one is rogue". They are there so that you can adjust the GC, and your OS has its own controls to stop an application from going Rogue and taking down your system. Using them as memory limits to prevent an application from going rogue is counter productive to using them to optimize garbage collection.
The switches were designed for, and are used for, more than one purpose. It is that simple.