Slashdot Mirror


Java Faster Than C++?

jg21 writes "The Java platform has a stigma of being a poor performer, but these new performance benchmark tests suggest otherwise. CS major Keith Lea took time out from his studies at student at Rensselaer Polytechnic Institute in upstate New York's Tech Valley to take the benchmark code for C++ and Java from Doug Bagley's now outdated (Fall 2001) "Great Computer Language Shootout" and run the tests himself. His conclusions include 'no one should ever run the client JVM when given the choice,' and 'Java is significantly faster than optimized C++ in many cases.' Very enterprising performance benchmarking work. Lea is planning next on updating the benchmarks with VC++ compiler on Windows, with JDK 1.5 beta, and might also test with Intel C++ Compiler. This is all great - the more people who know about present-day Java performance, the better.""

25 of 1,270 comments (clear)

  1. Um, it's online by JoshuaDFranklin · · Score: 5, Informative
    If you want, you can read the actual author's piece instead of a news story about it:

    The Java is Faster than C++ and C++ Sucks Unbiased Benchmark

    1. Re:Um, it's online by cheesybagel · · Score: 4, Informative
      I just looked at hash (C++, Java), but it seems he uses C++ STL and the Java API. This may end up being more of an API than a language test...

      It also does stupid things. Like this:

      int c = 0;
      for (int i=n; i>0; i--) {
      sprintf(buf, "%d", i);
      if (X[strdup(buf)]) c++;
      }
      When this would have worked just fine:
      int c = 0;
      for (int i=n; i>0; i--) {
      if (X[atoi(i)]) c++;
      }

      The alternative is actually shorter, besides being faster and using less RAM.

      I think the person which wrote this didn't know how to program in C++ very well. The two pieces of code are not even equivalent. The second loop is traversed backwards in the C++ version while it is not in the Java version. Don't ask me why.

    2. Re:Um, it's online by Anonymous Coward · · Score: 5, Informative

      It also does stupid things. Like this:

      int c = 0;
      for (int i=n; i>0; i--) {
      sprintf(buf, "%d", i);
      if (X[strdup(buf)]) c++;
      }

      When this would have worked just fine:

      int c = 0;
      for (int i=n; i>0; i--) {
      if (X[atoi(i)]) c++;
      }


      The code is dumb, yes, but you are wrong, nonetheless. That code won't even compile. I think you meant itoa(), which would be about the same as sprintf in terms of functionality.

      That for() loop is not equivalent to the Java code's for loop, either. In the java code, he used

      if (ht.containsKey(Integer.toString(i, 10))) c++;

      which means that he should have used

      if (X.count(somestringrepofi)) c++;

      X[somestringrepofi] will create an entry for the key if it is not found, making it very different from containsKey().

    3. Re:Um, it's online by GooberToo · · Score: 5, Informative

      The more I read the code, it sure is starting to look more and more like an apples and oranges comparison; which is usually what happens when people do java benchmarks.

      As typically observed, I'm seeing the benchmarks take serious advantage of java's GC mechanism, whereby, they never pay the piper. With C++, the piper is constantly being paid. All of the benchmarks which allocate objects and then delete them, are therefore, invalid. To be fair, I think, you either need to add a System.gc() line in the java code where C++ is doing its deletes or you need to implement your own new and delete operators to function more in line with what java is doing. Until you do either one of those things, the comparisons where objects are being allocated and deallocated are invalid. And frankly, I'm still not sure adding System.gc() is even fair, on either side. The reason is, calling System.gc() simply hints that it's a good idea to collect. There is nothing which requires the collection to take place. So, technically, the call could still be many times faster. On the other hand, I don't know enough about how they handle their gc-hint logic nor am I aware of exactly how much overhead is involved in the actual collection process. If it occurs too often, the shift in workload may be too unfair. Nonetheless, it's a point of very serious contention.

      Just for kicks, I modified objinst.java with, "if( i%(n/1000) == 0 ) System.gc() ;", on the lines that the C++ code had it's delete. When I timed it, it was over twice as slow as the C++ code (24+s vs 55+s). Worse, when I ran it with a 1:1 ratio of delete:System.gc(), I simply got tired of waiting, having waited over 5 minutes.

      So, basically, I'm not nearly as impressed as I first was. Simplistically, it's starting to look like a serious apples and oranges comparison. Elsewhere, you can find other examples of just plain bad code. Where again, with correct C++ code, came in about twice as fast at the Java code, whereby, more optimizations were still possible with the C++ code.

      So, it looks like we're seeing a combination of things here. Looks like a combination of bad code, ideal corner cases for java's hot spot, and invalid comparisons with memory allocation.

      Sadly, I'm once again seriously disappointed in java.

    4. Re:Um, it's online by Alan+Shutko · · Score: 4, Informative

      As typically observed, I'm seeing the benchmarks take serious advantage of java's GC mechanism, whereby, they never pay the piper. With C++, the piper is constantly being paid.

      That's not a bug in the tests, it's a feature.

      The theory behind garbage collection isn't just that it allows the programmer to avoid the effort of watching when to delete things. It's that garbage collection can actually improve performance on certain workloads.

      Forcing a garbage collection for every delete is completely unfair, since it does a full scan of memory, as opposed to just twiddling bits to free a single data value.

      There's no memory leak for these benchmarks... both C++ and Java free all memory used when the process exits. Perhaps you'd prefer a longer-running test with lot of garbage generation (forcing gc to run at some point).

  2. The Great Computer Language Shootout by thebra · · Score: 4, Informative

    Correct link

  3. What are -client and -server? by JoshuaDFranklin · · Score: 4, Informative
    1. Re:What are -client and -server? by Jennifer+E.+Elaan · · Score: 4, Informative
      "Some of the other differences include the compilation policy used, heap defaults, and inlining policy."

      Am I the only one who noticed the "inlining policy" thing? Considering "method call" was one of the most compelling arguments for his case (by orders of magnitude!), the fact that the methods being "called" are being called *INLINE* should mean something.

      If you're allowed to turn on the java inliner, surely you can spare the time to turn on the C++ one as well (he used -O2, not -O3, for compiling the C++ apps).

  4. Re:If you don't run the JVM... by Tar-Palantir · · Score: 4, Informative

    He claims you should use the server JVM instead, stating that it is faster but slower to startup and consumes more memory.

  5. Re:Caught up with the speed, but still the ugliest by mark-t · · Score: 4, Informative
    You haven't played with the pluggable look and feel for Swing much, have you?

    Oh... and as of Java1.5, Swing apps can now be skinned to look however you'd like them to.

  6. Re:This doesn't make any sense... by Ianoo · · Score: 4, Informative

    Java isn't "emulation". Modern JVMs use a JIT (just-in-time compiler) to translate bytecode instructions into pure binary assembled object code just before it is reached in the program (hence "just in time"). This is cached, so the next time that particular code is executed, it will run at full assembler speed.

    Something I've often wondered is whether this caching could be persistent, i.e. be kept between runs of the JVM. Eventually, the entire program would be translated to pure assembler with the cost of translation largely amortised across many sessions. You still keep the safety, cross platform compatibility and ease-of-programming of a bytecode language (i.e. Java, C#) but you get the bonus of the cached object code being just as fast, even during startup and shutdown.

  7. Re:my arse by kaffiene · · Score: 5, Informative

    *sigh* have you people never heard of runtime optimisations? There are some things you can optimise at runtime (like runtime constants) which are *impossible* to optimise at compile time.

    This whole "x is written in y, so x can't be faster than y" rubbish is just that - rubbish.

  8. been there, done that by Anonymous Coward · · Score: 5, Informative

    1) javac (Sun's Java compiler) is written in Java. You can even access it programmatically at runtime if you really want to.

    2) While it's not an id game, IL2 Sturmovik is a critically-acclaimed fight simulator that was written almost entirely in Java.

  9. Very true, if don't nkow what you are doing by Pac · · Score: 4, Informative

    Out of the box Swing is amazingly ugly. The people choosing default colors at Sun could well be substituted by a randomizer without a difference in results. I mean, who was the genius who thought purple bars in a menu were cute?

    Now, when you need to change that quickly and without much overload, there are ways. A little known global HashTable called UIDefaults lets you change just about everything on the visual interface without having to write your own LookAndFeel (which you obviously can do too, for very large projects). You can have your scrollbars, menus, etc in any colour, size and shape, using any font. You can easily change all default colours without having to set every control. After a while the ugliness ceases to be a problem.

  10. Re:Caught up with the speed, but still the ugliest by Inf0phreak · · Score: 4, Informative
    Would you please turn off the moronic "smart quotes" feature in IE?
    Seeing things like this:
    I’m
    is hurting my eyes.

    This page has more information about this horrible malfeature.

    --
    ________
    Entranced by anime since late summer 2001 and loving it ^_^
  11. Slow C++ compiler by siesta+at+uni · · Score: 4, Informative

    The article says he used GCC to compile the C++ versions, but GCC produces code that isn't nearly as good as the Intel compiler for example. (Here, but no good if you don't subscribe)
    A lot of the test results are close, and I think a different compiler would change the outcome.

  12. Re:He used g++ to compare C++ with Java... by ky11x · · Score: 5, Informative

    g++'s goal is modularity for ease of porting in cross-platform cross-compiling. aggressive optimization is not one of its strengths. the point of such benchmarks is really not a language comparison, but a comparison between the code generated by the most optimized compilers for that language on a specific platform. Using g++ for this simply causes the study to lose credibility

  13. different requirements by vlad_petric · · Score: 4, Informative
    The server one is optimized for throughput and concurrency, whereas the client one for latency.

    You might think that the two are the same, but the two settings actually make a visible impact if you're running on a multi-processor system. Most notably, the garbage collector and locking primitives are implemented differently.

    --

    The Raven

  14. Re:He used g++ to compare C++ with Java... by exp(pi*sqrt(163)) · · Score: 4, Informative

    g++ isn't great at optimizing. For code I write it's somewhere between 0 and 50% slower than MSVC. It depends a lot on the type of code of course. For pure numerical work I think the Intel compiler usually scores highly so I'm surprised you're not seeing much difference.

    --
    Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
  15. String concat sillyness by danharan · · Score: 4, Informative
    The article mentions Lea modified the String concatenation code, although Java still lost to C++ in that test. He unfortunately didn't do a great job:
    import java.io.*;
    import java.util.*;

    public class strcat {
    public static void main(String args[]) throws IOException {
    int n = Integer.parseInt(args[0]);
    StringBuffer str = new StringBuffer();

    for (int i=0; i<n; i++) {
    str.append("hello\n");
    }

    System.out.println(str.length());
    }
    }
    Instantiating the StringBuffer with an approximate size would prevent it from having to reassign a char array every time it runs out of space. new StringBuffer(n*6) for n=10000000 as used in his test should have a pretty large impact.

    I could not run the test for 10M, but ran it for up to 1M. 541 milliseconds in one case, 280 in the other. Here's the code I used (I had to modify the timing cause I'm running XP):
    public class Strcat2 {
    public static void main(String args[]) throws IOException
    {
    long start, elapsed;
    start = System.currentTimeMillis();

    int n = Integer.parseInt(args[0]);
    StringBuffer str = new StringBuffer(n*6);

    for (int i=0; i<n; i++)
    {
    str.append("hello\n");
    }

    System.out.println(str.length());
    elapsed = System.currentTimeMillis() - start;
    System.out.println("Elapsed time: "+elapsed);
    }
    }
    The only difference in the class Strcat besides the class name is the instantiation of StringBuffer.

    NB: I'm not accusing the author of bias against Java, nor am I ignorant of the fact a bunch of /.'ers could kick my ass in C++ optimization. It would be interesting however to have a distributed benchmark, where in the true spirit of OSS we could fiddle with it until we could not wring any more performance gains.
    --
    Information: "I want to be anthropomorphized"
  16. Re:He used g++ to compare C++ with Java... by mwillis · · Score: 5, Informative

    Intel gives their c++ compiler away free for non-commercial hobbyist use on linux.

    The windows version has a free trial that runs for 30 days.

    Try it. See if it makes a difference. If it doesn't, torch it. If you find it makes your critical code run 2x faster, then... have a look at what a computer that runs 2x faster will cost you, and then decide what to do.

  17. Command and source/test review. by Ninja+Programmer · · Score: 4, Informative

    erm ... I only checked the fibonacci routine, but it's actually quite funny - he's branching recursive calls, a clear case when a smart-enough runtime optimization would work better. I mean, any reasonably smart optimizer would eventually figure out that there are too many calls to the same function with the same argument to just stand by and watch. I'd say that given this difference c++ did quite alright in that one.

    This is known as the "halting problem". No, the compiler cannot guarantee the ability to transform a recursive solution to a non-recursive one. The case of the fibonacci algorithm is a particularly difficult one to transform properly if the compiler hasn't special cased it.

    That said -- Ack and Fib are call overhead limited. They examples of poor quality code whose performance is not inner loop based.

    Hash will be C-string (specifically strcmp and sprintf) limited in performance. The performance is therefore very data dependent (since Java uses length delimited strings.) Using a fast string class such as "The Better String Library" (http://bstring.sf.net) would have yielded C++ far better performance. A similar comment applies to the strcat test.

    The Heapsort is a particularly bad implementation. In good implementations, the Intel compiler really takes gcc to town. See: http://www.azillionmonkeys.com/qed/sort.html

    Integer Matrix multiplying is an extremely rare application. So I wouldn't put too much stock in the results here -- though, I would be surprised if there was much differentiation between either Java of C++ on this test.

    The method calling, I think, will be very much limited by the compiler's ability to inline past method calls. I think Intel C/C++ differentiates itself on such things.

    The Nestedloop and random tests are interesting -- I don't see how Java is supposed to beat C++ on it, but its possible to be equal.

    I don't know enough about the Java object system and barely enough about C++ object system to comment on sieve or objinst.

    It seems to me that sumcol and wc are going to be IO limited.

    I don't think this test is exactly fair, as the code is not representative for tasks where performance really matters.

    1. Re:Command and source/test review. by arkanes · · Score: 4, Informative
      The method call is a really egregious case of bad methodolgy. The C++ test he's using was designed to test method call overhead, which is why it returns by value instead of refrence. The massive performance improvement from the JVM (server only, note) is the JVM aggresively inlining the call (and it's a single method call in a tight loop, an obvious inline candidate), so what he's measuring is just loop performance, not method call overhead. If he hadn't disabled loop unrolling and function inlining in GCC (via O2), C++ would have performed much better.

      It's worth pointing out that inlining is a case where a VM can really shine for optimizing because it has alot more options available (partial inlining, etc) and can make better decisions about tradeoffs. But this particular benchmark is comparing apples to oranges.

    2. Re:Command and source/test review. by norton_I · · Score: 4, Informative

      The method test is limited by two things: First, gcc appears unable to inline virtual method calls through a pointer to an object, even when the exact type is available, second, he didn't turn on loop unrolling. Loop unrolling is a huge, huge advantage on any sort of null benchmark like this.

      Changing the objects the be stack allocated and adding -funroll-loops moves the c++ benchmark up to just ahead of the java benchmarks.

      Of course, this does point to several advantages of a runtime optimizer. A static optimizer will never be as good at optimizing virtual function calls as a runtime optimizer, since it will never be as good at identifying types. Also, a runtime optimizer will always be better at creating specializations of existing functions (creating a special version of a routine when some input value is treated as a constant).

  18. Java performance "truths" change over time by eduardodude · · Score: 5, Informative

    Check out this recent IBM Developerworks article which looks at how modern JVMs handle allocation and garbage collection.

    Some very surprising tidbits there. For instance:

    "Performance advice often has a short shelf life; while it was once true that allocation was expensive, it is now no longer the case. In fact, it is downright cheap, and with a few very compute-intensive exceptions, performance considerations are generally no longer a good reason to avoid allocation. Sun estimates allocation costs at approximately ten machine instructions. That's pretty much free -- certainly no reason to complicate the structure of your program or incur additional maintenance risks for the sake of eliminating a few object creations."

    Read the article for an excellent nuts-and-bolts analysis of many current performance considerations. From the posts in this thread, it's pretty clear a lot of people haven't looked into what's actually done in a server JVM these days.