Slashdot Mirror


Java Faster Than C++?

jg21 writes "The Java platform has a stigma of being a poor performer, but these new performance benchmark tests suggest otherwise. CS major Keith Lea took time out from his studies at student at Rensselaer Polytechnic Institute in upstate New York's Tech Valley to take the benchmark code for C++ and Java from Doug Bagley's now outdated (Fall 2001) "Great Computer Language Shootout" and run the tests himself. His conclusions include 'no one should ever run the client JVM when given the choice,' and 'Java is significantly faster than optimized C++ in many cases.' Very enterprising performance benchmarking work. Lea is planning next on updating the benchmarks with VC++ compiler on Windows, with JDK 1.5 beta, and might also test with Intel C++ Compiler. This is all great - the more people who know about present-day Java performance, the better.""

74 of 1,270 comments (clear)

  1. Um, it's online by JoshuaDFranklin · · Score: 5, Informative
    If you want, you can read the actual author's piece instead of a news story about it:

    The Java is Faster than C++ and C++ Sucks Unbiased Benchmark

    1. Re:Um, it's online by Too+Much+Noise · · Score: 3, Informative

      erm ... I only checked the fibonacci routine, but it's actually quite funny - he's branching recursive calls, a clear case when a smart-enough runtime optimization would work better. I mean, any reasonably smart optimizer would eventually figure out that there are too many calls to the same function with the same argument to just stand by and watch. I'd say that given this difference c++ did quite alright in that one.

      So yes, there are cases when runtime optimizations that are unavailable at compile time can speed things a lot. Does this make Java faster? yes, if you look at the right corner case. Hell NO, if you look at the wrong one.

      The right tool for the job, as usual. And the right tool wielder, otherwise any tool will suck.

    2. Re:Um, it's online by mukund · · Score: 3, Informative

      Your wish just came true. Check out the JNode project.

      --
      Banu
    3. Re:Um, it's online by cheesybagel · · Score: 4, Informative
      I just looked at hash (C++, Java), but it seems he uses C++ STL and the Java API. This may end up being more of an API than a language test...

      It also does stupid things. Like this:

      int c = 0;
      for (int i=n; i>0; i--) {
      sprintf(buf, "%d", i);
      if (X[strdup(buf)]) c++;
      }
      When this would have worked just fine:
      int c = 0;
      for (int i=n; i>0; i--) {
      if (X[atoi(i)]) c++;
      }

      The alternative is actually shorter, besides being faster and using less RAM.

      I think the person which wrote this didn't know how to program in C++ very well. The two pieces of code are not even equivalent. The second loop is traversed backwards in the C++ version while it is not in the Java version. Don't ask me why.

    4. Re:Um, it's online by Anonymous Coward · · Score: 5, Informative

      It also does stupid things. Like this:

      int c = 0;
      for (int i=n; i>0; i--) {
      sprintf(buf, "%d", i);
      if (X[strdup(buf)]) c++;
      }

      When this would have worked just fine:

      int c = 0;
      for (int i=n; i>0; i--) {
      if (X[atoi(i)]) c++;
      }


      The code is dumb, yes, but you are wrong, nonetheless. That code won't even compile. I think you meant itoa(), which would be about the same as sprintf in terms of functionality.

      That for() loop is not equivalent to the Java code's for loop, either. In the java code, he used

      if (ht.containsKey(Integer.toString(i, 10))) c++;

      which means that he should have used

      if (X.count(somestringrepofi)) c++;

      X[somestringrepofi] will create an entry for the key if it is not found, making it very different from containsKey().

    5. Re:Um, it's online by Tranzig · · Score: 2, Informative

      The second loop is traversed backwards in the C++ version while it is not in the Java version. Don't ask me why.

      There is an instruction called loop in the x86 instruction set which decreases the value of the register cx by 1 on every iteration, until it becomes 0. So by writing decrementing for loops, it could be transformed to a mov and a loop. Two decades ago it was faster than increasing a variable, checking if it reached the required value and if it's less then jumping to the begining.
      But nowadays x86 processors work different, and loop became slower than the jumping method. It seems the writer of this code forgot this one. Anyways, the optimizing compilers don't generate loop instructions for decrementing for loops anymore.

    6. Re:Um, it's online by Pointer80 · · Score: 3, Informative

      Read the article. He said that he included JVM startup time in the benchmarks.

      /pointer

      --
      [%- PROCESS life -%]
    7. Re:Um, it's online by GooberToo · · Score: 5, Informative

      The more I read the code, it sure is starting to look more and more like an apples and oranges comparison; which is usually what happens when people do java benchmarks.

      As typically observed, I'm seeing the benchmarks take serious advantage of java's GC mechanism, whereby, they never pay the piper. With C++, the piper is constantly being paid. All of the benchmarks which allocate objects and then delete them, are therefore, invalid. To be fair, I think, you either need to add a System.gc() line in the java code where C++ is doing its deletes or you need to implement your own new and delete operators to function more in line with what java is doing. Until you do either one of those things, the comparisons where objects are being allocated and deallocated are invalid. And frankly, I'm still not sure adding System.gc() is even fair, on either side. The reason is, calling System.gc() simply hints that it's a good idea to collect. There is nothing which requires the collection to take place. So, technically, the call could still be many times faster. On the other hand, I don't know enough about how they handle their gc-hint logic nor am I aware of exactly how much overhead is involved in the actual collection process. If it occurs too often, the shift in workload may be too unfair. Nonetheless, it's a point of very serious contention.

      Just for kicks, I modified objinst.java with, "if( i%(n/1000) == 0 ) System.gc() ;", on the lines that the C++ code had it's delete. When I timed it, it was over twice as slow as the C++ code (24+s vs 55+s). Worse, when I ran it with a 1:1 ratio of delete:System.gc(), I simply got tired of waiting, having waited over 5 minutes.

      So, basically, I'm not nearly as impressed as I first was. Simplistically, it's starting to look like a serious apples and oranges comparison. Elsewhere, you can find other examples of just plain bad code. Where again, with correct C++ code, came in about twice as fast at the Java code, whereby, more optimizations were still possible with the C++ code.

      So, it looks like we're seeing a combination of things here. Looks like a combination of bad code, ideal corner cases for java's hot spot, and invalid comparisons with memory allocation.

      Sadly, I'm once again seriously disappointed in java.

    8. Re:Um, it's online by Alan+Shutko · · Score: 4, Informative

      As typically observed, I'm seeing the benchmarks take serious advantage of java's GC mechanism, whereby, they never pay the piper. With C++, the piper is constantly being paid.

      That's not a bug in the tests, it's a feature.

      The theory behind garbage collection isn't just that it allows the programmer to avoid the effort of watching when to delete things. It's that garbage collection can actually improve performance on certain workloads.

      Forcing a garbage collection for every delete is completely unfair, since it does a full scan of memory, as opposed to just twiddling bits to free a single data value.

      There's no memory leak for these benchmarks... both C++ and Java free all memory used when the process exits. Perhaps you'd prefer a longer-running test with lot of garbage generation (forcing gc to run at some point).

    9. Re:Um, it's online by Sinterklaas · · Score: 2, Informative

      Technically, the majority of the time, according to java's docs, it should do nothing at all.

      It doesn't say that at all:

      "Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects."

      So, if the majority of the calls are more or less an empty function call, what's the harm in doing it?

      The VM can't determine whether a gc is useful or not without a full heap scan. AFAIK, calling System.gc() will usually result in a full scan, totally defeating the standard gc'ing (which is supposed to be efficient as is). You aren't supposed to call System.gc() except in special circumstances.

  2. The Great Computer Language Shootout by thebra · · Score: 4, Informative

    Correct link

  3. Sorry, no. by Anonymous Coward · · Score: 0, Informative
    Java Faster Than C++?

    No, it isn't. It's much slower.

    I wrote a program that simply counts to 10000 and then quits. Time from double-clicking the icon until when the program exits:
    C++: 0.5 seconds
    Java: 20 seconds
    How hard is that?
    1. Re:Sorry, no. by Anonymous Coward · · Score: 1, Informative
      I think you forgot to use -O3. If you had, your c program would run a 2 instruction loop: decrement, branch. I seriously doubt you found a way to write a faster program in assembly. ;)

      Input:
      int i;
      for (i = 0; i < 1000000000; ++i) {
      }
      Output (gcc -O3):
      movl $999999999, %eax
      .p2align 4,,15
      L6:
      decl %eax
      jns L6
  4. What are -client and -server? by JoshuaDFranklin · · Score: 4, Informative
    1. Re:What are -client and -server? by Jennifer+E.+Elaan · · Score: 4, Informative
      "Some of the other differences include the compilation policy used, heap defaults, and inlining policy."

      Am I the only one who noticed the "inlining policy" thing? Considering "method call" was one of the most compelling arguments for his case (by orders of magnitude!), the fact that the methods being "called" are being called *INLINE* should mean something.

      If you're allowed to turn on the java inliner, surely you can spare the time to turn on the C++ one as well (he used -O2, not -O3, for compiling the C++ apps).

  5. Re:If you don't run the JVM... by Tar-Palantir · · Score: 4, Informative

    He claims you should use the server JVM instead, stating that it is faster but slower to startup and consumes more memory.

  6. Cross platform by leakingmemory · · Score: 2, Informative

    Java's strength is mostly it's cross platform compatibility. I have never really liked C++ very much. It seems complicated to write cross-platform code with C++. (Header troubles, and all OSes seems to have it's own implementation. Ie. try to compile Linux code on FreeBSD, and opposite) As a conservative coder I therefore prefer C, which is as fast as you (the coder) make it.

  7. Re:Caught up with the speed, but still the ugliest by mhale2243 · · Score: 3, Informative

    True, which is why the eclipse project (www.eclipse.org) created and maintains SWT. A portable native widget tookit. See http://www.eclipse.org/articles/Article-SWT-Design -1/SWT-Design-1.html for more info.

  8. Re:-O3 by Anonymous Coward · · Score: 1, Informative

    Don't forget strip -s

  9. Re:Caught up with the speed, but still the ugliest by mark-t · · Score: 4, Informative
    You haven't played with the pluggable look and feel for Swing much, have you?

    Oh... and as of Java1.5, Swing apps can now be skinned to look however you'd like them to.

  10. Re:Just one game by Mindcry · · Score: 2, Informative

    unreal uses a java like scripting language, but its for gameplay code, not the actual rendering, however, i think that's as close as it comes right now. (the java like stuff is something like 20x as slow as the c++ code, cause of the overhead etc)

    P.S. a lot of gameplay code is also in c++ and compiled into dll's (i believe), but mods dont have access to the headers to compile c++ code into the game readily.

  11. Re:This doesn't make any sense... by Ianoo · · Score: 4, Informative

    Java isn't "emulation". Modern JVMs use a JIT (just-in-time compiler) to translate bytecode instructions into pure binary assembled object code just before it is reached in the program (hence "just in time"). This is cached, so the next time that particular code is executed, it will run at full assembler speed.

    Something I've often wondered is whether this caching could be persistent, i.e. be kept between runs of the JVM. Eventually, the entire program would be translated to pure assembler with the cost of translation largely amortised across many sessions. You still keep the safety, cross platform compatibility and ease-of-programming of a bytecode language (i.e. Java, C#) but you get the bonus of the cached object code being just as fast, even during startup and shutdown.

  12. Re:Caught up with the speed, but still the ugliest by C.+E.+Sum · · Score: 3, Informative

    One of the best things about OS X is Aqua-ized Java apps.

    http://developer.apple.com/documentation/Java/Co nc eptual/Java141Development/UI_Toolkits/chapter_5_se ction_2.html

    --
    -- Have you ever imagined a world with no hypothetical situations?
  13. Sorry, yes by Anonymous Coward · · Score: 1, Informative

    This means nothing.

    That is the most typical benchmark of somebody who doesn't know anything about compiler optimization.

    This is almost truly a mark on the compiler making an optimization in one case where the Java compiler doesn't. Assigning to an unused variable is a useless operation. A decent compiler removes the assignment, then notices an empty loop, then removes the loop.

    Not to mention, your double-click takes into account the entire VM initialization, which greatly, greatly outweighs the test itself, rendering the test useless on that account as well.

    So you end up benchmarking the entire VM initialization with a NOP. Gee, wonder which one's going to win?

    This is why benchmarks are hard. This is why the SpecINT benchmarks are notoriously bad as they (at least were) easy to optimize against.

  14. Re:Caught up with the speed, but still the ugliest by Anonymous Coward · · Score: 1, Informative

    By the sounds of it you have no idea what a good GUI is. Skins and themes do not a pretty UI make.

  15. Re:my arse by kaffiene · · Score: 5, Informative

    *sigh* have you people never heard of runtime optimisations? There are some things you can optimise at runtime (like runtime constants) which are *impossible* to optimise at compile time.

    This whole "x is written in y, so x can't be faster than y" rubbish is just that - rubbish.

  16. Re:every year this happens... by bckrispi · · Score: 3, Informative

    Ummm, wrong. The majority of java class libraries, and (significant parts, if not all) of the compiler are written in Java. There is, of course, some C++ for doing really low level stuff, but not the amount that you're implying.

    --
    Xenon, where's my money? -Borno
  17. been there, done that by Anonymous Coward · · Score: 5, Informative

    1) javac (Sun's Java compiler) is written in Java. You can even access it programmatically at runtime if you really want to.

    2) While it's not an id game, IL2 Sturmovik is a critically-acclaimed fight simulator that was written almost entirely in Java.

  18. Very true, if don't nkow what you are doing by Pac · · Score: 4, Informative

    Out of the box Swing is amazingly ugly. The people choosing default colors at Sun could well be substituted by a randomizer without a difference in results. I mean, who was the genius who thought purple bars in a menu were cute?

    Now, when you need to change that quickly and without much overload, there are ways. A little known global HashTable called UIDefaults lets you change just about everything on the visual interface without having to write your own LookAndFeel (which you obviously can do too, for very large projects). You can have your scrollbars, menus, etc in any colour, size and shape, using any font. You can easily change all default colours without having to set every control. After a while the ugliness ceases to be a problem.

  19. Re:Caught up with the speed, but still the ugliest by Inf0phreak · · Score: 4, Informative
    Would you please turn off the moronic "smart quotes" feature in IE?
    Seeing things like this:
    I&#146;m
    is hurting my eyes.

    This page has more information about this horrible malfeature.

    --
    ________
    Entranced by anime since late summer 2001 and loving it ^_^
  20. Slow C++ compiler by siesta+at+uni · · Score: 4, Informative

    The article says he used GCC to compile the C++ versions, but GCC produces code that isn't nearly as good as the Intel compiler for example. (Here, but no good if you don't subscribe)
    A lot of the test results are close, and I think a different compiler would change the outcome.

  21. Re:Caught up with the speed, but still the ugliest by DeckerEgo · · Score: 2, Informative

    Also check out SkinLF from L2FProd - it's a library that makes it very easy to use GTK themes, KDE themes or even both together to make a very nice native-looking interface. I use it with ConsultComm and have had very nice results.

  22. Re:Could use a good analysis by Magnus+Pym · · Score: 2, Informative

    I agree with you. This does not match my experience at all. The Java programs I have used (especially anything with a GUI) have been bloated and much slower than programs in C++ that do 10 times as much. I would be curious to see if these benchmarks included things like opening windows, pulling down menus etc.

    Magnus.

  23. c# is FASTER than JAVA by gamesmash · · Score: 2, Informative

    Because c# is faster than Java, and Java is faster than c++, then by transitivity, C# is faster than c++. To back up my claim that c# is indeed faster, I cite as my source a researcher at cornell: http://www.cs.cornell.edu/vogels/weblog/2002/11/24 .html

  24. Re:Nort really surprising by IMarvinTPA · · Score: 2, Informative

    I think you have some fatal flaws in your C code. Your string buffer isn't long enough to hold what you are doing. And they aren't doing the same thing. The first code is going to count to 10240. The second code is going to display ascii characters from null through to character 255 and then do some very bad things.

    Bad code, BAD!

    IMarv

  25. Re:He used g++ to compare C++ with Java... by ky11x · · Score: 5, Informative

    g++'s goal is modularity for ease of porting in cross-platform cross-compiling. aggressive optimization is not one of its strengths. the point of such benchmarks is really not a language comparison, but a comparison between the code generated by the most optimized compilers for that language on a specific platform. Using g++ for this simply causes the study to lose credibility

  26. Magicosm by wurp · · Score: 2, Informative
    Magicosm is a 3D real time persistent online world. It's not released yet, but we have about a dozen beta testers. If we were funded, we would be way done by now; working on things only in your spare time is a bitch ;-)

    We use Xith3D (primary written by our main client developer), a Java3D workalike. Xith3D was spun off in response to Sun's news that Java3D would no longer be supported. Sun's decision may have been reversed; I'm not entirely sure.

    Anyway, we have slick looking 3D that performs just fine; comparably to other engines. It's on top of an API (Xith3D/Java3D) that sits on top of opengl.

    There have been several good 3D java games displayed at the GDCs, stuff from FullSail and GetAmped.

    By the way, the project is currently going through a lull as I work on another side project (an online yard sale) and the primary client developer has had to leave the team to spend more time with his family. Send us a note at jobs@magicosm.net if you want to help out as a developer, 3D artist, system administrator, or (especially) investor!

  27. Re:Largest Prime by jfengel · · Score: 2, Informative

    I know you're joking to make a point, but you do realize that 1 isn't prime, right? That's not just a matter of arbitrary definitions; a lot of theorems that apply to primes don't apply to 1.

  28. different requirements by vlad_petric · · Score: 4, Informative
    The server one is optimized for throughput and concurrency, whereas the client one for latency.

    You might think that the two are the same, but the two settings actually make a visible impact if you're running on a multi-processor system. Most notably, the garbage collector and locking primitives are implemented differently.

    --

    The Raven

  29. Quake2, Alien Flux, Tribal Trouble by maxgilead · · Score: 2, Informative

    Sure, there's java port of Quake 2, there's Alien Flux, Tribal Trouble. But, as others already mentioned Java is mostly used for programming game logic. It's performance is constantly improving and only recently it gained enough speed to be seriously considered for writing entire game engines.

  30. Re:One example of why the tests are BS by andy55 · · Score: 2, Informative
    Why allocate and deallocate an object within the scope of a function?

    You would do it if you need a scrap object only sometimes (and didn't want to pay the overhead penalty of instantiating it every time the proc got called). Here's an example:


    void foo() {
    ...
    if ( SomeRareCondition() ) {
    AReallyNastyObject* temp = new AReallyNastyObject;
    ...
    delete temp;
    }
    ...
    }

  31. Re:He used g++ to compare C++ with Java... by exp(pi*sqrt(163)) · · Score: 4, Informative

    g++ isn't great at optimizing. For code I write it's somewhere between 0 and 50% slower than MSVC. It depends a lot on the type of code of course. For pure numerical work I think the Intel compiler usually scores highly so I'm surprised you're not seeing much difference.

    --
    Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
  32. Re:One example of why the tests are BS by Bloater · · Score: 3, Informative

    The cout is done twice, and the new and delete are each done only once. They are not the reason for the poor performance.

    The problem is that g++ probably does not optimise it all inline, whereas the particular java VM he has chosen to use probably does.

    Although defining the Toggle variables with auto storage class may give g++ the hint it needs to realise this.

    Additionally, the activate method is declared to be virtual, this shouldn't be a problem, except that it may further hide the optimisation opportunity from g++. Note that the description of the test does not stipulate that it is testing virtual methods.

  33. Quick analysis... by the_skywise · · Score: 2, Informative

    In the bulk of his results, C++ on an i686 beat out the CLIENT JVM every time except in two tests. Object creation and word count. In the object creation test the code is biased towards Java. He's creating the objects AND DELETING THEM in C++, but Java's garbage collection probably isn't doing the deletion at all.

    The other test is the word count. This one is interesting because he sets the streambuffer to 4k in both Java and C++. But in the C++ version the stream won't preload to fill the buffer. So the amount being cached is UNKNOWN. I can't speak for the Java version but I bet it preloads the entire file.

    That leaves the Server JVM switch. In which case I think you're seeing alot more code inlining then the standard C++ compiler generates.

    Either way, this is hardly a definitive test.

  34. Re:He used g++ to compare C++ with Java... by vlad_petric · · Score: 2, Informative
    Try inter-procedural optimizations. For that you have to give icc/icpc the -ipo flag, and furthermore use xiar instead of ar and xild instead of ld.

    The results might shock you.

    --

    The Raven

  35. Re:C++ hash code is hobbled? by Anonymous Coward · · Score: 1, Informative

    Ah. I see. I'm used to using strings or ints as keys in std::map functions.

    Most C++ programmers would do that. Then again, most C++ programmers would also bother to deallocate the arrays returned by strdup(), too, so I guess this guy isn't a C++ programmer.

    I don't normally use hash_map because it's not in the g++ 3.1 distro (my company mandates this version).

    IME, most people don't use hash_map, since it's not standard. My understanding is that it will be standard in the next revision of the C++ standard, so maybe it will gain more popularity then.

    Is there another source for free STL implementations out there besides the archaic SGI STL?

    Ever used STLPort? It's supposed to be pretty decent, considering that it's free.

    not sure about ::hash_map, but the ::map lookup calls a function in this case. using k->second would dereference an ESI pointer to memory, which is faster than an inline call to a member function. not sure if hash_map does the same. yes?

    You're right. It probably would improve performance to some measurable extent. The previous post stating otherwise is probably wrong.

  36. Re:Riiiiiiight by l33t-gu3lph1t3 · · Score: 2, Informative

    Windows nowadays *is* very stable. I've been working as a systems administrator over Windows NT, 2000, XP, and 2003 Systems for about 2 years now (obviously I haven't spent that much time on 2003...) and you know what? Except for one really screwy problem that later turned out to be memory-related, I have never had stability problems with those flavours of MS Windows.

    Ok that's not entirely true. I've seen Windows stability problems. They were the result of user stupidity like truckloads of spyware and hard disks with no space left.

    And as far as "idiotic debug statements" go, GCC holds the frikkin crown for those. When java compiler or runtime crashes, it tells you more about what went wrong than C++ does...perhaps the reason you see amateurish java programs is that the amateurs and programming students are switching from C++ (a language no one should use unless they're supremely gifted as a programmer) to Java (a language that saves them from their own stupidity).

    --
    ------- "From bored to fanboy in 3.8 asian girls" ----------
  37. Re:Explanation by areve · · Score: 2, Informative

    java is JIT compiled not interpreted

  38. Benchmarks? by slapout · · Score: 2, Informative

    It's not the benchmark's that count. It's the programs I need to run. Every program I've tried that's written in Java takes longer to start up than one written in C/C++. Althought that may change with .Net :-)

    Java has gotten better though. The programs are usable now days. (Just have that start up time as the virtual machine loads.) Use to be the programs loaded slow and ran slow.

    --
    Coder's Stone: The programming language quick ref for iPad
  39. String concat sillyness by danharan · · Score: 4, Informative
    The article mentions Lea modified the String concatenation code, although Java still lost to C++ in that test. He unfortunately didn't do a great job:
    import java.io.*;
    import java.util.*;

    public class strcat {
    public static void main(String args[]) throws IOException {
    int n = Integer.parseInt(args[0]);
    StringBuffer str = new StringBuffer();

    for (int i=0; i<n; i++) {
    str.append("hello\n");
    }

    System.out.println(str.length());
    }
    }
    Instantiating the StringBuffer with an approximate size would prevent it from having to reassign a char array every time it runs out of space. new StringBuffer(n*6) for n=10000000 as used in his test should have a pretty large impact.

    I could not run the test for 10M, but ran it for up to 1M. 541 milliseconds in one case, 280 in the other. Here's the code I used (I had to modify the timing cause I'm running XP):
    public class Strcat2 {
    public static void main(String args[]) throws IOException
    {
    long start, elapsed;
    start = System.currentTimeMillis();

    int n = Integer.parseInt(args[0]);
    StringBuffer str = new StringBuffer(n*6);

    for (int i=0; i<n; i++)
    {
    str.append("hello\n");
    }

    System.out.println(str.length());
    elapsed = System.currentTimeMillis() - start;
    System.out.println("Elapsed time: "+elapsed);
    }
    }
    The only difference in the class Strcat besides the class name is the instantiation of StringBuffer.

    NB: I'm not accusing the author of bias against Java, nor am I ignorant of the fact a bunch of /.'ers could kick my ass in C++ optimization. It would be interesting however to have a distributed benchmark, where in the true spirit of OSS we could fiddle with it until we could not wring any more performance gains.
    --
    Information: "I want to be anthropomorphized"
  40. Re:He used g++ to compare C++ with Java... by mwillis · · Score: 5, Informative

    Intel gives their c++ compiler away free for non-commercial hobbyist use on linux.

    The windows version has a free trial that runs for 30 days.

    Try it. See if it makes a difference. If it doesn't, torch it. If you find it makes your critical code run 2x faster, then... have a look at what a computer that runs 2x faster will cost you, and then decide what to do.

  41. Re:-O3? by BenjyD · · Score: 3, Informative

    Because -O3, despite what many people say, doesn't very often generate faster code. In many cases the extra inlining can create slower code.

    For example:

    methcall.cpp -O2 1.8s -O3 1.8s
    fib.cpp -O2 3.7s -O3 3.7s
    matrix.cpp -O2 1.8s -O3 1.8s (interestingly, adding -march=athlon-xp for my machine reduces time to 1.5s)

  42. Re:He used g++ to compare C++ with Java... by CoughDropAddict · · Score: 3, Informative
  43. Command and source/test review. by Ninja+Programmer · · Score: 4, Informative

    erm ... I only checked the fibonacci routine, but it's actually quite funny - he's branching recursive calls, a clear case when a smart-enough runtime optimization would work better. I mean, any reasonably smart optimizer would eventually figure out that there are too many calls to the same function with the same argument to just stand by and watch. I'd say that given this difference c++ did quite alright in that one.

    This is known as the "halting problem". No, the compiler cannot guarantee the ability to transform a recursive solution to a non-recursive one. The case of the fibonacci algorithm is a particularly difficult one to transform properly if the compiler hasn't special cased it.

    That said -- Ack and Fib are call overhead limited. They examples of poor quality code whose performance is not inner loop based.

    Hash will be C-string (specifically strcmp and sprintf) limited in performance. The performance is therefore very data dependent (since Java uses length delimited strings.) Using a fast string class such as "The Better String Library" (http://bstring.sf.net) would have yielded C++ far better performance. A similar comment applies to the strcat test.

    The Heapsort is a particularly bad implementation. In good implementations, the Intel compiler really takes gcc to town. See: http://www.azillionmonkeys.com/qed/sort.html

    Integer Matrix multiplying is an extremely rare application. So I wouldn't put too much stock in the results here -- though, I would be surprised if there was much differentiation between either Java of C++ on this test.

    The method calling, I think, will be very much limited by the compiler's ability to inline past method calls. I think Intel C/C++ differentiates itself on such things.

    The Nestedloop and random tests are interesting -- I don't see how Java is supposed to beat C++ on it, but its possible to be equal.

    I don't know enough about the Java object system and barely enough about C++ object system to comment on sieve or objinst.

    It seems to me that sumcol and wc are going to be IO limited.

    I don't think this test is exactly fair, as the code is not representative for tasks where performance really matters.

    1. Re:Command and source/test review. by arkanes · · Score: 4, Informative
      The method call is a really egregious case of bad methodolgy. The C++ test he's using was designed to test method call overhead, which is why it returns by value instead of refrence. The massive performance improvement from the JVM (server only, note) is the JVM aggresively inlining the call (and it's a single method call in a tight loop, an obvious inline candidate), so what he's measuring is just loop performance, not method call overhead. If he hadn't disabled loop unrolling and function inlining in GCC (via O2), C++ would have performed much better.

      It's worth pointing out that inlining is a case where a VM can really shine for optimizing because it has alot more options available (partial inlining, etc) and can make better decisions about tradeoffs. But this particular benchmark is comparing apples to oranges.

    2. Re:Command and source/test review. by norton_I · · Score: 4, Informative

      The method test is limited by two things: First, gcc appears unable to inline virtual method calls through a pointer to an object, even when the exact type is available, second, he didn't turn on loop unrolling. Loop unrolling is a huge, huge advantage on any sort of null benchmark like this.

      Changing the objects the be stack allocated and adding -funroll-loops moves the c++ benchmark up to just ahead of the java benchmarks.

      Of course, this does point to several advantages of a runtime optimizer. A static optimizer will never be as good at optimizing virtual function calls as a runtime optimizer, since it will never be as good at identifying types. Also, a runtime optimizer will always be better at creating specializations of existing functions (creating a special version of a routine when some input value is treated as a constant).

  44. Really? Try this one. by rjh · · Score: 3, Informative
    Given *any* algorithm, I can come up with a c++ implementation that is faster than a Java implementation. Period.
    I'd like to see a C++ implementation of the Halting Problem that's faster than a Java implementation, please, thank you.

    Oh, wait, you can't do that because nobody can write Halting.

    I guess that means there are some algorithms for which you can't write a faster C++ version. Next time, try less rhetoric and more facts. "There exist lots of algorithms for which I can code a C++ implementation that's faster than a Java implementation" is good. The instant you make a unilateral statement like the one you just made, though, it shows that you don't know as much about computer science as you think you know.

    Fact: there exist cases where Java is faster due to its ability to optimize on the fly. And if you know C++ as well as you think you do, this shouldn't surprise you. C++ beats C so handily for many tasks because C++ is able to do much better compile-time optimization largely on account of the C++ compiler having access to much more type information than a C compiler. When Java beats C++, it's on account of Java having access to much more information about runtime paths than a C++ compiler. ("Much more" may be an understatement; C++ doesn't even try!)

    In other words, the JVM (sometimes) beats C++ for the exact same reason that C++ almost always spanks C; the faster implementation has access to more information and uses that information to make more efficient use of cycles.

    I don't think these situations will appear all that often, and I am deeply skeptical of this guy's "in the general case, Java is faster" conclusion. But my skepticism isn't leading me to make rash statements which cannot be backed up.
  45. One mistake (Was: Re:Explanation) by jdennett · · Score: 2, Informative

    Almost no C++ implementation calls the OS (kernel) for every memory request, precisely because that's too slow.

    More to the point, C++ doesn't *have* to use dynamic allocation so often, but in badly written code it may well do so, and that will hurt performance. In C++ you can drop objects on the stack, in Java you can't. Heap/GC allocation can be pretty quick, but not quite as quick as stack allocation.

    1. Re:One mistake (Was: Re:Explanation) by Anonymous Coward · · Score: 1, Informative
      More to the point, C++ doesn't *have* to use dynamic allocation so often, but in badly written code it may well do so, and that will hurt performance. In C++ you can drop objects on the stack, in Java you can't. Heap/GC allocation can be pretty quick, but not quite as quick as stack allocation.

      Still, it's a case of C++ having some advantages and Java having some other advantages. Java can't put things on the stack, but the way it allocates objects is that dedicates a block of memory called "Eden", and just goes through it sequentially, allocating one object right after another, not doing any searching for available space.

      Then, because there are no native, machine-level pointers in Java like there are in C++, it is possible to move objects without screwing up references. So when "Eden" gets full, you can just compact the sequence of objects. Java uses a generational garbage collector that takes advantage of the fact that most objects are short-lived, which means that in most cases, Eden (the newbie generation of objects) will have a lot of objects in it that can be freed. Those that cannot be freed are recognizable as relatively long-lived objects and can be moved somewhere else, giving you a nice big block of memory that's once again free to use to allocate objects very quickly.

      The other thing Java has going for it is that all this can happen behind the scenes because objects can be moved and references changed without causing a problem. C++ can't do this. So, in many cases (such as when the application is blocked waiting for I/O), the Java garbage collector thread can actually get some work done that C++ would have to do either at the time you do a new or delete (i.e. inline instead of offline).

  46. Java performance "truths" change over time by eduardodude · · Score: 5, Informative

    Check out this recent IBM Developerworks article which looks at how modern JVMs handle allocation and garbage collection.

    Some very surprising tidbits there. For instance:

    "Performance advice often has a short shelf life; while it was once true that allocation was expensive, it is now no longer the case. In fact, it is downright cheap, and with a few very compute-intensive exceptions, performance considerations are generally no longer a good reason to avoid allocation. Sun estimates allocation costs at approximately ten machine instructions. That's pretty much free -- certainly no reason to complicate the structure of your program or incur additional maintenance risks for the sake of eliminating a few object creations."

    Read the article for an excellent nuts-and-bolts analysis of many current performance considerations. From the posts in this thread, it's pretty clear a lot of people haven't looked into what's actually done in a server JVM these days.

  47. Re:Explanation by neurojab · · Score: 2, Informative

    >When it comes to speed, compiled languages will always run faster than interpreted ones, especially in real-world applications.

    It's a bit simplistic to call Java an "interpreted" language. A Java source file is compiled into bytecode. Modern JVMs then take this bytecode and use a Just In Time compiler to compile it into machine code just before it runs. Naturally this is a bit of overhead up front, but once the class is JITed, it will perform on the order of natively compiled code written in another language.

  48. Re:java *can* be fast... by Fnkmaster · · Score: 2, Informative
    Yes, EJBs are generally layers of needless cruft. There are lots of perfectly good situations that require distributed systems, and EJBs provide one of the worst models to solve truly partitionable problems and effectively distribute load, and if you just want reliability or failover capabilities, there are plenty of easier ways to get that than using EJBs.


    I agree with you - the only business apps I've seen that really NEEDED C++ were some very tight-loop mathematically intensive things where the 2x-4x performance difference imposed by lots of array bounds-checking became a limiter to performance optimization with the Java VM implementation - and that was easily solved by a small chunk of JNI code implementing the iteration over the arrays.

  49. Re:The language does matter by gregor_b_dramkin · · Score: 2, Informative

    I believe you can declare a java method or class as "final". This optional behaviour is like C++'s default behaviour.

    BTW. I looked at the "shootout" article when it first reared its ugly head. I recall the Python code looking like it had been written by someone who had read 3 chapters of "Teach Yourself * in 21 Days". Utter crap.

    --
    You can never equivocate too much.
  50. Re:Could use a good analysis by Paul+Lamere · · Score: 2, Informative
    One of the papers cited on the page is the FreeTTS FreeTTS - A Performance Case Study a paper written by the speech team here at Sun's Research Labs.

    This paper describes the performance issues we encountered when developing FreeTTS. I think it is a pretty good representation of the issues involved in developing a high-performance Java application along with a comparision between a Java and a native-C version of the same application. This paper describes how we ported a native-C synthesizer (Flite) to Java (FreeTTS) and how were able to get better performance from our engine.

    This is not a toy application but a real application that performs well in a domain where performance really matters.

  51. Re:I don't actually care hugely about performance by Xetrov · · Score: 2, Informative

    Converting objc -> c is trivial -- There are a number of objc -> c converters.

    Actually I think that might be how most of the compilers actually work :)

    So if you like Objective-C and would rather code in it... you can!

  52. Re:Caught up with the speed, but still the ugliest by Anonymous Coward · · Score: 1, Informative

    You've got it wrong. This is correct:

    const double sqrt5 = sqrt(5.);
    const double gold = (1.0 + sqrt5)/2.0;
    const double goldi = (1.0 - sqrt5)/2.0;

    int fibonacci(int n)
    {
    return floor((pow(gold, n)-pow(goldi,n))/sqrt5);
    }

    For n > 5, you don't need the goldi part.

  53. Re:Java is not faster than optimized c++ by p3d0 · · Score: 2, Informative
    Runtime optimization analyzes the bytecode that's running and can come up with ways of optimizing things at runtime. However, this optimization was coded into the JVM, and there's absolutely nothing that stops you from emulating that approach in C.
    Wrong. JIT compilation offers you two things you can't get from a static compiler: (1) dynamic information about a particular run of the program, and (2) the opportunity to make optimistic assumptions, knowing you can re-compile if your assumptions are violated.

    Suppose that during a particular run, there is a particular variable x that, whenever we look at it, always has the value 42. Now we want to compile a method that uses that value. A JIT can optimistically assume that value will always be 42. Now it can go to town on optimizations involving that variable. Got an "if(x > 10)" statement? You can omit the compare and branch. Got a division with x as the denominator? You can omit the divide-by-zero check, and turn the divide into a cheaper multiply-by-reciprocal. And so on, and so on. Then, you register your assumptions with a runtime service that is capable of re-compiling the method of your assumptions turn out to be incorrect. If you ever see that variable with a value other than 42, you recompile it (and do some fancy footwork to deal with threads that might already be running the obsolete version).

    What you end up with is a method implementation that a static compiler just could not produce. Perhaps, if it were very smart, it could use profile-directed feedback to create a specialized version of the method for each of a number of "likely" values of x. However, unless you want to try to cover all 4 billion possible integer values, you must include a backup version for when you turn out to be wrong. Plus, you'll always have the overhead of choosing which of the specialized versions to call in the first place. This is all overhead the dynamic version doesn't have.

    So it is definitely not impossible for a dynamic compiler to outperform a static one. It is just very rare, given the current state of the art of compilers.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  54. you have no idea what you're talking about by autopr0n · · Score: 2, Informative

    Are you trolling or what?

    This is known as the "halting problem". No, the compiler cannot guarantee the ability to transform a recursive solution to a non-recursive one. The case of the fibonacci algorithm is a particularly difficult one to transform properly if the compiler hasn't special cased it.

    No it's not. In fact, it's not even close to the definition of the "halting problem". The Halting Problem is "Given input X, and program Y, will Y ever finish it's calculation, and halt on when given X as an input". It's a 'problem' for which no computer program can be written to solve.

    It has nothing to do with whether or not a compiler can convert a recursive algorithm into a non-recursive one.

    That it's possible to convert looping programs into recursive programs is trivially true, because recursive functions alone are Turing complete. Similarly, because any recursive program can run on a Turing machine, which has no concept of a 'stack' needed for recursive calls can run any recursive program by emulating the stack on it's tapes.

    If you want a real world example, just compile any recursive code in any language to MIPS or some other RISC instruction set without push/pop and call functions like those found on X86 chips. You don't "call" functions directly, you just move around in memory and jump all over the place, just the same as you would in looping code.

    --
    autopr0n is like, down and stuff.
    1. Re:you have no idea what you're talking about by Ninja+Programmer · · Score: 1, Informative

      Are you trolling or what?

      I am a professional programmer with plenty of math background. Besides running a porno-site, who are you?

      This is known as the "halting problem". ...

      No it's not. In fact, it's not even close to the definition of the "halting problem". The Halting Problem is "Given input X, and program Y, will Y ever finish it's calculation, and halt on when given X as an input". It's a 'problem' for which no computer program can be written to solve.

      I didn't say its the definition of the problem. But you clearly don't understand how this definition applies to the real world. For any algorithm to determine whether or not a recursion has a simple degeneration to an iterative formula requires solving arbitrarily complex math problems.

      Solving or knowing if these math problems can be solved is equivalent to the halting problem.

      The point is that the best compilers can do here is pattern matching which isn't going to be worth it for the relative infrequency of any given special cases.

  55. K5 doesn't allow new user sign-ups by tepples · · Score: 2, Informative

    At last count, Kuro5hin was closed to new users. Therefore, K5 is not a general purpose discussion site. New users have mostly gone to K6 instead.

  56. Re:Troll by Ninja+Programmer · · Score: 2, Informative

    Gcc is designed for compatibility with a wide range of architectures, and is not optimized for a single one. He also (apparantly) used stock glibc from Red Hat. And only one "test", the method call test, showed java to be a real winner. And even then, it's server-side Java, which is meaning less when you talk about it as a day-to-day dev language (ie; creating standalone client-side apps).

    Intel's (heavily optimized) C++ compiler should be a damn sight faster, and so should VC++.


    This is a slight misrepresentation. gcc actually does quite respectably on x86 platforms -- its easily as good as MSVC++, and its clear that the gcc people have put a lot of work into this compiler. Of course, the Intel C++ compiler is truly awesome and leaves pretty much every other x86 compiler in the dust, but this is really a case of Intel just putting a truly amazing effort into their compiler rather than anyone else comming up short.

    The real issue with these tests is that pretty much none of them have real computational inner loops. They all measure unlikely program overhead that could easily be removed with any reasonable rerendering of the code.

  57. Re:My Hero! by Kosgrove · · Score: 2, Informative

    One of the most important reasons software has advanced is exactly BEACUSE of information abstraction - API's, object-oriented computing.

    In my experience, it's not a bad thing to be beholden to an API author. The author(s) likely know the system better than you. It's one of the things that helped humanity advance in general - specialization. It shouldn't be taken to the extreme, of course, SOMEONE has to know how the stuff works, but not most programmers.

    Assembly language is good to know, but many (most?) coders will not ever have a need to touch it, unless they are doing embedded design or compiler development. There are much more useful things to learn for those of us that write software. For example, I am absolutely shocked at the number of students who graduated with me that had taken a SPARC assembly class (required), but knew zero about relational databases. (Although to be fair to my alma mater, Penn State, I believe that situation has since been rectified.)

    By the way, I know assembly. MC68k as an undergraduate, a year of MIPS architecture as a graduate.

  58. Re:He used g++ to compare C++ with Java... by Anonymous Coward · · Score: 1, Informative
    This can be also because G++ floating point has higher accuracy than MSVC:s floating point code.

    For example, sin() calls in MSVC lead to single FSIN assembly instruction. But INTELs FSIN instruction is known to be quite inaccurate/incorrect with certain range of angles so G++ inserts code before and after FSIN which corrects the result to be more accurate. (this feature can be disabled with --ffast-math switch).

  59. Re:Write a JVM in Java??? by ArsenneLupin · · Score: 2, Informative
    For example, the GCC compiler can compile its own source code.

    Yes, the GCC compiler. However a JVM is an interpreter.