Performance Benchmarks of Nine Languages
ikewillis writes "OSnews compares the relative performance of nine languages and variants on Windows: Java 1.3.1, Java 1.4.2, C compiled with gcc 3.3.1, Python 2.3.2, Python compiled with Psyco 1.1.1, Visual Basic, Visual C#, Visual C++, and Visual J#. His conclusion was that Visual C++ was the winner, but in most of the benchmarks Java 1.4 performed on par with native code, even surpassing gcc 3.3.1's performance. I conducted my own tests pitting Java 1.4 against gcc 3.3 and icc 8.0 using his benchmark code, and found Java to perform significantly worse than C on Linux/Athlon."
I am not a compiler nerd (IANACN?), so maybe someone else can answer the following simple question:
Why are the Microsoft languages so fast with the Trig functions?
I'm a 2000 man.
Not sure of the accuracy. Benchmark is on a loop:
32-bit integer math: using a 32-bit integer loop counter and 32-bit integer operands, alternate among the four arithmetic functions while working through a loop from one to one billion. That is, calculate the following (while discarding any remainders)....
It also relies on the strength of the compiler, not just the strength of the language.
The Custom Mary
Why did VB do so bad on IO compared to the other .Net benchmarks? They were pretty much equal up until the IO benchmarks? Any chance of getting the code published that was used to test this?
Well, for performance it does. For cross platform compilation it rocks the house. If you really want performance you need to be using something like Intel's C compiler (which oddly was not tested)
Finkployd
Why benchmark the various ".NET languages" (those languages whose compilers target the CLR)? Every compiler targeting the CLR produces Intermediate Languages, or more specifically MSIL. The only differences you'd find is in optimizations performed for each compiler, which usually aren't too much (like VB.NET allocates a local variable for the old "Function = ReturnValue" syntax whether you use it or not).
Look at the results for C# and J#. They are almost exactly the same, except for the IO which I highly doubt. Compiler optimizations could squeeze a few more ns or ms out of each procedure, but nothing like that. After all, it's the IL from the mscorlib.dll assembly that's doing most the work for both languages in exactly the same way (it's already compiled and won't differ in execution).
When are people going to get this? I know a lot of people that claim to be ".NET developers" but only know C# and don't realize that the clas libraries can be used by any languages targeting the CLR (and each has their shortcuts).
...some analysis of the code generated by Visual C++ and gcc side by side, particularly for those trig calls. If there's that great a discrepancy between the runtimes, that's a good clue that either one of the compilers is under-optimising (i.e. missing a trick), or the other is over-optimising (i.e. applying some transformation that only approximates what the answer should be). I didn't see any mention of the numerical results obtained being checked against what they ought to be (or even against each other).
:)
As any games/DSP programmer will tell you, there are a million ways to speed up trig providing that you don't *really* care after 6dps or so.
OK, maybe I'm just bitter because I was expecting gcc 3.1 to wipe the floor.
These sigs are more interesting tha
Well unfortunately, comparing Java to C# on a Windows machine is like comparing a bird and a dolphins ability to swim in water; Several components of C# are integrated right into the operating system so naturally it's going to run faster on a windows machine. Compare C#, C++ and Java on machines where the components aren't integrated and then we will have a FAIR benchmark.
Oh wait! C# only runs on one operating system. Can you name any other development languages that only run on ONE OS, boys and girls? Neither can I.
This is my sig. There are many like it but this one is mine.
I think anyone who has done much work with either developing or running large scale java programs knows that speed can definitely be an issue.
I would consider myself part of that "anyone," and I disagree with you. Other than load times (which aren't as bad as they used to be), Java can perform as fast or faster than C code. The main thing is to use a good VM - IBM's J9 VM significantly outperforms Sun's.
Someone should do a study on the time taken to design, implement and debug a resonably complex chunk of code under C++ and Java. I'm pretty sure that the result would show the huge advanatage of Java over C++.
The difference between Canada and the USA is that in Canada healthcare is a right and gun ownership is a privilege.
Given the ever accelerating clockspeed of processors, is the raw performance of langauges that big an issue? Except for CPU-intensive programs (3-D games, high-end video/audio editing), current CPUs offer more than enough horsepower to handle any application. (Even 5-year old CPUs handle almost every task with adequate speed). Thus, code performance is not a big issue for most people.
On the other hand, the time and cost required by the coder is a bigger issue (unless you outsource to India). I would assume that some languages are just easier to design for, easier to write in, and easier to debug. Which of these langauges offers the fastest time to "bug-free" completion for applications of various sizes?
Two wrongs don't make a right, but three lefts do.
The Java performance is best explained by an article by Prof Kahan: "How JAVA's Floating-Point Hurts Everyone Everywhere" http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf also see "Marketing vs. Mathematics" http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf I suspect the relatively poor floating-point performance of gcc is also caused by the desire to acheive accurate results.
Using the IBM Java VM, I've been able to achieve consistently cutting my runtimes in half over the Sun VM. Anyone currently using the Sun VM for production work should test the IBM one and consider the switch.
My application that I benchmarked is data and network and memory intensive, although not math intensive, so that's what I can speak for. We consistently use 2 GB of main memory and pump a total of 2.5 TB (yes, TB) of data (doing a whole buch of AI style work inside the app itself) through the application over it's life cycle, and we cut our total runtime from 6 days to 2.8 days by switching to the IBM VM.
According to these benchmarks it doesn't.
The short of it is that GCC 3.2.1 is highly competitive with ICC 7.0, except for two cases:
FP-intensive code on the Pentium 4
Code that allows Intel C++ to auto-generate SSE vector code for it
A deep unwavering belief is a sure sign you're missing something...
I code in a plethora of languages (mainly, python, c, c++, java) and trust me, execution speed is not an area I look at anymore when deciding on C or Java impl (unless , of course your dealing wih cross platform graphics) It hasn't been for 3 years.
I think anyone who has done much work with either developing or running large scale java programs knows that speed can definitely be an issue.
It think it may be time for you to crawl out from that rock you live under.
There was an interesting article in Dr Dobb's a few months back. They did a performace (C++) comparison of 6 or so compilers, gcc included. The end result was that performace wise (execution AND code size) gcc came in last place in all their testing. However, gcc did win when it came to conformance to the C++ standard as it was the only compiler that supported all the language features.
Actually, I'm quite comfortable with the performance numbers Python turned in. I use Python quite a bit, and for the things the benchmark was run on, it's the kind of area I'd find looking for bottlenecks, and in turn implement in C or C++.
Python's huge win is not in speed, but in the ability to express the program in a very concise and easy to understand way.
The fact that Psyco can provide huge speed ups via a simple import is just icing.
Conveniently I have the same system configuration as ikewillis (dual 2.0 GHz Athlon MP), but am running Windows XP instead of Linux. I also have Intel C++ 8.0, which he used on Linux to generate his results.
So I ran the same tests that he ran under Linux under Windows. Here are my results from Intel C++ 8.0, with Profile Guided Optimization turned off (comparing to his with PGO on):
Running the same tests under Windows with PGO turned on, the numbers did not change except on the least-significant digits, so I won't bother to list those too. Before running the tests, I set the program to run at high priority on one processor to avoid unnecessary interference from other running applications, or unnecessary processor-jumping--although when I tried it without, there wasn't much of a difference (< 1%).
Conclusions? First, it seems the 64-bit integer performance problem is something that exists only for Intel C++ 8.0 on Linux, not Windows. Second, it seems stdlib I/O performance is significantly higher under Linux than under Windows for this benchmark.
It's hard for thee to kick against the pricks.
I would like to see benchmarks between Java vendors on the same platform for 1.4.x. Specifically, I'd like to see Sun JVM, IBM J9, and BEA JRocket. The question is how do other commercial JVMs really stack up against the Sun standard.
I would also like to see benchmarks of the same JVM across different operating systems on the same processor, namely Windows, Linux, BSD, and (if it matters) Solaris x86. The question is how do other JVMs stack up against the Windows 'standard'.
It would also be nice to see a 'leveling' benchmark across different processors, specifically comparing a suite of Java benchmarks on WinTel and MacOS.
Python did pretty badly in the tests. The reason is that in Python it takes a long time to translate a variable name into a memory address (It happens at runtime instead of compile time).
The benchmark code has stuff that basically looks like this:
Adding 1 to i takes no time at all but looking up i take a little time. In C this is going to be a lot faster.
Python did really bad when "i" from the example above was a long compared to when it was a long in C. That's because Python has big number support but in C a long is limited to just 4 bytes.
Python did OK in the trig section because the trig functions are implemented in C. It still suffers because it takes a long time to look up variables though.
In real life, variable look up time is sometimes a factor. However, for programs that I've written getting data from the network, or database was the bottleneck.
Why didn't they include ActivePerl?
In the article it rather sounds like they just assumed Python performance would be an indicator of performance for interpreted languages generally, but is there anything to back this up?
This Like That - fun with words!
I actually use C++ for portability, not speed or generic programming (which are nice to have).
If you avoid platform, compiler, and processor specific features, C++ is even more portable than Java. Java on the other hand tends to drag all platforms down to the least common denominator, then requires the use of contorted logic and platform extensions just to attain acceptable performance.
People seem to have forgotten the original intention of C: portable code.
I was a bit surprised by this quote in the article:
"Even if C did still enjoy its traditional performance advantage, there are very few cases (I'm hard pressed to come up with a single example from my work) where performance should be the sole criterion when picking a programming language. I"
I can only assume from this that he has never done or known anyone who has done any realtime programming. If you're going to write something
like a car engine management system performance is the ONLY critiria, hence a lot of these sorts of systems are still hand coded in assembler , never
mind C.
I ran four tests using Portable.net and mono. For lazyness reasons I only ran Int and Trig benchmarks. All tests were performed on a 2.4ghz p4.
First, I compiled Benchmark.cs using cscc (portable.net) and mcs (mono).
I then ran each binary with mono and ilrun (portable.net). Results are interesting.
Portable.net compiler: cscc -O3 Benchmark.cs
$ ilrun Benchmark.portable.exe
Int arithmetic elapsed time: 12996 ms
Trig elapsed time: 28700 ms
$ mono Benchmark.portable.exe
Int arithmetic elapsed time: 16235 ms
Trig elapsed time: 4534 ms
Mono Compiler: mcs Benchmark.cs
$ ilrun Benchmark.exe
Int arithmetic elapsed time: 13784ms
Trig elapsed time: 27939 ms
$ mono Benchmark.exe
Int arithmetic elapsed time: 15994 ms
Trig elapsed time: 4596 ms
As you can see, Portable.net has slightly faster Int math, but crumbles under the trig functions. There is no significant difference between the compilers.
the Portable.net runtime had a serious bug where the time calculated was an order of magnitute out. I used the unix time command to get a more accurate result.
It would be interesting to do this comparison using Microsoft.NET as well. I would assume Microsoft.net would absolutely rape these results.
n.b. Please note this was not a comprehensive benchmark. I disabled some of the tests because I didn't feel like waiting (So sue me), while X, xmms, xchat, etc were running.
If cars followed Moore's law we'd all be driving at the speed of light about now. And guess what -- that's completely unnecessary.
No, I wouldn't want a 20 HP engine in my car. But I don't feel the need for a 1.6e9 HP engine, either.
Performance and scale are two different beasts.
However, there are definitely times when some languages are more appropriate than others.
Being mostly a one-language person (used to be guru level on others, but skills have lapsed) I restrict my development to the areas that language is strong in. And let people fluent in the other languages do the other work.
For the things most people use Java for, speed isn't that important compared to the reasons they're using it.
For people who need speed, maybe Java isn't the right choice. So pick something else and get on with it.
For the example environment shown, i.e. writing software to run on windows, I'd pick Delphi anyway - all the speed advantages of C, many of the programming language niceties of Java, all of the front-end simplicity of VB. Delphi rocks for Windows development. Shame it wasn't also benchmarked..
~Cederic
Changing this to 'linesToWrite = [myString] * ioMax' dropped time on my system from 2830ms to 1780ms (I'd like to note that I/O on my system was already much faster than his *best* I/O score, thank you very much Linux)
In the trig test, I used numarray to decrease the runtime from 47660.0ms to *6430.0ms*. The original timing matches his pretty closely, which means that numarray would probably beat his gcc timings handily, too. Any time you're working with a billion numbers in Python, it's a safe bet that you should probably use numarray!
I didn't immediately see how to translate his other mathematical tests into numarray, but I noted that his textual explanation in the article doesn't match the (python) source code!
(My system is a 2.4GHz Pentium IV running RedHat 9)
Hate stupid software on freshmeat? Laugh at
Almost all my work programming is function calls and string, dictionary, and list manipulations in a single and multi-threaded apps. A little integer, very little float (money), no trig. In this case the only useful benchmark in the article is I/O.
My guess is that a majority of programmers aren't developing apps heavy in integer, float, or trig. So this benchmark article suggest that we not develop the next iteration of Quake in python.
Does anyone know of a multiple language benchmark relevant to the rest of us lumpen proletariat programmers?
I think enough other people have already pointed out that the kind of comparision done in this article is rather useless. Different languages are designed for different uses, and while some languages might favour faster code, other might favour ease of development or portability.
Anyway, even when remaining within a same language or language family, the benchmarks are still quite meaningless. For instance when you want to compare the Performance of MSVC++ and GCC. The benchmark has several flaws:
- the code is too trivial. It doesn't show how good the compilers really are at optimizing
- the code is too library dependent. For instance, in the trig benchmark, only the runtime library is really benchmarked and not the code generated by the compiler itself
- for the floating point benchmarks, the options chosen for both compilers do not match. For MSVC++, the options chosen favour speed over accuracy, while the GCC options favour accuracy over speed.
The last point can very easily be illustrated with the trig benchmark.
On my computer (P4, 2.8GHz), I get the following results:
1) Options from the article: 10.9s
2) additional option -ffast-math : 6.9s
(this option is also a significant win for the double benchmark)
3) options above plus linking with CRT_fp8.o : 2.8s
The last option may need some explanation:
Programs compiled by MSVC++ by default set the math coprocessor to 64bit, while GCC programs set it to 80bit. Linking with CRT_fp8.o on Windows platforms makes GCC programs behave like MSVC++ programs and only use 64bit precision. For arithmetic operations, this makes no difference, but the built in transcedental functions become much faster if you reduce the precision of the coprocessor. So all in all, be were able to reduce the speed of the trig benchmark by a factor 3.9 just by changing the compilation options. This is almost exactly the difference seen in the article between the MSVC++ and the GCC results for the trig benchmark.
All in all, for trivial benchmarks like this, if you chose matching compilation options, different compilers give you almost the same results.
The only real weakness that GCC is showing is 64bit integer arithmetic. These are badly implemented in GCC and could be vastly improved.
Marcel
I think the author underestimates the impact of having Athlon specific optimizatins turned on for his c compiler(where applicable) while the Java HotSpot JIT compiler likely only optimizes very well for Pentium 4.
.NET languages with Java and used gcc as reference as well. Interesting for me is that Java can compete with .NET.
Conclusion would be: the JIT compiled Java on an Athlon is poorly or not at all optimzed.
Sidenote: the original author of the original benchmark wanted to compare
Further: obviously trig functions (which could be compiled to a single math processor opcode) are not optimzed at all in Java. From the language level calling a trig fuction is a call to a static method in the class Math. If that is "mapped" one to one to machine code it results in a JSR to the C function which contains only a few opcodes, but what a c compiler will compile to one opcode is in trivial 'mapped' Java about 10 opcodes.
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.