Performance Benchmarks of Nine Languages
ikewillis writes "OSnews compares the relative performance of nine languages and variants on Windows: Java 1.3.1, Java 1.4.2, C compiled with gcc 3.3.1, Python 2.3.2, Python compiled with Psyco 1.1.1, Visual Basic, Visual C#, Visual C++, and Visual J#. His conclusion was that Visual C++ was the winner, but in most of the benchmarks Java 1.4 performed on par with native code, even surpassing gcc 3.3.1's performance. I conducted my own tests pitting Java 1.4 against gcc 3.3 and icc 8.0 using his benchmark code, and found Java to perform significantly worse than C on Linux/Athlon."
Not sure of the accuracy. Benchmark is on a loop:
32-bit integer math: using a 32-bit integer loop counter and 32-bit integer operands, alternate among the four arithmetic functions while working through a loop from one to one billion. That is, calculate the following (while discarding any remainders)....
It also relies on the strength of the compiler, not just the strength of the language.
The Custom Mary
The Java performance is best explained by an article by Prof Kahan: "How JAVA's Floating-Point Hurts Everyone Everywhere" http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf also see "Marketing vs. Mathematics" http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf I suspect the relatively poor floating-point performance of gcc is also caused by the desire to acheive accurate results.
Don't forget about the Win32 Compiler Shootout
Note that Python is pretty easy to extend in C/C++, so that speed critical parts can be rewritten in C if the performance becomes an issue. Writing the whole program in C or C++ is a premature optimization.
Save your wrists today - switch to Dvorak
Because the guy who wrote the code decided to use the VB6 compatability features instead of the .NET runtime for VB. Why one would do this, I have no idea.
In the case of Java, you find that the Intel floating point trig instructions don't meet the Java machine spec. So they had to implement them as a function.
It all depends if you want accuracy or speed.
- "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'
Someone should do a study on the time taken to design, implement and debug a resonably complex chunk of code under C++ and Java. I'm pretty sure that the result would show the huge advanatage of Java over C++.
The difference b/w Java and C++ would be dwarfed by the difference b/w Java and Python. Java may be 30-40% more productive than C++, but Python is 1000% more productive than Java. And yes, this applies to larger projects. J2EE may come to its own w/ projects that have hundreds of mediocre programmers, but if you have a mid-size team of highly skilled developers creating something new & unique (something like Zope or Chandler), Python will trounce the competition.
Save your wrists today - switch to Dvorak
There were a number of problems with this benchmark, which are addressed in the OSNews thread about the article.
Namely:
- They only test a highly specific case of small numeric loops that is pretty much the best-case scenario for a JIT compiler.
- They don't test anything higher level, like method calls, object allocation, etc.
Concluding "oh, Java is as fast as C++" from these benchmarks would be unwise. You could conclude that Java is as fast as C++ for short numeric loops, of course, but that would be a different bag of cats entirely.
A deep unwavering belief is a sure sign you're missing something...
If more people would use the SWT libraries (part of the Eclipse project) instead of the crappy AWT/Swing libraries, then this misconception would go away. SWT works by mapping everything to native OS widgets if possible, giving it the look, feel, and speed of a native app. I used Eclipse for quite a while before finding out that it is almost 100% pure Java (other than the JNI code necessary for the native calls).
Site was showing signs of Slashdotting, so I'll quote one of the more important sections...
Results
Here are the benchmark results presented in both table and graph form. The Python and Python/Psyco results are excluded from the graph since the large numbers throw off the graph's scale and render the other results illegible. All scores are given in seconds; lower is better.
int long double trig I/O TOTAL
Visual C++ 9.6 18.8 6.4 3.5 10.5 48.8
Visual C# 9.7 23.9 17.7 4.1 9.9 65.3
gcc C 9.8 28.8 9.5 14.9 10.0 73.0
Visual Basic 9.8 23.7 17.7 4.1 30.7 85.9
Visual J# 9.6 23.9 17.5 4.2 35.1 90.4
Java 1.3.1 14.5 29.6 19.0 22.1 12.3 97.6
Java 1.4.2 9.3 20.2 6.5 57.1 10.1 103.1
Python/Psyco 29.7 615.4 100.4 13.1 10.5 769.1
Python 322.4 891.9 405.7 47.1 11.9 1679.0
Beware: In C++, your friends can see your privates!
It's in many ways unfortunate that with JDK 1.2 (Swing) and onwards, Sun pretty much dumped fast native support for GUI rendering. It has its benefits -- full control, easier portability -- but the fact is that simple GUI apps felt faster with 1.1 than they have done ever since (or even more). This is, alas, especially noticeable on X-windows, perhaps since often the whole window is rendered as one big component as opposed to normal x app components (in latter case, x-windows can optimize clipping better).
Years ago (in late 90s, 97 or 98), I wrote a full VT-52/100/102/220 terminal emulator with telnet handling (plus for fun plugged in a 3rd party then-open SSH implementation). After optimizing display buffer handling, it was pretty much on par with regular xterm, on P100 (Red hat whatever, 5.2?), as in felt about as fast, and had as extensive vt-emulation (checked with vttest). Back then I wrote the thing mostly to show it can be done, as all telnet clients written in Java back then were horribly naive, doing full screen redraw and other flicker-inducing stupidities... and contributed to the perception that Java is and will be slow. I thought it had more to do with programmers not optimizing things that need to be optimized.
It's been a while since then; last I tried it on JDK 1.4.2... and it still doesn't FEEL as fast, even though technically speaking all java code parts ARE much faster (1.1 didn't have any JIT compiler; HotSpot, as tests show, is rather impressive in optimizing). It's getting closer, but then again, mu machine has almost an order of magnitude more computing power now, as probably does gfx card.
To top off problems, in general Linux implementation has been left with much less attention than windows version (or Solaris, but Solaris is at least done by same company). :-/
I like paying taxes. With them I buy civilization -- Oliver Wendell Holmes
According to these benchmarks it doesn't.
The short of it is that GCC 3.2.1 is highly competitive with ICC 7.0, except for two cases:
FP-intensive code on the Pentium 4
Code that allows Intel C++ to auto-generate SSE vector code for it
A deep unwavering belief is a sure sign you're missing something...
Then, with more and more languages, especially ones with VMs, you get further and further away from the hardware. The end result: you lose performance. It does more and more for you, but at the expense of real optimizations, the kind that only you can do.
Now the zealots will come out and say, "Language X is better than language Y, see!" To me this argument is boring. I tend to use the appropriate tool for the job. So:
Yes, my teams use many languages, but they also put their effort to where they get the biggest bang for the buck. And in any business approach, that's the key goal. You don't see carpenters use saws to hammer in nails or drive screws. Wise up!
...tizzyd
The Pentium trig instructions are not IEEE compliant (they don't return the correct values for large magnitude arguments). gcc errs on the side of caution and generates slow, software-based wrappers that correct for the limitations of the Pentium instructions by default. Other compilers (e.g., Intel and probably Microsoft) just generate the in-line instructions with no correction. When you look at the claimed superiority of other compilers over gcc, it is usually such tradeoffs that make gcc appear slower.
You can enable inline trig functions in gcc as well, either with a command line flag, or an include file, or by using "asm" statements on a case-by-case basis. Check the documentation. With those enabled, gcc keeps up well with other compilers on trig functions.
Changing this to 'linesToWrite = [myString] * ioMax' dropped time on my system from 2830ms to 1780ms (I'd like to note that I/O on my system was already much faster than his *best* I/O score, thank you very much Linux)
In the trig test, I used numarray to decrease the runtime from 47660.0ms to *6430.0ms*. The original timing matches his pretty closely, which means that numarray would probably beat his gcc timings handily, too. Any time you're working with a billion numbers in Python, it's a safe bet that you should probably use numarray!
I didn't immediately see how to translate his other mathematical tests into numarray, but I noted that his textual explanation in the article doesn't match the (python) source code!
(My system is a 2.4GHz Pentium IV running RedHat 9)
Hate stupid software on freshmeat? Laugh at