High-level Languages and Speed
nitsudima writes to tell us Informit's David Chisnall takes a look at the 'myth' of high-level languages versus speed and why it might not be entirely accurate. From the article: "When C was created, it was very fast because it was almost trivial to turn C code into equivalent machine code. But this was only a short-term benefit; in the 30 years since C was created, processors have changed a lot. The task of mapping C code to a modern microprocessor has gradually become increasingly difficult. Since a lot of legacy C code is still around, however, a huge amount of research effort (and money) has been applied to the problem, so we still can get good performance from the language."
So we "still can get good performance" from C? The implication is that C will somehow become overcome by some unnamed high-elvel language soon. That is just wishful thinking. The article is not very substantial, and where it tries to substantiate, it misses the mark badly. The claim that C cannot handle SIMD instructions well is not true. You can use them directly from C, or the C compiler can use them through autovectorization, as in gcc 4.1. The claim that C cannot inline functions from another source file is also wrong. This is a limitation in gcc, but other compilers can do it, and IIRC the intel compiler can. It is certainly not "impossible".
I remember back in the days of the Atari ST and Amiga, C was considered to be a high-level language. People would complain about the poor performance of games written in C (to ease the porting from Amiga to ST and vice versa) over 'proper' Assembly coded games.
Now I hear most people referring to C and C++ as "low level" languages, compared to Java and PHP and visual basic and so on. Funny how that works out.
I like Assembler. There's something about interacting intimately with your target hardware. It's a shame that it's no longer feasible with today's variety of hardware.
Argh.
This is not true. What they mean, I think, is "the task of mapping C code to efficient machine code has gradually become increasingly difficult".
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
C is not a low level language. If you're not directly manipulating the registers on the processor, you are not in a low level language (and forget about the "register" keyword, modern compilers just treat register variables in C/C++ as memory that needs to be optimized for speed).
If anything, C is a so-called mid level language. If it wasn't, you'd be using an assembler instead of a compiler.
I got nothin'
The more abstract a language is, the better a compiler can understand what you are doing. If you write out twenty instructions to do something in a low-level language, it's a lot of work to figure out that what matters isn't that the instructions get executed, but the end result. If you write out one instruction in a high-level language that does the same thing, the compiler can decide how best to get that result without trying to figure out if it's okay to throw away the code you've written. Optimisation is easier and safer.
Furthermore, the bottleneck is often in the programmer's brain rather than the code. If programmers could write code ten times faster, that executes a tenth as quickly, that would actually be a beneficial trade-off for many (most?) organisations. High-level languages help with programmer productivity. I know that it's considered a mark of programmer ability to write the most efficient code possible, but it's a mark of software engineer ability to get the programming done faster while still meeting performance constraints.
Bogtha Bogtha Bogtha
The first mistake: Confusing "compile" performance with execution performance. The job of maping C/C++ code to machine code is trivial.
.NET will make software worse.
I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages.
What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment.
The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant.
If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?"
Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and
OK, this is nitpicking but there are some exceptions - I remember that TASM would convert automatically long conditional jumps to the opposite conditional jump + an unconditional long jump since there was no long conditional jump instruction.
This paragraph is complete crap. If you're using a Dictionary API in a so called "low-level language", it's as possible for the API to do the same optimization as it is for the runtime he talks about; and you're still letting "someone else do the optimization".
That's surely true. But the opposite is also true - when you use an immense amount of too complex semantics, they can be translated into a pile of inefficient code. Sure, this can improve in the future, but right now it's a problem of very high level constructs.
Not exactly true I think. Yes, the approach on that page is not standard C, but on section 4 he also talks about some high level performance improvements which are still being experimented on, so...
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
"I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages."
.NET will make software worse."
The "appeal to an expert" fallacy?
"What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment."
It also means that portability becomes ever harder, as well as adaptability to new hardware.
"If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?""
It's about algorithms. Computers just happen to be the most convienent means for trying them..
"The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant."
With the trend towards VM's and virtualization, that "hypothetical" computer comes ever closer.
"Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and
Now who's handwaving?
I didn't see anything mentioning that many high-level languages are written in C. And I don't consider languages like FORTRAN to be high-level. FORTRAN is a language that was designed specifically for numeric computation and scientific computing. For that purpose, it is easy for the compiler to optimize the machine code better than a C compiler could ever manage. The FORTRAN compiler was probably written in C, but FORTRAN has language constructs that are more well-suited to numeric computation.
Most truly high-level languages, like LISP (which was mentioned directly in TFA), are interpreted, and the interpreters are almost always written in C. It is impossible for an interpreted language written in C (or even a compiled one that is converted to C) to go faster than C. It is always possible for a C programmer to write inefficient code, but that same programmer is likely to write inefficient code in a high-level language as well.
I'm not saying high-level languages aren't great. They are great for many things, but the argument that C is harder to optimize because the processors have gotten more complex is ludicrous. It's the machine code that's harder to optimize (if you've tried to write assembly code since MMX came out, you know what I mean), and that affects ALL languages.
Every serious hacker should have a play with assember, or even machine code. There is real magic in starting up a uP or uC on a board you built yourself, and making it flash a few LEDs under the control of your hand assembled program. I found a whole new depth of understanding when I built a 68hc11 based board (not to mention memorizing a whole bunch of op-codes). Of course, I'd never want to write a 'serious' piece of code in assembly, and it still amazes me that anyone ever did!
The article addressed this point by mentioning that the definitions of high and low level language are a moving target. Nowadays I think most people consider assembly language to be its own thing, and the low-level classification has now been shifted into a domain that was once described completely by the term high-level. The term "high-level language" has been replaced by the term "programming language."
If you're going to go with the jargon as it's most often used nowadays (which is a perfectly reasonable thing to do), then C would certainly be about as low as you can get without manipulating individual registers - i.e., without being assembly language.
Coding large apps in assembly is usually way beyond the point of diminishing returns in terms of performance.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Low level says what you want the system to do. High level says what you want the language (Via compiler, interpreter etc) to make the system do.
How many people can read hex if only you and dead people can read hex?
Amazingly far back (try the 80s) a professor friend of mine had a marvelous example of compiler-generated code where the compiler had done such an amazing job of optimising register use that you had to trace through more than 20 pages of assembler output with colored markers to trace from where the register was loaded to where it was used.
No way I would ever have the huevos to code that way in assembler. On a RISC machine or (Heaven help us) the Itanic it gets lots worse.
Lacking <sarcasm> tags,
The article later points out that the native version was running slower due to not using optimization options correctly. And later the native version was running 15% faster than the managed version
Copyright infringement is "piracy" in the same way DRM is "consumer rape"
I've probably written more assembly than most slashdot readers, and most of what you say is true:
It used to be the case that I could always increase the speed of some random C/Fortran/Pascal code by rewriting it in asm, parts of that speedup came from realizing better ways to map the current problem to the actual cpu hardware available.
However, I also discovered that much of the time it was possible to take the experience gained from the asm code, and use that to rewrite the original C code in such a way as to help the compiler generate near-optimal code. I.e. if I can get within 10-25% of 'speed_of_light' using portable C, I'll do so nearly every time.
There are some important situations where asm still wins, and that is when you have cpu hardware/opcodes available that the compiler cannot easily take advantage of. I.e. back in the days of the PentiumMMX 300 MHz cpu it became possible to do full MPEG2/DVD decoding in sw, but only by writing an awful lot of hand-optimized MMX code. Zoran SoftDVD was the first on the market, I was asked to help with some optimizations, but Mike Schmid (spelling?) had really done 99+% of the job.
Another important application for fast code is in crypto: If you want to transparently encrypt anything stored on your hard drive and/or going over a network wire, then you want the encryption/decryption process to be fast enough that you really doesn't notice any slowdown. This was one of the reasons for specifying a 200 MHz PentiumPro as the target machine for the Advanced Encryption Standard: If you could handle 100 Mbit Ethernet full duplex (i.e. 10 MB/s in both directions) on a 1996 model cpu, then you could easily do the same on any modern system.
When we (I and 3 other guys) rewrote one of the AES contenders (DFC, not the winner!) in pure asm, we managed to speed it up by a factor of 3, which moved it from being one of the 3-4 slowest to one of the fastest algorithms among the 15 alternatives.
Today, with fp SIMD instructions and a reasonably orthogonal/complete instruction set (i.e. SSE3 on x86), it is relatively easy to write code in such a way that an autovectorizer can do a good job, but for more complicated code things quickly become much harder.
Terje
"almost all programming can be viewed as an exercise in caching"