High-level Languages and Speed
nitsudima writes to tell us Informit's David Chisnall takes a look at the 'myth' of high-level languages versus speed and why it might not be entirely accurate. From the article: "When C was created, it was very fast because it was almost trivial to turn C code into equivalent machine code. But this was only a short-term benefit; in the 30 years since C was created, processors have changed a lot. The task of mapping C code to a modern microprocessor has gradually become increasingly difficult. Since a lot of legacy C code is still around, however, a huge amount of research effort (and money) has been applied to the problem, so we still can get good performance from the language."
Well, we ran our own tests. We took a sizable chunk of supposedly well-written time-critical code that the gang had produced in what was later to become Microsoft C [2] and rewrote the same modules in Logitech Modula-2. The upshot was that the M2 code was measurably faster, smaller, and on examination better optimized. Apparently the C compiler was handicapped by essentially having to figure out what the programmer meant with a long string of low-level expressions.
Extrapolations to today are left to the reader.
[1] I used to comment that C is not a high-level language, which would induce elevated blood pressure in C programmers. After working them up, I'd bet beer money on it -- and then trot out K&R, which contains the exact quote, "C is not a high-level language."
[2] MS originally relabled another company's C complier under license (I forget their name; they were an early object lesson.)
Lacking <sarcasm> tags,
So we "still can get good performance" from C? The implication is that C will somehow become overcome by some unnamed high-elvel language soon. That is just wishful thinking. The article is not very substantial, and where it tries to substantiate, it misses the mark badly. The claim that C cannot handle SIMD instructions well is not true. You can use them directly from C, or the C compiler can use them through autovectorization, as in gcc 4.1. The claim that C cannot inline functions from another source file is also wrong. This is a limitation in gcc, but other compilers can do it, and IIRC the intel compiler can. It is certainly not "impossible".
I remember back in the days of the Atari ST and Amiga, C was considered to be a high-level language. People would complain about the poor performance of games written in C (to ease the porting from Amiga to ST and vice versa) over 'proper' Assembly coded games.
Now I hear most people referring to C and C++ as "low level" languages, compared to Java and PHP and visual basic and so on. Funny how that works out.
I like Assembler. There's something about interacting intimately with your target hardware. It's a shame that it's no longer feasible with today's variety of hardware.
Argh.
This is not true. What they mean, I think, is "the task of mapping C code to efficient machine code has gradually become increasingly difficult".
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
The speed of code written in computer language is based on the number of CPU cycles required to carry it out. That means that the speed of any higher-level language is related to the efficiency of code executed by the interpreter or produced by the compiler. Most compilers and interpreters these days are pretty darn good at optimizing, making the drawback of using a higher-level language less and less important.
If you don't believe me, I suggest you look at some of the assembly code output of gcc. I'm no assembly guru, but I don't think I would have done as well writing assembly by hand.
I am officially gone from
Isnt the JIT for java written in C though.
ahah now we know why my java program is so slow. damn C slowing it down.
Escher was the first MC and Giger invented the HR department.
Sure, CPU:s look quite a bit different now than they did 20+ years ago. On the other hand, CPU designs do heavily take into account what features are being used by the application code expected to be run on them, and one constant you can still depend on is that most of that application code is going to be machine-generated by a C compiler.
For instance, 20 years ago there was nothing strange about having an actual quicksort machine instruction (VAXen had it). One expectation was still, at the time, that a lot of code would be generated directly by humans, so instructions and instruction designs catering to that use-case were developed. But by around then, most code was machine generated by a compiler, and since the compiler had little high-level semantics to work with, the high-level instructions - and most low-level one's too - went unused; this was one impetus for the development of RISC machines, by the way.
So, as long as a lot of coding is done in C and C++ (and especially in the embedded space, where you have most rapid CPU development, almost all coding is), designs will never stray far away from the requirements of that language. Better compilers have allowed designers to stray further, but stray too far and you get penalized in the market.
Trust the Computer. The Computer is your friend.
The more abstract a language is, the better a compiler can understand what you are doing. If you write out twenty instructions to do something in a low-level language, it's a lot of work to figure out that what matters isn't that the instructions get executed, but the end result. If you write out one instruction in a high-level language that does the same thing, the compiler can decide how best to get that result without trying to figure out if it's okay to throw away the code you've written. Optimisation is easier and safer.
Furthermore, the bottleneck is often in the programmer's brain rather than the code. If programmers could write code ten times faster, that executes a tenth as quickly, that would actually be a beneficial trade-off for many (most?) organisations. High-level languages help with programmer productivity. I know that it's considered a mark of programmer ability to write the most efficient code possible, but it's a mark of software engineer ability to get the programming done faster while still meeting performance constraints.
Bogtha Bogtha Bogtha
The first mistake: Confusing "compile" performance with execution performance. The job of maping C/C++ code to machine code is trivial.
.NET will make software worse.
I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages.
What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment.
The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant.
If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?"
Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and
OK, this is nitpicking but there are some exceptions - I remember that TASM would convert automatically long conditional jumps to the opposite conditional jump + an unconditional long jump since there was no long conditional jump instruction.
This paragraph is complete crap. If you're using a Dictionary API in a so called "low-level language", it's as possible for the API to do the same optimization as it is for the runtime he talks about; and you're still letting "someone else do the optimization".
That's surely true. But the opposite is also true - when you use an immense amount of too complex semantics, they can be translated into a pile of inefficient code. Sure, this can improve in the future, but right now it's a problem of very high level constructs.
Not exactly true I think. Yes, the approach on that page is not standard C, but on section 4 he also talks about some high level performance improvements which are still being experimented on, so...
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
"I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages."
.NET will make software worse."
The "appeal to an expert" fallacy?
"What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment."
It also means that portability becomes ever harder, as well as adaptability to new hardware.
"If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?""
It's about algorithms. Computers just happen to be the most convienent means for trying them..
"The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant."
With the trend towards VM's and virtualization, that "hypothetical" computer comes ever closer.
"Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and
Now who's handwaving?
I didn't see anything mentioning that many high-level languages are written in C. And I don't consider languages like FORTRAN to be high-level. FORTRAN is a language that was designed specifically for numeric computation and scientific computing. For that purpose, it is easy for the compiler to optimize the machine code better than a C compiler could ever manage. The FORTRAN compiler was probably written in C, but FORTRAN has language constructs that are more well-suited to numeric computation.
Most truly high-level languages, like LISP (which was mentioned directly in TFA), are interpreted, and the interpreters are almost always written in C. It is impossible for an interpreted language written in C (or even a compiled one that is converted to C) to go faster than C. It is always possible for a C programmer to write inefficient code, but that same programmer is likely to write inefficient code in a high-level language as well.
I'm not saying high-level languages aren't great. They are great for many things, but the argument that C is harder to optimize because the processors have gotten more complex is ludicrous. It's the machine code that's harder to optimize (if you've tried to write assembly code since MMX came out, you know what I mean), and that affects ALL languages.
Every serious hacker should have a play with assember, or even machine code. There is real magic in starting up a uP or uC on a board you built yourself, and making it flash a few LEDs under the control of your hand assembled program. I found a whole new depth of understanding when I built a 68hc11 based board (not to mention memorizing a whole bunch of op-codes). Of course, I'd never want to write a 'serious' piece of code in assembly, and it still amazes me that anyone ever did!
Whoa! This article seems to be making up history out of whole cloth. I'm not even sure where to begin. It's just totally out to lunch.
:= A + 1) puts it closer, not to PDP-11 architecture, but to contemporary machine architecture in general.
C was not a reaction to LISP. I can't even imagine why anyone would say this. LISP's if/then/else was an influence on ALGOL and later languages.
C might have been a reaction to Pascal, which in turn was a reaction to ALGOL.
LISP was not "the archetypal high-level language." The very names CAR and CDR mean "contents of address register" and "contents of decrement register," direct references to hardware registers on the IBM 704. When the names of fundamental languages constructs are those of specific registers in a specific processor, that is not a "high-level language" at all. Later efforts to build machines with machine architectures optimized for implementation of LISP further show that LISP was not considered "a high-level language."
C was not specifically patterned on the PDP-11. Rather, both of them were based on common practice and understanding of what was in the air at the time. C was a direct successor to, and reasonably similar to BCPL, on Honeywell 635 and 645, the IBM 360, the TX-2, the CDC 6400, the Univac 1108, the PDP-9, the KDF 9 and the Atlas 2.
C makes an interesting comparison with Pascal; you can see that C is, in many ways, a computer language rather than a mathematical language. For example, the inclusion of specific constructs for increment and decrement (as opposed to just writing A
"How to Do Nothing," kids activities, back in print!
I thought it might be helpful for a current student to let you know what it is we learn today at my college. I'm a senior Software Engineering major, not a comp sci major. Comp Sci is another department and has a totaly different focus. They focus on super efficent algorithms, we focus on developing large software projects.
My software engineering program has been very Java intensive. My software engineering class, object oriented class, and software testing class were all java based. We dabbled in C# a bit as well.
However, I also had an assembly class, a programming languages class where we learned perl and scheme(this language sucks) and about five algorithms classes in C++. I also had an embedded systems class in both C and assembly(learned assembly MCU code, then did C).
I feel like this is all pretty well rounded; I've learned a bunch of languages and am not really specialized in one. I'd say I am best at Java right now, but I can also write C++ code just fine.
I've never been told a computer has any kind of crazy limitless performance. In embedded systems, I learned about performance. Making a little PIC microcontroller calculate arctan was fun(took literally 30 seconds without a smart solution). I also learned that there is a trade off between several things such as performance, development time, readability, and portability.
We are taught to see languages as tools, you look at your problem and pull a tool out of the tool box that you think fit the problem best. You have to weigh whats important for the project and chose based off of that.
The final thing I'd like to point out is that one huge issue with software today is it is bug ridden. How easy something is a test makes a big difference in my opinion. Assembly and C will pretty much always be harder to test than languages like Java and C#.
I don't think the universities are the problem, at least not in my experience.
From:
Subject: The truth about 'C++' revealed
Date: Tuesday, December 31, 2002 5:20 AM
On the 1st of January, 1998, Bjarne Stroustrup gave an interview to the IEEE's 'Computer' magazine.
Naturally, the editors thought he would be giving a retrospective view of seven years of object-oriented design, using the language he created.
By the end of the interview, the interviewer got more than he had bargained for and, subsequently, the editor decided to suppress its contents, 'for the good of the industry' but, as with many of these things, there was a leak.
Here is a complete transcript of what was was said, unedited, and unrehearsed, so it isn't as neat as planned interviews.
You will find it interesting...
__________________________________________________ ________________
Interviewer: Well, it's been a few years since you changed the world of software design, how does it feel, looking back?
Stroustrup: Actually, I was thinking about those days, just before you arrived. Do you remember? Everyone was writing 'C' and, the trouble was, they were pretty damn good at it. Universities got pretty good at teaching it, too. They were turning out competent - I stress the word 'competent' - graduates at a phenomenal rate. That's what caused the problem.
Interviewer: problem?
Stroustrup: Yes, problem. Remember when everyone wrote Cobol?
Interviewer: Of course, I did too
Stroustrup: Well, in the beginning, these guys were like demi-gods. Their salaries were high, and they were treated like royalty.
Interviewer: Those were the days, eh?
Stroustrup: Right. So what happened? IBM got sick of it, and invested millions in training programmers, till they were a dime a dozen.
Interviewer: That's why I got out. Salaries dropped within a year, to the point where being a journalist actually paid better.
Stroustrup: Exactly. Well, the same happened with 'C' programmers.
Interviewer: I see, but what's the point?
Stroustrup: Well, one day, when I was sitting in my office, I thought of this little scheme, which would redress the balance a little. I thought 'I wonder what would happen, if there were a language so complicated, so difficult to learn, that nobody would ever be able to swamp the market with programmers? Actually, I got some of the ideas from X10, you know, X windows. That was such a bitch of a graphics system, that it only just ran on those Sun 3/60 things. They had all the ingredients for what I wanted. A really ridiculously complex syntax, obscure functions, and pseudo-OO structure. Even now, nobody writes raw X-windows code. Motif is the only way to go if you want to retain your sanity.
[NJW Comment: That explains everything. Most of my thesis work was in raw X-windows. :)]
Interviewer: You're kidding...?
Stroustrup: Not a bit of it. In fact, there was another problem. Unix was written in 'C', which meant that any 'C' programmer could very easily become a systems programmer. Remember what a mainframe systems programmer used to earn?
Interviewer: You bet I do, that's what I used to do.
Stroustrup: OK, so this new language had to divorce itself from Unix, by hiding all the system calls that bound the two together so nicely. This would enable guys who only knew about DOS to earn a decent living too.
Interviewer: I don't believe you said that...
Stroustrup: Well, it's been long enough, now, and I believe most people have figured out for themselves that C++ is a waste of time but, I must say, it's taken them a lot longer than I thought it would.
Interviewer: So how exactly did you do it?
Stroustrup: It was only supposed to be a joke, I never thought people would take the book seriously.
C became popular because of Unix. Since you could get the source code for Unix most big universities used Unix in there OS courses. And since it was written in c you where going to learn C if took Computer Science. Textbooks started to assume you knew c. Magazines started to assume you knew c. People wrote free small c compilers and then came GCC, so now you could have a good free c compiler for just about any system. But before GCC all the buzz was about Smalltalk. Smalltalk was the future. OOP was going to replace structured programing. The problem was very few people has a computer that could run Smalltalk. So C++ was born.
A final blow to Modula-2 was simply Borland didn't create a Modula-2 compiler. For many years when you said Pascal you reall meant Turbo or Borland Pascal. Borland was the Pascal company and they add objects to pascal and eventual created Delphi.
I am sure Topspeed has closed up shop. There just isn't much room for compiler makers anymore. You have the free software at the bottom end and the Microsoft Monster at the top. Only a few niche players are left. Ada seems to be a place where a good compiler company can still make a few dollars.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Coding large apps in assembly is usually way beyond the point of diminishing returns in terms of performance.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I've probably written more assembly than most slashdot readers, and most of what you say is true:
It used to be the case that I could always increase the speed of some random C/Fortran/Pascal code by rewriting it in asm, parts of that speedup came from realizing better ways to map the current problem to the actual cpu hardware available.
However, I also discovered that much of the time it was possible to take the experience gained from the asm code, and use that to rewrite the original C code in such a way as to help the compiler generate near-optimal code. I.e. if I can get within 10-25% of 'speed_of_light' using portable C, I'll do so nearly every time.
There are some important situations where asm still wins, and that is when you have cpu hardware/opcodes available that the compiler cannot easily take advantage of. I.e. back in the days of the PentiumMMX 300 MHz cpu it became possible to do full MPEG2/DVD decoding in sw, but only by writing an awful lot of hand-optimized MMX code. Zoran SoftDVD was the first on the market, I was asked to help with some optimizations, but Mike Schmid (spelling?) had really done 99+% of the job.
Another important application for fast code is in crypto: If you want to transparently encrypt anything stored on your hard drive and/or going over a network wire, then you want the encryption/decryption process to be fast enough that you really doesn't notice any slowdown. This was one of the reasons for specifying a 200 MHz PentiumPro as the target machine for the Advanced Encryption Standard: If you could handle 100 Mbit Ethernet full duplex (i.e. 10 MB/s in both directions) on a 1996 model cpu, then you could easily do the same on any modern system.
When we (I and 3 other guys) rewrote one of the AES contenders (DFC, not the winner!) in pure asm, we managed to speed it up by a factor of 3, which moved it from being one of the 3-4 slowest to one of the fastest algorithms among the 15 alternatives.
Today, with fp SIMD instructions and a reasonably orthogonal/complete instruction set (i.e. SSE3 on x86), it is relatively easy to write code in such a way that an autovectorizer can do a good job, but for more complicated code things quickly become much harder.
Terje
"almost all programming can be viewed as an exercise in caching"
High level languages have always been compared to cognitive semantics and grammatical styles. That is the higher the level of the language the easier it is for us humans to read and write it. Conversely, the lower the level the language is the more discreet steps are needed to describe an instruction or data.
Speed of program languages or machine languages are not measured by how high or low level they are to us. They are also measured by time to develop and implement the program. The article basically makes a point of it, that it's "better to let someone else" to optimize the low-level code while you write with the high-level language. You could write a super fast machine coded program, but it'll take you much longer to write it than with a simpler higher level language.
The new debate is over datatypes and the available methods to manipulate them. Older hardware gave us the old debate with primitive datatypes and a general set of instructions to manipulate the data. Newer hardware can give us more than just primitives. For example, a unicoded string datatype seen by the hardware as a complete object instead of an array of bytes. With hardware instructions to manipulate unicoded strings, that would pratically take away any low-level implementation of unicoded strings. The same could be done for UTF-8 strings. We could implement hardware support for XML documents and other common protocols. How these datatypes are actually implemented in hardware is the center of the debate.
Eventually, there will be so many datatypes that there will be seperate low-level languages specifically designed for a domain a datatypes. The article makes the point there exists an increase in complexity for newer compliers to understand what was intended by a set of low-level instructions. Today's CPUs have a static limit of low-level instructions. The future beholds hardware implemented datatypes and their dynamic availability of low-level instructions. Newer processors will need to be able to handle the dynamic set of machine language instructions.
Does the new debate conflict with Turing's goal to simply make a processor unit extensible without the need to add extra hardware? For now, we have virtualization.
I sped up some C code by unrolling a loop with Duff's Device. Duff's Device, for those who haven't encountered it, makes an ingenious use of the often-maligned C behavior that case statements, in the absence of a break or return statement, fall-through.
Duff's Device takes advantage of the fall-through by jumping into the middle of an unrolled loop of repeated instructions. If eight instructions are unrolled, Duff's Device iterates the loop
times, but enters the loop by jumping to the
'the unrolled instruction from the end of the loop. (This sounds complicated, but isn't; just look at the code and it becomes clear.)
...
...
The whole point of Duff's Device is speed and locality of code. Speed: because the loop is unrolled, more instructions are executed for each jump back to the top (and jumps are, relatively, expensive, because they mean any preloaded instructions must be tossed out ans re-read. Locality: (hopefully) all the instructions can be cached, so the processor doesn't have to re-read them from memory.
But what gcc does with Duff's Device on ARM targets is just bizarre. gcc uses a jump table (good) to directly change the Program Counter (good, so far). But instead of jumping into the loop (which would be good), gcc uses the jump table to jump to
a redundant assignment and
an unconditional jump.
Yes, gcc very smartly makes a jump table (which directly changes the Program Counter, just like a jump would) to jump to a jump. This is simply a waste of code and time:
Why a jump table just to set up an unconditional jump? Why the redundant mov, which could have been done once, prior to the jump table jump? Who knows, that's what gcc does.
In this particular case, the object is to copy halfwords to a memory address, which address is really mapped to an output device. ARM processors, of course, are optimized for word addresses, so the "best" way to do this would be to load multiple words (LDM), shift the upper
Opinions on the Twiddler2 hand-held keyboard?
The article is a bit simplistic.
With medium-level languages like C, some of the language constructs are lower-level than the machine hardware. Thus, a decent compiler has to figure out what the user's code is doing and generate the appropriate instructions. The classic example is
char tab1[100], tab2[100];
int i = 100;
char* p1 = &tab1; char* p2 = &tab2;
while (i--) *p2++ = *p1++;
Two decades ago, C programmers who knew that idiom thought they were cool. In the PDP-11 era, with the non-optimizing compilers that came with UNIX, that was actually useful. The "*p2++ = *p1++;" explicitly told the compiler to generate auto-increment instructions, and considerably shortened the loop over a similar loop written with subscripts. By the late 1980s and 1990s, it didn't matter. Both GCC and the Microsoft compilers were smart enough to hoist subscript arithmetic out of loops, and writing that loop with subscripts generated the same code as with pointers. Today, if you write that loop, most compilers for x86 machines will generate a single MOV instruction for the copy. The compiler has to actually figure out what the programmer intended and rewrite the code. This is non-trivial. In some ways, C makes it more difficult, because it's harder for the compiler to figure out the intent of a C program than a FORTRAN or Pascal program. In C, there are more ways that code can do something wierd, and the compiler must make sure that the wierd cases aren't happening before optimizing.
The next big obstacle to optimization is the "dumb linker" assumption. UNIX has a tradition of dumb linkers, dating back to the PDP-11 linker, which was written in assembler with very few comments. The linker sees the entire program, but, with most object formats, can't do much to it other than throw out unreachable code. This, combined with the usual approach to separate compilation, inhibits many useful optimizations. When code calls a function in another compilation unit, the caller has to assume near-unlimited side effects from the call. This blocks many optimizations. In numerical work, it's a serious problem when the compiler can't tell, say, that "cos(x)" has no side effects. In C, it doesn't; in FORTRAN, it does, which is why some heavy numerical work is still done in FORTRAN. The compiler usually doesn't know that "cos" is a pure function; that is, x == y implies cos(x) = cos(y). This is enough of a performance issue that GCC has some cheats to get around it; look up "mathinline.h". But that doesn't help when you call some one-line function in another compilation unit from inside an inner loop.
C++ has "inline" to help with this problem. The real win with "inline" is not eliminating the call overhead; it's the ability for the optimizers to see what's going on. But really, what should be happening is that the compiler should check each compilation unit and output not machine code, but something like a parse tree. The heavy optimization should be done at link time, when more of the program is visible. There have been some experimental systems that did this, but it remains rare. "Just in time" systems like Java have been more popular. (Java's just-in-time approach is amusing. It was put in because the goal was to support applets in browsers. (Remember applets?) Now that Java is mostly a server-side language, the JIT feature isn't really all that valuable, and all of Java's "packaging" machinery takes up more time than a hard compile would.)
The next step up is to feed performance data from execution back into the compilation process. Some of Intel's embedded system compilers do this. It's most useful for machines where out of line control flow has high costs, and the CPU doesn't have good branch prediction hardware. For modern x86 machines, it's not a big win. For the Itanium, it's essential. (The Itanium needs a near-omniscient compiler to perform well, because you have to decide at compile time which instructions should be executed