Slashdot Mirror


High-level Languages and Speed

nitsudima writes to tell us Informit's David Chisnall takes a look at the 'myth' of high-level languages versus speed and why it might not be entirely accurate. From the article: "When C was created, it was very fast because it was almost trivial to turn C code into equivalent machine code. But this was only a short-term benefit; in the 30 years since C was created, processors have changed a lot. The task of mapping C code to a modern microprocessor has gradually become increasingly difficult. Since a lot of legacy C code is still around, however, a huge amount of research effort (and money) has been applied to the problem, so we still can get good performance from the language."

25 of 777 comments (clear)

  1. Bah by perrin · · Score: 4, Insightful

    So we "still can get good performance" from C? The implication is that C will somehow become overcome by some unnamed high-elvel language soon. That is just wishful thinking. The article is not very substantial, and where it tries to substantiate, it misses the mark badly. The claim that C cannot handle SIMD instructions well is not true. You can use them directly from C, or the C compiler can use them through autovectorization, as in gcc 4.1. The claim that C cannot inline functions from another source file is also wrong. This is a limitation in gcc, but other compilers can do it, and IIRC the intel compiler can. It is certainly not "impossible".

    1. Re:Bah by TheRaven64 · · Score: 5, Insightful
      The claim that C cannot handle SIMD instructions well is not true. You can use them directly from C, or the C compiler can use them through autovectorization, as in gcc 4.1

      You have two choices when using SIMD instructions in C:

      1. Use non-portable (between hardware, and often between compilers) intrinsics (or even inline assembly).
      2. Write non-vectorised code, and hope the compiler can figure out how to optimally decompose these into the intrinsics. Effectively, you think vectorised code, translate it into scalar code, and then expect the compiler to translate it back.
      Compare the efficiency of GCC at auto-vectorising FORTRAN (which has a primitive vector type) and C (which doesn't), if you don't believe me.

      The claim that C cannot inline functions from another source file is also wrong. This is a limitation in gcc, but other compilers can do it, and IIRC the intel compiler can. It is certainly not "impossible".

      When you pass a C file to a compiler, it generates an object file. It has absolutely no way of knowing where functions declared in the header are defined. You can hack around this; pass multiple source files to the compiler at once and have it treat them as a single one, for example, but this falls down completely when the function is declared in a library (e.g. libc) and you don't have access to the source.

      --
      I am TheRaven on Soylent News
    2. Re:Bah by Anonymous Coward · · Score: 5, Insightful

      C is faster in the same sense that assembly is faster: You have more control over the resulting machine code, so the code can by definition always be faster. You can optimize by hand. But that comes at a price: You have to optimize by hand. That's why C isn't always faster, especially not when it's supposed to be portable. The question isn't whether there could be a faster program in a language of choice, it's whether a language is at the right level of abstraction for a programmer to describe what the program must do and not a bit more. Overspecification prevents optimization. If you write for (int i=0; i<100; i++) where you really meant for (i in [0..99]), how is the compiler going to know if order is important? The latter is much more easily parallelized, for example. C is full of explicitness where it is often not needed. Assembly even more so. That's the problem of low level languages.

  2. High Level by HugePedlar · · Score: 4, Insightful

    I remember back in the days of the Atari ST and Amiga, C was considered to be a high-level language. People would complain about the poor performance of games written in C (to ease the porting from Amiga to ST and vice versa) over 'proper' Assembly coded games.

    Now I hear most people referring to C and C++ as "low level" languages, compared to Java and PHP and visual basic and so on. Funny how that works out.

    I like Assembler. There's something about interacting intimately with your target hardware. It's a shame that it's no longer feasible with today's variety of hardware.

    --
    Argh.
    1. Re:High Level by radarsat1 · · Score: 4, Insightful

      No. Well, generally you'll have faster code if you code it in assembly. But things change when you enter the world of embedded programming... you're right, portability isn't AS important as speed. Sometimes. In certain parts of your program. But I recommend you DON'T disregard portability, even when it comes to microprocessors. In a real-world engineering project, you never know when one day parts will change, parts become obsolete, and you don't want to be left having to translate thousands of lines of assembly code.

      Rather, usually whats done is that most of the code is written in C, and only those parts that REALLY REALLY have to be optimized, like interrupt handlers for example, can be done in assembly. People use assembly for routines that, for example, have to take exactly a certain number of instruction cycles to complete.

      But it should be avoided as much as possible. It's just not worth losing the portability.

      More and more these days, microprocessors are embedding higher level concepts, and even entire operating systems, just to make software development easier.

  3. Inaccurate summary by rbarreira · · Score: 4, Insightful
    The task of mapping C code to a modern microprocessor has gradually become increasingly difficult.

    This is not true. What they mean, I think, is "the task of mapping C code to efficient machine code has gradually become increasingly difficult".
    --

    The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
  4. Re:Old debate by StarvingSE · · Score: 4, Insightful

    C is not a low level language. If you're not directly manipulating the registers on the processor, you are not in a low level language (and forget about the "register" keyword, modern compilers just treat register variables in C/C++ as memory that needs to be optimized for speed).

    If anything, C is a so-called mid level language. If it wasn't, you'd be using an assembler instead of a compiler.

    --
    I got nothin'
  5. High-level languages have an advantage by Bogtha · · Score: 5, Insightful

    The more abstract a language is, the better a compiler can understand what you are doing. If you write out twenty instructions to do something in a low-level language, it's a lot of work to figure out that what matters isn't that the instructions get executed, but the end result. If you write out one instruction in a high-level language that does the same thing, the compiler can decide how best to get that result without trying to figure out if it's okay to throw away the code you've written. Optimisation is easier and safer.

    Furthermore, the bottleneck is often in the programmer's brain rather than the code. If programmers could write code ten times faster, that executes a tenth as quickly, that would actually be a beneficial trade-off for many (most?) organisations. High-level languages help with programmer productivity. I know that it's considered a mark of programmer ability to write the most efficient code possible, but it's a mark of software engineer ability to get the programming done faster while still meeting performance constraints.

    --
    Bogtha Bogtha Bogtha
    1. Re:High-level languages have an advantage by Eivind · · Score: 5, Insightful
      If programmers could write code ten times faster, that executes a tenth as quickly, that would actually be a beneficial trade-off for many (most?) organisations.

      Especially since you can combine. Even in high-performance applications there's typically a only a tiny fraction of the code that actually needs to be efficient, it's perfectly common to have 99% of the time spent in 5% of the code.

      Which means that in basically all cases you're going to be better off writing everything in a high-level language and then optimize only those routines that need it later.

      That way you make less mistakes, and get higher-quality better code quicker for the 95% of the code where efficiency is unimportant, and you can spend even more time on optimizing those few spots where it matters.

  6. Typical Java Handwaving by mlwmohawk · · Score: 5, Insightful

    The first mistake: Confusing "compile" performance with execution performance. The job of maping C/C++ code to machine code is trivial.

    I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages.

    What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment.

    The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant.

    If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?"

    Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and .NET will make software worse.

    1. Re:Typical Java Handwaving by cain · · Score: 5, Insightful
      If computer science isn't about computers, what is it about?

      "Computer science is no more about computers than astronomy is about telescopes" -- Edsger Dijkstra quotes (Dutch computer Scientist. Turing Award in 1972. 1930-2002)

      Sorry, you're arguing against Dijkstra: you lose. :)

    2. Re:Typical Java Handwaving by arevos · · Score: 4, Insightful
      The first mistake: Confusing "compile" performance with execution performance. The job of maping C/C++ code to machine code is trivial.

      I've designed compilers before, and I wouldn't class constructing a C/C++ compiler as "trivial" :)

      If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?"

      One could also make the opposite argument. Many computer courses teach languages such as C++, C# and Java, which all have connections to low level code. C# has its pointers and gotos, Java has its primatives, C++ has all of the above. There aren't many courses that focus more heavily on highly abstracted languages, such as Lisp.

      And I think this is more important, really. Sure, there are many benefits to knowing the low level details of the system you're programming on; but its not essential to know, whilst it is essential to understand how to approach a programming problem. I'm not saying that an understanding of low level computational operations isn't important, merely that it is more important to know the abstract generalities.

      Or, to put it another way, knowing how a computer works is not the same as knowing how to program effectively. At best, it's a subset of a wider field. At worst, it's something that is largely irrelevant to a growing number of programmers. I went to a University that dealt quite extensively with low level hardware and networking, and a significant proportion of the marks of my first year came from coding assembly and C for 680008 processors. Despite this, I can't think of many benefits such knowledge has when, say, designing a web application on Ruby on Rails. Perhaps you can suggest some?

      Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and .NET will make software worse.

      I disagree. I think software sucks because software engineers don't understand programming

    3. Re:Typical Java Handwaving by Oligonicella · · Score: 4, Insightful

      "The job of maping C/C++ code to machine code is trivial."

      Which machine, chum?

      "I've been programming professionally for over 20 years..."

      OK, bump chests. I've been at it for 35+. And? Experience doth not beget competence. There are uses for low-level languages and those that require them will use them. Try writing a 300+ module banking application in assembler. By the time you do, it will be outdated. Not because the language will change, but because the banking requirements will. Using assembler to write an application of that magnitude is like trying to write an Encyclopedia article with paper and pencil. Possible, but 'tarded.

      "Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and .NET will make software worse."

      More like, 'software sucks today for the same reason it always has -- fossized thinkers can't change to make things easier for those who necessarily follow them.' Ego, no more.

  7. Some comments on the article by rbarreira · · Score: 4, Insightful
    OK, the article isn't bad but contains a few misleading parts... Some quotes:

    one assembly language statement translates directly to one machine instruction

    OK, this is nitpicking but there are some exceptions - I remember that TASM would convert automatically long conditional jumps to the opposite conditional jump + an unconditional long jump since there was no long conditional jump instruction.

    Other data structures work significantly better in high-level languages. A dictionary or associative array, for example, can be implemented transparently by a tree or a hash table (or some combination of the two) in a high-level language; the runtime can even decide which, based on the amount and type of data fed to it. This kind of dynamic optimization is simply impossible in a low-level language without building higher-level semantics on top and meta-programming--at which point, you would be better off simply selecting a high-level language and letting someone else do the optimization.

    This paragraph is complete crap. If you're using a Dictionary API in a so called "low-level language", it's as possible for the API to do the same optimization as it is for the runtime he talks about; and you're still letting "someone else do the optimization".

    When you program in a low-level language, you throw away a lot of the semantics before you get to the compilation stage, making it much harder for the compiler to do its job.

    That's surely true. But the opposite is also true - when you use an immense amount of too complex semantics, they can be translated into a pile of inefficient code. Sure, this can improve in the future, but right now it's a problem of very high level constructs.

    Due to the way C works, it's impossible for the compiler to inline a function defined in another source file. Both source files are compiled to binary object files independently, and these are linked.

    Not exactly true I think. Yes, the approach on that page is not standard C, but on section 4 he also talks about some high level performance improvements which are still being experimented on, so...
    --

    The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
  8. Typical "/." Handwaving by Anonymous Coward · · Score: 5, Insightful

    "I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages."

    The "appeal to an expert" fallacy?

    "What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment."

    It also means that portability becomes ever harder, as well as adaptability to new hardware.

    "If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?""

    It's about algorithms. Computers just happen to be the most convienent means for trying them..

    "The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant."

    With the trend towards VM's and virtualization, that "hypothetical" computer comes ever closer.

    "Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and .NET will make software worse."

    Now who's handwaving?

    1. Re:Typical "/." Handwaving by 14CharUsername · · Score: 3, Insightful

      Now who's handwaving?

      I'd say you are. His first statement wasn't a logically fallacy, he was just pointing out this argument has been going on for a long time.

      You made a good point about portability, but I think that was your only point. And its easily shot down byt the fact that its just as easy to port a standard C/C++ API to a new environment as it is to port Java/.NET to a new environment.

      He made an excellent point about many new graduates not knowing how the CPU actually works and you replied with: "It's about algorithms. Computers just happen to be the most convienent means for trying them.." ??? What the hell does that mean? Handwaving indeed.

      His main point was that VM's are always slower compiled machine code. Even if computers are doubling in speed every 18 months or whatever, native machine code will still be faster than virtual machine code.

      With the trend towards VM's and virtualization, that "hypothetical" computer comes ever closer.

      Right there you have just proven yourself to be an academic. Trends do not make reality. Besides that, what about gcj? If VMs were so great, why would anyone want to compile java to native code? In the real world, people care about performance. Academics are satisfied that a problem has a solution. In the real world we need to be able to get a solution in the minimum amount of time. VMs always take more time.

      Now you may continue your handwaving.

    2. Re:Typical "/." Handwaving by Anonymous Coward · · Score: 3, Insightful

      "In the real world we need to be able to get a solution in the minimum amount of time. VMs always take more time."

      I'd argue that in the real world (or at least business world) we need the solution to be developed in the shortest amount of time, with the most amount of security. While a VM based language is not guaranteed to provide quicker time / security, in most cases it probably will.

  9. What I didn't see in TFA... by s_p_oneil · · Score: 4, Insightful

    I didn't see anything mentioning that many high-level languages are written in C. And I don't consider languages like FORTRAN to be high-level. FORTRAN is a language that was designed specifically for numeric computation and scientific computing. For that purpose, it is easy for the compiler to optimize the machine code better than a C compiler could ever manage. The FORTRAN compiler was probably written in C, but FORTRAN has language constructs that are more well-suited to numeric computation.

    Most truly high-level languages, like LISP (which was mentioned directly in TFA), are interpreted, and the interpreters are almost always written in C. It is impossible for an interpreted language written in C (or even a compiled one that is converted to C) to go faster than C. It is always possible for a C programmer to write inefficient code, but that same programmer is likely to write inefficient code in a high-level language as well.

    I'm not saying high-level languages aren't great. They are great for many things, but the argument that C is harder to optimize because the processors have gotten more complex is ludicrous. It's the machine code that's harder to optimize (if you've tried to write assembly code since MMX came out, you know what I mean), and that affects ALL languages.

  10. Assembler by backwardMechanic · · Score: 4, Insightful

    Every serious hacker should have a play with assember, or even machine code. There is real magic in starting up a uP or uC on a board you built yourself, and making it flash a few LEDs under the control of your hand assembled program. I found a whole new depth of understanding when I built a 68hc11 based board (not to mention memorizing a whole bunch of op-codes). Of course, I'd never want to write a 'serious' piece of code in assembly, and it still amazes me that anyone ever did!

  11. Re:Old debate by Bastian · · Score: 5, Insightful

    The article addressed this point by mentioning that the definitions of high and low level language are a moving target. Nowadays I think most people consider assembly language to be its own thing, and the low-level classification has now been shifted into a domain that was once described completely by the term high-level. The term "high-level language" has been replaced by the term "programming language."

    If you're going to go with the jargon as it's most often used nowadays (which is a perfectly reasonable thing to do), then C would certainly be about as low as you can get without manipulating individual registers - i.e., without being assembly language.

  12. The myth of assembly performance by p3d0 · · Score: 4, Insightful
    Well, generally you'll have faster code if you code it in assembly.
    No, generally you'll have slower code. In a few specific, well-chosen places, you may get faster code. If you had unlimited time, patience, and performance tuning expertise, then you could beat the compiler on a large application, but how realistic is that?

    Coding large apps in assembly is usually way beyond the point of diminishing returns in terms of performance.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  13. Re:Old debate by jacksonj04 · · Score: 3, Insightful

    Low level says what you want the system to do. High level says what you want the language (Via compiler, interpreter etc) to make the system do.

    --
    How many people can read hex if only you and dead people can read hex?
  14. Don't be so sure by overshoot · · Score: 3, Insightful
    Well, generally you'll have faster code if you code it in assembly.
    I wouldn't even grant that in the general case.

    Amazingly far back (try the 80s) a professor friend of mine had a marvelous example of compiler-generated code where the compiler had done such an amazing job of optimising register use that you had to trace through more than 20 pages of assembler output with colored markers to trace from where the register was loaded to where it was used.

    No way I would ever have the huevos to code that way in assembler. On a RISC machine or (Heaven help us) the Itanic it gets lots worse.

    --
    Lacking <sarcasm> tags, /. substitutes moderation as "Troll."
  15. Initially by Vexorian · · Score: 3, Insightful

    The article later points out that the native version was running slower due to not using optimization options correctly. And later the native version was running 15% faster than the managed version

    --

    Copyright infringement is "piracy" in the same way DRM is "consumer rape"
  16. An expert assembly programmer in a CPU... by Terje+Mathisen · · Score: 4, Insightful

    I've probably written more assembly than most slashdot readers, and most of what you say is true:

    It used to be the case that I could always increase the speed of some random C/Fortran/Pascal code by rewriting it in asm, parts of that speedup came from realizing better ways to map the current problem to the actual cpu hardware available.

    However, I also discovered that much of the time it was possible to take the experience gained from the asm code, and use that to rewrite the original C code in such a way as to help the compiler generate near-optimal code. I.e. if I can get within 10-25% of 'speed_of_light' using portable C, I'll do so nearly every time.

    There are some important situations where asm still wins, and that is when you have cpu hardware/opcodes available that the compiler cannot easily take advantage of. I.e. back in the days of the PentiumMMX 300 MHz cpu it became possible to do full MPEG2/DVD decoding in sw, but only by writing an awful lot of hand-optimized MMX code. Zoran SoftDVD was the first on the market, I was asked to help with some optimizations, but Mike Schmid (spelling?) had really done 99+% of the job.

    Another important application for fast code is in crypto: If you want to transparently encrypt anything stored on your hard drive and/or going over a network wire, then you want the encryption/decryption process to be fast enough that you really doesn't notice any slowdown. This was one of the reasons for specifying a 200 MHz PentiumPro as the target machine for the Advanced Encryption Standard: If you could handle 100 Mbit Ethernet full duplex (i.e. 10 MB/s in both directions) on a 1996 model cpu, then you could easily do the same on any modern system.

    When we (I and 3 other guys) rewrote one of the AES contenders (DFC, not the winner!) in pure asm, we managed to speed it up by a factor of 3, which moved it from being one of the 3-4 slowest to one of the fastest algorithms among the 15 alternatives.

    Today, with fp SIMD instructions and a reasonably orthogonal/complete instruction set (i.e. SSE3 on x86), it is relatively easy to write code in such a way that an autovectorizer can do a good job, but for more complicated code things quickly become much harder.

    Terje

    --
    "almost all programming can be viewed as an exercise in caching"