High-level Languages and Speed
nitsudima writes to tell us Informit's David Chisnall takes a look at the 'myth' of high-level languages versus speed and why it might not be entirely accurate. From the article: "When C was created, it was very fast because it was almost trivial to turn C code into equivalent machine code. But this was only a short-term benefit; in the 30 years since C was created, processors have changed a lot. The task of mapping C code to a modern microprocessor has gradually become increasingly difficult. Since a lot of legacy C code is still around, however, a huge amount of research effort (and money) has been applied to the problem, so we still can get good performance from the language."
So we "still can get good performance" from C? The implication is that C will somehow become overcome by some unnamed high-elvel language soon. That is just wishful thinking. The article is not very substantial, and where it tries to substantiate, it misses the mark badly. The claim that C cannot handle SIMD instructions well is not true. You can use them directly from C, or the C compiler can use them through autovectorization, as in gcc 4.1. The claim that C cannot inline functions from another source file is also wrong. This is a limitation in gcc, but other compilers can do it, and IIRC the intel compiler can. It is certainly not "impossible".
I remember back in the days of the Atari ST and Amiga, C was considered to be a high-level language. People would complain about the poor performance of games written in C (to ease the porting from Amiga to ST and vice versa) over 'proper' Assembly coded games.
Now I hear most people referring to C and C++ as "low level" languages, compared to Java and PHP and visual basic and so on. Funny how that works out.
I like Assembler. There's something about interacting intimately with your target hardware. It's a shame that it's no longer feasible with today's variety of hardware.
Argh.
This is not true. What they mean, I think, is "the task of mapping C code to efficient machine code has gradually become increasingly difficult".
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
C is not a low level language. If you're not directly manipulating the registers on the processor, you are not in a low level language (and forget about the "register" keyword, modern compilers just treat register variables in C/C++ as memory that needs to be optimized for speed).
If anything, C is a so-called mid level language. If it wasn't, you'd be using an assembler instead of a compiler.
I got nothin'
Ain't it great to know that Modula-2 - and essentially ALL of the strongly typed and structure languages - have pretty much died out. I did piles of stuff in M2, including reading and parsing legacy binary files, re-entrant interrupt handlers in DOS, etc.
The living have better things to do than to continue hating the dead.
The more abstract a language is, the better a compiler can understand what you are doing. If you write out twenty instructions to do something in a low-level language, it's a lot of work to figure out that what matters isn't that the instructions get executed, but the end result. If you write out one instruction in a high-level language that does the same thing, the compiler can decide how best to get that result without trying to figure out if it's okay to throw away the code you've written. Optimisation is easier and safer.
Furthermore, the bottleneck is often in the programmer's brain rather than the code. If programmers could write code ten times faster, that executes a tenth as quickly, that would actually be a beneficial trade-off for many (most?) organisations. High-level languages help with programmer productivity. I know that it's considered a mark of programmer ability to write the most efficient code possible, but it's a mark of software engineer ability to get the programming done faster while still meeting performance constraints.
Bogtha Bogtha Bogtha
The first mistake: Confusing "compile" performance with execution performance. The job of maping C/C++ code to machine code is trivial.
.NET will make software worse.
I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages.
What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment.
The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant.
If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?"
Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
OK, this is nitpicking but there are some exceptions - I remember that TASM would convert automatically long conditional jumps to the opposite conditional jump + an unconditional long jump since there was no long conditional jump instruction.
This paragraph is complete crap. If you're using a Dictionary API in a so called "low-level language", it's as possible for the API to do the same optimization as it is for the runtime he talks about; and you're still letting "someone else do the optimization".
That's surely true. But the opposite is also true - when you use an immense amount of too complex semantics, they can be translated into a pile of inefficient code. Sure, this can improve in the future, but right now it's a problem of very high level constructs.
Not exactly true I think. Yes, the approach on that page is not standard C, but on section 4 he also talks about some high level performance improvements which are still being experimented on, so...
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
"I've been programming professionally for over 20 years, and for those 20 years, the argument is that computers are now fast enough to allow high level languages and we don't need those dirty nasty assemblers and low level languages."
.NET will make software worse."
The "appeal to an expert" fallacy?
"What was true 20 years ago is still true today, well written code in a low level language tailored to how the computer actually works will always be faster than a higher level environment."
It also means that portability becomes ever harder, as well as adaptability to new hardware.
"If computer science isn't about computers, what is it about? I haate that students coming out of universities, when asked about registers and how would they write a multiply routine if they only had shifts and adds, ask "why do I need to know this?""
It's about algorithms. Computers just happen to be the most convienent means for trying them..
"The problem with computer science today is that the professors are "preaching" a hypothetical computer with no limitations. Suggesting that "real" limitations of computers are somehow unimportant."
With the trend towards VM's and virtualization, that "hypothetical" computer comes ever closer.
"Software sucks today because software engineers don't understand computers, and that's why languages and environments like Java and
Now who's handwaving?
I didn't see anything mentioning that many high-level languages are written in C. And I don't consider languages like FORTRAN to be high-level. FORTRAN is a language that was designed specifically for numeric computation and scientific computing. For that purpose, it is easy for the compiler to optimize the machine code better than a C compiler could ever manage. The FORTRAN compiler was probably written in C, but FORTRAN has language constructs that are more well-suited to numeric computation.
Most truly high-level languages, like LISP (which was mentioned directly in TFA), are interpreted, and the interpreters are almost always written in C. It is impossible for an interpreted language written in C (or even a compiled one that is converted to C) to go faster than C. It is always possible for a C programmer to write inefficient code, but that same programmer is likely to write inefficient code in a high-level language as well.
I'm not saying high-level languages aren't great. They are great for many things, but the argument that C is harder to optimize because the processors have gotten more complex is ludicrous. It's the machine code that's harder to optimize (if you've tried to write assembly code since MMX came out, you know what I mean), and that affects ALL languages.
Every serious hacker should have a play with assember, or even machine code. There is real magic in starting up a uP or uC on a board you built yourself, and making it flash a few LEDs under the control of your hand assembled program. I found a whole new depth of understanding when I built a 68hc11 based board (not to mention memorizing a whole bunch of op-codes). Of course, I'd never want to write a 'serious' piece of code in assembly, and it still amazes me that anyone ever did!
I love these hard definitions of soft concepts. Just because you write down some rules, it doesn't mean we follow them. Any programmer understands roughly what 'high level' and 'low level' mean, but I'm sure we'll all argue over where the boundaries are - they're not well defined. I guess you stopped at 101?
I believe the phrase the faster your program will compile means "the faster the compiler will translate your program into machine-executable code." Apparently the author means "the compiler will generate faster code." He then makes the same mistake again, equivocating between the process of compilation and the quality of the compiled output.
If you can't manage to write a clear sentence defining what topic you're exploring...what else might you be getting wrong?
Key word is "relatively." C is low level compared to languages such as Java and C#, which do a lot of things such as memory management for you.
I got nothin'
as much as development process.
CPU power is available and cheap but time to market is critical. Most of the time, you don't need to do the fastest program ever, but to do a program that works reasonably well and that you can debug easily (some may say it is the same requirement).
C may not be the best tool for any given task but it is a pretty decent swiss army knife that most people know how to use reasonably well.
Disclaimer: I'm not in web devmnt but in embedded real time on DSP. With 8 dedicated ALU (2 mul, 2 add/sub, 2 logic and 2 load/store) running at the same time on the chip, there is still not many good alternatives to C (let the compiler optim and pray) and ASM (massive headhache).
The article addressed this point by mentioning that the definitions of high and low level language are a moving target. Nowadays I think most people consider assembly language to be its own thing, and the low-level classification has now been shifted into a domain that was once described completely by the term high-level. The term "high-level language" has been replaced by the term "programming language."
If you're going to go with the jargon as it's most often used nowadays (which is a perfectly reasonable thing to do), then C would certainly be about as low as you can get without manipulating individual registers - i.e., without being assembly language.
Computer science is no more about computers than astronomy is about telescopes" -- Edsger Dijkstra quotes (Dutch computer Scientist. Turing Award in 1972. 1930-2002)
I see this quote everywhere, and just because it's by some semi-famous academic, nodody questions it and takes it for granted. The quote is utter rubish.
With astronomy you have stars, which aren't man made and thus only scarcely understood and the tools we use to look at them, teleskopes, which are man-made. We understand them.
Computers and Computer Science are both things that are entirely man-made. There is no natural phenomenon that we call 'computer' and a science that studies this natural phenomenon called "computer science". It's all one thing. The quote is rubbish and contains no usefull information whatsoever. On the contrary: the conclusion it draws in abolutely false.
We suffer more in our imagination than in reality. - Seneca
the author is only a couple of years out of college and he is already well on his way to be becoming a professional troll. I see a bright future for him...
need a free COBOL editor for Windows?
People will always argue over 3 things:
1) whether assembly is faster than C
2) whether interpreted languages are faster than C/C++
The real question here is - which type of language does well for your application?
Ultimately C will be faster if a good programmer, who understands the language and the application. However, will he be more productive? I'd never write a third person shooter in python, perl, or java. However, what I might do is add a 3d engine to a python statistics modeling program that's already written in one of those languages. Most people would agree, writing a web interface in C is just insane if you have anything particularly useful you want to write. However, I'll probably write a multi-process webserver in C, just because it makes sense for speed (I know python has a built-in webserver, but there are features it doesn't have. You may be able to write it in python, but will all those features that apache have be fast?).
The bottom line is:
- Define the application you want to build.
- Define your requirements (responsiveness, rhobustness [security,reliability,etc], extendability, deadlines).
- Do a little research with a few languages (just experimentation). Write prototype interfaces in the language, do a little benchmarking, just play with it.
- Make a decision on a language based on what you've found and what's required.
As more high level languages appear (functional languages look very promising), see what those languages have over what's already out there. If it has an applicability to what you're doing, use it.
I'm tired of seeing everyone beat a dead horse. Yes, I know the two arguments:
- X is faster
- Y is just as fast as X, but can do it in less lines of code.
X & Y are different, there's no ignoring it. There's more dimensions to languages than speed and time to market, don't ignore them.
Coding large apps in assembly is usually way beyond the point of diminishing returns in terms of performance.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Low level says what you want the system to do. High level says what you want the language (Via compiler, interpreter etc) to make the system do.
How many people can read hex if only you and dead people can read hex?
Amazingly far back (try the 80s) a professor friend of mine had a marvelous example of compiler-generated code where the compiler had done such an amazing job of optimising register use that you had to trace through more than 20 pages of assembler output with colored markers to trace from where the register was loaded to where it was used.
No way I would ever have the huevos to code that way in assembler. On a RISC machine or (Heaven help us) the Itanic it gets lots worse.
Lacking <sarcasm> tags,
The article later points out that the native version was running slower due to not using optimization options correctly. And later the native version was running 15% faster than the managed version
Copyright infringement is "piracy" in the same way DRM is "consumer rape"
Huh? I would argue that commercially successful (as in boxes sold to Fortune 500 companies and used in production) operating systems have been written in three languages:
* Assembly
* C
* Lisp
Are there any commercially successful OSs written in C++ yet?
(revealing my ignorance and posting flamebait, all in one)
To a Lisp hacker, XML is S-expressions in drag.
I've probably written more assembly than most slashdot readers, and most of what you say is true:
It used to be the case that I could always increase the speed of some random C/Fortran/Pascal code by rewriting it in asm, parts of that speedup came from realizing better ways to map the current problem to the actual cpu hardware available.
However, I also discovered that much of the time it was possible to take the experience gained from the asm code, and use that to rewrite the original C code in such a way as to help the compiler generate near-optimal code. I.e. if I can get within 10-25% of 'speed_of_light' using portable C, I'll do so nearly every time.
There are some important situations where asm still wins, and that is when you have cpu hardware/opcodes available that the compiler cannot easily take advantage of. I.e. back in the days of the PentiumMMX 300 MHz cpu it became possible to do full MPEG2/DVD decoding in sw, but only by writing an awful lot of hand-optimized MMX code. Zoran SoftDVD was the first on the market, I was asked to help with some optimizations, but Mike Schmid (spelling?) had really done 99+% of the job.
Another important application for fast code is in crypto: If you want to transparently encrypt anything stored on your hard drive and/or going over a network wire, then you want the encryption/decryption process to be fast enough that you really doesn't notice any slowdown. This was one of the reasons for specifying a 200 MHz PentiumPro as the target machine for the Advanced Encryption Standard: If you could handle 100 Mbit Ethernet full duplex (i.e. 10 MB/s in both directions) on a 1996 model cpu, then you could easily do the same on any modern system.
When we (I and 3 other guys) rewrote one of the AES contenders (DFC, not the winner!) in pure asm, we managed to speed it up by a factor of 3, which moved it from being one of the 3-4 slowest to one of the fastest algorithms among the 15 alternatives.
Today, with fp SIMD instructions and a reasonably orthogonal/complete instruction set (i.e. SSE3 on x86), it is relatively easy to write code in such a way that an autovectorizer can do a good job, but for more complicated code things quickly become much harder.
Terje
"almost all programming can be viewed as an exercise in caching"
Haskell does OK. But compare to Clean, another pure lazy functional language. Clean blows away Haskell most of the time and competes favourably with C, sometimes beating it.
Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
When Fortran was made, nobody thought that CPUs of 30 years in the future will have vector processing instructions. In fact, as Wikipedia says, vector semantics in Fortran arrived only in Fortran 90.
The only advantage of current Fortran over C is that the vector processing unit of modern CPUs is better utilised, thanks to Fortran semantics. But, in order to be fair and square, the same semantics could be applied to C, and then C would be just as fast as Fortran.
The fact that C does not have vector semantics reflects the domain C is used: most apps written in C do not need vector processing. In case such processing is needed, Fortran can easily interoperate with C: just write your time-critical vector processing modules in Fortran.
As for higher-level-than-C languages being faster than C, it is purely a myth. Code that operates on hardware primitives (e.g. ints or doubles) has exactly the same speed in C, Java and other languages...but higher level languages have semantics that affect performance as much as they can help performance. All the checks VMs do have an additional overhead that C does not have; the little VM routines run here and there all add up to slower performance, as well as the fact that some languages are overengineered or open the way for sloppy programming (like, for example, not using static members but creating new ones each time there is a call).
It was an early example of the MS method of software development: buy out someone who has a viable product and do a much better job of marketing that product.
I maintain that MS has never been much of a software development company but, rather, a software marketing company. Certainly, the vast majority of their "innovations" have been in marketing. MS tends to incrementally improve on other developers software while being very innovative in their marketing of that software.
Lattice C was an early example. Excel is a mid-life example. A more recent example is the Groove Networks collaboration tool. MS recently bought them and will include the Groove in the next version of Office. They pretty much had to do this as Office is pretty stale. Who really needs a newer version of Word for example? And OpenOffice is coming along and is free. The only way to improve the Office product enough to warrent an upgrade was to add serious collaboration capabilities. And, this being MS we're talking about, the only way to do that was to go out and buy serious collaboration capabilities. Now they'll integrate it into Office and market the bejeezus out of it.
I rest my case...
Anyone that considers C to be a "cross-platform assembler" probably has never worked in assembler, and almost definitely hasn't done so on more than one platform.
'C' is only "low level" to those that don't get any closer to the hardware than, say, Visual Basic. Anyone that has programmed in assembly language will assure you that 'C' is quite high level. I'd be willing to accept "mid-level", but in reality once you've worked at the assembly level you will realize that there's very little difference between 'C' and Visual Basic. 'C' and VisualBasic are essentially both high-level languages; but 'C' just seems more intimidating than VisualBasic to the VisualBasic programmer. Those that call 'C' mid-level are probably VisualBasic programmers that think 'C' is intimidating so it clearly can't be a high-level language like VB.
When you write a single line in 'C' and realize that that can correspond to hundreds of assembly language instructions, you realize that 'C' is very much a relatively high-level language. When you try to do floating point math on an 8-bit processor with no floating point instructions, you realize that 'C' is very much a high-level language. When you try to add three numbers and multiply it by a fourth, and you come from 'C', you realize that (1 + 2 + 3) * 4 is a heck of a lot more complicated than you imagined.
The main difference between VB and 'C' is that VB gives you more self-contained packages to let you interact with today's GUI's. 'C' gave you printf which was fine for writing to a terminal. VB gives you all kinds of controls to let you do pretty GUI stuff. The concept is exactly the same, and both are high level.
I say all of this having programmed in assembly language, then Basic, then QuickBasic, then 'C', then VisualBasic, and now almost exclusively 'C' and assembly language in truly embedded systems (embedded != Windows or Linux in a small form factor).
Nah, that's nonsense. Python has string objects, Scheme has continuations. Ruby's still slower.
First-class reentrant continuations and dynamic typing (another major efficiency hog) probably constrain you to, in the best case, the same box as compiled Scheme - about the same as Java.
Perl is not ugly, just really really idiomatic. As with all idiomatic languages, you can't grok what something means if you're not exposed to it. :)
It's just a matter of "if you can't stand the line noise, get out from the code-kitchen!".
Even if I can understand easily Perl code, what I can't really stand is C pointer arithmetic if it steps too far...
I do stuff in embedded space using IAR or GreenHills and gcc. For the most part, the proprietary vendors are losing ground to gcc. The proprietary advantage is shrinking, especially with more modern micros and as gcc improves. For the most part, code that comes out of gcc is no worse than code coming out of IAR or GreenHills. Where the priopritary guys have a real advantage is in better Clib implementations. The Clib, and newlib, that are normally used with gcc are huge and bloaty in comparison.
Engineering is the art of compromise.
What fundamental design flaw -- that malloc() is less convenient to use in C++? For crying out loud, use new!
Ok, void pointers are less useful in C++ than in C. In my experience, that has been a non-problem. But then I've never tried to program in C with a C++ compiler -- I have enough problems without creating artificial ones.
Unfortunately, just because a new generation is growing up doesn't mean we'll want to rewrite absolutely everything. It'd be much better if things were developed rationally as soon as possible -- that reduces the total amount of legacy c/c++ code which will ultimately have to be rewritten later.
Besides, it's not a new concept, and if this generation of programmers didn't get it, neither will the new generation, because among the very first generation of programmers were people who understood Lisp machines. Of course, if a new generation really does start using mostly Ruby when the current one can't handle Lisp, we'll know it was those darned parentheses. Just as any sufficiently advanced technology is indistinguishable from magic, any sufficiently advanced language is indistinguishable from Lisp.
It will be funny to see this turned on its head, if there are ever enough, say, Python or Ruby programmers to improve python/ruby compilers/runtimes to where, a couple generations of processors later, it's C that has a lack of optimizations and is actually farther from the hardware. We may actually see a C virtual machine as a necessity!
More practically, I try to work with languages that suit the task at hand, which is really never C unless I'm dealing with a huge existing C codebase.
Don't thank God, thank a doctor!
Oh, hell no.
.NET.
Java feels way slower than anything else. My college courses were mostly in Eclipse. It runs fast enough, but it takes forever to start, which is true of many, many Java apps.
Which means that when these same programmers end up learning C/C++, they'll think Java is slow because it's "interpreted". I guess there's at least the hope that they'll wind up using C#, and thinking Java is slow because it sucks. Which is good enough, because Java sucks for other reasons, even though it isn't really slow.
But really, with Generics, Java has basically picked up most of the features and syntax of C++, added garbage collection and much more anal retentive restrictions, and called it a whole new language. The bytecode and virtual machine is really not relevant to the awfulness of the language itself -- you can write a perfectly good language for the JVM -- but the JVM, specificalyl, has its own drawbacks, in that it's hard to write more libraries for Java, and many of the existing libraries suck in profound ways compared to C/C++ alternatives, or even
Frankly, the only good thing about them learning Java is that at least for awhile, their code may be portable, because it's so hard to make OS-specific or arch-specific Java.
Don't thank God, thank a doctor!
the problem of yum is in the design of the application, not the language.
... that i have seen on that frontier. proper indexes and logical stops would make it much faster. there's your chance to write it. choose whatever language you want. design has to be good.
python is fast enough for almost any package management quest, but yum is the worst piece of
I'd tell you the chances of this story being a dupe, but you wouldn't like it.