High-level Languages and Speed
nitsudima writes to tell us Informit's David Chisnall takes a look at the 'myth' of high-level languages versus speed and why it might not be entirely accurate. From the article: "When C was created, it was very fast because it was almost trivial to turn C code into equivalent machine code. But this was only a short-term benefit; in the 30 years since C was created, processors have changed a lot. The task of mapping C code to a modern microprocessor has gradually become increasingly difficult. Since a lot of legacy C code is still around, however, a huge amount of research effort (and money) has been applied to the problem, so we still can get good performance from the language."
Well, we ran our own tests. We took a sizable chunk of supposedly well-written time-critical code that the gang had produced in what was later to become Microsoft C [2] and rewrote the same modules in Logitech Modula-2. The upshot was that the M2 code was measurably faster, smaller, and on examination better optimized. Apparently the C compiler was handicapped by essentially having to figure out what the programmer meant with a long string of low-level expressions.
Extrapolations to today are left to the reader.
[1] I used to comment that C is not a high-level language, which would induce elevated blood pressure in C programmers. After working them up, I'd bet beer money on it -- and then trot out K&R, which contains the exact quote, "C is not a high-level language."
[2] MS originally relabled another company's C complier under license (I forget their name; they were an early object lesson.)
Lacking <sarcasm> tags,
We had to make a change to our 'comments' table schema that would have locked up the site if we had allowed full access. At over 15M rows, this takes some time. Sorry about that.
I don't believe this as much as the people who I see repeating that sentence all the time...
Not many years ago (with gcc), I got an 80% speed improvement just by rewriting a medium sized function to assembly. Granted, it was a function which was in itself, half C code, half inline assembly, which might hinder gcc a bit. But it's also important to note that if the function had been written in pure C code, the compiler wouldn't have generated better code anyway since it wouldn't use MMX opcodes... Last I checked, MMX code is only generated from pure C in modern compilers when it's quite obvious that it can be used, such as in short loops doing simple arithmetic operations.
An expert assembly programmer in a CPU which he knows well can still do much better than a compiler.
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
Take Quake II for instance; as quoted from the article 'the managed version initially ran faster than the native version' - which would suggest higher-level languages are certainly capable of comparing to that of their lower-level siblings.
Also, take into account the added developer time gained from factors like memory-management being, well, managed, and ever-falling processor & memory prices, and the logical conclusion is usually "write at a higher-level".
There are of course more considerations than these when deciding on a development platform, but essentially, I think there'd have to be very good reasons for writing green-field projects too close to the machine.
throw new NoSignatureException();
Here's a print view of the article so that you don't have to keep moving through the pages. Despite that annoyance, it was a good article. I wish there had been more concrete examples though.
Who said Freedom was Fair?
The criteria for a high-level language are: 1) you aren't allowed to do direct memory register manipulations (i.e. cant run of the end of an array into other areas), and 2) you are interpreted. Either of these can qualify a language as high-level. C has direct memory register manipulation and it is not interpreted, therefore it cannot be a high-level language.
stuff |
It seemed to me the article was criticising C and trying to compare Java favourably. ie, C is a low level language that canot be optimised, Java is a high level language that can. roughly.
:-) )
It didn;t say much at all otherwise, but it did have a nice collection of adverts.
Optimisation:
You don't have to hack around, some compilers do it for you. The new MS compiler does a 'whole program optimisation' where it will link things together from separate object modules. Still cannot handle libraries, but then, that's just an issue that applies to all programs that are split into component parts. (except as the article implies, java that uses the bytecode in class libraries... except when compiled to native code as the first page of the article mentioned as a way to boost speed. Can't have it both ways
The main reason C is "faster" than high level languages is because C doesn't cover bad programmers' butts with elaborate type checking, ref counting and garbage collection. Take a properly designed C app with graceful error handling and secure inputs, and you will take a performance hit. Let's face it, most of the code we write in C involves error handling and idiot-proofing, things that most high-level languages have built-in functionality for these boring, repetitive slabs of code we all hate writing.
I see no reason why a high-level application couldn't be compiled as skillfully as a feature-equivalent low-level application. It's just a matter of breaking down the code into manageable building blocks.
-Billco, Fnarg.com
20 years ago there was nothing strange about having an actual quicksort machine instruction (VAXen had it).
While the VAX had some complex instructions (such as double-linked queue handling), it did not have a quicksort instruction.
Here is the instruction set manual.
)9TSS
No, what they say is "the proof of the pudding is in the eating." (Just pointing it out because most people get it wrong.)
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
"If programmers could write code ten times faster, that executes a tenth as quickly, that would actually be a beneficial trade-off for many (most?) organisations."
This sound perfectly reasonable in theory. In practice, however, it's not. Users want speedy development AND speedy execution. I developed a Java image management program for crime scene photos, and the Sheriff Patrol's commander told me flat out: we'll never use this. It's too slow.
I rewrote the program using C++ and Qt, and gained a massive speed improvement. The Sheriff Patrol and detective units have been using it ever since, and they love it. I had been a Java booster for upwards of eight years until then. That was (roughly) three years ago, and I haven't written a line of Java since. I have, however, run my historic Java programs in SUN's most recent JVM. The newer hardware runs it faster, but Qt/C++ still smokes Java. Qt gives me speedy development, and C++ gives me fast execution. It's the best of both worlds.
Badly researched to the point of being irresponsible.
1. Unsupported implication that 'C' was created in response to PDP-11 assembly language.
2. Using vector attached processors as evidence of HLL obsolescense. First, the Altivec/MMC unit is not the entire processor, it doesn't even do most of the work, it's an *attached* unit. There is still a main MPU to do the spaghetti code. Second, they are easily used by HLL's via optimized LIBRARIES, that's the beauty and breakthrough of 'C' that has become a model for HLL's.
3. JIT examples fail to include the runtime of the JIT compiler itself. The program may speed up by 10%, but running the JIT before the program will blow that time out of the water.
4. Article totally ignores the "RISC revolution" of the 80's where processors were actually designed based on HLL's, designed specifically to speed them up, acting in consort with the compilers & linkers. This concept is now old hat. Maybe the author wasn't born yet.
Need I continue??
I thought it might be helpful for a current student to let you know what it is we learn today at my college. I'm a senior Software Engineering major, not a comp sci major. Comp Sci is another department and has a totaly different focus. They focus on super efficent algorithms, we focus on developing large software projects.
My software engineering program has been very Java intensive. My software engineering class, object oriented class, and software testing class were all java based. We dabbled in C# a bit as well.
However, I also had an assembly class, a programming languages class where we learned perl and scheme(this language sucks) and about five algorithms classes in C++. I also had an embedded systems class in both C and assembly(learned assembly MCU code, then did C).
I feel like this is all pretty well rounded; I've learned a bunch of languages and am not really specialized in one. I'd say I am best at Java right now, but I can also write C++ code just fine.
I've never been told a computer has any kind of crazy limitless performance. In embedded systems, I learned about performance. Making a little PIC microcontroller calculate arctan was fun(took literally 30 seconds without a smart solution). I also learned that there is a trade off between several things such as performance, development time, readability, and portability.
We are taught to see languages as tools, you look at your problem and pull a tool out of the tool box that you think fit the problem best. You have to weigh whats important for the project and chose based off of that.
The final thing I'd like to point out is that one huge issue with software today is it is bug ridden. How easy something is a test makes a big difference in my opinion. Assembly and C will pretty much always be harder to test than languages like Java and C#.
I don't think the universities are the problem, at least not in my experience.
One interesting feature the compiler/IDE system I was using at the time (TopSpeed's) had was this concept that all their language compilers (M2, C, C++, etc) all compiled into an intermediate binary form, and their final compiler did very heavy optimizations on that "byte code".
That's no different to most compilers. GCC for instance parses the "frontend" language (C, C++, etc) into an intermediate language and performs most optimisations on that intermediate language before translating it to assembler instructions. Optimisation can be performed in the high level language, and even the assembler, but most is performed at the intermediate level as this way all frontends can potentially benefit.
The more recent versions of GCC also perform transformations on a tree-based intermediate form, before converting that into the older RTL form. There are certain high level optimizations that just work better on abstract syntax trees.
C became popular because of Unix. Since you could get the source code for Unix most big universities used Unix in there OS courses. And since it was written in c you where going to learn C if took Computer Science. Textbooks started to assume you knew c. Magazines started to assume you knew c. People wrote free small c compilers and then came GCC, so now you could have a good free c compiler for just about any system. But before GCC all the buzz was about Smalltalk. Smalltalk was the future. OOP was going to replace structured programing. The problem was very few people has a computer that could run Smalltalk. So C++ was born.
A final blow to Modula-2 was simply Borland didn't create a Modula-2 compiler. For many years when you said Pascal you reall meant Turbo or Borland Pascal. Borland was the Pascal company and they add objects to pascal and eventual created Delphi.
I am sure Topspeed has closed up shop. There just isn't much room for compiler makers anymore. You have the free software at the bottom end and the Microsoft Monster at the top. Only a few niche players are left. Ada seems to be a place where a good compiler company can still make a few dollars.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
As someone else mentioned, there is no quicksort instruction. That's far too complex and involves looping and conditional branching. Probably the most complex of vax instructions was the polyf/polyg instruction, which would compute a polynomial to 7 iterations thus allowing one instruction to compute a trigonometric function. There were also instructions for copying strings up to 64k (and those instructions were interruptable), and instructions to format numbers a la cobol pics. These instructions were generally emulated in the smaller microvaxen and such, but were in microcode on the larger ones. Note that even x86 has a string copy instruction.
Now, here's where you're really wrong. Those instructions weren't put in there as a convenience to humans writing in assembly. Instead, they were put in there as a convenience to compiler writers who could make use of the high-level assembly instructions to ease their code generation. The cobol compiler was almost unnecessary. They had numeric data types to cover it, it was nuts.
They also had instructions to deal with octawords (128 bit integers), and of course the vax allowed accesses of any size integer on any boundary, which could mean a couple of fetches for a particular piece of data. There are assembly instructions to force alignment.
The only non-magic of which I'm aware is that it was "required" that between writing a piece of code into memory and executing it there should be an intervening rei instruction, apparently to clear all caching. I put the word "required" in quotes for a reason. A professor at a college that I attended wrote a very popular Scheme compiler. I mentioned one day to a grad-student friend this requirement, and somehow we ended up getting to the prof. He didn't have that in his compiler and it worked just fine writing to a piece of memory then executing it. I showed him the page in the VAX Architecture Handbook (probably around 276 or 278) and we got a good chuckle.
Anyway, shortly after VAX came out people started to seriously think about simplifying the instruction set and putting more burden on the compilers. I still believe the Alpha is probably the king of risc, ironic given that VAX is the king of cisc. Most of the lessons that VAX taught us were in the negative.
Do you have ESP?
You forgot "CONS" which comes from the IBM cons cells (a 36bit machine word on the 704), which is the block holding both a CAR and a CDR.
The thing is, the names only existed because no one found any better name for them, or any more interresting name (Common Lisp now offers the "first" and "rest" aliases to CAR and CDR... yet quite a lot of people still prefer using CAR and CDR).
LISP has always been a high level language, because it was started from mathematics (untyped lambda calculus) and only then adapted to computers.
And the fact that Lisp Machines (trying to get away from the Von Neumann model) were built doesn't mean that Lisp is a low level language, only that IA labs needed power that the Lisp => Von Neumann machines mappings could not give them at that time.
Lisp is a high level languages, because Lisp abstracts the machine away (no memory management, not giving a fuck about registers or machine words [may I remind you that Lisp was one of the first languages with unbound integers and automatic promotion from machine to unbound integers?])
"The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
No, what they SAY is "The proof is in the pudding" --
From google:
Results 1 - 10 of about 326,000 for "the proof is in the pudding". (0.47 seconds)
Results 1 - 10 of about 118,000 for "the proof of the pudding is in the eating" [definition]. (0.30 seconds)
They're not right, of course, but then, sadly, you're not either, since what people say has changed. It's changed to something nonsensical, which people quote without understanding, which is annoying, like "I could care less!":
Results 1 - 10 of about 2,180,000 for "I could care less". (0.28 seconds)
Results 1 - 10 of about 776,000 for "I couldn't care less". (0.22 seconds)
But "the proof is in the pudding" kind of rolls off the tongue better... like a pudding which tastes nasty and you are therefore gently, but suavely, spitting out.
Most truly high-level languages, like LISP (which was mentioned directly in TFA), are interpreted, ...
... and the interpreters are almost always written in C. It is impossible for an interpreted language written in C (or even a compiled one that is converted to C) to go faster than C.
Programming languages are not "interpreted". A language IMPLEMENTATION may be based on an interpreter. Every major implementation of Common Lisp today has a complier, and most of them don't even have an interpreter any more - everything, including command-line/evaluator input, is compiled on-the-fly before being executed.
Again, this is a property of implementations, not of languages. The highest-performance Common Lisp implementations have scaffolding written in C and assembly, but they do not use a C compiler when they compile Lisp code. They often use non-C ABI conventions for argument passing and stack handling, to make their style of function calling faster.
I don't mean to be harsh, but the "Lisp is slow because it's interpreted" meme is about twenty years out of date. It tends to be spread primarliy by college professors whose last exposure to Lisp was pre-1980, and it really grates on those of us who know better.
To a Lisp hacker, XML is S-expressions in drag.
I agree. Why should we give any weight to the sayings of some random guy. What the hell would he know know about computer science?
The quote is rubbish and contains no usefull information whatsoever. On the contrary: the conclusion it draws in abolutely false.
It seems to me that you are good example of the type of person that the OP was complaining about (ie. not knowing much about computer science). If you read about the history of computer science you would see that it started as a pure mathematical discipline that just happened to use computing devices because the algorithms were too complex to be solved quickly by hand. The early computer was just a tool that made things easier for mathematicians, much like a telescope for astronomers. Of course, modern computer science focuses much more on algorithms specifically related to computer functions like disk caching, task scheduling, etc. So Dijkstra's comment may not be as relevant today but at the time he said it was pretty accurate.
It's possible to say everything siad in this article -- vaugely, as it is said in this article -- and be right, and yet still dance around the reality.
Take a look yourself on http://shootout.alioth.debian.org/
C's faster than Java. It will probably always generally be so, unless you're trying to run C code on a hardware Java box.
This article says Java, for example, CAN be faster. But it doesn't say "C is almost always faster than Java or Fortran, usually faster than ADA, and C can be mangled (in the form of D Digital Mars, for instance) to be faster than C usually is. Often, Java is a pig, compared to C, BUT THERE ARE TIMES WHEN IT ISN'T. Really. There are times, few and far between, when it's actually, get this, FASTER. It's fun to look for those few times. And if you write programs which do that, that'd be cool. And as processors get wackier and wackier, there will be more and more times where this is true. Meanwhile, if your developers write good code, Java's easier to develop in and debug." Which would be more completely correct.
Excuse, me, now. I have to go back to my perl programming.
Yay. With continued displays of attitudes like that, I'm going to leave the industry.
It is getting increasingly difficult to hire S/W engineers that understand that there is an operating system and also hardware beneath the software they write. I need people NOW that can grok device drivers, understand and use Unix facilities, fiddle with DBs, write decent code in C, C++, Java, and shell, and can also whip together a decent WS interface. Someone who does all of those.
WhyTF has the S/W industry become so compartmentalized? I can hire a device driver person, but he won't know anything about web services. I can hire a DB person, but she won't know a damn thing about poking values into registers. I can hire a web-services person, but he will have never worked on a Unix platform before. WTF? Really, WTF?
In short, I can't hire someone who can take ownership of an entire system. It's always, "Well, that's a hardware thing, go ask Foo", "Oh, it looks like the database, need to talk to Bar", "The Web interface is borked, we'll need to bring Baz in", "Hm, it doesn't do this when we run it on Windows" (this one always pisses me off, because they can never explain why, and that's because they know nothing about Unix). How come I can't hire someone who could understand a whole vertical stack (and maintain it, and provide analysis and fixes when something breaks)?
I do this kind of thing now. If I can do it, it can't be that hard. But everybody thinks they have to specialize. THIS IS WHAT'S WRONG WITH THE INDUSTRY.
In the course of every project, it will become necessary to shoot the scientists and begin production.
I'm not sure that this has to do with a low-level/high-level language debate any more. Consider, for example,
t m
that C++ offers both very low and very high-level semantics. When properly used, this yields high level
programs with excellent performance.
But, so what? Neither C++ today, or any other very widely-used programming language adequately manages the
real problem, which is concurrency.
Herb Sutter has written an excellent paper on this topic, called "The Free Lunch is Over". Let's get off this
hobby horse and on to some real (and interesting) problems!
Here is Mr. Sutter's article: http://www.gotw.ca/publications/concurrency-ddj.h
The quote is utter rubish. ... With astronomy you have stars, which aren't man made ... Computers and Computer Science are both things that are entirely man-made. There is no natural phenomenon that we call 'computer' and a science that studies this natural phenomenon called "computer science".
Not. Even. Wrong.
If astronomy was called "telescope science" you'd also forget that it was about ways of looking at the skies. Computers are more flexible that that - they are used to model and study all kinds of natural phenomena. Algorithyms are strictly speaking mathematics, which is a feature of the universe and not "man made" if anything ever was. Computers are used to store and manipulate data about all kinds of things, most of which are not about computers. learning how to do all that is computer science.
My Karma: ran over your Dogma
StrawberryFrog
C was a reaction to both BCPL (via the language B) and PL/I. The UNIX designers, Thompson, Ritchie, et. al., came from the Multics project, and Multics was mostly written in PL/I with some BCPL. Most (but not all) of the BCPL code was written by the Bell Labs members of the Multics team. BCPL was a completely type-less language. C introduced a few rudimentary types.
By contrast, PL/I had a much more complete type system, although it was not even close to "strongly typed".
PASCAL was still very very new when C was designed.
In particular, PL/I strings and arrays were first class data types with compiler-known lengths,
and buffer overflows were MUCH MUCH less common. (not impossible - just much less common).
Full PL/I was an enormous language and hard to compile, but the ANSI G subset was actually quite
reasonable and not hard to compile for. The DEC PL/I (ANSI G subset) and C compilers for the VAX used the same code generator back-end (written by Dave Cutler who also designed RSX-11/M, VMS, and Windows NT), but the PL/I compiler produced better code for string and array handling, precisely because the compiler knew more about what the programmer actually intended. It could take better advantage of the VAX instruction set, particularly for strings of maximum known length. String instructions, such as on the VAX or the IBM System/360 could easily handle PL/I strings, but null-terminated C strings were much harder to compile for. This is not surprising, since IBM designed PL/I as a language for the System/360.