The End of Native Code?
psycln asks: "An average PC nowadays holds enough power to run complex software programmed in an interpreted language which is handled by runtime virtual machines, or just-in-time compiled. Particular to Windows programmers, the announcement of MS-Windows Vista's system requirements means that future Windows boxes will laugh at the memory/processor requirements of current interpreted/JIT compiled languages (e.g. .NET, Java , Python, and others). Regardless of the negligible performance hit compared to native code, major software houses, as well as a lot of open-source developers, prefer native code for major projects even though interpreted languages are easier to port cross-platform, often have a shorter development time, and are just as powerful as languages that generate native code. What does the Slashdot community think of the current state of interpreted/JIT compiled languages? Is it time to jump in the boat of interpreted/JIT compiled languages? Do programmers feel that they are losing - an arguably needed low-level - control when they do interpreted languages? What would we be losing besides more gray hair?"
When your web-based-datastore gets 50,000 inserts per second, hovers between 15 and 20 billion rows and endures a sustained query rate of 43,000 queries per hour, tell me which part of it you want to coded in PHP.
As the overhead of interpreted languages gets smaller (through faster systems, JIT, and other optimizations), its inevitable that eventualy we'll all be using one (unless you are one of the few people who have to program the virtual machines, the JIT compilers, etc).
And this is a good thing, because it means more independance from certain CPU architectures.
Someday, you will be able to use any OS on any CPU and any Application on any OS. This is one step in that direction.
-- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
(a) 'loosing': oh jesus christ
(b) the obvious answer is that native vs interpreted is basically simply the balance of developer cost versus cost of end-user resources (ram, cpu, users time). Interpreted code is getting faster every day, no matter what "OMG JAVA IS SO SLOW DUDE" geniuses on the interweb tell you, but there'll always be problem spaces where a 5% speedup pays huge dividends.
there is no need to sign your posts. this isn't usenet. your username is right there above your post. stop it.
Yeah... people keep saying that. Okay, lets take the benchmark I hear about most: http://kano.net/javabench/ "The Java is Faster than C++ and C++ Sucks Unbiased Benchmark". Unbiased my foot. "I was sick of hearing people say Java was slow" is not a good way to start an unbiased benchmark. Lets have a few more problems:
Y'know, I think there's a reason for that...
Y'know, a couple of decades ago I was running non-native applications on a 7Mhz system with 1MB RAM (my old A500). They were fast, but not quite as fast as native. I'm now using a system in the region of 500 times faster, in terms of raw CPU, and with 2,048 times more memory. And y'know what, non-native stuff is fast, but not quite as fast as native. Something about code expanding to fill the available CPU cycles, methinks...
Don't you hate that answer?
Yes, we are seeing more development in non-native code but, it gets it's power from the underlying libraries and core code that is native. The line between them gets fuzzy when you toss in JIT and scripting to native code compilers. It really depends on the problem area. If I'm just parsing apart a bunch of log files to make reports Perl or Python would be the best. Web apps seem to benefit from the safety net of non native code but I'm sure there are exceptions.
OTOH there are plenty of apps that need all the speed and memory the machine can provide. My current job involves real time financial data delivery. Writing that in Python or Java would (probably) not work out too well. OS code that works directly with hardware will probably stay in assembler or C. Fast low level stuff is what allows the slower high level stuff to be useful.
Either way you still need to know what you're doing because in the end both native code and interpreted code run as opcodes on a CPU and use hardware resources. You need to mind memory use in Java just like C. Just in different ways. You've need to watch what you do in inner loops in both Python and C++. Linear lookups can cause scaling problems in Perl, Java, Python or C/C++.
It all depends on how fast you want to get from problem to solution, how much hardware you can throw at it, how complicated the problem is, how much time you have and many other factors.
Languages are tools, not a religion. The broader your knowledge the more tools you have at your disposal. Pick the best one for the job at hand.
The obscure we see eventually. The completely obvious, it seems, takes longer. - Edward R. Murrow
Ok, assuming the post isn't flamebait... This issue keeps coming up. A good programmer should understand that the language choice depends on the task at hand.
If you're making a pretty GUI, you may want to use an easy-to-use and portable language and may not care about performance as much. If you're creating a high-performance backend, or doing some realtime processing, an interpreted language is practially useless.
Before deciding which paradigm is superior, you must narrow down the question to a type of task. Since the variety of tasks we use software for does not seem to be shrinking, it seems that this issue will not be resolved decisively anytime soon.
You seem to be under the impression that these problems you cite display inadequacies in the hardware, rather than the software. But, in the words of some fictional professor from a book I can't remember: "If you speed up a dog's brain by a factor of a million, you'll have a machine that takes only three nanoseconds to decide to sniff your crotch." Given the current software and algorithms available, more computing power alone wouldn't solve any of the problems you describe.
You want the truthiness? You can't handle the truthiness!
Interpreted & JIT languages are "within a constant factor" of native code's speed, and CS students are taught that such things don't matter. ;-)
And for many types of apps, they really don't. Ten times slower than instantaneous, is instantaneous.
But people use computers for lots of things, and believe it or not, some of those things are still CPU-bound, and take so much work that humans can perceive the delay. Your word-processor is 99% idle so surely it doesn't need to be native, but you know that somewhere on this planet, a poor shmuck is staring at an hourglass icon, waiting for a macro to finish. The real question is: who cares? Is that guy's time worth more, or is the programmer's time worth more?
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
but the results are incomparably better
By what metric? Expressiveness? Ease of implementation? Ease of maintenance? Error rate? Because, last I checked, low-level languages like C fail on all those points compared to a higher-level language.
The problem isn't native-code vs interpretive code. It's that our native code languages are terribly flawed.
Programming backed itself into a corner with C and C++. They're useful languages, but they're not safe. Now this has nothing to do with performance; you can have safety in a hard-compiled language. Ada, the Modula family, and the Pascal/Delphi family did it. The problem is that, because of some bad design decisions in C (the equivalence of arrays and pointers being the big one), you have to lie to the language to get anything done. This makes safety hopeless. The basic problem of C is that you have to obsess on "who owns what" for memory allocation purposes, and the language gives you no help with this. The language doesn't even adequately address "how big is this". With those two design defects, we're doomed to have memory safety problems. Which we do.
C++ at first seemed like an improvement, but as it turned out, C++ adds hiding to C without improving safety. Note that this seems to be unique to C++; no prior language did that, and no language since has taken that route. Attempts have been made to work around the problem within the structure of C++, but with limited success. The "auto_ptr" debacle and the endless problems of trying to make sound reference-counted allocation work reliably indicate the fundamental limitations of the language. You just can't fix those problems in C++ without breaking backwards compatibility. (See my postings in comp.std.c++ over the last decade for more details on this.)
Java was invented mostly to get around the memory safety problems of C and C++. The fact that Java is usually semi-interpretive has nothing to do with the language design; that's a consequence of Sun's original focus on applets. There are native-code compilers for Java; GCC contains one. There are competitive advantages of locking the user into a giant environment (J2EE in the Java world, .NET in the Microsoft world), which is part of why we're seeing so much of that. But it's not a language design issue.
Microsoft came up with C# as their answer to Java, and most of the same issues as with Java apply.
What's so embarassing about all this is that it's quite fixable. The solutions were known twenty years ago. If you have a language where the language knows how big everything is, and the subscript checks are hoisted out of loops at compile time, you get safety with high performance. There were Pascal compilers that got this right in the 1980s.
On the allocation front, you can use either garbage collection or reference counting to automate that process. Java and C# are garbage-collected; Perl and Python are reference-counted, and in practice, programmers in those languages seldom have to think about memory allocation issues. Allocation overhead can also be hoisted out of loops. Java compilers are starting to do this, allocating temporary variables on the stack. Reference count updates can be optimized similarly. There's nothing to prevent using these techniques in a native-code compiler.
And that's how we got to where we are today, with buffer overflows, zombies, and blue screens of death, papered over with a layer of inefficient interpreters. Fortunately the hardware people have held up their end and made it possible to live with this, but we on the software side should have the understanding and grace to be embarassed by it.
One guess where 99% of the ccycles arae in that
I'll take a guess! And it's even the one you want me to guess. The db2 instance. That's the fucking *point*. The fast C code that's executing has already been written.. some of it is in the python interpreter, some it is in the ksh and php interpreters, most of it is in the db2 interpreter. Very fast algorithms doing what they do best: optimized, super fast loops operating on static types.
That is WHY python and other interpreted languages achieve the speed they achieve.. because what they do is allow you to glue together C code written by other people. And, because the Python code is much simpler, you can understand the interactions between the fast code more easily, and see where your code fails to perform well. It's always because you're putting loops together inefficiently and making poor design choices, not because of the speed of the interpreter--and now that your code is short enough for you to see that, you can fix it.
Your application logic doesn't need to be super fast. It needs to be super agile, so you can refactor and accommodate changing requirements and make smart decisions about which pieces you are going to use and how you are going to use them together.
C won't die, at least, not for a long, long time*, and that doesn't bother me, a hardcore Python programmer, in the least. Somebody has to do the dirty job of writing those fast loops. Meanwhile I'll be here zipping through the application implementation.
*It will eventually be replaced by Pyrex, of course.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.