As Languages Evolve...
naph writes "It seems that as programming languages have developed there has been a steady increase in the level of abstraction they use. Early languages were all very low-level, but successive generations have become higher and higher. Is this trend going to continue, or do you think we've reached a kind of happy medium between power and abstraction? Would developers prefer higher level languages, or is the direct control of things good? I was just wondering what other developers out there thought of this."
As I recall from my software engineering class, programmers program at the same rate in lines of code regardless of the language (I believe IBM did the study in the 80's, but dont quote me on that). Therefore, programming languages SHOULD be more abstract to increase productivity. It also comes down to the "reinventing the wheel" factor. The more bug-free features/libraries we can stuff into a language, the more we can produce bug free code quicker. The only problem is of course that abstraction comes at the cost of speed. How much more enjoyable is it to program in java and not have to worry about cleaning up memory than say C or even assembly where everything is a battle. I dont know about you, but I would much rather type create_new_window() than worry about framebuffers and things of that nature. Hopefully this can be accomplished while keeping speed up and code bloat down
Two words: Moore's Law. Even heavy abstraction will not keep up with the speed increases from that. And hardware is cheap. IT guys that program app servers aren't. Instead of paying an IT crew a lot of money to optimize a server you could just invest in more/better hardware. And size is not an issue any more as far as code size goes. The data most programs sift through is usually the only thing to consider as far as storage goes, since usually the data is so much larger than the program anyways.
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
I think the hard tie to a single langauge for a project is slowly going away. I think you are going to see more and more projects done in 3+ langauges. You built your first revision in a scripting langauge (python), and have it calling your database (PL-SQL), then once most everything is working you will go in and check how it preforms. You profile the slow bits and port them over to a quicker langauge (C++) using a tool to help you tie it all together (SWIG).
I could have written this little made up example using (perl), (xslt+xml), (C), (h2xs) or one of a dozen other combos.
The power of using a scripting langauge as a major component of your project is you get rapid prototyping, and easy extendability. The advantage of using a lower level langauge is speed and "access" to APIs of hardware you might need. Why anyone would feel the need to limit themselves to one group is beyond me.
The progressive abstraction of computer languages slow to a creep a long time ago. OO has been around very a very long time, just not neccesarily in the form of C++. Essentially the CS world has settled on some mutual understanding the the range of abstraction around C++, Java, and Perl is a pretty good place to be depending on how OO and whatnot you want to be (and of course we will always have the ever-enduring C for simpler and systems programming), and we can't seem to come up with anything better that's got more useful abstraction than that.
There's been a dream of a useful and successful 4GL for many many years now, and from time to time someone claims they've done it, but it's a shoddy system that isn't flexible enough and too proprietary (comes with it's own crappy OS just for that programming language, etc). 4GL (4th generation language) is supposed to supremely abstract away the need for code altogether, or at least try to. In my idea of a proper 4GL, programming would consist of composing one well-structered XML document describing the objects your problem domain deals with, what they can do, and your business rules for dealing with them. It should be something a non-technical person who understands the business can write with a little help for a helper gui. From there 3GL (C++, Java) source code, GUI elements in whatever, middleware servers, database design and sql code, should all spring forth on it's own. But like I said, so far 4GL has been a pipe dream, we seemed to have reached a point where it's going to be very difficult to get much further without figuring out true-AI first, which is some ways off.
11*43+456^2
On the contrary, I believe it is you who is being clever or naive. Embedded programming, device drivers, etc. account for maybe one tenth of one percent of all source code written today. I fear that I may even be overestimating that number.
For smaller projects, using low-level facilites can get you big gains. Even for larger projects, optimizing the hotspots in code with low-level constructs from within a high-level framework can show great results. In fact, this is how many high-level languages are implemented.
However, while the human mind is the best code optimizer out there, it is also the most frail and inconsistent. While you can make very tight routines in Assembly on a good day, what happens when you are sleep deprived, up against a hard deadline, and stuck trying to figure out why the program keeps crashing? You know, in the real world. Compilers, while maybe not producing the absolute best code for a particular instance are very consistent about producing pretty damned good binary output billions of times 24 hours a day/7 days a week.
Code production time is also a factor. If you are working on a project that's a few tens of thousand lines of code, Assembly language -- with a competant code author behind it -- can show amazing results over the Python version (for example). But the Python version was finished and debugged days or weeks before the Assembly language version was code complete.
Your C++ example is a bit misleading though. Chances are that you were looking at the assembler's output of iostreams. iostreams implementations, while getting better, are not anywhere near the small size of stdio.h or raw assembly output to the console. Then again, iostreams is far more portable and flexible than either of the previous two. It's a tradeoff, just like everything else in the world: convenience/specificity. Take out iostreams and replace it with a home-grown implementation or stdio.h and you'll notice some "tighter" code.
In addition, compilers keep getting better. A good optimizing compiler is nothing to sneeze at nowadays. As a whole, compilers are measurably better than five years ago, and worlds better than they were fifteen years ago. Some of the best human minds are writing general code optimizers out there today.
Add in the final tidbit that assembly isn't portable. If you are targeting a particular embedded platform with strict space requirements -- a small minority of all development projects out there -- C with Assembly fits the bill. As soon as your platform changes because a vendor went under or requirements demand a faster processor or whatever, all of that Assembly is basically useless. You might be able to use some of the same general algorithms, but you're basically talking about a rewrite. Then again, if you wrote some Assembly code that is more generic, it's not heavily optimized is it? It also doesn't work too well if your target includes multiple platforms from the start.
Want to write a portable, network-aware program? If you use C, be sure to eat your Wheaties in the morning, because you're gonna have to spend a while typing in all of the #ifdefs. But you'll just make it clean and compile for Linux, FreeBSD, Solaris right? What if Windows or BeOS are requirements for your project? #ifdef #ifdef #ifdef
OR!
You could write it in Perl, Python, Java, or any of the other "dirty" high level languages and worry about those clock cycles after you've profiled it and seen the need. This of course doesn't remove the need for proper design before you start, but we were talking about implementation.
Remember: Premature optimization is the root of all evil.
- I don't need to go outside, my CRT tan'll do me just fine.
Computers double in complexity every 18 months. Programmer productivity doesn't. The only way to get programmer productivity to keep pace is by augmenting programmer intelligence with computer intelligence.
Only small and simple processing platforms are good candidates for an all-Assembly solution nowadays.
Show me someone who can effectively handle all of the ins and outs of modern Athlons, P4s, Xeons, and G4s, and I'll show you someone who's smart enough to spend their talents on writing compilers instead of individual applications.
I'm sorry, but normal humans cannot beat optimizing compilers on modern processors. Coding to the metal on the high end isn't a viable option anymore.
- I don't need to go outside, my CRT tan'll do me just fine.
Not always, if the constant factor cost of the O(N) algorithm is much higher, then for a particular set of N's the O(N^2) will be faster.
While complexity analysis is extremely useful, never forget that ultimately, the individual program, machine and input will draw the line between which algorithm is truely faster.
While it's a general trend that higher abstraction results in lower performance, this isn't always the case. Sometimes you get your cake and eat it too. Templates in c++ are my favorite example... you can get much of the usefulness of virtual classes while not paying as high a price. Ocaml would be another example. It's extremely fast, on par with g++, and yet allows a very high level of abstraction.
I think picking the right tool for the job and the coder/team is most important. This is why java and c++, while covered with warts and sore spots, really are a good thing.
Id rather use some indexing on such large amount of data. Like putting an array out of first 4 letters and use that as index for rest of data. That should give us a 4 Meg index file for first search. [2^(5bitsperletter*4letters)] And that reduces the overall datafile by decent amount by reducing duplicates so its not waste.
If its compressed well it won't save much but anyway.
But I wouldn't use 386 for it as it would require some EXCELLENT variable lenght compression inorder to fit the data in the harddrives for it to seach.
400M names to search for...
If we assume average name lenght is ~10bytes.
[6bits per letter.]
That would result enough less than 4G. P4 with 4G of ram could fit that data in memory. And get it in 0.6 seconds average. 1.3s worstcase.
[Random search.]
Your 386 with binary search, should require
2log(400 000 000)= 29 random disk accesses.
Thats ~1 seconds, with HD delays of the time.
So P4 has better average case while 386 has better worstcase.
If 386 cannot handle enough disks to hold the required data you should add the network delay for the cluster that handles it, and suddenly your 386 with the algorithm is going to loose.
Isn't moore law wonderfull. These days people could get far more ram than 15 years ago harddrive space.
Emacs is good operating system, but it has one flaw: Its text editor could be better.
This is FUD. There is far more consistency in Java compilers than C compilers in the real world. The difference in output of C compilers is far greater than the difference in output of Java bytecodes. As far as debugging system libraries and the language, while in my years of Java programming I have come across bugs in the JVM implementation and even once(!!!) I came across a compiler bug. I was able to work around the problem in all cases. This is impressive considering that I have done Java development on OS/2, Windows, Linux, and Solaris. I have had far more headaches from C compilers (yes, I code in C as well -- learned it years before Java came along) when bouncing from platform to platform than I have ever had from Java bugs. Come to think of it, I've even come up against a C library bug or two. To suggest that higher level languages are somehow tainted in this respect and C or Assembly is the cleaner answer is laughable.
I misspoke. My intention was to say that iostreams are more portable than an Assembly solution and is more flexible than either. You are correct.
Why? Why does Java need a hack that -- irrespective of the language syntax -- makes drop in text segments like a preprocessor? Do you need it for plugging in a native implementation when available and a Java one when not? Or conditional compilation of some functionality? Java doesn't need a preprocessor for that. You can do it in Java and still take advantage of the Java syntax validators -- which is crippled by the use of a prepreprocessor. It isn't about macros being confusing (which would be a #define and not an #ifdef); It's about them being unnecessary and in this case harmful. Since we're talking about macros and Java, what good would they do? If I write a sufficiently small or simple method/function, the Java compiler will inline it for me. For that matter, so will the optimizing C compiler. #define is of limited use today.
And C is one of the only languages left that DOESN'T have a standard, portable networking API. Java, Python and Perl have all had one for years. Java recently gained a secondary API for non-blocking I/O that takes care of a lot of the speed issues plaguing network apps in the past. And it's supported by all current JVMs. And any Java programmer can look up how to use it in any recent Java tutorial or quick reference.
You want a library? So do I. Those libraries are called java.net, java.nio, and IO::Socket. How are they implemented? Probably in C and/or some Assembly. Do I care? Not if the API is stable and the speed fits my minimum requirements for a job (and they almost always do).
Nothing wrong with "testing out an implementation in a higher-level language" eh? What happens when that implementation is plenty fast and/or memory efficient enough to do the job. Why recode in C?
Java is crippled how? By its standard GUI libraries? How can you compare that to C's lack of any standard GUI library? As for the rest of Java, given the non-blocking I/O libraries, how is Java substantially slower? By all means, give me an example larger and more complex than "Hello World."
Study after study that I have read has demonstrated that algorithm makes far more difference than language in speed contests. The worst problems I have seen in Java code were when arrogant C programmers tried to code it like C and then whine about how it won't work right for them. I have seen Java code where people have made classes called Get_Channels and made instances of those objects in order to call an instance method instead of just making a static method and calling it off of the class definition. This is why Java gets a reputation of slow in the last few years, not because of some inherent limitation of the platform. As far as naming conventions, thank god there's a standard. Any Java programmer who follows the naming standard is pretty well covered that some other programmer will recognize and understand his constructs with a minumum of time and effort. It isn't until C programmers come in with their need to call classes get_channels that things go haywire.
Scope problems? What scope problems? Can you be more specific because I'm not even sure what you have a problem with here.
Yes, and people never bang their head against C or Assembly. I have had many more problems with the limitations of C in the past seven years than I have ever had with Java or Perl. Sometimes C fits the bill. When speed is truly an issue and every little cycle counts, sometimes C (or more often C++) comes to my rescue. For almost every other problem imaginable, C is my problem, not my solution.
Java and C++ have very well defined behavior for object instantiation. For someone who complains that Java programmers just aren't smart enough to handle preprocessor macros, you seem awfully dismissive of your own ability to see what happens in a higher level language when I and others barely even blink.
- I don't need to go outside, my CRT tan'll do me just fine.
Very correct. There exists a project now with the goal of making it easy to create PHP websites. It is called Enzyme.
It is essentially some source code (the XML and templates) that compiles (gets looked at by the set of PHP scripts that write scripts) into PHP and is then run on the target machine. The "compiled" PHP is then again compiled and run when it is needed.
All this abstraction comes at a price. It does slow things down. The benefit is that once the "compiler" is written, many generic PHP sites can be created with just an afternoon thinking about the requirements, writing the specs, testing the system. The next day, off to usability study and withing 2 or 3 days the customer has their site, ready to do.
The other benefit is that rather tedious but necessary steps can be tossed in quite easily. Its much easier to use cleartext passwords, but once someone has written the javascript and proper PHP to handle client encrypted passwords, it get incorporated into every project, not just the ones written by the guy who really knows his stuff.
Essentially what is happening is that really smart people are making it easier for less smart people to make computers do exactly what they want, and that is a good thing(tm).