Old-School Coding Techniques You May Not Miss
CWmike writes "Despite its complexity, the software development process has gotten better over the years. 'Mature' programmers remember manual intervention and hand-tuning. Today's dev tools automatically perform complex functions that once had to be written explicitly. And most developers are glad of it. Yet, young whippersnappers may not even be aware that we old fogies had to do these things manually. Esther Schindler asked several longtime developers for their top old-school programming headaches and added many of her own to boot. Working with punch cards? Hungarian notation?"
For some reason the article says that only variables beginning with I,J,and K were implicitly integers in Fortran. Actually, it was I-N.
Good Hungarian notation does exactly that, actually. Check out Apps Hungarian, which encodes the semantic type of the data, rather than the language-level data type.
Of course stupid Hungarian notation is stupid. Stupid anything is stupid. Problem is, most people don't hear about the right approach.
First off, most of the things on the list haven't gone away, they've just moved to libraries. It's not that we don't need to understand them, it's just that not everyone needs to implement them (especially the data structures one- having a pre-written one i good, but if you don't understand them thoroughly you're going to have really bad code)..
On top of that, some of their items
*Memory management- still needs to be considered about in C and C++, which are still top 5 languages. You can't even totally ignore it in Java- you get far better results from the garbage collector if you null out your references properly, which does matter if your app needs to scale.
I'd even go so far as to say ignoring memory management is not a good thing. When you think about memory management, you end up with better designs. If you see that memory ownership isn't clearcut, it's usually the first sign that your architecture isn't correct. And it really doesn't cause that many errors with decent programmers(if any- memory errors are pretty damn rare even in C code). As for those coders who just don't get it- I really don't want them on my project even if the language doesn't need it. If you can't understand the request/use/release paradigm you aren't fit to program.
*C style strings
While I won't argue that it would be a good choice for a language today (heck even in C if it wasn't for compatibility I'd use a library with a separate pointer and length), its used in hundreds o thousands of existing C and C++ library and programs. The need to understand it isn't going to go away anytime soon. And anyone doing file parsing or network IO needs to understand the idea of terminated data fields.
I still have more fans than freaks. WTF is wrong with you people?
If you're going to talk about old school, you gotta mention Mel.
The determined Real Programmer can write Fortran programs in any language.
Self-modifying code
Yup, I actually write asm code.. plus he mentions "modifying the code while it's running".. if you can't do that, you shouldn't be wielding a debugger.
Code that generates code is occasionally necessary, but code that actually modifies itself locally, to "improve performance", has been obsolete for a decade.
IA-32 CPUs still support self-modifying code for backwards compatibility. (On most RISC machines, it's disallowed, and code is read-only, to simplify cache operations.) Superscalar IA-32 CPUs still support self-modifying code. But the performance is awful. Here's what self-modifying code looks like on a modern CPU:
Execution is going along, with maybe 10-20 instructions pre-fetched and a few operations running concurrently in the integer, floating point, and jump units. Alternate executions paths may be executing simultaneously, until the jump unit decides which path is being taken and cancels the speculative execution. The retirement unit looks at what's coming out of the various execution pipelines and commits the results back to memory, checking for conflicts.
Then the code stores into an instruction in the neighborhood of execution. The retirement unit detects a memory modification at the same address as a pre-fetched instruction. This triggers an event which looks much like an interrupt and has comparable overhead. The CPU stops loading new instructions. The pipelines are allowed to finish what they're doing, but the results are discarded. The execution units all go idle. The prefetched code registers are cleared. Only then is the store into the code is allowed to take place.
Then the CPU starts up, as if returning from an interrupt. Code is re-fetched. The pipelines refill. The execution units become busy again. Normal execution resumes.
Self-modifying code hasn't been a win for performance since the Intel 286 (PC-AT era, 1985) or so. It might not have hurt on a 386. Anything later, it's a lose.
This is one of my favourite quotes:
"The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet." - Michael A. Jackson
That being said, when I hit the experts only situation I can usually get 2 orders of magnitude improvement in speed. I just then have to spend the time to document the hell out of it so that the next poor bastard who maintains the code can understand what on earth I've done. Especially given that all too often I am this poor bastard.
It's all too-often that people get the wrong view of a program using the debugger, either because it's not showing what's really there, or they're not interpreting it right. If you think something's wrong based on what you see in the debugger, write a test program first. More often than not, the test program will pass. After all, the compiler's job is to output code which meets the language specification regarding side-effects, not to make things look right in the debugger. In this case, the developer should have written a simple test which inserted two different values that had the same hash code, and verified that he really could only access one of them in the container. He would have found that they were both still there.
Even mundane programmers would not dream of using a generic library that includes sorts they'll never refer to in, say, an e-mail client or a game. They'll write their own.
Erm, why the hell not? Good programmers, even the best programmers (in fact especially the best programmers), will just use qsort() (or the equivalent for the language they're using). Then, IF performance on their lowest-spec target hardware is unacceptable, they will profile their code and find out what's taking the time. And then, IF it's the sorting algorithm that's the bottleneck, only THEN will they implement a more specific version. Anything else is a waste of time and an additional risk of introducing unnecessary bugs.
Unless we're really pushing the boundaries (and those boundaries are so far away with modern computers that 99% of applications can't even SEE them from their cosy seat in the middle of userland) the stock sorting algorithm your language provides will be plenty fast enough. If you're using a high-level interpreted language, you'll never beat it in efficiency.
What you're saying may have been true 15, or even 10 years ago, but it's certainly not true now.
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
The reason people say C++ is slower and uses more memory is because it is. Not due to the language itself (except for one case), but due to how people use it and the mistakes they make
1)RTTI and exceptions- very slow. If you use them you will be slower than C. Of course most embedded systems avoid them like the plague. (This is the one case where it's a language fault)
2)Passing objects. Its a frequent mistake that people forget to pass const object& rather thn the object itself, causing extra constructors and destructors to be called. Honest but costly mistake.
3)The object oriented model and memory. In an object oriented model you tend to do a lot more memory copying. In C, if you have an OS function that returns a string (a char*), you'll use generally save that pointer somewhere, use it directly a few times, then free it. In C++ you'll take it, insert it into a string object (which will cause a copy), pass that object around (and even by reference thats less efficient than using a char* directly), probably call c_str() on it if you need to pass it back to the OS, then finally let the destructor free it. More time.
4)The object oriented model and hiding complexity- it can be very easy in an object oriented system to forget the true cost of an operation. Programmers think of x=y as a cheap operation, like it is with ints. With objects, it may be very expensive. Same with other operations that happen "automatically" like string concatenation using +. It can be easy to write code that doesn't look too bad, but really takes thousands of cycles.
5)Constructors, copy constructors, and operator =- some of these can be called in very unusual places, especially when they're being passed to and/or returned from a function. Read Scott Meyers for a list of all of them. If you had a function that was passed in two Foo objects, mainpulated them, created a new Foo object, set it equal to one of the two passed in, and returned that Foo object I doubt 1 in 10 programmers would correctly guess all of the times these would be called (and I'm not that 1- it's been way too long since I studied the issue). In C these would be at worst 4 memcpys (two for pass in, 1 for assignment, 1 for return). So C++ object quirks can eat up a lot of time in these situations.
All that doesn't mean you shouldn't use C++. But due to it you won't get the sheer execution speed you would in C.
I still have more fans than freaks. WTF is wrong with you people?
or that a for loop should be processed with >= 0 where possible (or split into a while loop) to reduce computation time.
This is an obfuscating micro-optimization with pitfalls (e.g. unsigned is always >= 0) and should not be a general rule. In many cases the compiler will do any optimization here automatically, and in other cases you need to profile first to make sure this is the bottleneck before obfuscating the code.