Old-School Coding Techniques You May Not Miss
CWmike writes "Despite its complexity, the software development process has gotten better over the years. 'Mature' programmers remember manual intervention and hand-tuning. Today's dev tools automatically perform complex functions that once had to be written explicitly. And most developers are glad of it. Yet, young whippersnappers may not even be aware that we old fogies had to do these things manually. Esther Schindler asked several longtime developers for their top old-school programming headaches and added many of her own to boot. Working with punch cards? Hungarian notation?"
Some of those are obnoxious and good to see them gone. Others, not so much. For instance, sorting/searching algorithms, data structures, etc. Don't they still make you code these things in school? Isn't it good to know how they work and why?
On the other hand, yeah... fuck punch cards.
Heh, I had to turn in a punched card assignment in college (probably the last year THAT was ever required)... but I was smart enough to use an interactive CRT session to debug everything first... then simply send the corrected program to the card punch.
I was an early adopter of the "let the machine do as much work as possible" school of thought.
This issue is a bit more complicated than you think.
I don't get what the big deal is with Hungarian Notation. Why do people consider it a bad thing?
Modern IDEs might reduce the need for it, but not everyone uses an IDE to read or write code.
My Sysadmin Blog
First of all, most actual practices mentioned are well alive today -- it's just most programmers don't have to care about them because someone else already did it. And some (systems and libraries developers) actually specialize on doing just those things. Just recently I had a project that almost entirely consisted of x86 assembly (though at least 80% of it was in assembly because it was based on very old code -- similar projects started now would be mostly in C).
Second, things like spaghetti code and Hungarian notation are not "old", they were just as stupid 20 years ago as they are now. There never was a shortage of stupidity, and I don't expect it any soon.
Contrary to the popular belief, there indeed is no God.
Actually, the worst spaghetti code I have ever seen (in 30+ years most of it in life-critical systems) is OO C++. It doesn't have to be that way, but I have seen examples that would embarrass the most hackish FORTRAN programmers.
I am alarmed at the religious fervor and non-functional dogma associated with modern programming practices. Even GOTOs have good applications - yes, you can always come up with some other way of doing it, by why and with how much extra futzing? But it's heresy.
Brett
* Sorting algorithms
If you don't know them, you're not a programmer. If you don't ever implement them, you're likely shipping more library code than application code.
* Creating your own GUIs
Umm.. well actually..
* GO TO and spaghetti code
goto is considered harmful, but it doesn't mean it isn't useful. Spaghetti code, yeah, that's the norm.
* Manual multithreading
All the time. select() is your friend, learn it.
* Self-modifying code
Yup, I actually write asm code.. plus he mentions "modifying the code while it's running".. if you can't do that, you shouldn't be wielding a debugger, edit and continue, my ass.
* Memory management
Yeah, garbage collection is cheap and ubiquitous, and I'm one of the few people that has used C++ garbage collection libraries in serious projects.. that said, I've written my own implementations of malloc/free/realloc and gotten better memory performance. It's what real programmers do to make 64 gig of RAM enough for anyone.
* Working with punch cards
Meh, I'm not that old. But when I was a kid I wrote a lot of:
100 DATA 96,72,34,87,232,37,49,82,35,47,236,71,231,234,207,102,37,85,43,78,45,26,58,35,3
110 DATA 32,154,136,72,131,134,207,102,37,185,43,78,45,26,58,35,3,82,207,34,78,23,68,127
on the C64.
* Math and date conversions
Every day.
* Hungarian notation
Every day. How about we throw in some reverse polish notation too.. get a Polka going.
* Making code run faster
Every fucking day. If you don't do this then you're a dweeb who might as well be coding in php.
* Being patient
"Hey, we had a crash 42 hours into the run, can you take a look?"
"Sure, it'll take me about 120 hours to get to it with a debug build."
How we know is more important than what we know.
Try overlays...
Back in the day we had do all the memory management by hand. Programs (FORTRAN) had a basic main "kernel" that controlled the overall flow and we grouped subprograms (subroutines and functions) into "overlays" that were swapped in as needed. I spent hours grouping subprograms into roughly equal sized chunks just to fit into core, all the while trying to minimize the number of swaps necessary. All the data was stored in huge COMMON blocks so it was available to the subprograms in every overlay. You'd be fired if you produced such code today.
Virtual memory is more valuable than full screen editors and garbage collection is just icing on a very tall layer cake...
You will never find a programming language that frees you from the burden of clarifying your thoughts.
http://www.xkcd.com/568/
The worst I saw in my ~25 years, and I include old COBOL and BASIC crap, was not spagetti in the strict sense of the word. It was a 10000 line Java method written by a VB developer. There were no gotos, but the entire thing was nested ifs switches and for loops nested to over 10 layers deep. Oh, and you did read that right, it was a method - the entire class had a solitary static method full of copy and pasted chunks. He explained that it was OO because it was Java. I might forgive him if it was gigantic nested unrolled loop that ran like stink, but it was slow and crash prone.
A bunch of gotos and gosubs are a pleasure to debug compared to that kind of poo, seriously.
No matter how nice a new paradigm that comes along, there is always some idiot who can make it suck far, far more than the last paradigm.
Re wrote that as 10 classes of ~20 lines each, it ran faster and never died until it was told to.
I don't therefore I'm not.
A feature like intellisense isn't a feature to save typing time... its primary benefit is to save looking things up in a manual if one happens to not remember the exact spelling of some class member or function. If one knows exactly what ones wants to type in the first place, it doesn't stop you, nor should it even slow you down, unless it's implemented poorly.
File under 'M' for 'Manic ranting'
There is no practical difference between OO code and structured code. The article assumed structured code means goto and gosub, but any Real Programmer knows that procedures (which are just gosubs by name rather than address) are still structured programming.
So what's OO? Each class is just a bunch of functions and procedures, with one entry point and one exit point for each - your standard structured programming methodology. The fact that there are different classes makes no difference. Calls between classes don't change the nature of a class any more than pipes between programs change the nature of programs.
I wasn't impressed by other claims, either. Garbage collection is still a major headache in coding, which is why there are so many debugging mallocs and so many re-implementations of malloc() for specialist purposes. Memory leaks are still far, far too common - indeed they're probably the number 1 cause of crashes these days.
Pointer arithmetic? Still very very common. If you want to access data in an internal database quickly, you don't use SQL. You use a hash lookup and offset your pointer.
Sorts? Who the hell uses a sort library? Sort libraries are necessarily generic, but applications often need to be efficient. Particularly if they're real-time or HPC. Even mundane programmers would not dream of using a generic library that includes sorts they'll never refer to in, say, an e-mail client or a game. They'll write their own.
One of the reasons people will choose a malloc() like hoard, or an accelerated library like liboil is that the standard stuff is crappy for anything but doing standard stuff. This isn't the fault of the glibc folks, it's the fault of computers for not being infinitely fast and the fault of code not being absolutely identical between tasks.
The reason a lot of these rules were developed was that you needed to be able to write reusable code that also had a high degree of correctness. Today, you STILL need to be able to write reusable code that also has a high degree of correctness. If anything, the need for correctness has increased as security flaws become all the more easily exploited, and the need for reusability has increased as code bases are often just too large to be refactored on every version. (Reusability is just as important between versions as it is between programs - a thing coders often forget, forcing horrible API and ABI breakages.)
The reason that software today is really no better, stability-wise, than it was 15-30 years ago is that new coders think they can ignore the old lessons because they're "doing something different", only to learn later on that really they aren't.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
The biggest "new" headache that will probably end up in such an article 20 years from now is web "GUIs", A.K.A. HTML-based interfaces. Just when I was starting to perfect the art of GUI design in the late 90's, the web came along and changed all the rules and added arbitrary limits. Things easy and natural in desktop GUI's are now awkward and backassward in a browser-based equivalent.
Yes, there are proprietary solutions, but the problem is that they are proprietary solutions and require users to keep Flash or Active-X-net-silver-fuckwhat or Crashlets or hacky JimiHavaScript up-to-date, making custom desktop app installs almost seem pleasant in comparison, even with the ol' DLL hell.
On a side note, I also came into the industry at the tail end of punched cards (at slower shops). Once the card copy machine punched the holes about 1/3 mm off, making them not read properly, but on *different* cards each pass thru. It's like including 0.5 with binary numbers, or 0.4999 and 0.5001 with a quantum jiggle.
Good Times
Table-ized A.I.
You're missing the point---it breaks when one of the variables is a reference to the other.
It's a neat algorithm, but the case in which it fails just goes to show that these skills aren't irrelevant. Yes, you should know what a reference is. Using your compiler and libraries as a crutch for your lack of understanding leads to unpleasant bugs.
You don't need long division in normal life. Regardless of if you are in a math heavy career or not, you aren't going to waste your time doing it by hand, you'll use a calculator which is faster and more accurate. However, you need to learn it. You need to understand how division works, how it's done. Once you learn it, you can leave it behind and automate it, but it is still important to learn. An understand of higher level math will likely be flawed if basic concepts aren't learned properly.
The one application of "goto" that I swear by is for cleaning up allocations on failure when coding in C.
Maintaining a huge library of legacy C code, one of the most common bugs we see is leaks due to people using multiple "return" statements and failing to clean up allocations. You can fairly reliably pick such a function at random and find a memory leak: people always get it wrong.
"goto cleanup;" however, is hard to mess up.
I've seen any number of clever tricks to avoid the "goto". Using "break" statements in a do {} while (0) loop, for example. All of them merely obfuscate the code, and make it more likely for bugs to appear.
My mother, who was programming before a fair few of us (including me) were born, once told me this: If you think you've found a bug in a compiler, or an operating system, or a programming language, or a well-known commonly used library... you're wrong.
:)
Of course, this doesn't hold true 100% of the time, especially when you're pushing the limits of new versions of large 3rd party libraries, but when one is just starting to program (and hence using very well known, well tested libraries and code) it's true 99.99% of the time.
(Oh, btw, I love your sig. Makes me laugh every time.
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
Even more puzzling to me is how someone could decide to use a data structure without understanding its behavior (and without at least checking the Java APIs or simply Googling).
Easy. They learned that they should use *insert class here* in Intro to Programming 1 or 2 and never thought about it again since then. Horrendous overuse of StringBuilders is probably the most common example of this, but it can apply to just about anything.
Tomato wedge sperm darts that are Republican.
you get far better results from the garbage collector if you null out your references properly, which does matter if your app needs to scale.
You don't get any difference at all if you null out local variables. In fact, you may even confuse the JIT into thinking that the variable lifetime is larger than it actually has to be (normally, it is determined by actual usage, not by lexical scope).
It is well known that Michael Schumacher is NOT much of a car nut when it comes to the mechanics. How many world championships did he win? Oh, more then ANYONE ELSE?
You need to know about the network stack if it is your job to know about the network stack. If it isn't, you don't need to know about it. What good is it for someone who writes an music codec to know about the network? Parallell programming notepad?
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
>>If you think you've found a bug in a compiler, or an operating system, or a programming language, or a well-known commonly used library... you're wrong.
You apparently never tried doing template coding in C++ ten years ago. =)
Well, obviously all 3 above knew how to use a Hashtable or HashMap, but neither knew what they really do and all ended up trying to fix what's not broken.
But the real answer I'm tempted to give is more along the lines of the old wisecrack: In theory there's no difference between theory and practice. In practice there is.
In theory, people shouldn't know more than what collection to use, and they'll be perfectly productive without more than a Java For Dummies course. In practice I find that the people who understand the underlying machine produce better code. Basically that you don't need to actually program anything in ASM nowadays, but if you did once, you'll produce better code ever after. You don't need to chase your own pointers in Java any more, but you _can_ tell the difference between people who once understood them in C++ and those who still struggle with when "x=y" is a copy and when it holds only a reference to the actual object. You theoretically don't need to really know the code that javac generates for string concatenation, but in practice you can tell the difference in the code of those who know that "string1=string2+string3" spawns a StringBuffer too and those who think that spawning their own a StringBuffer is some magical optimization. Etc.
And then there are those who are living proof that just a little knowledge is a dangerous thing. I see people all the time who still run into something that was true in Java 1.0 times, but they don't understand why or why that isn't so any more.
As a simple example, I run into people who think that to rewrite:
as
... is some clever optimization, and it speeds things up because Java doesn't have to check the extra bounds on i any more.
In reality it's dumb and actually slower, instead of being an optimization. Any modern JIT (meaning since at least Java 1.2) will see that the bound was already checked, and optimize out the checking in the array indexing. So you have exactly one bounds check per iteration, not two. But in the "optimized" version, it doesn't detect an existing check, so it leaves in the one at the array indexing. So you _still_ have one bounds check per iteration. It didn't actually save anything. But this time the exit is done via an exception, which is a much more expensive thing.
For bonus points, it introduces the potential for another bug: what if at some point in the future the doSomething() method throws its own ArrayIndexOutOfBoundsException? Well, they'll get a clean exit out of the loop without processing all values, and without any indication that an exception has occured.
Such stuff happens precisely to people who don't understand the underlying machine, virtual or not.
A polar bear is a cartesian bear after a coordinate transform.
I think you've got the bar a little high there. I'd settle for not continuing to run into bugs that result because people wrote code that copies a string into a buffer without knowing if the buffer was big enough to hold the string. Or, not quite a bug, people who place arbitrary, and small, limits on the size of strings (or numbers) - cause god forbid that anyone have a name longer than 12 characters, or a feedback comment longer than 40 characters, or ...
The tyrant will always find a pretext for his tyranny - Aesop
LOL, I used to believe that, but I can now reliably make SunPRO, GCC and MSVC miscompile things. SunPRO has a bug where it always considers children of friends to be friends. SunPRO occasionally constructs locals when an exception should have caused flow control to leave the block earlier. GCC insists on copying temporaries passed by const reference. SunPRO outright crashes when you try templating on a member function pointer type. MSVC incorrectly mangles names of symbols in anonymous namespaces contained within other namespaces. GCC won't find global operators inside a namespace that contains operators, even for completely unrelated types. Giving GCC the same specific register constraint for an input and output of an inline assembly block will cause miscompilation - you need to use numeric constraints. People say that I only find this stuff because I'm digging around in the dark corners of the language where no-one else goes. It still sucks to be tearing my hair out over it, though.
This might be okay if you are SO constrained you can't afford one register's worth of temp space, but if you're into performance, this is 4-8x slower than using a temp variable, in every language I've tried it on. Run your own benchmarks, see what I mean. Also, don't obfuscate your code, just to be "clever".
You may not be wrong, but you should exhaust all other possibilities first. I was working in a company where we found a bug in the floating-point calculation on the intel chip. http://en.wikipedia.org/wiki/Pentium_FDIV_bug
:)
Lots of people also found it. You can't even assume that your hardware is right
(Oh, btw, damn your sig! I'm singing that song now!)
Johns: Well, how does it look now? Riddick: Looks clear.
Um, I'm pretty sure quicksort is still the go-to sort simply because it's the implementation that's built into almost every single programming environment. Then again honestly, I'd say that from the point of view of a pragmatic programmer... it doesn't matter. There's a built-in fuction (whether it's qsort() in the C standard library, or Arrays.Sort() in Java, or whatever) that will take your array and return it, sorted. If your app runs too slow and you profile it and it turns out the speed problem is in the sorting AND you can't find a better algorithm that doesn't depend so much on sorting... THEN you look at optimising it. Never forget the two cardinal rules of optimising:
1) Don't optimise.
2) (Experts only:) Optimise later.
Or as I once read it eloquently expressed:
1) Make it work.
2) Make it work right.
3) Make it work fast.
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
Horrendous overuse of StringBuilders is probably the most common example of this
If you have a more efficient way to concat large strings of unknown length, I'd like to hear about it.
Although it's generally true that what I initially think is a compiler bug is almost always my fault, I definitely run into actual compiler or library bugs on occasion.
- I was working on some low-level boot loader assembly code, and found that "call esp" was not working as documented. It turns out that most modern Intel CPUs have a bug they haven't bothered fixing where it jumps to the value of ESP *after* pushing the return address. AMD processors don't do that, which is why it worked for me >_<
- I found that a memcpy() in our Win32 "vectored exception handler" was corrupting memory. The code that caused the exception was a memmove() that had decided to go in reverse to ensure a proper copy. It turns out that Win32's exception handler stub wasn't clearing the x86 direction flag before calling exception handlers, a violation of the Win32 calling convention. Compilers assume that the direction flag is clear at the start of a function, so a constant-sized memcpy() will frequently get inlined as simply "rep movsd".
- We found an extremely obscure bug in Visual C++'s compiler where it incorrectly zero-extended a pointer to 64 bits when it should have sign-extended as per the C standard. A global declaration like this was being used:
int var;
long long extended = (long long) &var;
If the base address of a 32-bit program is >= 0x80000000, as occurs in NT device drivers, the extension to 64-bit will be zero-extended instead of sign-extended. This differs from what happens in local variables.
In fact, getting this right is impossible with the relocations available to Win32 images and object files. The compiler should've thrown an error because it can't do the requested operation. (In C++ it could, but would have to make it a constructor.)
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
> My mother, who was programming before a fair few of us (including me) were born, once told me this: If you think you've found a bug in a compiler, or an operating system, or a programming language, or a well-known commonly used library... you're wrong.
So when MSVC prints "Internal Compiler Error" and stops compiling my code, I'm wrong? :)
Five years ago, it was easy to cause MSVC to crash & burn - lately their back-end compiler has gotten much better dealing with C/C++ code.
Why, why, why do people get SO offended when you tell them they have to learn computers to be good at computers?
Because you're essentially attacking them.
What if I responded to you by saying: "I'm sorry, but if you don't understand how flatly attacking people's qualifications for their job is insulting and threatening, you shouldn't be having this discussion. You simply don't have the interpersonal skills to articulate this kind of thing in a manner that would be productive, let alone persuasive."
Get your dander up at all?
And please don't hide behind the "I was just stating a fact, if the shoe fits, wear it." There are lots of good ways to say what you're trying to get at that are probably even closer to the truth.
Which is that really don't have to learn *everything* about computers in order to be good at computers. It is certainly an underlying truth that the more you know, the better you are as a developer. But it's entirely possible to be a reasonably productive developer without knowing everything... as long as your abilities are matched to what you need to accomplish. And there's more or less a curve of task difficulty to go along with a curve of developer abilities.
I don't know very much about building compilers. Some people would say that makes me a mere dilettante of a software developer. That's a rash overstatement. It's absolutely true I would be a *better* developer if I knew more about these things, and certain problem domains would be more open to me, but there's a huge problem space that really doesn't require this knowledge. This works the other way, too: I probably know more about Linear Algebra and Discrete Mathematics than many developers and even some CS majors (studied Math in school) and I'm familiar with the Logic Programming paradigm (written full programs in Prolog). These things make me a better developer, particularly for some problem domains, but it certainly doesn't mean anyone who doesn't know these things is a simple hack.
I think implementing hashes and other primitives that are now part of libraries/languages falls in this category. Being able to implement them is certainly a *demonstration* that you've mastered certain skills. The contrapositive doesn't necessarily follow. Not ever having implemented them -- in particular because you've never had to -- doesn't necessarily imply that you lack the ability to solve that class of problem.
And in fact, it might demonstrate a certain stripe of wisdom: there's a limited amount of time and a pretty much infinite supply of problems. What do you spend time learning how to do?
Tweet, tweet.