How to be a Programmer
Martin L. Smith writes "Rob Read has posted his magnum opus, "How to be a Programmer: A Short, Comprehensive and Personal Summary" to Samizdat Press where it can be scarfed by the masses. Rob's book is a forty-page tour through the million-and-one things he thinks a programmer ought to know as he sets out into deep water. One of the reasons he posted this was to get some feedback, so tell him what you think. Samizdat Press is maintained by the Colorado School of Mines to provide a distribution point for free (mostly earth-sciences related) texts."
There is no substitute for experience, but there is something resembling a fast track.
Get paired to a senior programmer/systems engineer
If you have the opportunity to work with a senior on a one-to-one basis, grab it with both hands. There will not be many times when an experienced guy is willing to work with you or coach you, so rejoice when the opportunity presents itself, take it. A colleague of mine asked me which project he should take: a glamorous one where he would be working in a large team with no coaching, or a boring-looking but difficult job, working under one senior programmer. I adviced him to take the latter... which he did, and while he often complained about the job itself, his programming skills improved by leaps and bounds, which made him a senior programmer on the next assignment. I was glad to see he has taken it upon himself to teach in the same manner and spend lots of time with the junior guys.
If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
I especially like:
There is a lot of room for miscommunication about estimates, as people have a startling tendency to think wishfully that the sentence:
"I estimate it might be possible if I really understand that problem that it is about 50% likely to be completed in 5 weeks if no one bothers us in that time."
really means:
"I promise to have it all done 5 weeks from now."
Heh heh heh...
Another good reference for this type of info is The Pragmatic Programmer. It lays out how to write flexible, dynamic, and adaptable code, as well avoiding traps that a lot of new programmers fall into. It takes the time to explain the "why's" behind a lot of the engineering approaches advanced programmers take. It is definitely aimed at "junior" programmers, though. Usually when we get someone just out of collage, I point them to this book.
4 years ago, I (Mechanical Engineer, major in Design Engineering) was involved in a bigger software project: Building a modular simulation system for vehicles, based on a database and a Multi-body code with output to Excel and lots of fancy stuff in between to make it all work. Since the customers and users were the people from our Design Dept, i.e. Engineers, I asumed that they would have thought through all the specs and that we basically just had to start.
Big mistake! Being good and great Design Engineers in the mechanical and electrical domain, regarding software they were as clueless as any Marketing Drone. Whenever we tried to extract specifications, all we got was "make it work like that old APL code we have, but better and more modern and let is calculate/simulate more correct results". Aaaarrrrggggghhh...
Unnecessarily to mention, that only very few actually knew how the old system worked and under what assumptions it was built.
Well, we boxed our way through and today I am the only person in the company that has the total insight (the other 2 left). Unfortunately, we were never given time to properly document the system (of course the code itself is quite well documented but there is more to do than just that). In my naïvité I thought that the Design Dept with their fixation on drawings and Supplier Specs and Purchase Reservations and Engineering Change Notices should understand the value of proper documentation...
A reflection I can now make: Hiring us Design Engineers to make the work instead of professional Software Engineers was probably the only way for the company to get the job done within reasonable time & budget. Non-existent specs, poorly understood assumptions for certain calculations - what a nightmare for any professional software developer!
Excellence: Moderate (mostly affected by comments on your karma)
Yes, it is probably mistitled. I'd rather expand the essay to include other aspects of programming than change the title...but in the short term I'll do that, thank you.
The biggest clue that the writer has no clue about computer programming is his statement that 50 hour weeks are typical and 60 hour weeks are his limit. If you are writing code for more than about 2 hours a day, you are writing bad code that is horrible and buggy. I always try to explain what I do to people as very complicated math homework. Noone can actually do math homework for 60 hours a week. It is far too draining.
The majority of most programmers days at work is spent processing ideas in the back of their heads while they do other things (like post on Slashdot). The 2 typical tasks in programming, adding a new small feature to an existing program and debugging a bug are about 100 lines of code and 2 lines of code respectively. These would take in theory half an hour and 2 minutes respectively. But as the old story goes, its knowing which $1 component to replace in the $1,000,000 machine that costs the $10,000. So it is in programming.
Knowing how to integrate the new features and bug fixes without horribly ruining the existing design is the mark of a good programmer. Actually coding the fix or feature once it has been designed (on paper or in your head) is trivial. Overworking yourself leads to bad design and more bugs, which take even more of your overworked self to fix. This escalating behavior leads to burnout as well as the human brain can not spend that much time working on difficult problems every single day.
Anyhow, now that my brain has figured out how it wants to implement the new feature Im working on, while writing this comment, its back to work to toss out my 100 lines of well designed code. If my writing seems confusing or poorly structured, its because my brain was working on code design, not paragragh design.
Here's some feedback.
...
... there is even more you can do. Move I/O handling to a seperate thread (more on this in my next comment.)
...)
Re: Divide and Conquer debugging approach
Knowing *where* to split requires less skill than he suggests. While binary-splitting is useful from an algorithmic point of view, in the arena of debugging, there is no reason to be binary. I will typically split the problem many times (8 or more) at each step. This observes the fact that usually the cost of splitting the code is much less than re-running the scenario to test to see which split it makes it past, or fails to run properly.
Neglecting examples in the debugging section is bad. In particular miss-synchronization of multi-threaded applications is an example that should be shown.
Re: 2.6 How to Optimize Loops
Ok, this is a really short list, and it misses the important principle of "caching", and some of the suggestions are wrong, or typically inconsequential.
1. Sometimes floating point can actually be faster than integer code. This is especially true if the code can be completely pipelined. In particular trying to change from floating point to fixed point algorithms in modern CPUs may actually *decrease* performance. The details of this requires a lot more discussion.
2. Inlining will be ineffective if the function routine is too large, or if the procedure prologue/epilogue cost is either low or unremovable.
3. Fold constants together -- you should be more explicit about what you mean here. Certainly sub-expression elimination is a common technique that usually works well (but compilers are pretty good at finding that for you) but in some CPUs like the x86, immediate absolute value operands are practically cost-free. Perhaps he means "hoist" whenever possible? That certain does help.
4. As to moving I/O into a buffer
5. Try not to divide and avoiding expensive casts requires much more detail. The best thing to say here is the understanding these costs requires understanding the underlying machine code that results from these operations. (Floating point division can actually be relatively cheap in the right context, and differentiating between cheap and expensive casts can sometimes be difficult, and require context as well.)
6. Using pointers rather than indicies -- x86's have sophisticated addressing modes wherein there is commonly no difference between these two alternatives.
Re: 2.7 How to Deal with I/O Expense
An important principle to apply is to realize the parallelism via multithreading can substantially assist these problems. For example if some IO is non-negotiable, or non-predictable, then at least it can be blocked, or streamed in a seperate thread. The reasoning behind this is that modern operating systems can yield (i.e., block) program control (i.e., your execution resources) from a slow to respond thread to the faster ones. So you can overlap all your algorithmic work with the delays while waiting for the data.
Re: 2.8 How to Manage Memory
Something should be said about caching versus non-caching. First of all, point out the cached memory can be tens to hundreds of times faster than main memory (in modern CPUs.) Variables on your local stack, and globals that are commonly used in your inner loops, will tend to be cached. However array streaming will tend to de-cache your data.
Running through your streamed data in multiple passes is especially bad, as it will require reading your data into the cache multiple times.
Again much more can be said here.
Re: 2.9 How to Deal with Intermitten Bugs
This is an important topic. Its because it represents the hardest debugging problem. We all run into it sooner or later. Even if it is a hard subject to tackle, it has to be expanded on. Giving examples here are invaluable. You have to show that as hard as it is, it is possible to ferret out such bugs.
Re: 2.10 How to Learn Design Skills.
The biggest thing to explain here, I think, is to just explain that all code can and should have seperate documentation corresponding to it, that is written *before* the actual code is written.
Re: 3.6 How to Work with Poor Code.
Remember that people may be more open, or willing to learn than you think. If you decide you have to recode something for someone, it may be beneficial to be explicit about this and show them the results. But for such a thing to be effective, and to get over any potential ego problems, you have to make sure the rewrite is absolutely, clearly, obviously better (it should be shorter and more easily readable.) Your goal should be to make sure the programmer that is the target of the rewrite, considers the results to be a better approach that is worth emulating themselves. (Give a man a fish
Section 3.7 needs to be tied to the last paragraph of section 2.1. Scribbling over some "pristene" (sp?) code is irrelevant if you can easily recover it (which you can with good source control.)
Re: 3.8 Unit testing -- my experience with this is a bit depressing. Unit tests always start out being a good thing, but over time, they are an extreme PITA to maintain. Unit testing is a good thing for what I consider *totally generic modules*. The reason being that truly generic modules do not evolve over time, while other code invariably does.
Unit testing can only be effective if there in an enforced automated testing mechanism. I.e., a failure causes an automatic and non-negotiable rejection of code checked into the tree. I have found it remarkably difficult to convince people that such a policy is worthwhile. (SGI used to use such a mechanism, and, of course, it worked wonderfully for them.)
Section 3.9 and 2.4 Belong together. How is 3.9 a team skill?
Re: 5.2 How to Manage Third Party Software Risks
In my experience, this is trivial -- rely on track record. Its more indicative than anything else. If the software has already shipped and has a history, then there is no problem. If it has not yet shipped (and you are hoping that it will in time for you to use it), then you are going to get version 1.0 software at best and more likely you are providing a beta test environment for the third party developer. Just put yourself in the shoes of the third party developer. In what way will they maximize the take away from their involvement in a relationship to sell you software? Remember business relationships can tend to dominate technical ones.
Re: 5.4 How to Communicate the Right Amount
In here you write: It costs its duration multiplied by the number of participants. Please underline and boldface this. It amazes me how managers don't understand this.
Re: 6.1 How to Tradeoff Quality Against Development Time
Remember that a good *design* will be resilient against poor code implementations. If good interfaces and abstractions exist throughout the code, then the eventual rewrites will be far more painless. If its hard to write clear code that is hard to fix, consider what it is wrong with the core design that is causing this.
Re: 6.2 How to Manage Software System Dependence
The harps back to a concept I referred to above as *totally generic modules*. These are just libraries that provide useful functionality and can take input without making any non-trivial assumptions, and contains no dependencies whatsoever.
An example of this is the C run time library. A good example that will help make this clear is that the C run time library is able to provide a quicksort implementation without knowing anything about the underlying array it is sorting.
State-less, assumption-free, zero-dependency code is very valuable. Its maintenance and development will be finite in cost, while its utility is on-going. Imagine the cost of rewriting the C library every time you use it.
Impressing this upon programmers will help them recognize the value of reducing dependencies.
Re: 7.2 How to Utilize Embedded Languages
Ony option you seem to have avoided is the possibility of embedding pre-canned languages. The real problem with embedding a language is that useful language design is harder than you might think at first. People's aversion to using/learning it is bad enough, what happens when they uncover a flaw in your language that is fatal to its design? People who design real languages put a lot of work in them, that cannot be trivialized. Whipping up an embedded language is unlikely to yield the most stellar results.
That said, there are currently numerous options for embedded other pre-canned languages. Python, Lua and Ruby come to mind. Before going off on some adventure of trying to design your own language, consider whether or not you are going to be able to do a better job than what you could do by embedding one of these languages. From my personal experience, I can tell you that Lua can be embedded in a few hours, and has probably the smallest learning curve of any language in existence.
This is right on - the jobs I've been most attracted to are the ones where they asked me the most technical questions. I'm surprised how little of this sort of questioning many people do when hiring. When I'm interviewing people, I try to put them through their paces as much as possible.
I had one engineer help me take apart a vacuum feedthrough, clean it, and put it back together. She jumped right in and did it. I offered her a job on the spot.
It's not wasting time, I'm educating myself.