Lessons From Your Toughest Software Bugs
Nerval's Lobster writes: Most programmers experience some tough bugs in their careers, but only occasionally do they encounter something truly memorable. In developer David Bolton's new posting, he discusses the bugs that he still remembers years later. One messed up the figures for a day's worth of oil trading by $800 million. ('The code was correct, but the exception happened because a new financial instrument being traded had a zero value for "number of days," and nobody had told us,' he writes.) Another program kept shutting down because a professor working on the project decided to sneak in and do a little DIY coding. While care and testing can sometimes allow you to snuff out serious bugs before they occur, some truly spectacular ones occasionally end up in the release... despite your best efforts.
Program crashing at startup? Okay, let's add debugging statements.
Can't get the debugging statements to execute? Okay, let's try removing code.
Doesn't fix the problem? Okay, let's keep removing more... and more...
A couple hours later, so much code was removed that the entire program had become nothing more than an empty main function that still crashed. This led to the following rule which I try to follow to this day: Make sure that you're actually compiling and executing the same copy of the code that you're modifying. ;)
I'll never forget the last thing grandma said to me before she died: "What are you doing in here with that knife?!?"
Back in the 80's, I was working on a project with three other programmers. Nobody had heard of version control back then; we were using VAX/VMS and it would keep a few versions of a file around after you changed it, which seemed good enough (after all, we all trusted each other, right?)
Well, I don't remember the exact bug(s), but one day I fixed something, and tested it. Fine. A few days later the bug came back. So I went back, fixed it again (wait, didn't I already make this change?). A few days later it came back again.
It turned out that one of the other guys had fixed a different bug, which I had introduced with my fix. So, his fix was to change the code back the way it was. We went back and forth a few times un-doing each others' changes before we realized what was going on. Seeing a revision log with comments on the changes might have helped...
Have you read my blog lately?
I recall a proverb, something like
"It takes twice as much intelligence to debug code as it took to write it.
So if you code to the best of your ability you are, by definition,
not qualified to debug it."
In the free world the media isn't government run; the government is media run.
I'm glad you found the truth - that being more careful with pointer math and biasing array memory structures more is truly a blessing. May you also discover the higher truth that coding in languages that need no such nonsense (as their automated memory allocation and deallocation routines have been far better debugged than yours) is even more blessed and may lead you more quickly to the communion with defect-free code you desire.
That is all.
Yeah.. That's why the Democrats are pro open immigration too.
I gave up on the concept that I would be able to write and debug programs correctly the first time. Now all the central data structures in any long-lived control system get error-checking code added to them. For example, the sorted-list code is built with a checker to ensure it stays in order. The communications code gets error-checking. The PID controllers get min/max testing, etc.
Every once in a while I come across a bugs that are not in the source code. Often they are compiler errors. Sometimes the bugs involve a rare C/C++ or operating system eccentricity. Sometimes the errors are caused by obscure library changes. Sometimes they are hardware errors.
Especially with the embedded micro-controllers, I leave the consistency checking code in, because you just can't assume the everything always works. The nature of software bugs change with time, and it is not always in the way a programmer would expect. I am frequently surprised by how obscure some of the bugs are.
I was working at my first job writing my first program ever that was not a homework assignment, I decided to write it as a multi-threaded program
^^^ 2015 nominee for most terrifying sentence on Slashdot :)
I don't care if it's 90,000 hectares. That lake was not my doing.
The number of times i've had fellow developers complain that their bug *must* be caused by the compiler, or the OS, or the framework, or the hardware only for it to turn out to be their fault all along is the reason why i always suspect my code before i blame anything else.
The order of parameter evaluation is one that bites a lot of people because most compilers do it the expected way. When you're walking an AST to emit some intermediate representation, you're going to traverse the parameter nodes either left-to-right or right-to-left and most compiler IRs don't make it easy to express the idea that these can happen in any order depending on what later optimisations want. If they have side effects that generate dependencies between them (as these do) then they're likely to remain in the order of the AST walker. Most compilers will walk left-to-right (because a surprising amount of code breaks if they don't), but a few will do it the other way.
To understand why this is in the spec, you have to understand the calling conventions. Pascal used a stack-based IR (p-code) and had a left-to-right order for parameter evaluation, which meant that the first parameter was evaluated and then pushed onto the stack, so the last parameter would be at the top of the stack. The natural thing when compiling Pascal (as opposed to interpreting the p-code) was to use the same calling convention, with parameters pushed onto the call stack left to right. Unfortunately, C can't do this and support variadic functions (not: some implementations wanted to do this, which is why the C spec says that variadic and non-variadic functions are allowed to use completely different calling conventions), because if the last variadic argument is the top of the stack then there's no way to find the non-variadic arguments unless you also do something like push the number / size of variadic arguments onto the stack.
This meant that C implementations tended to push parameters onto the stack right to left. This is less of an issue now that modern architectures have enough registers for most function arguments, but is still an issue on i386. Because of the order of the calling convention, it's more efficient on some architectures to evaluate arguments right to left. Some compilers that are heavily performance-focussed (GPU and DSP ones in particular, where they don't have a large body of legacy code that they need to support) will do this, because it reduces register pressure (evaluate the rightmost argument using some temporaries, push it to the stack, move onto the next, reusing all of those temporary registers).
I am TheRaven on Soylent News