Slashdot Mirror


Eric S. Raymond Identifies A Common Programming Trap: 'Shtoopid' Problems (ibiblio.org)

"There is a kind of programming trap I occasionally fall into that is so damn irritating that it needs a name," writes Eric S. Raymond, in a new blog post: The task is easy to specify and apparently easy to write tests for. The code can be instrumented so that you can see exactly what is going on during every run. You think you have a complete grasp on the theory. It's the kind of thing you think you're normally good at, and ought to be able to polish off in 20 LOC and 45 minutes.

And yet, success eludes you for an insanely long time. Edge cases spring up out of nowhere to mug you. Every fix you try drags you further off into the weeds. You stare at dumps from the instrumentation until you're dizzy and numb, and no enlightenment occurs. Even as you are bashing your head against a wall of incomprehension, consciousness grows that when you find the solution, it will be damningly simple and you will feel utterly moronic, like you should have gotten there days ago.

Welcome to programmer hell. This is your shtoopid problem.... If you ever find yourself staring at your instrumentation results and thinking "It...can't...possibly...be...doing...that", welcome to shtoopidland. Here's your mallet, have fun pounding your own head. (Cue cartoon sound effects.)

Raymond's latest experience in shtoopidland came while working on a Python-translating tool, and left him analyzing why there's some programming conundrums that repel solutions. "You're not defeated by what you don't know so much as by what you think you do know," he concludes. So how do you escape?

"[I]nstrument everything. I mean EVERYTHING, especially the places where you think you are sure what is going on. Your assumptions are your enemy; printf-equivalents are your friend. If you track every state change in the your code down to a sufficient level of detail, you will eventually have that forehead-slapping moment of why didn't-I-see-this-sooner that is the terminal characteristic of a shtoopid problem."

Share your own stories in the comments. Are there any programmers on Slashdot who've experienced their own shtoopid problems?

5 of 189 comments (clear)

  1. Re:Not always... by StormReaver · · Score: 4, Informative

    ... can't believe it takes so much code to do something that seemed so straightforward.

    While that happens too, it is on the other end of the spectrum of what Eric is describing.

  2. Rubber Duck Debugging by Anonymous Coward · · Score: 2, Informative

    I learned long ago to recognize the feeling that comes when I know I'm missing something obvious. When I do that, I grab a coworker, and explain the issue to them. Just explaining it to someone is frequently enough, but sometimes they spot something glaringly obvious that I've missing.

    I spent an hour once trying to find an issue where the difference was between I5 and l5. Yeah, depending on your font and display that may be an easy problem, or a hard one. One of those is a capital i, the other a lowercase L.

  3. Sometimes you need to debug optimized code. by Anonymous Coward · · Score: 2, Informative

    So you have a failed assertion. What happened? Fire up the debugger, breakpoint on abort. Breakpoint gets triggered, you get a backtrace. Can't imagine how you got there.

    Days of debugging later...

    The abort function is marked as "noreturn". Consequently instead of calling abort, the compiler saves a few bytes/cycles by jumping to a preexisting abort call, never mind the state of the stack frame. Of course, this single recycled abort call in the whole module is where all backtracks end up. Hooray.

    Now obviously the whole purpose of abort as opposed to exit is to get a core dump. And the whole purpose of a core dump is debugging. And debugging involves backtraces, so abort calls should leave stack and continuation in a useful and recognizable state. So the obvious remedy is not to mark abort as "noreturn". Because you never want to have the stack in a mess when aborting as opposed to exiting.

    Enter your most beloved glibc maintainer of yore. Who refuses to lie to the compiler for any reason at all.

    This shtoopid problem will stick around. -fno-crossjumping for yall.

  4. Re:assert()'s for every assumption by pauljlucas · · Score: 4, Informative

    That is why the assert macro can be disabled via NDEBUG. You enable asserts during development and testing to catch errors so they do not go unnoticed, then disable them for production.

    --
    If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
  5. Re:assert()'s for every assumption by blackhedd · · Score: 4, Informative

    You got the right way to say it: "Assert all of your assumptions."

    Code rots when it gets modified in ways that don't respect the implicit assumptions made in the past. Have you ever said to yourself, "this function is only called from two places and I know those two places validate the parameters"? When you then write the function without checking parameters, you've made an implicit assumption that makes all the sense in the world at that moment. But someone else (or you in three months) will forget or won't know the assumption, and call that function from somewhere else, with unchecked parameters.

    Either document your implicit assumptions (making them explicit), or (better) assert them. The way to get good at asserting implicit assumptions is first to learn how to recognize when you're making an implicit assumption! That takes skill and practice, but if you don't do it, asserts don't help you.

    And I leave the asserts in the final builds, too. In decades of professional C programming, I've never had a case where the asserts imposed a measurable performance penalty.

    To the people who say "use NDEBUG to disable your asserts for production, because customers hate interruptions": no, don't do that. A violation of an implicit assumption is ALWAYS a bug, and it's always better for it to bite your behind sooner rather than later. The assert tells you exactly what you did wrong. I've had this conversation dozens of times:

    [Angry customer]: Your software crashed!
    [Me]: pop open the syslog and search for the word "assert."
    [Angry customer]: It says "line XXX in file YYY"!
    [Me]: you'll have the fix in two hours.

    Interestingly, there is a handful of classes of bugs that are impervious to asserting your assumptions. The worst of these, in my experience, is the accidentally shadowed variable. But using assert in a disciplined way is incredibly useful.