Eric S. Raymond Identifies A Common Programming Trap: 'Shtoopid' Problems (ibiblio.org)

Not always... by rbeattie · 2018-09-29 22:49 · Score: 4, Interesting

More times than not, the solution is actually really difficult - you just underestimated the problem. Then you go to github and find a library that shows you how it should be done, and you can't believe it takes so much code to do something that seemed so straightforward.

--
Me

Re:Not always... by StormReaver · 2018-09-29 23:03 · Score: 4, Informative

... can't believe it takes so much code to do something that seemed so straightforward.
While that happens too, it is on the other end of the spectrum of what Eric is describing.
Re:Not always... by igny · 2018-09-30 02:01 · Score: 3, Interesting
The shtoopidest problem I faced was in TransactSQL. Usually, the syntax there is case insensitive, but there is a difference between
- where timeStamp >= format(watermark,'yyyy-MM-dd hh:mm:ss') --<-- incorrect
- where timeStamp >= format(watermark,'yyyy-MM-dd HH:mm:ss') --<--correct
This bug was extremely elusive for me because the code looks fine and watermark in our data is almost never between 00:00:00 and 01:00:00 and that was when the bug sometimes causes missing data in our target tables.
--
In theory there is no difference between theory and practice. In practice there is. - Yogi Berra
Re:Not always... by Rei · 2018-09-30 03:10 · Score: 4, Insightful

Agreed. But I don't really think Eric's "solution" is that helpful. Heres the last two "shtoopid" problems I had (in a CFD model evolution app):
1) My program (which a piped child process, which in turn had its own children) was randomly locking up when one of the subprocess's children died. Now, normally that's an eminently solveable problem... except for the fact that it was locking up in a different place each time. I was stuck digging deeper and deeper into pipe magic with no luck. I even went to strace python, but that just added more confusion, as strace was dying at random times, and sometimes not even printing out full lines!
The problem? The output of my program was running through "tee". I was only seeing the last section of the buffer to be printed out :P The real problem was that the subprocess had simply stopped printing data and so the pipe read was hanging; it was instantly obvious when "tee" wasn't used.
2) My program would sometimes go into a "subprocess keeps dying" mode. This started out of the blue with no changes to my code. Again, I kept instrumenting more and more, with no luck.
The problem? I had started, in another window, a shell script that ran on a loop to generate visualization data at regular intervals whenever the process was running. When the visualization data would appear in the middle of a run, it would sometimes interfere with the raw data, due to the way the data processing was set up. But since that visualization data wasn't present when the run started, it took time for the problem to show up, and then would just occur out of the blue.
The short of this is... if you follow Eric's "instrument everything" solution to "shtoopid" problems, you'll sometimes just dig yourself further into a hole. The problem is that you have a base assumption that's wrong. IMHO, the best solution is to bring a third party in and explain everything about what you're doing and where it's going wrong. Not only can their different perspective add insight, but the very act of having to explain and reproduce everything from scratch (and answer their questions) can help you as well.

--
"Who the hell is Nietzche? It's a question stupid people are asking." -- Newscaster, "Jesus Christ Supercop"
Re:Not always... by Aighearach · 2018-09-30 04:56 · Score: 1

I go the exact opposite direction of him!
Just stop looking at the instrumentation as if you're researching a problem or doing R&D. You're not doing R&D when you have a trivial mistake you can't find, you're just debugging.
Stop looking at the tools. Look at the code. If you find yourself saying, "It...can't...possibly...be...doing...that" just whack yourself over the head with the cartoon hammer right there, because telling yourself that is why you're having trouble finding it. You have to instead be asking, "why would it be doing that?" And stop making assumptions, like, "Oh, it can't be that one possibility, because I believe there to be a good enough sanity check." Or worse, "It can't be this one part, because we have a test case." Stop looking at your tools, look at your implementation, stop substituting your memory of what you believe to be correct and just look at the actual code. What does it do? If you had wanted what is really happening, instead of what you intended, what changes would you have made to what you intended? Does that uncover any edge cases? That's a much better type of question to ask than, "where are all the photons at each nanosecond according to the instrumentation?"
The best one I ever had was in `98. I spent 24+ hours looking for a bug in my Perl program. Turns out it was a typo, and I forgot to turn on "use strict" which catches those. I was not used to typos even being a class of bug I needed to consider.
The only other thing I'd say about it is that understanding compiler/interpreter implementation can really help when you're edge-case stuff like embedded systems. So I'm not saying to ignore the tools; think about the tools as other pieces of code, don't just use the tools to try to find mistakes. All mistakes are PEBCAK. Thinking about your tool is more useful than asking your tool to think for you.
Re: Not always... by Hognoxious · 2018-09-30 05:33 · Score: 1

I got woken up very early once because somebody knew the 100 year exception for leap years but apparently not the 400 year exception to the exception.

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Re: Not always... by UnknowingFool · 2018-09-30 06:10 · Score: 3, Interesting

Sometimes it's a very slight difference between environments that causes problems. We rolled out some code to production that had been fully tested in Dev and Test environments. Things started to break due to SQL errors. Ran the SQL directly on the production database server but it ran fine. Somehow the SQL was getting different results running through the production server than it was on the production database server directly.
After some investigation the only difference between the production server and other environments was the server used a slightly older database driver. It was a minor version difference. How this caused errors was that in the older db driver all math operations had to be explicit data casts despite what documentation said but the newer driver followed the database documentation. So Integer A / Integer B should be implicitly cast as Integer according to the documentation. However the older driver would cast that as Float for some unknown reason and that would cause errors.
But this would only happen using the db driver on Production. Testing the SQL directly on Production DB wouldn't have found it. Testing the code and SQL on Dev and Test servers wouldn't have found the bug. The patch notes for the db driver didn't mention the change.

--
Well, there's spam egg sausage and spam, that's not got much spam in it.
Re:Not always... by thsths · 2018-09-30 08:25 · Score: 1

Yes, I agree that some problems are more complicated than they seem, especially if you need to synchronise with some other state, or a poorly documented piece of software.
State is the key problem, the article is right about that. If I find these problems, I try to eliminate states as much as possible. Sometimes a different programming approach help. Reconstructing a value rather than saving it reduces complexity. But sometimes you need the state, and then you may want to look at state machine theory, which is not always intuitive.
Re:Not always... by glenebob · 2018-09-30 09:17 · Score: 1

Nice example of how to do something completely wrong.
Re: Not always... by SharpFang · 2018-09-30 10:21 · Score: 1

Unfortunately, had to do my own.
Date of easter, sunrise/sunset times. No good C++ library providing these.

--
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
Re: Not always... by Pig+Hogger · 2018-09-30 12:00 · Score: 1

You must live in unicorn land...
Re: Not always... by UnknowingFool · 2018-10-01 03:18 · Score: 1

And how in the world would this have helped us? Like I said the only difference was the db driver on Production was different. Otherwise everything else was the same. The same applications, same toolset, the same versions, same databases, etc. Testing didn't find the problem because it didn't exist on Test or Dev.

--
Well, there's spam egg sausage and spam, that's not got much spam in it.
Re: Not always... by UnknowingFool · 2018-10-01 03:23 · Score: 1

From what I remember, the SQL vendor didn't write the driver but neither did the application vendor. It was a 3rd party driver.

--
Well, there's spam egg sausage and spam, that's not got much spam in it.
Re: Not always... by UnknowingFool · 2018-10-02 06:39 · Score: 1

There are gads of ways they are different; however, as coders we neither have the ability, permission, or control to make sure that they are identitical in every single way. In some cases these are unavoidable. The production environment is a 6 cpu server located in CA for example. The Dev and Test servers are 2 CPU servers located locally. But here's the thing: as coders we made sure that we developed on the as close to Production as we could. We didn't think that a db driver was enough to cause a difference. In this case we had to patch the code immediately and the sysadmins had to test the effect of deploying the new driver to Production. That second part took 6 weeks.

--
Well, there's spam egg sausage and spam, that's not got much spam in it.

Re:He really is old, isn't he? by StormReaver · 2018-09-29 23:11 · Score: 5, Insightful

...he doesn't simply use a debugger to step through the problematic code?

That misses the entire point. In the class of problem he is describing, everything looks fine at the debugging level (regardless of how you are debugging). Or better yet: your debugging tools show that something is wrong, yet how the program gets into that state is elusive. You have traced the program execution in excruciating detail, and everything looks great until the very next line of code morphs your perfect execution state into a problematic one for reasons that appear to be impossible. Eventually, you figure out how it's possible, write a small amount of code that you should have written earlier in the process, and fix the problem.

You then realize the obviousness of the solution, and feel like an idiot for having spent hours, days, weeks, or months figuring it out.

Re:He really is old, isn't he? by ledow · 2018-09-29 23:15 · Score: 4, Interesting

Ever tried debugging deep-level OS kernel code?

To be honest, debuggers also introduce just as many differences - I have crafted code (nothing special, fancy or playing tricks) that, when debugged, works entirely differently to non-debugged. Debugging inserts all kinds of stuff into the code that modifies the pointers of all kinds of data by vast amounts, and can made it "pass" whatever it is you wanted to do.

Also, if you program against many architectures, an architecture-specific bug might be something that you don't have the tools for, despite debugging the code on all your normal platforms. Yes, a debugger is the ultimate solution, but mostly you might just not have that stuff available and it could be days or weeks before you can get it going to the point that you can effectively debug code that you've been working on for 20 years and know inside out.

Plus many problems are not debuggable - maybe your users are having the issue but you're not, and you can't reproduce, but dozens of your users can, and yet they have almost identical environments to you - the only way to debug that is to set up a full programming, debugging and source environment on their machine - which may be something you don't want to do - or give them an instrumented version of the executable, which may not reproduce the problem.

I know for a fact that I have programs that work on Linux, Windows, even HTML5 (via emscripten), that also can work on Mac. But for sure I wouldn't be buying a Mac to diagnose problems on that platform until it was absolutely necessary. And I wouldn't be giving my code to users for them to diagnose it.

But through in a bunch of printf's and a log and - no matter the architecture or tools available - you can get down to a function, a line, a set of parameters enough to debug before you even need to think "How the fuck am I'm going to go about getting debug info out of that person/system/architecture?"

I know I have a C macro that I prefix all functions with. In "normal" mode, it just expands to a function definition. In "debug" mode, it expands to the function, and a bunch of debugging lines for when it enters/leaves each function and the parameters given to it. This means one switch change and the program runs basically identically to how it runs without debugging, churns out a huge log file, doesn't modify any structures, pointers, etc. and which I can skim the bottom of after a crash report to know where and why it crashed, on any architecture, with a compiled binary, without including the full -g debugging shit that basically gives away your source code (or a version of it).

Nothing said by yanestra · 2018-09-29 23:16 · Score: 1, Insightful

No-one can doubt that Eric S. Raymond is a talker.

Assert is your friend by Serif · 2018-09-29 23:29 · Score: 4, Insightful

Been there, got several wardrobes full of T shirts.

If unit testing and staring at code for more than a few minutes doesn't solve this kind of problem, then the assertion hammer comes out. Assert everything, especially the things that are so obvious that they don't need an assertion. The bugs just have fewer and fewer places to hide and eventually surrender.

Re:Assert is your friend by justthinkit · 2018-09-30 02:43 · Score: 1, Interesting

These are always humorous comment threads.
Everyone has some experience. Everyone has plenty of advice. Lots of absolutes (like the parent's) come out.
Point number one, always, is to check our assumptions. We assume we know how to code. We assume others know how to code. We assume libraries work as documented. We assume compilers are logical. We assume we are.
Men assume women are logical. Fun ensues.
Kids assume parents are good examples. And waste decades of their lives.
Physics drifts away from the best model for one hundred years. Everyone drifts into the ditch with it.
Little to nothing is learned because...follow the money.
Real problem one in programming / languages is lack of examples. Commands are introduced, and given a 10 line example. And off we run.
Real problem two in programming / languages is lack of incentive to do the right thing for the programmer. Microsoft is famous for its excrement. Because if it actually delivered something that was near perfect, who would ever upgrade?
Real problem three in programming / languages is no one gives a crap. For each person weighing in on this thread, one hundred won't. For each person at least reading this thread, one hundred won't. For each person seeking the best language, one hundred will choose the one that gets them a good job. And then all these randomly compromised and diluted groups are placed under a PHB who is more concerned with breaking things -- by making things "better" -- than anything else.
Given the randomness of this forum, the deliberate trash that passes for programming languages and the throw-away nature of advice, we might as well be asking "What good books have you read this year?" The winner so far for me is "Strange Chemistry". Who knew that Visine and Bengay could be so dangerous?

--
I come here for the love
Re: Assert is your friend by Anonymous Coward · 2018-09-30 07:16 · Score: 1

Wow, I hope this was some sort of ironic joke I don't get. Because if not it is the most useless and nonsensical post I have ever read on this site.

assert()'s for every assumption by jrbrtsn · 2018-09-29 23:30 · Score: 5, Interesting

Over my 30 year career, I cannot believe how many 'C' programmers I've come across who are unfamiliar with the assert() macro. This macro is essential for trapping all invalid assumptions! Usually it's as simple as:

if ( ! functionWhichCanFail(a,b,c) ) assert(0);

Run your program from the debugger, and it will stop when the assert(0) is encountered, giving you full and convenient access to everything needed to hunt down the issue.

Re:assert()'s for every assumption by Anonymous Coward · 2018-09-30 00:27 · Score: 1

as a user, nobody wants the program to abruptly shut down when something goes wrong, often losing their work. this is why a lot of modern languages de-emphasize or completely lack things like assert and exceptions
Re:assert()'s for every assumption by pauljlucas · 2018-09-30 01:11 · Score: 4, Informative

That is why the assert macro can be disabled via NDEBUG. You enable asserts during development and testing to catch errors so they do not go unnoticed, then disable them for production.

--
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
Re:assert()'s for every assumption by UnknownSoldier · 2018-09-30 02:40 · Score: 2

A "professional" programmer bragging how much he is ignorant.
*facepalm*
Hint: If you aren't using all the (language's debugging) tools available then you aren't as smart, or professional, as you think you are. A "real" professional would go "Oh cool, this defensive programming -- an implementation of Fail Fast -- will help in tracking down bugs! Nice!"
C programmers who don't use assert() are either ignorant, stupid, writing toy programs, or some combination. Period.
Re:assert()'s for every assumption by blackhedd · 2018-09-30 05:47 · Score: 4, Informative

You got the right way to say it: "Assert all of your assumptions."
Code rots when it gets modified in ways that don't respect the implicit assumptions made in the past. Have you ever said to yourself, "this function is only called from two places and I know those two places validate the parameters"? When you then write the function without checking parameters, you've made an implicit assumption that makes all the sense in the world at that moment. But someone else (or you in three months) will forget or won't know the assumption, and call that function from somewhere else, with unchecked parameters.
Either document your implicit assumptions (making them explicit), or (better) assert them. The way to get good at asserting implicit assumptions is first to learn how to recognize when you're making an implicit assumption! That takes skill and practice, but if you don't do it, asserts don't help you.
And I leave the asserts in the final builds, too. In decades of professional C programming, I've never had a case where the asserts imposed a measurable performance penalty.
To the people who say "use NDEBUG to disable your asserts for production, because customers hate interruptions": no, don't do that. A violation of an implicit assumption is ALWAYS a bug, and it's always better for it to bite your behind sooner rather than later. The assert tells you exactly what you did wrong. I've had this conversation dozens of times:
[Angry customer]: Your software crashed!
[Me]: pop open the syslog and search for the word "assert."
[Angry customer]: It says "line XXX in file YYY"!
[Me]: you'll have the fix in two hours.
Interestingly, there is a handful of classes of bugs that are impervious to asserting your assumptions. The worst of these, in my experience, is the accidentally shadowed variable. But using assert in a disciplined way is incredibly useful.
Re: assert()'s for every assumption by Anonymous Coward · 2018-09-30 06:25 · Score: 1

"I'm a professional C programmer".
"I'm not a 'C' programmer"
?????
Re:assert()'s for every assumption by sysrammer · 2018-09-30 10:08 · Score: 1

And when the dev forgets to turn it off for that one little module, hilarity reigns when the feed gets random debug messages in it.

--
His ignorance covered the whole earth like a blanket, and there was hardly a hole in it anywhere. - Mark Twain
Re:assert()'s for every assumption by Darinbob · 2018-09-30 18:15 · Score: 1

They can sometimes be annoying. Ie, "assert(0)" for me often generates a code dump where I can't see the actual value of the original culprit variables anymore, since the compiler thinks the variables in don't need to be saved and the assert code itself scribbles all over the registers.
Another problem with asserts is that they need to be about REAL errors. I've seen many cases where an assert has happened at a customer and when investigating you see that the assert itself is the error. Often the devs just add them willy nilly without thinking about whether the precondition is true or not
I also see asserts (or exceptions) used as the cheap ass way of handling unexpected stuff, rather than trying to recover from the error and carry on. Ok, you got an unexpected value in a packet from the network, probably best to NOT assert there and instead handle the error (discard the packet probably), and yet I've seen code that did this which asserted after shipping.
Re:assert()'s for every assumption by Darinbob · 2018-09-30 18:17 · Score: 1

Several places I worked had a policy of leaving all asserts in place for production release. I think the rationale is that half the bugs are found by customers, which is another problem in itself.
Re:assert()'s for every assumption by Darinbob · 2018-09-30 18:22 · Score: 1

I have had to remove asserts where the code was too big to fit on the chip anymore. I found a few asserts, verified that they could not possibly happen, and removed them.
For most stuff I've worked in in the last couple of decades, the assert causes a mysterious reboot and coredump.
Re:assert()'s for every assumption by blackhedd · 2018-09-30 19:00 · Score: 2

Not the way I'd do it. An assert is always to be considered an orthogonal instrumentation of code, not code itself. The code must ALWAYS work according to intent, with or without the assert. The point of the assert is to detect when the intent isn't correct or has changed over time due to violated assumptions.
Write it this way:
auto ok = FunctionWhichShouldntEverFail (a,b,c);
assert (ok);
In short: always have the assert as its own statement (easily removed or commented out); and only assert values, not functions or expressions with side effects.
Re:assert()'s for every assumption by blackhedd · 2018-09-30 19:11 · Score: 1

As with every tool and technique, there are right ways and wrong ways to use it.
To one of your points: sometimes it's not easy to figure out the best or most consistent way to handle a problem, when you expect it to happen never or almost never. A perfect example is when you can't create a realistic test case! In that case, I'd rather write an assert and get a deterministic and easily-fixable failure in the field, than write complicated error handling and recovery that I can't easily test.
One time I nearly fired a guy, is when I hit a NULL-pointer exception in some of his code that was part of an error handler. "Didn't you test this error handler??!!" "Of course not, there was no way to write a test case that triggers it."
Re:assert()'s for every assumption by blackhedd · 2018-09-30 19:24 · Score: 1

One thing I've done a lot is to rewrite the assert macro so that it leaves a syslog trace. Another thing you can do (if you're lucky enough to get a coredump on failure) is to arrange for a distinctive signal (like SIGABRT) to hit your program. That stands out nicely in the coredump.
Re:assert()'s for every assumption by dfsmith · 2018-10-01 10:49 · Score: 1

[Angry customer]: The software I was using crashed with an assert code!
[St. Peter]: Yes, we'll have to talk to the car manufacturer about that.
Most of the software I work on can emit smoke, mangle parts, or lose data if it were to abort. There's a place for assert(), but it's not always an option.
Re:assert()'s for every assumption by Mr.+Slippery · 2018-10-01 11:02 · Score: 1

To the people who say "use NDEBUG to disable your asserts for production, because customers hate interruptions": no, don't do that. A violation of an implicit assumption is ALWAYS a bug
But it's not. I've seen asserts used for things that did not affect program operation. Don't assert the irrelevant.

--
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood

printf() may not work for multithreaded problems by jrbrtsn · 2018-09-29 23:34 · Score: 4, Interesting

A few years ago I had an issue in a multi-threaded program where using printf()'s caused the problem to go away. In order to track the problem down, I ended up writing messages to a buffer in RAM, and dumping the buffer to stdout after the problem occurred.

So familiar! by jtgd · 2018-09-29 23:35 · Score: 2

Been there, done that many times. Nothing more frustrating to see something you know is absolutely impossible! But fairly satisfying when you ultimately find the bug.

--
J

Re:So familiar! by turbidostato · 2018-09-30 01:52 · Score: 1

"But fairly satisfying when you ultimately find the bug."
It is not the kind of problem Raymond is talking about, then.
He is talking about the kind of problem you *know* the solution should be obvious to you from second one and yet, it hides a whole lot of time in which you know, every single second, the problem is laughing at you. When, Bam! you finally find the answer, you know you were right all the time: the solution was obvious, it has been laughing at you at plain sight, and you only feel damn stupid.
Re:So familiar! by djinn6 · 2018-09-30 09:00 · Score: 1

Eric's problem is that he tried to write a shitty Python parser instead of doing it the proper way by using the built-in ast module. There would be no mucking about with white space if you start from the parsed ast.
Re:So familiar! by cshamis · 2018-10-01 07:04 · Score: 1

I love those. When I see something I know is impossible... I know I've just about nailed it. My debugging litany: Socratic Method (I presume that everything I thought I new know COULD be wrong) Penn and Teller (There is trickery and a truth here...) Sherlock Holmes (Once you eliminate the impossible, whatever is left --- is the answer)

Rubber Duck Debugging by Anonymous Coward · 2018-09-29 23:35 · Score: 2, Informative

I learned long ago to recognize the feeling that comes when I know I'm missing something obvious. When I do that, I grab a coworker, and explain the issue to them. Just explaining it to someone is frequently enough, but sometimes they spot something glaringly obvious that I've missing.

I spent an hour once trying to find an issue where the difference was between I5 and l5. Yeah, depending on your font and display that may be an easy problem, or a hard one. One of those is a capital i, the other a lowercase L.

Re:Rubber Duck Debugging by vtcodger · 2018-09-30 02:10 · Score: 1

Yep. I think 90% of these problems are caused by
1. Incorrect assumptions about how something that looks simple works. e.g. in Python using sort() (sorts and returns a success flag) instead of sorted (sorts and returns sorted object). I once spent several days tracking down what turned out to be an equation with an ambiguous denominator. The spec author intended (A+B)/(C+D).But elegantly typeset without the parentheses in the spec. The programmer read it as A+ B/C + D
2. Missing punctuation
3. Punctuation that doesn't belong where it is
4. Wrong punctuation -- period where comma intended, semicolon instead of colon, etc..

--
You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
Re:Rubber Duck Debugging by bzipitidoo · 2018-09-30 09:54 · Score: 1

I had one where I fatfingered O instead of 0. They are next to each other on the QWERTY keyboard, you know. Stared at that line of code for some 5 to 1O minutes, not seeing what the problem could possibly be. I did figure it out finally, but it was annoying to have lost that much time on a visual distinction problem. A better font would have helped.

--
Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
Re:Rubber Duck Debugging by fafalone · 2018-09-30 17:18 · Score: 1

Explaining definitely helps. I can't count the number of times I've spent hours trying to find a bug, finally broke down and posted a request for help, then solve the problem myself a minute or two later.

Re: That's when I usually find bugs in other code by Anonymous Coward · 2018-09-29 23:38 · Score: 1

I feel the same way! When my code doesn't work, it's somebody else's fault. :-)

Re:Get a better debugger by serviscope_minor · 2018-09-29 23:41 · Score: 4, Insightful

Or with experience you realise that stepping debuggers are great for some problems and printfs are great for other problems.

--
SJW n. One who posts facts.

Errrm, ... Isn't that simply called 'programming'? by Qbertino · 2018-09-29 23:54 · Score: 1

I know this type of thing, is my dayjob. Given, I'm currently doing total LAMP stack web development on setups degraded beyond imagination and a lot of my work involves coming up with crazy hacks and gluecode that gradually inches it's way towards a solution. In order to achieve I do exactly what he describes. It's basically what loosely typed web development is all about. But as far as I can tell, this type of problem is a regular thing in development and results in having to bend our abstraction of reality (code) to actual reality. This type of problem will never go away and AFAICT it's a standard thing to run into when doing anything but the most trivial cleanroom script.

--
We suffer more in our imagination than in reality. - Seneca

Re:He really is old, isn't he? by Anonymous Coward · 2018-09-29 23:57 · Score: 2, Insightful

We called these "Heisenbugs" - attempting to study the bug (via debugger/variable dumps, etc.) causes it to vanish from sight.

Off by one...... by Proudrooster · 2018-09-30 00:21 · Score: 2

I feel ya brother.. the off by one still gets me 30 years later.
https://en.wikipedia.org/wiki/...

I wish we could have an agreement that lists, arrays, elements, and anything put into a list, table, query, associative array, start with an index value of either 0 or 1.

I don't care just pick one, and don't use two different standards in the same environment.

Re:Off by one...... by Darinbob · 2018-09-30 18:25 · Score: 2

Don't forget the old Fortran programmer who moved to C/C++ and insists that all of his arrays start at 1, like God intended, and added some helper functions and macros to do this. Pity the developer who inherited that code and had to maintain it.

Re:He really is old, isn't he? by Antique+Geekmeister · 2018-09-30 00:22 · Score: 1

> Ever tried debugging deep-level OS kernel code?

At least I have. I'm remembering various times, long ago for me, when bugs or limitations in gcc caused difficulty for me applying kernel patches. I sympathize with expanding the reporting to make the critical information visible.

Offensive by 110010001000 · 2018-09-30 00:40 · Score: 4, Funny

I find that calling someone "stupid" (even yourself) is offensive and the imagery of "hitting with a mallet" is extremely violent. He shouldn't be allowed to work on open source projects.

Re:Offensive by Darinbob · 2018-09-30 18:26 · Score: 1

I call myself stupid all the time, this lessens the blow of other people calling me stupid.
Re:Offensive by ConceptJunkie · 2018-10-01 07:16 · Score: 1

I always say that my all code should be idiot-proofed because I'm one of the idiots. But I do generally write good code.

--
You are in a maze of twisty little passages, all alike.

Re:He really is old, isn't he? by gtall · 2018-09-30 00:42 · Score: 1

Hardware is replete with things that evade normal state-based analysis. The spectre bugs show that. Anytime you have a shared resource, you must diagnose the myriad ways of which it could be take advantage. In security, we might view many of these as covert channels. Timing channels are typically the hardest to analyze.

Prevention: if (2 == $a) by raymorris · 2018-09-30 00:58 · Score: 1

I've started preventing that by habitually putting the variable on the right side. If I accidentally use = instead of == I'll get a syntax error.

Re:Prevention: if (2 == $a) by NormalVisual · 2018-09-30 02:42 · Score: 1

Where I work, that's a coding standard, and will get a hard fail in a code review if the variable is on the left side. Of course it's not bulletproof, like if you're comparing two variables, but it helps. Other standards, like requiring braces for even single-line if/thens help as well, at the occasional expense of readability.

--
Please stand clear of the doors, por favor mantenganse alejado de las puertas

Easy prevention: if (10 == variable) by raymorris · 2018-09-30 01:01 · Score: 2

I've started preventing that by habitually putting the variable on the right side. If I accidentally use = instead of == I'll get a syntax error. It makes that bug impossible by just changing an arbitrary habit.

if ( 10 == variable )

Re:Easy prevention: if (10 == variable) by turbidostato · 2018-09-30 01:56 · Score: 1

That's the known workaround and highlights another problem that impacts our profession in full: our inability to stand on the shoulders of giants. With each generation, so it seems, we invent a whole lot of new things at the expense of forgetting a lot of things we already knew: two steps forward, one step (and sometimes even two, three or a whole mile) backwards.
Re:Easy prevention: if (10 == variable) by goose-incarnated · 2018-09-30 02:55 · Score: 1

I've started preventing that by habitually putting the variable on the right side. If I accidentally use = instead of == I'll get a syntax error. It makes that bug impossible by just changing an arbitrary habit.
if ( 10 == variable )
I used to do that until compilers started issuing warnings for assignments in conditional expressions; now I simply use the extra braces so that the compiler *KNOWS* that that construct was intended.

--
I'm a minority race. Save your vitriol for white people.

Sometimes you need to debug optimized code. by Anonymous Coward · 2018-09-30 01:10 · Score: 2, Informative

So you have a failed assertion. What happened? Fire up the debugger, breakpoint on abort. Breakpoint gets triggered, you get a backtrace. Can't imagine how you got there.

Days of debugging later...

The abort function is marked as "noreturn". Consequently instead of calling abort, the compiler saves a few bytes/cycles by jumping to a preexisting abort call, never mind the state of the stack frame. Of course, this single recycled abort call in the whole module is where all backtracks end up. Hooray.

Now obviously the whole purpose of abort as opposed to exit is to get a core dump. And the whole purpose of a core dump is debugging. And debugging involves backtraces, so abort calls should leave stack and continuation in a useful and recognizable state. So the obvious remedy is not to mark abort as "noreturn". Because you never want to have the stack in a mess when aborting as opposed to exiting.

Enter your most beloved glibc maintainer of yore. Who refuses to lie to the compiler for any reason at all.

This shtoopid problem will stick around. -fno-crossjumping for yall.

Re:printf() may not work for multithreaded problem by pauljlucas · 2018-09-30 01:17 · Score: 1

Of course not. There is nothing special about printf: it is just an ordinary function that takes time (multiple cycles) to execute. During that time, multiple values to be printed can be changed by other threads so the printed results are inconsistent. In such cases, you need to use a mutex.

--
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.

Re: He really is old, isn't he? by turbidostato · 2018-09-30 01:36 · Score: 1

While I won't say Eric's name is not suitable, I feel this should be called an "Akerue problem". As to the why, I leave it as an exercise to the reader (and I hope Eric gets to read this).

Re:printf() may not work for multithreaded problem by Wrath0fb0b · 2018-09-30 02:03 · Score: 5, Interesting

Fun story time related by a colleague. A pretty common piece of software (hint: there's probably one running within a few hundred yards of you) had an elusive bug. But as the parent noted, printf caused the problem to go away, and it was suspected because it caused synchronization on stdout. Unlike the parent, the developers didn't have time to actually implement a buffered-log solution to figure this out, so they the obviously-logical thing -- they replaced all the printf calls with barrier() and shipped it. It's still running like this today.

Another good one, I worked with someone who would log everything all the time by fprintfing to a high-numbered pipe. When I asked him, he gave a few advantages that still ring partially true (depends on context): first, he said, I can get the log from any running instance without even stopping by d-tracing the system call. But most critically, he said, all the formatting happens in userland and only after the syscall does the kernel actually realize that there's nothing on the other end of the pipe and drop the write. That means, he reasoned, that the release/debug versions would always have very close behavior and would avoid the class of 'bugs that don't reproduce in debug build'. As with the other story, to this day, there's a slew of machines out there, formatting and writing log messages to a pipe that's never open.

we can do better, but are doing worse by mothlos · 2018-09-30 02:15 · Score: 2

We have solutions to reduce this sort of problem (at least once you get past the learning curve), but the top programming languages tend to implement very few of them. Reasoning about state is difficult, particularly when that state can be altered in unexpected ways. It is difficult to be confident that your code does what you think it does when you don't have a computer-checked method of specifying your intentions separate from what your code does.

There are no magic solutions here, at the least you will end up needing to spend more time writing in a specification language and that requires learning how it works. I would say that a gentle introduction to something like this is Elm which has an aim of stripping down typed functional programming into something that doesn't really need a C.S. degree. Here is a video which helps to explain what a better type system can do for your code. If you want to see something a bit more mind-bending check out Idris which has a much more powerful specification language which can prevent things like off-by-one errors or unbounded recursion in many cases. Moving off the scale of usability a bit, there is ATS which is a difficult language, but its specification language is able to make pointer arithmetic safe and doesn't bind you to immutable data structures. Hell, even Rust is full of good ideas that help to avoid these issues. And if fault-tolerant distributed systems are your thing, you need to check out Erlang (or its sibling Elixir) as there are so many great ideas that have been around for decades yet don't get nearly enough exposure.

This doesn't prevent us all from occasionally falling into this trap, but the themes of the languages listed is to find ways to encourage (or force) you to get the little things right the first time with computer-verified specification and to isolate the search space where problems are likely to occur.

Re:we can do better, but are doing worse by PPH · 2018-09-30 03:27 · Score: 1, Funny

Ah yes. The age-old solution to programming problems: Invent another language to throw at it.

--
Have gnu, will travel.
Re:we can do better, but are doing worse by AHuxley · 2018-09-30 11:47 · Score: 2

Back to Ada.

--
Domestic spying is now "Benign Information Gathering"

Re:He really is old, isn't he? by goose-incarnated · 2018-09-30 02:32 · Score: 1

Instrument everything? Printf is your friend? The guy is talking about something of a few dozen lines of code, and he doesn't simply use a debugger to step through the problematic code? WTF LOL

You're a moron; many problems don't show up under the debugger.

--
I'm a minority race. Save your vitriol for white people.

Time by FeelGood314 · 2018-09-30 02:46 · Score: 1

Whenever I deal with setting time, daylight savings, time zones etc., managers just assume it is going to be easy. And then the corner cases start popping up. What happens when a co-processor clock drifts 3 seconds ahead during the transition at the end of daylight savings and you had an event that started at 2am local time... Your 20 minute easy bug fix turns into a week of long hours and 2 weeks of testing for the test team. And don't get me started with NTP, that code is awesome but the jitter handling is so complicated.

Re:printf() may not work for multithreaded problem by goose-incarnated · 2018-09-30 02:52 · Score: 2

A few years ago I had an issue in a multi-threaded program where using printf()'s caused the problem to go away. In order to track the problem down, I ended up writing messages to a buffer in RAM, and dumping the buffer to stdout after the problem occurred.

Similar story, except that the processor would reboot, clearing all the variables I stored leaving no opportunity to grab all the diagnostics.

I examined the map, determined what the last address was, added an interrupt handler on the clock that logged the stack pointer ~250/sec (only needed to log the pointer if it was smaller than the existing one) to determine how much margin I had and used that little space between maximum stack and variables to write my diagnostics to.

Once I had determined the smallest stack address that got used, I wrote my diagnostics into that margin between the stack and the bss. To make sure that the values wouldn't be overwritten on processor startup I could not use actual variables, I had to use a pointer variable that pointed to those ten bytes I could write into. At startup the bootstrap code would grab whatever was in that memory, chuck it via i2c onto another processor, clear the ten bytes, and then proceed with normal bootup.

When booted from cold that memory held nothing, when rebooted the memory was not cleared (because power was not removed) and thus I had my diagnostics from the previous execution.

And yes, I found the bug with the help of the diagnostics (don't recall what it was, but that isn't important).

--
I'm a minority race. Save your vitriol for white people.

BASHisms by Wolfrider · 2018-09-30 03:26 · Score: 1

--Try passing BASH commands over SSH to something that Requires Quotes to work right. There's a reason my hair is short.

--
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??

Re:printf() may not work for multithreaded problem by Bite+The+Pillow · 2018-09-30 03:41 · Score: 1

I would assume that printf is thread safe on at least some operating systems, printing the whole message and blocking other calls. That would force thread synchronization with the mutex present but hidden. Especially on Windows, where they do a lot to protect users against bad code, without much outside input in those decisions.

I really hate this damned PC by PPH · 2018-09-30 03:44 · Score: 1

I wish that they would sell it.
It never does quite what I want,
but only what I tell it.

--
Have gnu, will travel.

Why I was very cautious about time estimate by shoor · 2018-09-30 03:45 · Score: 1

Back in the day, I'd be given a programming task, and my boss would, naturally, want me to estimate how long it would take. Sometimes it looked like something I could do in maybe a day or two, but I always worried about that elusive bug that might pop up, where I'd get 95% of the coding done in that day or two, but then spend a week tracking down that one knotty problem.

--
In theory, theory and practice are the same; in practice they're different. (Yogi Berra & A. Einstein)

Re:Profilers by Galactic+Dominator · 2018-09-30 04:09 · Score: 2

(2) Because they don't distinguish between waste (a) and time consuming functionality (b)

If you are looking for profilers to analyze your code for inefficiencies, then you have a different definition of profiler than I believe most high lever users do. Profilers are there to make a representation of where time/cycles are spent in code. It is up to the author to analyze and act upon such information. And profiling is extremely useful provided you understand the code and infrastructure. You are correct in one way though, it useless for optimization provided you don't know the very basics of programming.

--
brandelf -t FreeBSD /brain

Which language? An expression in C++, C#, Java by raymorris · 2018-09-30 04:14 · Score: 1

Which "most C-syntax languages" do you have in mind?

In C++, C#, Java, Perl, and most languages I can think of, assignment is an expression, which means it returns a value.

The sole exception I can think of is that in Rust there IS NO assignment for many kinds of values. In Rust what looks like an assignment:

A = B;

May actually destroy B, making it no longer accessible. The value is moved from B to A, not copied. There can be only one instance of most value types, there cannot be two variables with the value. For that reason in Rust you can't do:
A = B = C;

That would (in any C derived language) result in A and B having the same value, which is often not supported in Rust.

Is Rust what you meant by "most languages"? Rust is weird in this respect (and a few other respects as well).

Re:printf() may not work for multithreaded problem by pauljlucas · 2018-09-30 04:18 · Score: 1

printf (and the stdio library in particular) may be thread-safe with respect to the FILE* objects but never with respect to the arguments you pass to printf for printing. In particular, something like:

printf( "%d %d\n", i, j );

guarantees nothing about when (or in what order) the values of i and j are read and copied into printf's stack frame (which happens before printf is even called just as every other functions' arguments).

--
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.

postfix by TechyImmigrant · 2018-09-30 05:09 · Score: 1

setgrey vs setgray

'nuff said

--
I should use this sig to advertise my book ISBN-13 : 978-1501515132.

Cold your mind and work in a reasonable order. by Anonymous Coward · 2018-09-30 05:17 · Score: 1

Recently I spent "weeks" going behind a configuration problem with a RAID1 in an embedded system I am working with. One of the disks go out of the RAID in a casual way ...

Then, I check the disks and they are OK, I search about my Linux version and some person indicated that "maybe" was because of the particular kernel version, so I improved the kernel. But the new kernel was not OK with my distribution, so I have to change it ... but the new distribution uses different versions of some key libraries forcing me to recompile my program with the new ones (crossing my fingers that the new libraries will continue working on my program) ... after remaking everything, the first thing was to find a failure, just than faster than before. What .... ( !!! some word not suitable for children !!! ).

At the end, I stopped and cold my mind. Let's see what could be ... a little more research. Oh, that symptom it is similar to a bad cable ... but cables are brand new and high quality ones ... let's try a new one. Oh my God !! ... zero problems. Now I have many days with a perfectly behaved system. That cable was in failure, not matter how I needed to trust in the provider.

In general. When something happens, it is important first to stop and to enumerate all possible reasons for the failure. Then, order them by difficulty. The easier to check first, and go deeper as each solution it is not resolving the problem. Don't try to rebuild the world first; this is not a good solution.

Or even worse, the bug vanishes when looked for by rbrander · 2018-09-30 05:28 · Score: 1

I can't remember the details now. (In particular, I cannot remember the date or who drove me home) But I can recall these kinds of bugs where you put in the "print" statement after every line, and figure, NOW it will be revealed... ...and the bug goes away. And I gradually removed print statements and brought the code back to not-inspected, and the bug stays gone.

Your REAL nightmare would be to have it come back at that point, it would start to feel like the X-files. (It's close to that in Ellen Ullman's nighmarish novel, "The Bug")

I never had it go THAT far, but I did "cure" some bugs by looking for them, having them disappear without me knowing what part of the search process changed something in the non-Print-statements and made it go away, then wondered for ages what the hell had really gone wrong...usually every time I used the program for years, still feeling mistrustful and often double-checking it.

Under Siege 2: Dark Territory by rastos1 · 2018-09-30 05:54 · Score: 1

... places where you think you are sure what is going on.

"Assumption is the mother of all fuck up's!"

WTF? RMS is only now figuring this out??? by Narcocide · 2018-09-30 06:18 · Score: 1

Sure, printf is your friend but I have been telling people that for decades now. Nobody ever listened to me when they decided to blame their own naivety on the language or the development suite, or me personally, instead.

Context for ESR's comment: Python-to-Go translator by Entrope · 2018-09-30 06:23 · Score: 1

Home page: http://www.catb.org/esr/pytogo...

This is not too surprising given his recent work on reposurgeon, his previous statements that Python is simply not performant enough for converting the gcc repository from Subversion to git, and his exploration of Rust vs Go as systems programming languages.

Debugging and Code Comprehension by BobC · 2018-09-30 06:29 · Score: 1

My debug tools and code comprehension tools for C are nearly identical:
1. An IDE that's smart enough to dim code that's commended or #if'ed out (vscode works OK).
2. A code coverage tool (gcov is fine).
3. An execution trace tool (e.g., rr - https://en.wikipedia.org/wiki/Rr_(debugging)).
4. A "fat" interface to GDB (lots of dynamic content and context display, often a GUI).

The first "rule" is to simply spend NO TIME on code that has nothing to do with the task/problem at hand. Drill down first, then pay attention to the rest only as needed to get the job done. Which means the first step is to determine which code is relevant (vs. proven irrelevant).

I'm getting back into C after first having done 20 years of C then a decade of Python, all in the area of real-time/embedded sensing and control. I'm amazed how little the C environment has changed, and how rarely the newer (and better) C language features are actually being used.

Re:Get a better debugger by mikael · 2018-09-30 06:40 · Score: 1

One problem I encountered with an early C++ compiler was that a chain of function calls which had a function in the middle that had a return result of "int" would actually return the result of the last function call it itself had made,even if no return result was supplied. While this was the actual intention, it was spooky that no value was returned. Any attempts to add new code would just make this fail.

--
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads

Re:printf() may not work for multithreaded problem by mikael · 2018-09-30 06:44 · Score: 1

printf makes calls to malloc/alloc and free to do allocation and deallocation for string construction. There were some libraries that provided the printf functionality using a static pool of memory and string concatenation.

--
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads

Re:printf() may not work for multithreaded problem by Anonymous Coward · 2018-09-30 07:30 · Score: 1

Yes, I've had printf-obscuring bugs (Heisenbugs?), and bugs where printf and stderr get eaten by the program that wraps the program with the bug, and bugs where redirecting stdout or stderr to a file or a pipe changes the behavior of the program.

Eventually I resorted to putting debug statements inside unlink() calls:
unlink("-- debug message here")
and then used strace to watch what's going on:
strace -f -s 1024 -e trace=unlink programname args args args...

Unlink was chosen because it's not buffered (like printf), it's fast, it's a kernel (not library) call, it's got a single string argument, it doesn't have any[1] side effects, it's not called very much by normal programs, it can't be redirected to /dev/null, and I can grep for it easily.

[1] If you put '--' at the start of the message, there's no way it'll match an existing filename, thus, no side effects.

!= instead of == or vice versa by mark-t · 2018-09-30 07:33 · Score: 2

I know what it is that I mean for the program to do, but sometimes will type exactly the opposite, all the while continuing to read it the way that I meant it. Even putting an assert in will not help because in close proximity to where I've accidentally created this kind of inverted condition, it is unfortunately quite likely I will repeat the mistake. And again, when I make these kinds of mistakes, I cannot easily feel nd them on my own because I see the code I thought I wrote instead of what is necessarily actually there.

--
File under 'M' for 'Manic ranting'

Re:!= instead of == or vice versa by TeknoHog · 2018-09-30 10:04 · Score: 1

In Python, the pass statement helps with these. You can always convert a single if-without-else construct into an if-else one, where one of the cases has the non-inverted condition.

--
Escher was the first MC and Giger invented the HR department.

Re:Get a better debugger by djinn6 · 2018-09-30 08:44 · Score: 1

This would makes a lot more sense if you read the disassembled program. Per common C calling conventions, the return value is stored in the eax register before handing the control flow back to the caller. Suppose function A calls function B, which in turn calls function C. C would set the eax register to some int value that it wants to return, since B does nothing after calling C, eax would be unmodified. When A checks eax for the return value of B, it would get C's return value instead.

An old systems fart says... by Anonymous Coward · 2018-09-30 09:40 · Score: 1

On an old system that suffered from poor architecture and management refusing to spend money to fix it...

I've had parts (LSB) of pointers get corrupt, causing corruption in other random places, but close to where they should point. A little code change and that pointer is no longer clobbered or in turn clobbers something different with no obvious effect.

I've had to add code to the C preamble that would check if a certain variable was corrupt, and log the address of the current function and call stack on just the first time it is found corrupt. That got me close enough to find it.

Lockups in Microsoft WIndows system calls that I finally had to add a timer interrupt and a heartbeat in the code. If the timer ever went off it would log the main program call stack up into the system call. We proved it was a Microsoft bug and pay $10K a year for the right to hear them say "Yep it is our bug, but we have no idea why and can't fix it."

Microsoft compiler bugs that would produce incorrect code if there was a /* */ comment that straddled a 0x8000 boundary in the source code.

MS FAT file system messes up file time stamps in daylight saving time changes.

MS WIndows problems with 32 bit timers overflowing after 42 days.

Yes, these bugs were in Window NT and later, not 3.11 or 9x.

I found a bug in the Z80 processor (push AF doesn't push correct flags if interrupt occurred during previous instruction) and when I reported it to Zilog, they admitted it and gave me some work around code. Find that one if you can.

Interrupt levels were level triggered instead of edge triggered as it was documented. This caused problems with nested interrupts. Very rare occurrence, until I had the idea that might be what it was and put a time wasting loop inside the interrupt handler to make it trivial to recreate.

Porting from real mode to protected mode, I had to write a GPF handler to catch app programmer bugs and allow the code to continue to run if the GPF was not harmful, like scanning past the end of data. That required code that disassembled the running code to find out how long the offending instruction was so that it could be skipped and registers modified to act like it did before.

More, but I've got to go visit father in law just home from the hospital.

Weeping Angels by Weaselmancer · 2018-09-30 11:15 · Score: 2

I like to think of those kinds of bugs as Weeping Angels. They only move when you're not looking at them.

I have about a dozen years experience in MS Embedded CE. There is typically a Release build, and a Debug build. Release will macro out all the debug statements, which changes the execution timing. Enough so to where the bug that is biting you is often seen only in Release. Switch to Debug to chase it, and it goes away.

I had a similar experience recently with a PIC32 project. The devboard they sell has floating inputs on UART1. It never fails in the devboard. It does fail in the board I made. The floating inputs every so often will decide to twitch back and forth rapidly, firing a shitstorm of interrupt requests that crash the firmware. It never dies on the devboard. It occasionally gets twitchy and dies on our board, which is exactly derived from the schematic of the devboard. As an added plus, if you hook up an oscilloscope to the pins that changes impedance, and the float goes away, and the problem goes away. I have no idea how the devboard does not suffer from the same problem.

--
Weaselmancer
rediculous.

Chasing the SIGBUS by mdhoover · 2018-09-30 12:10 · Score: 1

I don't know how many days of my life have been lost chasing the SIGBUS on SPARC while porting libraries over to solaris.

And don't get me started on code which assumes a null pointer is an empty string

Had one of these last week by tdelaney · 2018-09-30 13:24 · Score: 1

Had one of these last week. I'd upgraded Groovy (used as a scripting engine) from 2.3 to 2.5 (and 2.4 which also exhibited the problem). Suddenly scripts that extended a custom base script couldn't instantiate the base class due to a missing constructor. The initial script was instantiating, and if I removed extending the base class it was able to execute.

Of course, it was actually working in my IDE.

After multiple days of logging, experimenting, configuring my IDE to (partially) replicate the problem, chasing down the possibility of classloaders resulting in the base class not actually extending the correct Script class (prompted by it working in my IDE but not elsewhere), rebuilding Groovy 2.5 to work around a bug in it, etc I eventually found the root cause - and it was a facepalm moment.

The scripts were actually:

Script1:
@BaseScript subpackage.CustomScript

subpackage.CustomScript
@BaseScript CommonCodeScript

It appears that in Groovy 2.4 @BaseScript changed from using the no-arg Script() constructor to the Script(Binding) constructor. And CommonCodeScript didn't implement CommonCodeScript(Binding), so CustomScript didn't implement it either.

The fix was to just @InheritConstructors for CommonCodeScript.

I still don't know how it was working in my IDE ...

Re:He really is old, isn't he? by Darinbob · 2018-09-30 17:39 · Score: 2

I didn't have a debugger, since the stupid chip gets wonky when turning it on. So compile, load the code, look at the oscilloscope, scratch my head, and repeat. I worry I was doing something stupid like it wasn't really loading my new code but had the old code, but that checked out too. Ask for some help over skype, but get nowhere.

Stop and stare at it, the change was supposed to be over and done in 5 to 10 minutes and it's been a few hours. Then I see it, I forgot a "~". I wasn't clearing that bit, I was clearing everything but that bit. And that's the first programming related question I tend to ask in interviews, so I felt pretty dumb.

This was Thursday. I will be keeping a journal of my senility.

Re:He really is old, isn't he? by Darinbob · 2018-09-30 17:45 · Score: 1

A kernel debugging itself is tricky, and leads some into a trap when they don't realize that somethings won't debug that way. Using a separate JTAG or other external debugging helps a lot, but sometimes that screws things up too.

What I hate is when there's a bug that depends upon timing so that using a debugger causes it to not happen. But tell the debugger to just run until the bug but then you can't step backwards. If you single step through it, the other tasks aren't running so the bug doesn't appear. Which is the time when you put in your old fashioned printfs, or add a mini logging mechanism to tell you the last N function calls, etc.

Re:He really is old, isn't he? by Darinbob · 2018-09-30 17:54 · Score: 1

Or the bug only happens in optimized code but goes away when compiling for debugging. Which means you end up debugging optimized code so that the debugger is uncertain about a variable's value. You end up cross referencing it with an assembler listing.

The fun ones are alignment errors. Which never happen on x86 code so those who only know Windows or Linux on a PC have never encountered these. Stick in a variable to store state for debugging and suddenly things are aligned again. This sometimes leads to the developer saying "I don't know what the bug was but when I merged in Bob's code it started working again", then it gets checked in and forgotten about for several months until it resurfaces.

Sadly that last seems to happen a lot when you've got really complicated code, poor documentation (or hard to find), and the developers are new and don't understand it as well as they think. The result are devs poking at bugs with sticks trying to get them to go away.

Re:Get a better debugger by Darinbob · 2018-09-30 18:01 · Score: 1

I knew people who relied on non-standard or indeterminate compiler behavior like that. Then they'd move to a new machine or upgrade the compiler and complain that it wasn't working right or that it was now giving warnings about code that they thought was perfectly fine.

Re:Get a better debugger by Darinbob · 2018-09-30 18:02 · Score: 1

Which is why the newer compilers give warnings in that case.

Instrumentation is nice, but ... by Ihlosi · 2018-09-30 19:42 · Score: 1

Instrumentation is nice, but try doing it on a smallish target (think microcontroller) which has to run in real-time, with mediocre and possibly buggy debug adapters.
Shtoopid problems might be programmer hell. Shtoopid problems on a small target that is hard to instrument is the laparascopic version of programmer hell.

Re: He really is old, isn't he? by Asgerix · 2018-09-30 21:30 · Score: 1

That's a kind of backwards way of naming it, I think ;-)

--
Life is wet, then you dry.

forgot a part by sad_ · 2018-09-30 23:03 · Score: 1

so working on this all day, and while driving home, or laying in bed or whatever... the solution pops in your head! you now know what is wrong.
make haste to your desk, implement the thought of solution, which then... comes very close but still isn't perfect.
repeat.

--
On a long enough timeline, the survival rate for everyone drops to zero.

Re:He really is old, isn't he? by Medievalist · 2018-10-01 06:25 · Score: 1

The guy is talking about something of a few dozen lines of code, and he doesn't simply use a debugger to step through the problematic code?

Oy veh, get off my lawn, you whippersnapper!

Using the same methods I've used since a powerful computer was one that had 8K main memory, I can debug anything. Why would I want to waste time on a limited tool when I already have globally applicable technique?

Learning how to debug on every OS, in every environment, forever, is a lot better use of my time than learning how to use this year's fad debugger/IDE. I can use the time saved to learn the latest fad language, which will be a lot more fun!

Save your files by Nahor · 2018-10-01 15:19 · Score: 1

A classmate of mine spent 45min trying to debug a crash. Eventually he added some printfs here and there but they didn't trigger. So he added more and more. Still nothing. Then he tried to understand why a "shtoopid" printf didn't work... Eventually he figured out that he never actually saved the file in those 45min, that he had kept running the same binary over and over.

Slashdot Mirror

Eric S. Raymond Identifies A Common Programming Trap: 'Shtoopid' Problems (ibiblio.org)

104 of 189 comments (clear)