New & Revolutionary Debugging Techniques?
An anonymous reader writes "It seems that people are still using print statements to debug programs (Brian Kernighan does!).
Besides the ol' traditional debugger, do you know any new debugger that has a revolutionary way to help us inspect the data? (don't answer it with ddd, or any other debugger that got fancy data display), what I mean is a new revolutionary way. I have only found one answer.
It seems that Relative Debugging is quite neat and cool."
Sometimes something goes wrong WITHOUT causing an exception. Those are the bugs hard to find.
You get what's called 'glassnose syndrome' too easily.
Instead concentrate on building software in many small incremental steps so that problems are caught quickly, and on separation of design so that dependencies are rare.
If you can't find a problem, leave it and do something else.
Otherwise, print statements, yes, that's about the right level to debug at.
Ceci n'est pas une signature
Good idea, but isn't unit testing + standard assertions do the same thing but in more automatic way ?
You feed some data to functions, you expect some sane pre-calculated output from them. Simple yet powerful.
And more important it's automatic. So you can integrate it into build process.
- Arwen, I'm your father, Agent Smith.
- Well, you're just Smith, but my father is Aerosmith!
Java Exceptions *were* a revolution in debugging.
Because everyone knows that Java invented exception handling...
I can't escape the suspicion that the anonymous poster is actually in some way connected to Guardsoft, but let's leave that for now...
I think it's a good idea, but I do wonder how many situations you'll be in where you already have an exisiting program that does everything you want to test against.
Having said, that, I can see how this would help with regression testing - making sure that you've not introduced any new bugs when fixing old ones. But I wonder how much it gives you above a general testing framework anyway...
"Relative debugging" seems to be what people have always been doing. Dump some state and comapre it to an expected state. Most frameworks for regression tests do something like that.
The best debugging method is to have a fast build environment so that you can add one printf, rebuild, reproduce the bug, move the printf to an even better place, rebuild and reproduce, etc. The more you rely on your tools to do the work for you, the less you understand the code and the less you understand the code, the more bugs you will make in the future.
There are no shortcuts to good code.
Comparing the "state" of multiple implementations or versions of code is an old technique. You don't need a special debugger for it--you can use a regular debugger and a tiny bit of glue code. Alternatively, you can insert the debugging code using aspects (aspectj.org).
However, like many programming techniques, most real world programmers won't know about them unless they can shell out $1000 for a tool; reading a paper or book just would be too much intellectual challenge, right?
This news item seems to be a thinly veiled attempt to drum up business for that company.
That sounds cool, but it isn't all that useful in practice. Debuggers that support stepping backwards usually end up keeping a lot of state around, which limits them to fairly small, well-defined problems or modules. But the problems where an experienced programmers need a debugger are just the opposite: they involve lots of code and large amounts of data.
Usually, it's best to avoid going back and forth through the code altogether; insert assertions and see which ones fail.
Print statements are a great tool, especially on large pieces of software maintained/enhanced by many people. Once you've debugged your problem, you just #ifdef out the prints, and check the code back into version control.
When the next poor programmer comes along, trying to fix/find a bug in that code, he a) can #ifdef the prints back on and quickly get debugging output about the important events taking place in his run, and b) read the code and see where the hairy bits are, because they tend to be the sections most heavily littered with debugging print calls.
Fancy debugger IDEs just don't support this preservation of institutional knowledge.
It seems to me that a lot more effort is being put into creating good unit tests to identify and prevent bugs, rather than debugging running applications. With an automated testing framework you can seriously reduce the amount of time spent on manual debugging and fixing as the bugs get identified as early as compile time, rather than run time.
Yeah, it's always tempting to leave your debug printfs in and ifdef them out but I find that has two problems:
1. When reading the code for logic, the print statements can be distracting and take up valuable vertical screen realestate. An algorithm without printfs can usually fit on a single screen. With printfs it may spill over two pages. That can make debugging harder if you need to understand what you're looking at at a conceptual level.
2. Almost invariably I find that a previous person's printfs are almost totally useless because that person was usually debugging a different problem. Thus, enabling old printfs will dump out a whole bunch of information I couldn't possibly care about.
Thus, I make it a point to clear out any debug printfs I was using before I check in my code.
...is that they don't help in time-dependent situations.
For example, a program in C that uses lots of signals and semaphores could perform differently when print statements are added. This is because print statements take a (relatively) long time to execute. Print statements can affect the bug their supposed to be monitoring.
I had a situation very much like this. One process would fork and exec another, and they would send signals to each other to communicate. But there were a few small bugs that caused one of the processes to occationally miss a signal. When I added the print statements, it slowed the process down enough that it caught the signal. The only way i was able to successfully debug it was with a line-by-line trace with pencil and paper. I don't know if ddd would have helped (but I didn't know about it at the time).
Of course, as many people who debug multi-threaded programs have found, using print routines to output logs can make the bug 'go away', because quite often CRT functions like printf() etc are mutex'd, which serialises code execution, and thus alters the timing, and voila, race condition begone!
:-S
I know it's happened to me
I'm guessing it's because writing a device driver in LISP is not anybody's idea of a good time?
Seriously though, not all bugs in a program have to do with allocation or buffer overflows. Besides, there are tools to help people with these problems. Most interesting bugs are beyond these types of distractions and can happen regardless of the language, or are you asserting that functional language programs always work perfectly the first time you run them?
Often something goes wrong with no runtime error. Those bugs are often really, really difficult to find.
Nope, looks like marketroid hype to me. Answer me this: what is the point of comparing two separate identical runs of a computer, except in the case of testing platform equivalence, in which case the output of a test set can simply be diff'd.
The key to their idea is that The user first formulates a set of assertions about key data structures, which equals traditional techniques. The reason such traditional techniques have failed and continue to fail is that those assertions are always an order of magnitude simpler than the code itself. These people forget that a program *is* a set of assumptions. Dumbing it down to "x must be > y" doesn't help with the complex flow of information.
Peace & Blessings,
bmac
- The best programmer I've met once told me that once you've dropped into the debugger, you've lost, which over time I've found to be quite true. The best debugging practice is to learn how not to use a debugger. (e.g., Are you using threads when they're not absolutely required? Say hello to debugging hell...)
- When you must debug, print statements cover 97% of the cases perfectly. They allow you to formulate a hypothesis and test it experimentally as efficiently as possible.
- Differential debugging is a nifty idea, but most of the time it'd be better to just use it with your print statements as above (e.g., print to logs and then diff them). For the one time per year (or five or ten years?) that having a true differential debugger might pay off, it's probably a loss anyway because of the cost and learning curve of the tool. (I thought about adding this to SUBTERFUGUE, but realized that no one would likely ever productively use this feature.)
- If you need another reason to avoid this tool in particular, these guys have a (software) patent on it. Blech!
--Mike"Not an actor, but he plays one on TV."
"I haven't used a debugger in years; print statements are the only debugging tool I need."
:)
You shouldn't limit your options. For instance, I can guarantee that I can find a fix a bug with a debugger faster than using printline's when dealing with cross-language projects. Like using Java's JNI to talk with native dll's, for example. The error could be somewhere on the java side or the C++ side. It could be the object on the C++ side, the converter to a Java object, the way the object is being used on the java side, etc. etc.. When you need to see a bunch of data simultaneously, a debugger is the way to go.
Need I mention assembler? Much easier to look at the registers and stack in a debugger.
I'm not saying that printlines are bad, because I use them frequently for quick to moderate debugging. But when I need to examine large objects, functions, etc. I bring out the debugger.
~X~
~X~
Seriously, though. I've worked as a programmer for the last 15 years. Mostly, I've been fixing other people's bugs. Here's what I like to see in code that I need to fix (and generally don't see):
1) Consistency in formatting, style, variable names, design - I don't care what style you use as long as it's consistent. I prefer my own form of Hungarian Notation, where a variable's prefix indicates its scope (global, static, etc), as well as the type. If any of that information changes, I should darn well follow through to make sure I've fixed everything that depends on them. Bring on strong type checking!
2) No spaghetti code. Give me this:
instead of this:It doesn't look like it matters much yet, but try adding eight more error checks to both, and see which you can track better. The "early bailout on error" model clearly surpasses the "endless nesting" model.3) Use of descriptive variable and procedure names. Source code is not meant to be understood by the computer. This is why we have compilers, and interpreters. Source code is meant to be understood by humans. Write your code for humans, and you'll be surprised at how much faster you can grind through code. You'll only write the code once, but when you have to debug it, you'll spend eternity sifting through line after line, wondering what the hell you meant by that overused "temp" variable (temporary value? temperature? celsius? kelvin?). If you had only taken the time to spell out, "surface_temperature_C", you'd know for sure. Vowels are good for you.
4) Comment! Not every line. Not an impossible to maintain function header comment with dates and initials of everyone who's edited it. Don't fall for nor rely on that "self-documenting" code nonsense. Just one comment line every three to ten code lines. That's all. Give me an overview of what's supposed to happen in each logical block of code. Tell me what if conditions are checking for. A good rule of thumb is to sketch out your functions in comments first, then fill in the blanks.
That's all I can come up with off the top of my head, but there are certainly more...
NOTE: for the pedants who think they noticed an apparent conflict between my hungarian notation style and the "surface_temperature_C" variable: since there is no scope or type prefix on the variable, it's a local variable, and I can change it at will, knowing that it will not affect any code outside the function at hand. If it had been "m_fSurfaceTemperature_C", then I'd know it could have repercussions affecting the state of the current object. If it were "g_fSurfaceTemperature_F", then I'd know I could hose my whole program with an invalid value. And should have converted from Celsius to Fahrenheit before doing so...
How's my programming? Call 1-800-DEV-NULL
I have to agree... this sounds as revolutionary as "Junit". Yes, if you follow the paradigm correctly, you'll produce 100% bug-free code, but it would take so long to follow the paradigm correctly, you'd never get anything done. Not to say that it doesn't look like it might be useful, but I think they're being disingenuous about the amount of work that's going to go into using it.
Proud neuron in the Slashdot hivemind since 2002.
We use C-like languages to make things go very, very fast, immediately. Sometimes a high level language _could_ deliver this if we were willing to wait for the hardware architecture to be re-designed in its favour (which we're not), and sometimes it's just not possible because the C-like language lets you do things which cannot be proven by the machine to be safe, yet nevertheless are correct. Even scary old gets() can be quite harmless, under certain carefully controlled circumstances.
Now, some people use C (or even more stupidly, C++) to make petty database GUIs, or configure their font preferences, and here I agree that it's worthless, and a modern high-level language would be more appropriate. But note that even when Red Hat use Python throughout their management applets, they still crash, just with a run-time error (a list is too short, or a string is found when a number was expected) instead of SIGSEGV. Some of those mistakes might even have security implications... Programmers make mistakes in any and every language.
> Use an editor with folding capability.
For one thing my editor doesn't support this, and not all editors do. For another thing it depends on the folding implementation in the editor as to how distracting this is.
> Redirect stdout/stderr to a file. Besides, this sounds like a straw man. There's nothing stopping you from having differently detailed level of debugging output.
So now you want me to sit down and create sed/awk/grep/perl scripts to filter out stuff I don't want. No thanks. I'd rather just put in the print statements I actually need rather than waste time filtering out useless printfs I didn't want in the first place.
Again, I've found most printfs to be useless except to the person debugging the particular problem. A reusable debug printf is a RARE thing. Why bother preserving them?
> Fine, but I hope you document how you tested/debugged the algorithm so another developer can recreate whatever it was that you did.
All I can say is: DUH. First off, I never said to delete comments, and as far as detailing how the file/problem was debugged, that's what the checkin log is for..
--
"I'm too old to use Emacs." -- Rod MacDonald
In high-level languages, you usually don't have memory-allocation or buffer-overflow problems, but quite often there are other traps. In Perl, numerous gotchas are mentioned in the manual. In Python, unexperienced developers often make shallow copies of lists when deep copies are needed. In Lisp, beginners often accidentally modify quoted lists in program sources, and they may write macros that captures variables. In Haskell, hastily-written programs may leak memory because of incorrect handling of laziness. I can't quickly think of an OCaml example, but at least it is easy to get hard-to-find typing errors during compile time if you are not careful... As for Java, I bet lots of beginners write applets that locks up randomly because they are not well aware of AWT/Swing threading issues.
All these, like memory problems in C/C++, are avoidable if the gotchas of the language is well taught and learnt --- and indeed they are mentioned on most books about the language. However if people happen to forget one of these, they will all lead to very hard-to-find bugs. So in this respect, you need self-discipline when programming with present-day languages, even high-level ones.
A problem with functional languages is that they are quite hard to learn (which also makes them interesting if you like computer science). One have to read quite a number of CS papers if he wants to use Haskell well (otherwise he will see cryptic type errors if he tries to do anything advanced, or if he did anything wrong). C is much easier in this respect, and even C++/Perl aren't that hard --- they are just complex.