Not All Bugs Are Random
CowboyRobot writes "Andrew Koenig at Dr. Dobb's argues that by looking at a program's structure — as opposed to only looking at output — we can sometimes predict circumstances in which it is particularly likely to fail. 'For example, any time a program decides to use one or two (or more) algorithms depending on an aspect of its input such as size, we should verify that it works properly as close as possible to the decision boundary on both sides. I've seen quite a few programs that impose arbitrary length limits on, say, the size of an input line or the length of a name. I've also seen far too many such programs that fail when they are presented with input that fits the limit exactly, or is one greater (or less) than the limit. If you know by inspecting the code what those limits are, it is much easier to test for cases near the limits.'"
They want their structured programming back...
Fairly obvious statement, but one some newer programmers don't know. Koenig is talking about white-box testing, which is well understood.
We absolutely test all boundary conditions, on both sides. This is standard practice where I work, for just that reason.
http://en.wikipedia.org/wiki/Boundary-value_analysis
Has testing degraded so far that people don't now what a bounds test is?
You know, I remember writing test plans to to test input that were one below, at, and one above, some arbitrary limit when I was a trainee programmer coding up COBOL on mainframes back in the mid 80s.
How on earth does this drivel make it onto Slashdot? This is 30 year old news at least (which makes it straight out of the 17th C in internet years)
Those are features
http://saveie6.com/
And some people cannot be programmers, some cannot hammer a nail, and some cannot wear their wife's clothings in public. You aren't a real man until you can do all three.
If this is a surprise to anyen writing code, you need to bone up on QA and failure modes.
This is a special case of Sherlock's theorem:
It's far easier to debug a sin of omission than a sin of commission. If a piece of code never performs a disallowed function (e.g. leaking memory, failing to sanitize user input) then all failures that remain are sins of omission: the program doesn't actually transfer the file requested, out of excessive restraint on some edge case the programmer never even considered.
Well, the programmer needs to get in there and consider the omission in the harsh light of day. Then the specification document needs to be updated.
And questions need to be asked about the user environment when an edge case is tripped three years into a heavy use cycle.
The only way to achieve software up-front with no failure modes and no functional omissions is to massively gold-plate the validation process, and this rarely works anyway.
I'm never happier writing code than when I'm subtracting stupid.
If I am a columnist with some modest name recognition I could have converted this mildly amusing (to me at least) observation into a column. But alas, I am not one. So he gets to repeat the age old advice given by old Prof Mahabala, teaching Intro to Computing 201, (Fortran programming, in IBM 365/155 using punch cards no less ) back in 1980 into a column, and all I get to do is to bitch about it.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
we can sometimes predict circumstances in which it is particularly likely to fail. 'For example, any time a program decides to use one or two (or more) algorithms depending on
If you have an MVC architecture with 10000 models, controllers and views; all of your domain data is stored as Object in maps, keyed with strings. If the keys for your middleware and your maps are built up on the fly and type casts are used at the point where your domain data is actually used, then you are probably going to have problems.
Oh and here's another good one: you have a module blah with an internal namespace com.company.blah but now you want a newblah so you copy blah and change two classes but leave the namespace the same then you get a bright idea: lets delete all the unchanged stuff from newblah and stick it into the classpath ahead of blah. The new module now inherits from the old one, right? Thats a brilliant idea, worthy of a genius. Oh yeah the classpath we customised isn't actually used to run the software, only to develop and test it.
http://michaelsmith.id.au
Has Dr. Dobbs discovered Software Engineering? They are only a few decades late.
The next time you excuse a bug as a corner case will be your last.
If computers can't make truly random numbers, then they can not make truly random bugs. This means that all bugs are deterministic. No bugs are actually random.
Apparently you've never dealt with race conditions and multi-threaded code.
I do not fail; I succeed at finding out what does not work.
In theory shouldn't any state change be checked for this interpolate continuity.
Do the sliding weight algo.. overlap N points in X, do both and adjust output as weighted ratio, mathematical discontinuity gone, all derivatives (within resolution of N) will be continuous. I've applied this and a simple 2 node neural-net device to make yes/no reply smooth (in respect to input.)
I've often wondered if this fuzzy logic device had a name.
The answer is Yes, of course those bugs must be hunted down and crushed into pulp and the trail backtracked and the colony attacked.
I agree! No Christmas bonus for Joe the janitor! that will teach 'em for making that error.
News at 11.
Sheesh, evil *and* a jerk. -- Jade
I had an interview a few years ago for a systems administrator position and knowledge of C was a requirement. I had programmed in C mostly as a hobby for several years prior but not during the most recent five years. The interviewer kept asking increasingly questions in hopes of tripping me. The interviewer eventually said I knew more about C than everyone of the people on his staff of highly experienced C programmers. I was even asked specifically about what test cases I would use given a certain short algorithm. It was bizarre how easy those test cases were to me but they astounded the interviewer.
Short of hardware issues, no software bug is truly random, 'it just happens sometimes' = 'I lack the skill to troubleshoot this'
(If at first you don't succeed, do it different next time!)
Those aren't random, either; they just appear to be random.
Good point, though it doesn't actually refute GP. Computers are deterministic*. On the other hand, programs, functions, etc may not be deterministic.
* Except devices specifically designed not to be deterministic, such as hardware random number generators that rely on quantum, electron phenomena, etc. Then again I'm not sure it is correct to call such devices computers.
Let's see you force one to happen in the debugger, then.
I do not fail; I succeed at finding out what does not work.
A study by the company I worked for a few years ago only found one thing that correlated with the number of bugs - the number of comments. It wasn't so much that the algorithm was complicated and needed explanation; more often it was just bad programmers tossing in a bunch of commented out print statements and comments that didn't accurately describe what the code was doing.
That depends. If you're are a die-hard TDD fan, for example, then you'll be writing your (unit) tests before you necessarily know where any such boundaries are. Moreover, there is nothing inherent in that process to bring you back to add more tests later if boundaries arise or move during refactoring. It's not hard to imagine that a developer who leaves academia and moves straight into an industrial TDD shop might never have been exposed to competing ideas.
(I am not a die-hard TDD fan, and the practical usefulness of white box testing is one of the reasons for that, but I suspect many of us have met the kind of person I'm describing above.)
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Is there really anyone to whom this is not obvious?
When I started work, almost 30 years ago, in the Functional Verification team at IBM this was day one training.
"Test around the limits" (or bounds testing as it was known) was practically beaten into you.
QA needs to be testing coders are poor QA / testers and we need real live testers not some scripted system that can be coded to pass but still not be right and or well it passes but if any one looks at the out put they will say that does not look right.
Not if the different threads are clocked from different PLLs.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
if you are in 8th grade i give you a pass. but i suspect you suck.
even the simplest of algorithms. No human has ever written 100% correct code 100% of the time.
And QA doesn't usually understand what the designers intent is, so leaving it all to QA is a bad idea.
It is not random. If you have enough knowledge and the ability to comprehend that knowledge, you can predict what will happen. Nothing is random.
That's like a kindergarten level lesson on debugging.
In my mind all bugs arise from an unexpected set of conditions and/or poor coding. They may be unanticipated but that does not make them random.
Silence is a state of mime.
A program is an alogrithm or group of algorithms.
So surely we're talking about "correctness" of algorithms?
Isn't the best textboom on this by Dr Jeffrey Kingston, Algorithms and Data Structures: Design, Correctness, Analysis ??
http://www.goodreads.com/book/show/2682170-algorithms-and-data-structures
Have to agree with the above comments, but hey, slashdot is not what it used to be (they let me on for example).
I have to declare a possible conflcit of interest too; I know Kingston.
work in progress
It is not random. If you have enough knowledge and the ability to comprehend that knowledge, you can predict what will happen. Nothing is random.
Sure, as long as you start a program and let it run all by itself without touching anything. As soon as you introduce human input, the system may still be deterministic, but the output of the program is in effect random because the behavior of the operator cannot be predicted. The kind of "knowledge and the ability to comprehend that knowledge" that you describe is known as omniscience, and most IDEs currently don't support it.
Breakfast served all day!
Oh.
Excuse me.
"Stochastically Variable".
I do not fail; I succeed at finding out what does not work.
"any time a program decides to use one or two (or more) algorithms depending on an aspect of its input such as size, we should verify that it works properly as close as possible to the decision boundary on both sides"
...
If you haven't figured this out already, then you shouldn't be writing code
There's a scope for both and shared knowledge can be beneficial.
Part of the automated build process can be running QA scripts via a robot. e.g. QA person files a bug report with steps to reproduce the issue. A VNC script is attached to the bug report, which is then translated to, say, a junit script. Programmer inserts asserts in the junit code to verify expected conditions. In this way regressions are fewer and don't escape to the QA people in the first place.
QA ignorance is not a bad thing. QA should not be working from designers POV but from a requirement/user story POV. If it does not satisfy the requirements or the user story, it is a bad design. Period. The less QA knows of the design the better.
putting the 'B' in LGBTQ+
Besides when it's caused by interference from background radiation?
Twinstiq, game news
"that it works properly as close as possible"
The proper word is "closely".
When all words are equal there is no longer any communication. Unfortunately that day seems imminent.
Bugs are never RANDOM. Bugs are, by definition, an error (human blunder ... incorrect design, improper code, etc.)
When exactly will that bit be flipped by the stray subatomic particle hitting your PC from the sun?
Some bugs really are random.
- Michael T. Babcock (Yes, I blog)
> The word "random" also includes the definition of "odd, unusual, or unexpected."
In everyday conversation, people may say "random" when they mean "unexpected". Hopefully they wouldn't WRITE that in a published article, but they may say it in informal conversation.
A computer programmer who passed Programming 201 will distinguish between random, unexpected, and arbitrary in informal conversation, because in our world those words have COMPLETELY different meanings. Among programmers, saying "random" when you mean "unexpected" would be like saying "pizza" when you mean "broom" - they are completely different concepts altogether, and both are important concepts. Witness the recent articles about the NSA messing with random generators - random is an important thing to us.
For a programmer writing for other programmers to conflate the two in a published article is sloppy, very sloppy.
> Sadly, I still see developers not testing, and are practically afraid of writing test scripts.
I'm a little bit afraid of some tests I've been asked to write, and I've been programming professionally since 1997. My credits include WordPress, Apache, testing the Linux kernel raid, and other well known software.
I fear it because tests for this project involve learning at least one and probably two new unit testing frameworks which each try to approach testing in an innovative new way. One has tests that look like English prose, not like code.
Also, I have to rewrite other people's code to make it testable, then try to get those changes, and all of the tests, through an arduous review process. This looks like it's going to be a giant PITA. Both of these problems could have been easily avoided.
If testing were integrated with the language, we wouldn't have a different test framework (or two) for every project. If you know the language, you know how to test it, if the two are integrated. Were it integrated, I also wouldn't need to rewrite the existing code in order to make it testable - any valid and reasonable code would be testable, or nearly so.
is random.
no shit.
No, a select few are random. One of my coworkers from a decade ago randomly copied portions of if-then statements trying to get them to work. "This branch works so I'll copy it, paste it, and fiddle with the conditions until something else happens." Never mind those two branches required completely different decisions. Looking at one part of his program, a third of his if-then branches could never be run because they would always evaluate to false. I pointed that out to him and he couldn't offer a shred of logic behind what he was doing. After I got home, I bleached my brain with a stiff drink and then I hoped I'd never see code so bad again.
Ahh... the good old days of being fresh from college. I've since seen much worse code since, but at least it wasn't as random as this guy's methodology.
> Other bugs can even be truly random; a race condition that depends on whether one thread gets scheduled on a processor
The bug exists before the processor is even purchased.
The bug is not random, though the output may be influenced by a random event. The output isn't even random - I debugged one of those in the kernel and the result was spinlock. The only random part is WHEN the problem becomes visible to the user.
That reminds me of a Heisenbug we had once. Completely off topic, but we once had a bug which would never manifest with debug tools running . You could try it a million times and it would be fine so long as IE's debug console was open. With the debugging tools closed, it would crash most of the time. Guess what the cause was.
Someone had left a call to console.debug in the code, which causes IE to stop execution if there is no object named "console".
Pex does half of the testing for you: http://research.microsoft.com/en-us/projects/Pex/
Even in the computer science sense the article is using random correctly. It's arguing bugs from programming errors are NOT randomly distributed through the code base but cluster around code that does certain implementation tasks.
There article is IMHO vapid, but does use it's vocabulary correctly.
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
the seemingly random bug is in fact deterministic and only triggered under certain unintentionally well-defined conditions. Hence all software bugs are deterministic.
Computers are not deterministic. Interrupts are raised at any time by interface hardware. Users generate keystrokes and mouse events at random. Internet packets arrive at random times. Pre-emptive multi-tasking operating systems suspend and resume processes according to the load at the moment, not according to a fixed schedule of n clock cycles.
If you want to go back to something like a TRS-80 where you were programming the hardware, not an OS, then yes, they approached deterministic, but with random user inputs, they have never been truly predictable.
You people are using some perverse definitions of "deterministic" if you think otherwise.
I do not fail; I succeed at finding out what does not work.
Computers are deterministic.
That's an over-simplification due to the widespread prevalence of multi-core CPUs and operating systems with preemptive (i.e., clock-driven and environment-driven) multitasking and interrupts. These things make even things that are clearly computers be non-deterministic. That "computers are deterministic" is merely a useful model of the world, not reality.
"Little does he know, but there is no 'I' in 'Idiot'!"
This is not news.
https://en.wikipedia.org/wiki/Edge_case
Amateurs
As others have pointed out, this ain't rocket science. I was a tester (and programmer) for years and the number of programmers I encountered that a) refused to do range checking properly or b) failed to use well-tested libraries were legion. I really thought that hooking electrical 'reminders' to them every time they did something like this would have improved their code dramatically.
/// Not a super-genius . . . yet. ///
Isn't a machine's "mistake" as much a deterministic output as its "functionality"?
Who passed that load of unpunctuated garbage?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
QA ignorance is not a bad thing. QA should not be working from designers POV but from a requirement/user story POV. If it does not satisfy the requirements or the user story, it is a bad design. Period. The less QA knows of the design the better.
ALL testing should be done from a requirements point of view. All. Even boundary tests. If one doesn't fail, the test still only makes sense if it shows that a certain requirement is fulfilled under the conditions.
I've seen far too many test-cases going green on a behavior that doesn't make any sense whatsoever. Waste of money for zero gain. "Don't know how it should behave" better produces a nice crash that is easily noticed, and resolved properly.
I am surprised Dr. Dobbs would publish this. Maybe they are trying to politely/indirectly send a message to the ObamaCare software folks. The requirements for testing and re-factoring will never go away. But since it was invented by people over 55 it can't be useful in "modern" programming!
Dick
Yeah no. Not sure if this applies to software testing (although some cases can be so unpredictable that you might as well refer to them as random), but the results of some quantum events are indeed random. Even with perfect knowledge of all variables from some moment in the past, some future events are impossible to predict. There are no hidden variables.
Happy people make bad consumers.
True random data has maximal information content (or close to that at least). Meaningful software does not. It may be incomprehesible, but it is not random in nature, and there will always be many points of view from which it is far from random. Find one of these, and you have a toehold into how it works.
John_Chalisque