Properly Testing Your Code?

← Back to Stories (view on slashdot.org)

Posted by Cliff on Tuesday June 18, 2002 @10:21PM from the hope-it-gets-a-passing-grade dept.

lowlytester asks: "I work for an organization that does testing at various stages from unit testing (not XP style) to various kinds of integration tests. With all this one would expect that defect in code would be close to zero. Yet the number of defects reported is so large that I wonder how much testing is too much? What is the best way to get the biggest bang for your testing buck?" Sometimes it's not the what, it's the how, and in situations like this, I wonder if the testing procedure itself may be part of the problem. When testing code, what procedures work best for you, and do you feel that excessive testing hurts the development process at all?

12 of 470 comments (clear)

Min score:

Reason:

Sort:

Testing by CoderByBirth · 2002-06-18 22:34 · Score: 2, Informative

I think JUnit-style testing works great, and I plan to start using it more often.
Testing is good to verify that your code does exactly what you think it does; a lot of the time I produce code that I "think" works, using JUnit allows me to verify that it actually does.
Check out junit.org.

For those of you who are sceptic about unit-testing, you should try it. Setting up the tests are not as tedious as one might think, they force you to think your problem through, and maybe most of all: they make your build look cool :)
Code Review, Code Review, Code Review by JohnT · 2002-06-18 22:38 · Score: 5, Informative

My experience has shown that the number one way to find defects is code reviews performed by other developers who can read the code and also understand the intended functionality. This will catch 90% of all defects before they are even released to QA.
For more information, the developers bible (IMHO) Code Complete (available on Amazon and elsewhere) has some good information on testing strategies and some hard numbers on effectiveness of testing. Good luck.
1. Re:Code Review, Code Review, Code Review by ssclift · 2002-06-18 23:47 · Score: 5, Informative
  
  Again, violent agreement. Why? Testing is basically just writing the code again, only in a more restricted form. You take a known input, and then program the output expected (rather than derive it another way) and then compare the two implementations.
  
  Inspection, on the other hand, compares the program with it written in another form: human language. Since human language is generally too vague to execute automatically, the only way to test the equivalence of the two is to inspect.
  
  By far the best inspection book is Software Inspection by Tom Gilb. His very generous web site contains a ton of supplementary material.
  
  Remember, proving two statements are the same is the halting problem, and is NP-complete (i.e. you must check all possible solutions). Testing is a measure of code against code, inspection a measure of code against requirements. Together they kill a lot of bugs because they find different discrepancies between three statements of the same problem.
Understand your real goals by James+Youngman · 2002-06-18 22:55 · Score: 3, Informative
Well, I'm assuming that your systems start with analysis & design, followed by coding, redesign and more coding, go through unit testing (with more redesign if you are unlucky), followed by integration testing once unit testing is complete, and perhaps acceptance testing once integration testing is complete. This is more or less the traditional waterfall cycle when the deliverable is the finished code.
This strategy works - lots of shops use it all the time. However, the real premise of the process is that you want to get through client acceptance testing as soon as possible, as long as the result is not dissatisfaction on the part of the client with the software after they've accepted it. As you have noticed this strategy doesn't actually produce bug-free code.
This is not surprising. What you achieve is after all pretty much determined by what your goal was. You (shops in general) need to think hard about what your actual goal is. If your goal is nearly-zero-defects, then the traditional process isn't doing the right things for you. If however, your goal is to obtain milestone payments from your client, then it's pretty good. This is an area where the business goals determine the software engineering processes.
Let's put another hat on and think about what the negative affects of this strategy might be (negative is really defined in terms of what your goals are, but let's be vague about that for a moment).
- If your goal is not "zero bugs" then you will stop work before there are no bugs left
- Your software is delivered to the customer with bugs in it, which the customer will find
- Your development team will partly move on to other areas, probably leaving a smaller number of people to deal with the remaining bugs
- Maintenance programmers are typically less skilled than some of the original team - because some of the original team have been pulled off to activities which are more important to your business (e.g. delivering another set of code in order to meet a payment milestone somewhere else).
- Skills evaporate over time - after N years it gets very difficult to find authoritative information on why something works like it does.
- As the code is fixed, it gets brittle. The emphasis is on just fixing bugs and making low-risk changes in order to avoid breaking the production code - hence refactoring is rare.
All of the above factors are unpleasant for those left to maintain the code. Many of them also limit the longer term flexibility of the product and hence the useful life of the software. This feeds back into development processes because limited product lifetimes mean that there is less incentive to change your process to produce software which can persist (i.e. why make the effort to ensure that the system is flexible enough to last through 20 years of changing requirements when you expect the system to be retired after only 7 years?)
You mentioned XP - it offers a lot of techniques that resolve these problems:-
- Programming in pairs - this makes for very efficient skills transfer, hence you limit the extent to which the expertise boils off
- XP testing aims for automation - this encourages more testing and the stronger testing ability allows you to contemplate high-risk-high-return activities like refactoring.
- Refactoring - which prevents your code getting old, brittle and hard to change
- Many more I'm sure (I'm not an XP practitioner)
However, XP is best adapted to projects where a single team makes multiple frequent deliveries of code, can work closely with the client, and where the development project continues in the medium to long term. These characteristics allow many of the XP techniques, and this means that techniques taken out of XP may not help projects of a different style.
Having said this, the automated testing angle is a real strength. If testing is done manually, it's time consuming and expensive. Hence people don't do it as much as they might otherwise thing is appropriate. Maintenance deliveries often just undergo regression testing, and faults can creep in which might have been caught by the original unit or integration tests. Automated testing has many advantages :-
1. Automated tests are faster, so people actually do it!
2. You can redo all the tests after every change if you want.
3. Automated testing allows you to refactor without danger
4. Being able to re-run all the tests really does keep out the new bugs which would otherwise have been introduced during maintenance
5. Your testing coverage grows with time (since bew tests are introduced but tests are only retired when the relevant functionality is changed or dropped)
6. You don't fail to spot errors (quite often with manual testing regimes errors can go unnoticed because the tester doesn't spot a small bit of incorrect behaviour that the original team might just have spotted).
Just as a data point, I work on some software that has an automated test suite. The suite contains between 500 and 1000 test cases; the test suite conducts those tests in under 5 minutes on a very old machine. To do these tests manually would take one full-time person at least a week.
The summary is :-
- Understand your business's real goals
- Cherry-pick techniques that will help achieve those goals (you might even be able to adopt a whole methodology if its processes are designed to achieve the kinds of goals that your business actually has).
Close the loop by edhall · 2002-06-18 23:12 · Score: 5, Informative

The object of finding bugs isn't to result in fewer bugs by fixing them. It's to result in fewer bugs by not writing them in the first place. The developers need to review found bugs on a regular basis, with the objective of changing development methods to avoid them in the future.

It's all fine and good to say "don't write buggy code in the first place," but this sort of feedback is the only way to get there. What makes this so hard in many organizations -- aside from the usual disrespect many developers have for QA people -- is that developers fear that this process is some sort of performance evaluation. As soon as this happens, the focus shifts from finding better processes to defending existing processes: "It's not really a bug," "There isn't really a better way of doing that," "We just don't have time to do it the 'right' way," and so on.

This is why the feedback needs to be direct from QA to the developers, who are then tasked to categorize bugs and develop recommendations for avoiding them. It's the latter that is the "product" required by management, not a list of bugs with developer's names on them. Management should otherwise get the hell out of the way.

-Ed
Re:best way to test is to use automated testing by Cpt_Corelli · 2002-06-18 23:12 · Score: 3, Informative

Of course properly written functionality test scripts (doing what the user does) will find most bugs. The downside is that it is boring to follow test scripts manually.

My company has been successful implementing automated functionality tests with Rational Robot (part of teamtest). If you just take the time to define proper test scripts you can easily redo all functionality tests on various platforms (if you use VMWare or similar sw to simulate different platforms) at the click of a button.

This saves time every release as the developers can focus on finding the really tough bugs instead of running boring functionality tests again.
no one size fits all process by jilles · 2002-06-18 23:45 · Score: 3, Informative

There's no one size fits all process for testing. How much effort you need to spend on testing depends on a lot of factors including but certainly not limited to: code size, amount of developers, customer requirements, life cycle of the system etc.

That being said, here are some remarks that make sense for any project:

In general a testing procedure that gives you no defects just indicates your testing procedure is bogus: defect free code does not exist and no test procedure (especially no automated procedure) will reveal all defects.

The XP way of determining when a product is good enough: write a test for a feature before you write code. If your code passes the test it is good enough. This makes sense and I have seen it applied successfully.

A second guideline is to write regression tests: when you fix a bug, write an automated test so you can avoid this bug in the future. Regression tests should be run as often as possible (e.g. on nightly builds). All large software organizations I've encountered do this. Combined with the first approach this will provide you with a growing set of automated tests that will assure your code is doing what its supposed to do without breaking stuff.

Thirdly, make sure code is reviewed (by an expert and not the new script kiddie on the block) before it is checked in. Don't accept code that is not up to the preset standards. Once you start accepting bad code you're code base will start to deteriorate rapidly.

--

Jilles
Evaluate your test suite for coverage by pfdietz · 2002-06-19 00:28 · Score: 2, Informative

One thing I've found invaluable is to compile your program with a translator that inserts code to detect when branches have been followed. Then run the test suite and see that all the code was executed. Any code that was not executed has not been tested.

It's amazing how poor coverage can be with a naively written set of tests. Ideally you want to write the tests so that the coverage comes out good, but in practice you may have to patch the tests with more tests to cover the parts you missed. You may also have to change the code to make it easier to cover.

Rare error cases (like malloc failures) can be hard to cover.
Re:the best way to test code... by Xentax · 2002-06-19 00:32 · Score: 5, Informative

I agree -- our own company suffers from giving less effort on code reviews than most of us know we should. People try to save time by under-planning for code reviews, but that saved time is always lost at least twofold in uncaught bugs, extra time for optimization, and so on -- all things that would be identified in a solid code review.

Identify the people in your company that have the best "critical eye" for each language you develop in -- and see to it that they get the time they need to really critique code, either during implementation or at least before starting integration testing (preferably before unit testing, actually). It may be hard to convince management the first time around, but if you account your hours precisely enough, the results WILL speak for themselves, in terms of hours spent overall and on testing/integration.

Xentax

--
You shouldn't verb words.
Tried Cleanroom? by CyberGarp · 2002-06-19 01:31 · Score: 2, Informative

My personal recommendation is the "Cleanroom" methodology. You create a functional specification with a mathematical guarantee of completeness and consistancy. Auditable correctness is also a part of the process. Then when it comes to testing you generate test cases that cover all states, all arcs and then do statistical test case generation based on a usage model. The overall cost of this process is a bit more up front, but studies have shown that the process far more than pays for itself in greatly reduced maintenance/debugging costs.

So to answer you question is that to generate a decent set of test cases, you really have to understand the problem space and have mapped out the state-space in some manner. Trying to derive this without a methodical approach and ones testing will be spotty. The worst I've seen so far was a random state-space walker (ala Brownian motion). Statistically this approach avoids all the difficult cases in the far corners of the state-space.

Now for the bad news: Cleanroom is quite tedious for the programmer. The enumeration phase takes seemingly forever and can be mind-numbingly boring.

Here's the amazon link on the layman's book on Cleanroom: Cleanroom Software Engineering: Technology and Process by Stacy J. Prowell, Carmen J. Trammell,Richard C. Linger, Jesse H. Poore

And now for the shameless self promotion bit with a long winded sales pitch for executives on Cleanroom: my own Cleanroom company: eLucidSoft.

Just chant over and over: "Hire eLucid, play golf."

--

I used to wonder what was so holy about a silent night, now I have a child.
Re:Trying to fit in implicit restrictions by Grab · 2002-06-19 04:48 · Score: 3, Informative

Re (2) and (3), having separate groups is a *really* good way to get the old "them-and-us" battle going. The testers hate the programmers bcos they see themselves as covering up for the programmers' lack of skills; the programmers hate the testers bcos the testers keep telling them that they're screwing up; and everyone hates QA for telling them how to do their job.

The problem is that very few bugs occur at the purely code stage, and most of them are easy to trace. The real problem is design bugs - they're the killers. The solution at our company is to (a) review, (b) separate development, and (c) follow a V-model quite strictly. We have one group of engineers, where everyone writes, codes, reviews and tests, and for each section, everyone will have a different role so no-one gets stuck just being the tester.

Someone writes a requirements spec, and also writes the system test spec by which they'll prove that the thing works as expected. Writing the test spec forces you to put numbers to your requirements, basically making you self-review the requirements for errors, and it find a lot of bugs. Someone else checks that the requirements spec makes sense, and also that the system test spec matches up with the requirements spec. And typically we spend over a quarter of our time at the requirements stage, bcos it's dead easy to edit a single line in a Word document but it's a helluva lot more difficult to change a zillion C files, test specs, etc.

If the system is simple, we'll go straight to code. But if the system is complex, we do a detailed design (usually in Simulink and Stateflow these days, since we're coding for embedded control systems). The person who does the design MUST NOT be the same person who wrote the requirements - this effectively gives us another review of the requirements, to catch any oddities which can't be implemented. And someone will review that the design is actually meeting the requirements.

The same person who's done the detailed design will also write a test spec to say how to test the code. This will cover all boundary conditions, so any time there's a comparison, for instance, we'd check that it gets ">=" instead of just ">". We've got an in-house tool which allows us to write test specs in Excel and run them on code automatically. And someone will review this to make sure it matches the design and covers all cases.

Then someone writes the code. The coder MUST NOT be the designer - as with the requirements/design separation, this gives us a free review. They'll put that through Lint to check for obvious problems (we follow the MISRA coding standard, with a few exceptions where MISRA didn't get it right), and then run their code through the test spec. If it fails, they'll look for the bugs in the code. Sometimes they find bugs in the test spec; in that case the test spec gets modified. Having an automated test spec means we can run tests on code with zero overhead for repeating the tests.

And then the code will be run against the system level tests, and hopefully everything passes. If it doesn't, the system level test has found a bug in the design, where the design isn't meeting the requirements. Rinse and repeat.

It's worth saying that we're writing code for automotive control systems. How much testing to do is really a trade-off of the cost of testing against the cost of failure, and in our case, failure is not an option!

Grab.
Orthogonal Array Based Robust Testing by yokimbo · 2002-06-19 15:20 · Score: 2, Informative

I'm sorry, but I didn't have time to read all the other responses. The replies I did read were mostly questions back at you and clarifications to other replies. So, here is my attempt to answer your questions.

do you feel that excessive testing hurts the development process at all?
Yes, of course it does. You could, theoretically, test code for one program for the rest of your life and still not discover all the bugs. That would be excessive testing and would definitely be a bummer to the development process. I think what you really mean is how do you determine how much testing is enough. For this I refer you to a few good testing books because frankly speaking, people have made careers out of this sort of thing :)

Books: The Art of Software Testing (hard to find and a little expensive) by Glenford J. Myers; The Complete Guide to Software Testing by William Hetzel; Code Complete : A Practical Handbook of Software Construction Steve C McConnell. These are some good options to get you started.

What is the best way to get the biggest bang for your testing buck?
I would take a serious look at Orthogonal Array Based Robust Testing. A method of testing developed by Taguchi and Konishi, and using orthogonal arrays to determine test cases. I don't have enough room here to get into details, but basically this type of testing guarantees detection of atleast 1st and 2nd order defects with the minimal amount of test cases. Madhan S. Phadke's Quality Engineering Using Robust Design mentions this type of testing. Also Bell Labs has been so kind as to publish online some fairly heavy strength Orthogonal arrays, so you don't have to calculate them. My employer uses this type of testing on many of its projects and it's a huge time saver. I just learned about it in an onsite class by our top tester and am going to pitch it to my project soon.

Good luck, sorry for being so vague in places, and finally, if you have more questions about Orthogonal Array Based Robust Testing, please let me know: redundant_pleonasm@hotmail.com