Retrofitting XP-style Testing onto a Large Project?
Mr Pleonastic submits this query for your consideration: "I work for a small startup (ok, me and another guy comprise the entire development team) that has somehow managed to survive the bust, attract a number of customers, and build up about 300K lines of functionality. Up to now we've made it by being smart and conscientious hackers, but I'm increasingly embarrassed by our shortcomings in testing. I like the XP approach to making enduring, automated test suites, but most of what I read about XP focuses on obvious stuff and changing your programmer culture at the outset. Does anyone have experience with, or advice for, retrofitting it onto a fairly mature project? What do your test suites look like, anyway? The bugs I fear most are of the 'If the user does X and then Y, the result blows away our assumptions' variety, not the 'Oops! My function returned the wrong value' variety (which happens of course). How do you write good test code for the former, without spending even longer debugging the test code? Is XP just for small, new projects?"
Win XP style testing? So, what's so hard? Release an alpha version and call it RC, and let users do the testing...
</wintroll>
My Stack Overflow user
Finding conditions that are outside your assumptions is not something you can do with a unit test. I have found that trying to simulate user creativity (stupidity?) with unit tests is an exersize in futility. Use your unit tests to make sure your methods do what they're supposed to do.
To find all those tricky combinations of use cases that blow away all your assumptions, just stick to the Fail-Fast principle. If you find anything that goes even slightly wrong, complain. Loudly! Throw an exception, pop up a dialog, whatever you need to make sure that everyone knows an error just occured. This will do two things:
1) You'll find a lot more errors in your code. You'll also be motivated to fix them quicker because the app will be unusable until you do.
2) You'll reduce the likelihood of generating bad data. The only thing worse than your program doing something wrong and crashing is doing something wrong and NOT crashing. Users will usually forgive you if your software crashes. If you start giving them bad data, they'll lose confidence in your app and never trust it again.
This is hard to answer in a short comment. I'll try, though you're welcome to contact me for more details through the information on my website.
Retrofitting tests onto an existing project is hard. Not only is it tedious, time-consuming work, but you're always haunted by specters that ask "How do you know the test isn't broken?" It's nice to have the tests, but you'll spend a lot of time and energy creating them that could be better spent adding new features and improving existing features. Besides that, it'll likely sap any motivation you might have had for testing.
It's much easier to draw a line in the sand and say "all new features and bugfixes will have tests, starting now". Before you fix a bug, write a test that explores the bug. It must fail. Fix the bug. The test must now pass. Before you add a new feature, write a customer test that can demonstrate the correct implementation of the feature. It must fail. Add the feature. The test must now pass. From the programmer level, you can write programmer tests through the standard test-driven development style.
It still can be tricky to get started, especially with customer tests, but they don't have to be beautiful, clever, or comprehensive. They just have to test the one feature you're working on sufficiently to give you confidence that you can detect whether or not it works. You'll likely have better ideas as you gain tests and experience and it's okay to revisit the test suite later on to make it easier to use and to understand.
The nice part about this system is that it adds tests where you need them where the code is changing, whether it's a part full of bugs or a part under continual development.
Keep in mind that to do testing this way, you need to be able to work in short, clearly-defined, and frequently-integrated steps (story and task cards, in XP terms). You also need the freedom to change necessary sections of the code (collective code ownership). It helps to have a good set of testing tools, so, depending on your language, there's probably an xUnit framework with your name on it. Also, it can be counterproductive to express your development and testing time estimates separately. At first, testing well will slow you down. It's tempting to throw it out altogether as a time sink. As you learn and your test suite grows, however, the investment will pay off immensely.
Your goal is difficult but doable. It's well worth your time.
how to invest, a novice's guide
I would like to introduce my own method: The CCCC test method (Clicky clikcky clicka click).
1. Open the application
2. Click at an totally unexpected object
3. Fill in some text somewhere (if expected)
4. Goto 2.
I find most warnings and bugs here, as long as you have some good assert() in your code. It's best if you use the CCCC method really fast, and test for like 4 minutes every time.
I once did a little palm programming, and I remember the emulator had a mode where it would randomly click on various controls and enter text really quickly, as a way of stress-testing your app, testing it's ability to handle any combination of input and options without blowing up. I wonder if something like that would be useful if the world of typically much-more complex PC programs...
A Minesweeper clone that doesn't suck
Not what it doesn't.
As a start write down test specs for all of your use cases, even if the specs aren't automatizable. Then find a tool to simulate the user (e.g. HttpUnit is very good to simulate web users) and try to turn the specs into functional tests. Ensure that your application works today.
The next step is remodularizing your code, try to find a tool that traces dependency diagrams in your code (e.g. Compuware's Pasta is an excelent free tool for Java). Module interdependency is a strong smell of bugs, so refactor them to make the dependencies acyclic, running your tests to keep everything under control.
Then try to write unit-tests for the modules, create mock objects from them and check if they do what they're supposed to do. Repeat the second step for your classes (or data structures and functions if it isn't OO). Try to make all dependencies acyclic and create unittests for them.
And during all these steps use Design by Contract to write down *all your assumptions*. Leave them on production code too (unless it's strictly necessary for performance). That way if your code breaks your assumptions the contracts will tell you. Also it'll force you to rewrite some code to make it checkable (i.e. exposing invariant predicates).
Finally don't forget to check the common XP areas (extreme programming in Yahoo! Groups and news://comp.software.extreme-programming).
It'll be no piece of cake but when you start to see a better factored code that keeps the bar green you'll be rewarded.
Disclaimer: If I disagree with you I'm probably trolling...
I have a similar situation, I have a bunch of code that "mostly works" and I'd love to have unit and acceptance tests.
But it's really hard to add it later. I mean REALLY HARD. The tests are tedious and boring and after 2-3 I get tired and the tests have errors.
If you follow the XP test-first technique, you code comes out MUCH different. You have low coupling, you have "testable" code where the pieces are interchangeable (so you can easily use mock printer objects or non-RDBMS backends, etc), and generally it's really elegant code with little extra work.
And you don't get bored writing test-first because every time you write a test, you then write the code that passes the test and it's really a feeling of accomplishment! And you don't get "lost in the big picture" because you are focusing only on passing that one little test.
The same is true for acceptance tests. I use HttpUnit to automate web apps, and although I'm not quite as religious about testing the interface, it's great for "add record, query record, delete record" stuff, to make sure it doesn't blow up when the end-user does something basic. For instance I had some code once that worked wonderfully, except login was broken. Since I was testing while logged in and never thought to log out and log back in, I never caught it in my manual tests. Automated tests can catch the stuff you forget.
So I'd recommend requiring tests on all NEW code (you'll see a big difference between the old and new code I bet, in terms of simplicity and low coupling).
And whenever you refactor the old code, start by writing tests that the old code passes.
But it will really be tough to retrofit ALL your old code with tests. I'd even say it's not worth it because your tests will not be good.
And remember: EVERY LINE OF CODE MUST EXIST TO PASS A TEST. That should be your goal on new code.
One word: JUnit
Diplomacy is the art of saying, "Nice doggie!" until you can find a rock.
This is something I've wrestled with too. Start where you'll get the most bang for your buck. Start with regression tests. I assume you're doing *some* testing (or at least your users are ;). When a problem shows up, make an automated regression test that surfaces that bug. Run it often and make sure the bug stays fixed.
With a 300KLOC codebase I have to ask is it boken down into components that can be tested in isolation. If it is, congradulations you've done some good software architecture. You can start by testing the interfaces to the components. Make a test that triggers each error condition from each interface function/object. The tests will seem braindead simple (like passing in a NULL when a valid pointer is expected), but these sort of tests are suprisingly useful. Infrequently exercised error checking is one of the easiest things to let slip through the cracks when modifing an implementation. That will be enough to get your test framwork set up, and shake out all the forgotten dependencies between your components. Then it will be straightforward to add more testing.
It won't be easy. You should expect you'll have to modify your code to make it testable. But if you expect to keep this code around for a while, it will pay off in the long run.
1) *Everytime* you discover a bug from now on, write a test case that exposes it. Then fix it.
:-)
2) Write new functionality test first. You are not allowed to implement new features unless you first implement a test that fails. Once in runs you are either done, or you got ahead of yourself and need to get back to writing a few more tests
Hold on dude! You are not a programmer, so you shouldn't give advice on programming.
Your resume clearly indicates that you are just a Perl hacker. As everyone knows, Perl is not a real programming language.
Come back when you know a real programming language.
Thank you and good night.
yes, these automated test tools are called "smart monkeys".
"James Tierney, former Director of Testing at Microsoft has reported in internal presentations that some Microsoft applications groups have found 10-20% of the bugs in their projects using monkey test tools."
cpeterso
I've done this kind of testing with an assembler I once ported. I used a program (from usenet) that measured the occurance of each possible multiletter sequence and generated random output text with the same ratios. Feed it English you get something the looks like English. I feed it assembly and it output something that looked like assembly. I rewrote the parser to reset when it got too deep and let it run. It took me over a week to fix all the bugs this method found. Most pf the bugs were in opcodes that no one every really used, but it was fun to clean up the code this way.
Get what the QA people call a "fuzzer".
There's two general types (often bundled together with a few other things as a test suite).
The first generates random keystrokes and the second generates random data either completely randomly or following some set of guidelines (field length etc.)
It still won't do everything that exposure to the real world will, but it'll get you a lot closer!
You're more likely to get people who know XP and TDD on http://groups.yahoo.com/group/extremeprogramming than you are on Slashdot, simply because the yahoo group is focused on that very topic.
No Zen is good zen
What several posters have already suggested.
Don't write tests for existing code just because it's there. Write tests for any code you have to change, and then do the change test-first.
Initially, it's going to be hard because legacy code is usually highly coupled. If you pay attention to reducing coupling, over time your code base will start to improve.
And get one of the xUnit clones for unit tests, and FIT (fit.c2.com) for acceptance tests.
John Roth
If there's something that's particularly hairy in your existing codebase, the next time you need to modify it, spend just a little time refactoring it first so you can make your modification more easily. You don't need to do a six-month rewrite, just a little bit of refactoring to remove local duplication and confusion. And remember this about refactoring: It's not just about improving the design of existing code by making careful transformations of it, it's about rigorous removal of duplication.
Write unit tests for the code you're refactoring immediately before you do so, in order to verify that your refactoring isn't actually changing the functionality. In other words, if you're extracting a method Foo::ExtractedMethod from the method Foo::BigMethod, start by writing a couple unit tests for Foo::BigMethod and verify that they pass. Then write a unit test for Foo::ExtractedMethod and watch it fail (because you haven't done the extraction yet). Then actually perform the extraction.
Once you've added a new feature, don't consider it completed until you've done another small round of refactoring to remove any duplication that adding the feature introduced to your codebase. Do the same thing as above, writing unit tests for any code that you're refactoring that doesn't have any yet, and ensuring that your new feature's tests all continue to pass. After a few feature additions your codebase should actually start getting cleaner and your test coverage should go way up.
Another tip: Try not to have interface-layer (View in MVC termonology) code do too much work. Try to keep the application's business logic in the Model layer and its interaction logic as much as possible in the Controller layer. This will make it much easier to write unit tests; your View layer will be very thin and mostly serve as a wiring-up of your Controller layer to various human interface widgets. The end result is that you can then write a test to validate that entering one value here and another value there doesn't invalidate your assumptions simply by tickling the appropriate Controller-layer code, rather than by trying to "fill in" text fields and "push buttons" via test code.
If you currently have a lot of logic built into your human interface widgets, those are prime candidates for refactoring the next time you need to touch them (or notice they contain some duplication relative to the code that you're currently touching).
And as others have said, check out the Extreme Programming mailing list at Yahoo! Groups. It's a great list with a lot of great people who are very willing to answer questions and help any way they can.
Here are some of my observations having applied many forms of testing.
One of the chief goals of XP is that units can, and should, be redesigned whenever any programmer identifies a unit that is poorly designed (contrast this to RUP, which is far less ad-hoc but still relies upon unit testing). This requires that your code be designed as components, and that each component have a well-defined contract.
Based upon the brief bit you mentioned in your story, it sounds like integration testing is more applicable than unit testing. 300,000 lines of functionality (whatever that means...I follow Mark Twain's concept of "if I had more time I would have written a smaller program") written by smart but conscientious hackers sounds to me like design that does not meet the criteria of small components with well-designed contracts. One developer likely cannot redesign a portion of the program without it having undesirable side effects in different parts of the program. Of course, I am making assumptions about your code that may be incorrect, but you have given scant little regarding the project itself (what language was it in? is it a web app? a server app? a client app?)
If your code meets the requirements for unit testing, write small, simple tests that should require little in the way of debugging (very small--most of my testing methods consist of only a few lines each). Test the contracts defined by each component. Build unit tests that test 3 things: successful conditions (maybe even repeat the test 100s of times to simulate load...code often behaves differently as buffers/queues fill up), errant conditions (to ensure things behave as expected), and boundary conditions (to ensure the contracts are met). My favorite unit testing tool is JUnit (I'm a Java guy).
If your code isn't well suited towards unit testing, perhaps integration testing is more appropriate. This won't give you the XP capabilities of allowing complete redesign of components without affecting the system as a whole, but it can lead to more reliable code releases going forward. Furthermore, I find it helpful to build an integration test for each bug that surfaces, so that I don't reintroduce the same bugs I've previously fixed. In my experience, this works better than unit testing when writing tests for existing code. HttpUnit is great for integration testing web applications. There are many off-the-shell commercial apps to integration test GUI applications, too. Many of these are point-and-click. They basically record macros of mouse/keyboard events that can be replayed. Build up a large enough suite of these and you can regression test code, even if it was not designed well as encapsulated components. However, don't kid yourself--this isn't XP-like testing. These tests won't help you rewrite small units of your code. But they will help you test the functionality of the entire system when changes are made. That is not XP coding. XP really requires component-based design.
Good luck! But don't kid yourself--XP cannot be bootstrapped onto code developed without using proper design principles. Testing is still very useful...but this testing is not in the spirit of XP.
--Be human.
There must have been so many within the 10-20% that they thought they were done.
> 300,000 lines of functionality
If you have this much code, I bet there's some duplicated code in there. Ferret it out with CPD and you'll have that much less code to write tests for.
It probably wouldn't hurt to search for unused code while you're at it - again, you'll reduce the amount of code you need to write tests for.
The Army reading list
Some applications are just way too complex. As your application grows, this test will be less useful. I am trying to learn automated testing of web applications, and so far, I've found that javascript and popups are evil.
This post is kinda silly.
--
say NO to software patents
Donald Knuth Letter to the Patent Office.
Carta a la Oficina de Patentes por el Profesor Donald Knuth
http://arhuaco.org/
If you spend more time debugging test code than debugging your functionallity ... then your API/abstraction of the desired functionallity is wrong.
.... .... and so on ...
... very simple set up. You have at least 2 scenarios so far, one where the customer gives the correct identification code and one(or more) where he does not. If you elaborate the use case you find also others, e.g. when the account has not enough money to withdraw etc.
... in Java, Python, Smalltalk, you can even run the code but get errors, ERROR means: TEST FAILED.
// this is one scenario from the use case "withdraw Money"
...(); // likely you have a factory for an Facade object or something
//ignore: 2) system asks for identification (4 digit code) // system checks, with result ok 4) check code (E1) // ignore: 5) system askes for amount of money to withdraw // test failed, print it or something
For further development try this:
Use Use Cases in textual form, probably "essential use cases", you can googel for that, is the easyst way in case you are not familiar with use cases so far.
Further use "scenarios" for defining certain things which can be done from the outside with your application. A scenario is corresponding to a use case in that way that it offers different pathes of interaction through the same screens with differnt data and descissions of the actor(user).
E.G: if you have a use case for an automated teller machine
Use case: withdraw money
Actor: customer
1) insert bank card
2) system asks for identification (4 digit code)
3) enter code
4) check code (E1)
5) system askes for amount of money to withdraw
E1) report wrror: wrong code. If tries > 3 abort (continue at E2) otherwise continue at 2)
E2)
Thats a use case
So: the above you likely do allready somehow (I asume you dont programm with a 400 page word document as "specification" do you?)
Now FIRST: transform the scenarios into test code. Write Code that tries to perform the steps a customer needs to do to withdraw money. Call non existing methods/functions with that code. Write the code until it compiles! In C like language it won't run, because it won't link
Now start to connect the test to your code, by either writing non existing code or by building a bridge from the test driver to the existing code.
My intention is that the test code is so simple, that it performs one action, a user would do, with one line of code.
If you follow that idea, your API to your backend will get very clear and simple. If you have a simple API, you likely have simple code behind that API. When you have simple code you liklely have less bugs.
And with a test case derived from a customer scenario you have a test case which shows:
a) broken functionallity after code reworks
b) what the system is intended to do -- it becomes specification!
Sample:
class TestWithdrawelWithCorrectCodeAndEnoughMoney {
String customerName = "Billy Joe";
String accountNumber = "123456789";
String code="1234";
double amount = 100.00;
MyApplication system = new
public void testScenario() throws Exception {
system.enterCard(customerName, accountNumber);
system.enterCode(code);
double oldBalance = system.getBalance();
double money = system.withdraw(amount);
if (money != amount) throw new WithdrawErrorException("wrong amount withdrawn");
double newBalance = system.getBalance();
if (balance + amount != oldBalance) throw new WithdrawErrorException("wrong balance after withdrawel");
}
static void main(String[] args) {
try {
testScenario();
} catch (Exception x) {
}
}
}
As you see, the code in testScenario() is simple, and besides the fact that it is
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
As others have noted, retrofitting some sort of automated testing would be very hard, but can be well worth it in the long run, maybe even this late in a development.
The key ingredients of successful automated test suites I've seen are...
Given the above, all you need is a set of macros and a record of the state that should result, perhaps generated by that macro. (You probably want to record some sort of summary of the state that's likely to reflect as small a change in your state as possible, rather than the full state; often it's the fact that something has changed that is important, not the specifics of what the change was.) Then just replay all the macros and compare the state produced by your current program with the recorded, known-good states, and investigate any discrepancies.
Obviously writing an engine to do these things can be a very significant amount of work. For UI code, particularly if you're using a heavily message-based architecture, it's probably easiest to put the macros in at the message level, IME, which at least gives you a relatively easy way to replay those macros later.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
In addition to the other good comments already posted (especially those regarding refactoring), let me suggest making increased testability part of the ongoing maintenance process. Specifically, every time a problem is reported, make the creation of a test that exposes the error the first order of business. (Of course, make it as fine-grained as possible, and refactor as necessary to achieve that focus.)
Then use that test to guide the resolution of the problem. Iterate throughout the life of the project.
Write a test document first.
The biggest problem with tersting tis that people test what the code was programmed to do. However, testing should be based on the requirements, to make sure that it does what it was intended for. Also, if the test document is written before coding, it means people won't cut corners when writing that document (due to optimism) and it may help drive coders to code the really important parts first.
Have you read my journal today?
Instead of having to write tests for HttpUnit, you can simply record tests with JMeter. You also have the added benefit of JMeter's load testing capabilities. The only downside is that JMeter's UI isn't very mature (or intuitive for that mature).
One thing unit testing is often spectacularly good at is pointing out where assumptions have been made and not spelled out. This often takes the form of "negative testing". (What happens if you add a NULL to that list? What happens if you try to access the -1th element in an array? What happens if you neglect to set an IP address?)
What you'll likely find is that, in a number of spots, the refusal or assertion is not spelled out. Occasionally, it can take some mulling to figure out how to deal with those edge cases (could NULL be valid in some circumstances? Should we make a different derivative or member variable to determine that behavior?)
There's much positive testing to do, too.
Don't bother with testing the extremely simple stuff. If it's to the point where you might as well be questioning whether the compiler can do its job, you won't gain value from it, and it will bore you to tears. (Mind you, if you have one of "those" compilers whose very foundations you question... :)
If you have classes that produce output lists/objects, one nice technique to use is, instead of checking the output manually, is to create an equals/== method for your output, then create the expected output in your test and compare it (via your equals) with the output from the class.
Some other miscellaneous observations I've made:
One of the hardest things about writing unit tests is trying to interface to the outside world. Whenever you can, avoid it. You can fake things to a point (using 127.0.0.1 as an IP address in some tests, for example), but you'll have to fall back on functional testing at some point. That's another good reason for keeping as much logic out of the view as you can.
One note of hope: the most difficult part of unit testing is getting started. Honestly. Once you "get it", you will always "get it", so hang in there :)
Binary geeks can count to 1,023 on their fingers
I ran across an article a couple of years ago by Chuck Allison in C/C++ Users Journal about the The Simplest Automated Unit Test Framework That Could Possibly Work. It included test frameworks written in C, C++,and Java and opened my eyes to doing best practices to the extreme. It also showed me how I could apply unit testing to my C code. You can download free Test Frameworks (Test Suites) for other languages.
Unit testing was the first XP key practice that I started to use. When I would have to make a change in my mature code, I would add a unit test section to the module I was changing (using #define TEST), and add a main() to execute the unit test (using #define TEST_MODULENAME). See examples of this on my software page. I then began using test-first programming by writing the unit test first, seeing it fail, then writing just enough code to make it pass.
Other extreme programming sites that have been useful have been extremeprogramming.org , which has a great tutorial that includes an introduction and overview, and the site Extreme Programming.
I would recomend some reading: first Refactoring which talks about code "smells".
One such smell is "Inappropriate Inimacy"
recipes are given to resolve these kind of problems. The nice thing about it, is that it talks of symptoms of problematic code, you will more easily understand where your problems occur and why.Next, read a good book on unit testing: Unit Testing In Java
At some point, you might get interested in Design Patterns a good alternative for Java with more practical examples.
Don't try to add all tests at once, do it on a regular basis, one by one, as needed to support your refactorings, and for regression testing. You don't want to find the same bug more than once.
Start using Ant as a build tool, automaticaly executing your test suites at least with every build.
Try to achieve an MVC (Model - View - Controler) architecture, the View (awt/swing, html/jsp, console,...) should only be responsible for showing the state of the model. The Controler should only validate user input and manipulate the Model (Mediator pattern). The Model should represent the business logic (you'll typically will find Person, PersistenceService, AuthenticationService objects here).
Just don't rush things, the goal is not to make your code compliant with The Only Right Way Of Doing Things(TM) the goal is improving your code for achieving more robustness and maintainability.
BTW, I'm speaking from experience, as Sr developer/analyst/architect in a small team and as coach/instructor. I wish you succes ;-)
Working on a project right now.
I consist of the entire QA team.
and that is my testing method.
play with the features until I go insane.
-Grumpy old tester.
Is it true that more people vote for the winner of American Idol, than vote for the president? -Ali G.
...Such as this
My opinion? See above.
Like a previous post, I'd suggest "evolving" to testability - every time you fix a bug or add a feature, do it test first.
You will have to spend some time setting up the testing framework - a structure for your unit tests, the "non-code" stuff, and a way of finding out asap that you've broken a test.
Depending on your environment, you could use something like AntHill or CruiseControl to automatically run all your unit tests as part of a (timed) build process, and email the results. CruiseControl also allows you to specify regular intervals at which your entire code-base is checked out of source code control, and unit-tested - you get an email if something breaks.
A key problem for most systems is getting all the "non-code" stuff (in my case mostly databases etc.) into a known-good state so you can rely on a unit test reporting accurate errors when trying to insert a duplicate value or delete a non-existing record - again, automate using (something like) Ant - Ant will do a lot of this stuff for you on non-java projects too.
Once you have refactored sufficiently, you can hopefully start testing independently from the "non-code" items.
I'd suggest buying Kent Beck's book "Test Driven Development" for more ideas on how to code in this paradigm - it's very very good.
Another book worth reading is The Pragmatic Programmer (they have a website too). Especially the "no broken windows" section is very worthwhile...
It's all very well in practice, but it will never work in theory.