Slashdot Mirror


Strategies for Test Databases?

youngcfan asks: "I've been tasked with finding strategies for a test database that can be used effectively by both software developers and the QA team. We're a J2EE shop with most of the interesting pieces of the application interacting heavily with the database -- so we need to test it. We're ramping up on JUnit, but are looking for ways to test the database-driven pieces of code. Since QA needs the same database for functional testing separate from developers' unit tests, DBUnit doesn't seem to suffice. We also have the challenge of working on multiple releases at the same time, which only complicates how and when to add new data to the test database in a way that's useful and valid for everyone. We're looking for strategies for using a test database in a way that meets both the QA's and the developers' needs, works for multiple releases, and isn't a heavy burden to maintain given that the schema and code can change anytime before any of the multiple upcoming releases. Any suggestions?"

5 of 66 comments (clear)

  1. Re:Use real data, not test data by Anonymous Coward · · Score: 1, Interesting

    A problem we had in a shop I used to work for was the the production dataset was huge. There were some plans to try and take subsets of data... but the schema was quite large and complex - making it a pain to keep integrity (which is crucial for performing tests against). In the end, we ended up doing a big refresh of test every few months. This was for user acceptance testing. The developers box got updated even less often - and as you can imagine - this caused huge problems (developers were expected to update the data and stored procedures needed for their portion of testing). Some of us just snuck our tests onto the acceptance box to get work done.

  2. Re:Perfect world enviroment by LarsWestergren · · Score: 2, Interesting

    You'd have 3 servers as close in configuration as possible. One houses your production enviroment DB and the other houses your test and one for the QA enviroment. You can get away with QA and TEST in the same server but you REALLY don't want a devoloper to crash the test box or bog it down with a bad query when they're doing QA.

    Seconded. I'm on a project right now where we (the programmers) have finally gotten management to allocate time for us so we can get going on doing more unit testing, integration testing and generally cleaning up the code.

    We have had a few incedents where a bug caused bad data to be inserted into the database. The bug was solved, but the data remained and caused strange behaviour. I am currently (reluctantly :-) learning Ruby so I can write a script that empties database tables and inserts fresh test data. This script is started by cron every night and the JUnit tests and integration tests are then run automatically. The QA team can then do their manual test during the day on that server.

    We also have one server that all developers are running their daily code against, and one intermediate server where we do test deploys before stuff is delivered for QA testing.

    I can also recommend you to take a look at Apache Derby which is included in JDK6. It is small, fast, and you don't have to do a lot of setting up. A single line of code to open a jdbc connection and you are ready to go, perfect for testing.

    --

    Being bitter is drinking poison and hoping someone else will die

  3. Like a Forest Fire by Flwyd · · Score: 2, Interesting

    We define our schema in an XML format. We have a class that builds a DB from that format, subclassed by database type, making skeletal DB install an automated process. This also means it's the same process to install a client site using Oracle as it is to install a test database on a developer machine using Postgres.

    When our master build runs test cases, it drops all tables and creates them all fresh using the XML definitions. Each JUnit test case is responsible for ensuring it has the data it needs. In some cases, this is done by setting up a facade on the regular service so that the test can worry about semantics and not data storage. In other cases, the test (or a utility) creates test data. You could presumably also copy part of your live data, though that makes it much more difficult to know what the correct answer is in advance.

    If you follow this structure, multiple releases with different schemas is trivial. Just have a parameter for the DB URL in your test suite and let it build the correct database version for you when it checks your schema out of your source repository.

    (Incidentally, keeping your database schema in your source repository also allows easy comparison of database structure between code versions, making it easier to figure out what must happen when you upgrade.)

    --
    Ceci n'est pas une signature.
  4. DBUnit can be very useful by mikeburke · · Score: 3, Interesting

    I work with a large, legacy codebase - about 2 million lines of code, 600 tables. Some bits are nicely written, some aren't. Concepts such as dependency injection, seperation via interfaces etc are not prevasive, so traditional unit testing approaches of mocks or HSQL are not useful (in fact I find they do not scale for 'meaningful' tests anyway).

    So you have this legacy code base - you want to make changes, but how can you validate the result? One approach is to compare database states - one from a known good codebase, one from a modified codebase. DBUnit can be tremendously useful here - this is what I've done (perhaps too complex for explaining on Slashdot):

    Create a common Unit Test base class that extends DBUnit's DatabaseTestCase. It will:

        a) receive a list of modified table names from the concrete test class
        b) if a system property is set, export a pristine copy of these tables prior to running the test - 'reference data'.
        c) execute the use case (register a user, perform a transaction, whatever) - this just makes a 'blind' call into the
              code proper.
        d) if a system property is set, export the modified table data ('known good results')

    The idea is you run this test twice:

    1) With the original codebase, with result exporting enabled to generate known good results.

    2) With the codebase under test - the results generated will be compared against known good results and DBUnit will flag any differences. You can get it to ignore stuff like sequnces,dates that will differ between runs.

    The reference data generated in (b) is reloaded prior to running the test second test, so you start from the same point. Each concrete test class just has to:

    * figure out what tables change within the test
    * provide the test code itself

    Everything else is managed by DBUnit - exporting/importing datasets, comparing datasets, etc.

  5. Re:Why use only one DB? by angel'o'sphere · · Score: 2, Interesting

    Oh my god ....

    You are nearly as wrong as your parrent!

    1st: the QA system very likely won't be the production system, but the production system running in future.
    2nd: DEFINETELY the development system is the same like the QA system. And no: it is not in flux!!! It is reset after each developer test, or developer access to it, either by erasing it and using a back up or by "roll back" of all transactions (that likely is not possible).

    How the hell should a developer figure if his actual "attempt of a new working piece of code" failes because "his DB is in flux" or if he has a programming error? What is if it does not fail because his work is in flux, but the QA system later says: you deleiverd defect code?

    If developers have the feeling "they need to set up" somethign before they can use the "rolled back" DB then the QA system needs very likely the same "set up". Note: the QA system wants to be as close as possibel on the future deployment system, so wants teh development system.

    Rule of thumb:
    a) production system - the system your old code runs on and the system your new code wil be deployed on
    b) QA system - the system that includes a "good" stand of the production system and the anticipated meaningfull defaults for a new system in future
    c) the development system - the system that is similiar to the QA sysem but has even more, the test environments for developers, probably via a mandator/client approach, this one should very easy and fast be able to fall back on an old revision.

    However, to answer the core question of the poster: your test cases should abstract away the data base and use mock ups for scenario tests.
    The goal should be to write for every QA test a script/test like you would for unit tests. That test can run completely without a DB if your system architecture is sound, that means if you have a class/object responsible for accessing every external resource. So you can e.g. use a flat file (property file or XML or what ever) to mimic a DB, by simply having a production version of that class and a test version.

    angel'o'sphere

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.