Slashdot Mirror


Community Test Data Repository?

BlizzyMadden inputs this query: "Currently I am working on a small utility to convert HTML to plain text. As I test this, I create more and more different types of HTML files to regression test it. I wonder to myself if these test files that I make would be beneficial to other developers who may be doing similar work. To expand on this thought, I wonder if there is a community-based repository of test data anywhere that developers and use and contribute to. Just curious if anyone knows of any project website out there that offers this." "Such a repository would be useful for files like the following:
Complex HTML files.
RFT and Word files with lots of formatting.
Large text files.
Excel files with complex equations and macros.
Files like this would be great if developers were to share them to debug their own applications."

4 of 50 comments (clear)

  1. Sourceforge? by LardBrattish · · Score: 5, Interesting

    If there isn't a test data project maybe you could start one. If people agree that it's a good idea then it'll grow... if not...

    I believe the idea has merit and should be done. This would be useful for the developers of many FOSS applications. A "torture test" of nasty Excel files or Word files would help Open Office etc. HTML files would be good for the Mozilla team. Maybe they would be interested in providing the first few sets of data.

    I'd also recommend tying the automated regression tests to this open source test data so every developer could download the source & the test data and make sure the new feature doesn't break anything...

    Any new troublesome files could be added to the test data and new tests could be built to ensure that the software deals with them.

    --
    What are you listening to? (http://megamanic.blogetery.com/)
  2. Re:Great idea. by seanyboy · · Score: 4, Interesting

    Why the hell is that a troll. In the past I've wanted 100,000 or so mailing addresses to test an indexing routine on, and have ended up spending time writing a random address generator. If I'd have been able to go to a site (like lorum ipsum), ask for 100,000 addresses in CSV format and had these downloadable as a zipped file, it'd have saved time. I'm sure I'm not the only developer this has happened to. Jeez.

    --
    Training monkeys for world domination since 1439
  3. IAWTP by Clover_Kicker · · Score: 2, Interesting
    I once needed a few thousand names for test data. The only big list I could find was the list of men killed in Vietnam

    Anyone have a less disturbing list of real or fake names? I suppose someone could grab some data from a geneology site, strip out just the names, and use that.

    If anyone knows of (or starts) a project like this I'd probably contribute.

  4. Re:or you could... by seanyboy · · Score: 2, Interesting
    --
    Training monkeys for world domination since 1439