Too Darned Big to Test?

← Back to Stories (view on slashdot.org)

Posted by timothy on Monday March 7, 2005 @10:58PM from the ship-it dept.

gManZboy writes "In part 2 of its special report on Quality Assurance (part 1) Queue magazine is running an article from Keith Stobie, a test architect in Microsoft's XML Web Services group, about the challenges one faces in trying to test against large codebases."

7 of 215 comments (clear)

Min score:

Reason:

Sort:

Article summary... by TuringTest · 2005-03-08 00:07 · Score: 3, Informative

... automatically performed by OTS:

Finally, testers can use models to generate test coverage and good stochastic tests, and to act as test oracles. A fundamental flaw made by many organizations (especially by management, which measures by numbers) is to presume that because low code-coverage measures indicate poor testing, or that because good sets of tests have high coverage, high coverage therefore implies good testing (see Logical Fallacies sidebar). One of the big debates in testing is partitioned (typically handcrafted) test design versus operational, profile-based stochastic testing (a method of random testing). Current evidence indicates that unless you have reliable knowledge about areas of increased fault likelihood, then random testing can do as well as handcrafted tests.[4,5]

For example, a recent academic study with fault seeding showed that under some circumstance the all-pairs testing technique (see Choose configuration interactions with all-pairs later in this article) applied to function parameters was no better than random testing at detecting faults.[6]

The real difficulty in doing random testing (like the problem with coverage) is verifying the result. A test design implication of this is to create relatively small test cases to reduce extraneous testing or factor big tests into little ones.[9]

Good static checking (including model property checking). If you know the coverage of each test case, you can prioritize the tests such that you run tests in the least amount of time to get the highest coverage. First run the minimal set of tests providing the same coverage as all of the tests, and then run the remaining tests to see how many additional defects are revealed. Models can be used to generate all relevant variations for limited sizes of data structures.[13,14] You can also use a stochastic model that defines the structure of how the target system is stimulated by its environment.[15] This stochastic testing takes a different approach to sampling than partition testing and simple random testing. Code coverage should be used to make testing more efficient in selecting and prioritizing tests, but not necessarily in judging the tests. Test groups must require and product developers must embrace thorough unit testing and preferably tests before code (test-driven development).

--
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
Re:Code can't be too big, just badly designed by Welsh+Dwarf · 2005-03-08 00:34 · Score: 2, Informative

And for those who didn't RTFA to the end:

The author is suggesting pseudo-random testing rather than exhaustive testing for a large code base, which may be a valid point when you recoup a large piece of monolithique code, but should never be used for a fresh project, where comlplete, staged testing is the only way to avoid a complete kludge.

David

--
Ask 8 slackers a question, get 10 awnsers (a citation, but I can't remember from who)
Re:Shouldn't that be too bloated to test? by Anonymous Coward · 2005-03-08 00:38 · Score: 1, Informative

Yes, you're dead right. The biggest problem is that customers have come to accept buggy, late or incomplete software deliveries as the status quo. In turn development house feel that it is acceptable to deliver buggy, late or incomplete software to the customer.

Customers are unreleastic; we all know that. The problem is that most customers surpassed the normal unreality a long time ago and now expect nothing short of miracles, which most development houses are only to happy to try and deliver. A customer attitude readjustment would help; if customers were realistic about their requirements and the timescales required to deliver their requirements, and likewise if development houses stopped lying to customers about what they can deliver in a given time period, maybe we'd see better software as a result.

Well, it's a nice dream.
Re:I get it by MrMickS · 2005-03-08 01:10 · Score: 2, Informative

Apple released a public beta of OS X to take advantage of the nerd factor. This did cost, but only enough to cover shipping costs. One key thing was that they provided an easy mechanism to provide feedback on bugs encountered. That there were bugs sort of proves the point of the article, that the OS as a whole was too big to be tested, at least in economic terms.

--
You may think me a tired, old, cynic. I'd have to disagree about the tired bit.
Re:Got an idea by Skye16 · 2005-03-08 01:22 · Score: 2, Informative

Wait wait wait, I have one!

Flamebait: Linux sucks!
Flamebait: Apple sucks!
Flamebait: Windows is the best!
Flamebait: The United States is the best!
Flamebait: The United States sucks!

THAT is flamebait.

At worst, waaaaaay yonder (what, great grandparent?), he was trolling, but I thought he was just being facetious.
Re:Shouldn't that be too bloated to test? by gosand · 2005-03-08 02:56 · Score: 4, Informative

But this is exactly what happens in big software houses. The pressure to release ahead of your competition and stay ahead (or catch up with) the perceived feature curve is huge. Delays are bad -- delays equal lost sales. And once the product is done, unlike a bridge or a plane or a shuttle which will last 20 - 30 years or more as is, that software immediately starts getting new features and major modifications for "the next version".
This is not always the case. I just left a very large company for a smaller one, and I have been doing software testing for 11 years. I have worked for two very large companies in my career, and two small ones. In the large ones, I learned most of what good testing was about. I also learned most of what I know about the development process, and how it should be done. Unfortunately, at both of those companies, they talked a good game but didn't deliver very well.
When it comes to software projects, you have 4 factors:
Schedule
Cost
Quality
Features
The rule is, you get to optimize one of these, are constrained by one, and you have to accept the other two. Everyone always thinks that they can get around this somehow, but it never works out. Oh, and you have to make these choices when you start the project - if you change them mid-stream it changes the game.
NASA was used as an example. They are constrained by features and want to optimize quality. Therefore, it costs what it costs and you get it when you get it. Most big software houses are constrained by schedule and want to optimize features. That means they throw money at it and take whatever quality they get. Until they bitch about the quality. If only they really understood this. I presented this to my manager, and he said "But cost is free, because everyone is salaried and can just work overtime." He was serious. Do you wonder why I left?
We always thought we were constrained by schedule because every single release, some manager would say "This is the release date, and it is not moving!" It would move EVERY SINGLE RELEASE. For 4 years, we never hit a release date. Of course, we thought we did because we kept moving it during the cycle. Once, we delivered the release 1 year late - but it was on time according to our re-evaluation. Phbbbt. We did software for hospitals, and it wasn't that big of a deal if we missed our release date. These were huge inventory systems, and it took months for them to deploy. They had to be signed off by Beta sites before it could even be made available to everyone, and even then nobody just bought it off the shelf. We had to go in, install it in their test environments, train them on it, and set up transition dates. And we had to schedule it all within their budget constraints. So time to market wasn't nearly as big of an issue as it is in small companies, where if you don't deliver in a week or two, you can really hurt the company.
I guess my point to all of this is that there are good QA and testing practices, but they might not apply to all situations. The key is knowing when to apply what. If I tried to apply Quality Assurance to where I am now, it would be a total waste of effort. The same goes for testing methodology. (they are NOT even remotely the same things you know) Our build schedules at the big company were every 2 weeks. Where I am now, we do at least 4 releases of software in that time. But it is hosted software, so it is a totally different animal. I value my time at large companies, I learned how things work and don't work in the QA and software testing arenas. The good part is, there is still more out there to learn.

--

My beliefs do not require that you agree with them.
Re:Shouldn't that be too bloated to test? by EddieBurris · 2005-03-08 03:50 · Score: 2, Informative

Besides which, once the flight control system version x.y is finished, the development tea doesn't then immediately start working on flight control system version x.y+1 (or worse, versionn x+1.0). It isn't as if NASA finishes a shutttle, and then immediately starts building a new, improved shuttle.

Every flight requires a new version of the primary flight control software and, because of the long lead time to prepare a version, they often have 2 or more in the works at the same time. At one time in 1983 there were 5 versions being worked on simultaniously.

Reliability in the flight control software for the space shuttle comes at a price. Their cost per line of code is $350*. That buys more quality than most commercial vendors can afford.

Eddie Burris

*http://www.stsc.hill.af.mil/crosstalk/1998/11/k rasner.asp/