Programming with GNU Software

Noeud:Maintain the Test Suite, Noeud Ť Next ť:Other Uses of a Test Suite, Noeud Ť Previous ť:Write tests!, Noeud Ť Up ť:Designing a Test Suite

Maintain the Test Suite

A test suite is a project in its own right.

... and therefore demands to be maintained. If you don't, it will become useless or unmaintainable, just like any other kind of program. Spend some time to:

generalize specific tests;
improve the maintainability of the test suite.

Try to generalize your specific tests when you implement or improve tests. For instance, if you are testing a feature which has a fixed set of possible values, test them all. If you exercise the interaction between two such features, do not hesitate the test the Cartesian product of their values, i.e., the set of all the valid and invalid couples.

paragraph. (FIXME: I should first ask Tom if he agrees with the following paragraph..)

The Automake test suite is a good example of what should not happen. Automake supports some form of conditionals, which is a typically feature with a small set of possible values: true and false. Conditionals can interact with each others, since they can influence the same set of variables and/or targets. Because it turned out to be much more delicate to implement than one may first think, the implementation was often changed. Virtually all the modifications were bug fixes, but they often introduced new ones. Gradually the test suite covered more and more cases of conditional uses, and today they cover almost the full range of possible values, the very Cartesian product aforementioned. But this coverage is performed via several handwritten tests, which are modified copies of the previous tests: merely checking that the coverage is complete is a delicate task because of the lack of homogeneity across these tests.

If the first test author had devoted some more time to his test, not only would the improvement of conditionals would have been sped up, but the testing framework would also have been improved because it would have been developed with generalization in mind. This is to parallel with novice programmers preferring to copy-paste-modify a routine n times for n slightly different tasks as compared to the generalization of existing routines to cover these n cases. Which brings us to our next point...

Expertise is gained during the test suite life time, and its rethinking is often beneficial. Just as a regular project, common patterns arise, and factoring can be done. Your test framework should support some form of programming so that this very factorization be possible.

Conventional Bourne shell based tests are again an excellent example of what should not be done. Automake, again, suffered from this: because there is function support in Bourne shell, there is a lot of code duplication, which results in sometimes having to repeat the same modifications on many different files (there are more than 300 test files). I personally had to change several times more than a hundred tests to cope with Automake performing some better sanity checks: these tests, which bypassed the official interface, were no longer "correct" Automake users (voir Look for Realism): the test suite must be viewed as a user).

It is common that these factorizations, these new test functions or macros, reveal holes in the testing. Reading seven invocations to a general routine testing a feature makes it easy to find the eighth case was lacking. Seven test cases written differently, at different places in the test suite, make it impossible for the maintainer to complete its coverage.