Concepts: Test-Ideas Catalog

Disciplines > Test > Concepts > Test-Ideas Catalog

Topics

Introduction
How Catalogs Find Faults
Good Catalogs
An Example of Using a Catalog
Creating and Maintaining Your Own Catalogs

Introduction

Much of programming involves taking things you've used over and over before and using them yet again in a different context. Those things can be data structures (such as linked lists, hash tables, relational databases, etc.) or operations (such searching, sorting, creating temporary files, or popping up a browser window).

The interesting thing about these clichés is that they have clichéd faults. People do not invent imaginative new ways of incorrectly inserting something into a doubly-linked list; they tend to make the same mistakes that they and others have made before. A programmer who pops up a browser window might make one of these clichéd mistakes:

creating a new window when one that's already open should be reused
failing to make visible a browser window that's obscured or minimized
using Internet Explorer when the user has chosen a different default browser
failing to check whether Javascript is enabled

Since faults are clichéd, test ideas that can find them are too. They can be put in a catalog.

How Catalogs Find Faults

One of the virtues of a catalog is that a single test idea can be useful for finding more than one underlying fault. Here's an example of one idea finding two faults.

The first fault was in a C compiler. This compiler took command-line options like "-table" or "-trace" or "-nolink". The options could be abbreviated to their smallest unique form. For example, "-ta" was as good as "-table". However, "-t" was not allowed, because it was ambiguous: it could mean either "-table" or "-trace".

Internally, the command-line options were stored in a table like this:

-table

-trace

-nolink

When an option was encountered on the command line, it was looked up in the table. It matched if it was the prefix of any table entry. (That is, "-t" matched "-table".) After one match was found, the rest of the table was searched for another match. Another match would be an error because it would indicate ambiguity.

The code that did the searching looked like this:

for (first=0; first < size; first++) {
    if (matches(entry[first], thing_sought)) {
        /* at least one match */
        for(dup=first+1; dup < size; dup++)
            /* search for another */
            if (matches(entry[dup], thing_sought))
                /* extra match */
                break;  /* error out */
        return first;
    }
}
return -1;	/* Not found or ambiguity */

Do you see the problem? (It's fairly subtle.)

The problem is the break statement. It's intended to break of the outermost enclosing loop when a duplicate match is found, but it really breaks out of the inner one. That has the same effect as not finding a second match: the index of the first match is returned.

Notice that this fault can only be found if the option being sought for matches twice in the table, as "-t" would.

Now let's look at a second, completely different fault.

The code takes a string. It is to replace the last '=' in the string with a '+'. If there is no '=', nothing is to be done. The code uses the standard C library routine strchr to find the location of '='. Here's the code:

ptr = strchr(string, ‘=’);	/* Find last = */
if (ptr != NULL_CHAR)
    *ptr = ‘+’;

This problem is also a bit subtle.

The function strchr returns the first match in the string, not the last. The correct function is strrchr. The problem was most likely a typographical error. (Actually, the deep underlying problem is that it's desperately unwise to put two functions that differ only by a typo into a standard library.)

This fault can only be found when there are two or more equal signs in the input. That is:

"a=b" would return the correct result, "a+b".
"noequals" would return the correct result, "noequals".
"a=b=c" would incorrectly return "a+b=c", not the correct "a=b+c".

What's interesting and useful here is that we have two faults with completely different root causes (typographical error, misunderstanding of a C construct) and different manifestations in the code (wrong function called, misuse of break statement) that can be found by the same test idea (search for something that occurs twice).

Good Catalogs

What makes a good catalog?

It contains a small set of test ideas that can find a much larger set of underlying faults.
It is easily skimmable. You should be able to skip test ideas that are not relevant to your situation.
It should not contain test ideas that you won't ever use. For example, someone who doesn't ever deal with web browsers shouldn't have to keep skipping over test ideas for programs that use web browsers. Someone working on game software will want a shorter catalog than someone working on safety-critical software. The game person can afford to concentrate only on the test ideas with the highest chance of finding faults.

Given these rules, it seems best to have more than one catalog. Some data and operations are common to all programming, so their test ideas can be put into a catalog all programmers can use. Others are specific to a particular domain, so test ideas for them can be put into a catalog of domain-specific test ideas.

The simple catalog found here is a good one to begin with. And here's another example Test Ideas for Mixtures of ANDs and ORs

An Example of Using a Catalog

Suppose you're implementing this method:

void applyToCommonFiles(Directory d1,
                        Directory d2,
                        Operation op);

applyToCommonFiles takes two directories as arguments. When a file in the first directory has the same name as a file in the second, applyToCommonFiles performs some operation on that pair of files. It descends sub-directories.

Here's how you might use the sample catalog. The method is to scan the catalog, looking for major headings that match, then consider the test ideas under the heading to see if they seem relevant, then write those that do into a Test-Ideas List.

(Note: this step-by-step description will perhaps make using the catalog seem laborious. It takes longer to read about creating the checklist than it does to actually create it.)

The first entry is for Any Object. Could any of the arguments be null pointers? This is a matter of the contract between applyToCommonFiles and its callers. The contract could be that the callers will not pass in a null pointer. If they do, all bets are off: applyToCommonFiles can perform any action. In such a case, no test is appropriate, since nothing applyToCommonFiles can do can be wrong. If, however, applyToCommonFiles is required to check for null pointers, the test idea would be useful. Let's assume the latter, which gives us this starting Test-Ideas List:

d1 is null (error case)
d2 is null (error case)
op is null (error case)

The next catalog entry is Strings. The names of the files are strings, and they're compared to see if they match. The idea of testing with the empty string ("") doesn't seem useful: presumably some standard string comparison routines will be used, and they will handle empty strings correctly.

But wait... If there are strings being compared, what about case? Suppose d1 contains a file named "File" and d2 contains a file named "file". Should those files match? On Unix, clearly not. On Windows, they almost certainly should. That's another test idea:

Files match in the two directories, but the case of the names is different

Notice that this test idea didn't come directly from the catalog. But the catalog drew our attention to a particular aspect of the program (file names as strings), and our creativity gave us an additional idea. It's important not to use the catalog too narrowly; use it as a brainstorming technique, a way of inspiring new ideas.

The next entry is Collections. A directory is a collection of files. Many programs that handle collections fail on the empty collection. A few that handle the empty collection, or collections with many elements, fail on collections with exactly one element. So these ideas are useful:

d1 is empty
d2 is empty
d1 has exactly one file
d2 has exactly one file

The next idea is to use a collection of the maximum possible size. That's useful because programs like applyToCommonFiles are often tested with trivial little directories. Then some user comes along and applies them to two huge directory trees with thousands of files in them, only to discover that the program is grotesquely memory inefficient and can't handle that realistic case.

Now, testing the absolute maximum size for a directory is not important; it need only be as large as a user might try. But, at the very least, there should be some test with more than three files in a directory:

d1 contains very many files
d2 contains very many files

The final test idea (duplicate elements) doesn't apply to directories of files. That is, if you have a directory with two files with the same name in it, you have a problem independent of applyToCommonFiles: your file system is corrupt.

The next catalog entry is Searching. Those ideas can be translated into applyToCommonFiles terms like this:

d1 and d2 have no files in common (all the names are different).
d1 and d2 have exactly one file in common (it's alphabetically the last element in the directory).
d1 and d2 have more than one file in common.

The final test idea checks whether applyToCommonFiles terminates too soon: does it return as soon as it finds the first match? The parenthetical remark in the test idea before that assumes that the program will fetch the list of files in a directory using some library routine that returns them sorted alphabetically. If not, it might be better to find out what the last one really is (the most recently created?) and make that be the match. Before devoting a lot of time to finding out how files are ordered, though, you should ask yourself how likely to find defects putting the matching element last is. Putting an element last in a collection is more useful if the code explicitly steps through the collection using an index. If it's using an iterator, it's extremely unlikely that the order matters.

Let's look at one more catalog entry. The Linked structures entry reminds us that we're comparing directory trees, not just flat collections of files. It would be sad if applyToCommonFiles worked only in the top level directories, not in the lower-level ones. But deciding how to test whether applyToCommonFiles works in lower-level directories forces us to confront incompleteness in its description.

First, when does applyToCommonFiles descend into sub-directories? If the directory structure looks like this:

Fig1: A directory structure

does applyToCommonFiles descend into Cdir? That doesn't seem to make sense. There can be no match with anything in the other directory tree. In fact, it seems as if files in sub-directories can only match if the sub-directory names match. That is, suppose we have this directory structure:

Fig2: A second directory structure

The files named "File" don't match because they're in different sub-directories The sub-directories should be descended only if they have the same name in both d1 and d2. That leads to these test ideas:

Some sub-directory in d1 is not found in d2 (no descent).
Some sub-directory in d2 is not found in d1 (no descent).
Some sub-directory appears in both d1 and d2 (descend).

But that raises other questions. Should the operation (op) be applied to matching sub-directories or just to matching files? If it's applied to the sub-directories, should it be applied before the descent or afterwards? That makes a different if, for example, the operation deletes the matching file or directory. For that matter, should the operation be allowed to modify the directory structure? More specifically: what's the correct behavior of applyToCommonFiles if it does? (This is the same issue as comes up with iterators.)

These sort of questions are typical of the ones that arise when you do a "close reading" of a method's description in the process of creating test ideas. But let's leave them aside now. Whatever the answers, there will have to be test ideas for them, test ideas that check whether the code implements the answers correctly.

Instead, let's return to the catalog. We still haven't considered all its test ideas. The first one—"empty (nothing in structure)"—asks for an empty directory. We already got that from the Collections entry. We also got the "minimal non-empty structure", which is a directory with a single element. This sort of redundancy is not uncommon, but it's easy to ignore.

What about "a circular structure"? Directory structures can't be circular—a directory can't be within one of its descendants or within itself... or can it? What about shortcuts (on Windows) or symbolic links (on UNIX)? If there's a shortcut in d1's directory tree that points back to d1, should applyToCommonFiles keep descending forever? The answer may lead to one or more new test ideas:

d1 is circular because of shortcuts or symbolic links
d2 is circular because of shortcuts or symbolic links

(Depending on the correct behavior, there may be more test ideas than that.)

Finally, what about "depth greater than one"? Earlier test ideas will ensure that we test descending into one level of sub-directory, but we should check that applyToCommonFiles keeps descending:

descends through several levels (>1) of d1's sub-directories
descends through several levels (>1) of d2's sub-directories

Creating and Maintaining Your Own Catalogs

As was mentioned earlier, the generic catalog won't contain all the test ideas you need. But domain-specific catalogs haven't been published outside the companies that have created them. If you want them, you'll need to build them. Here's some advice.

Avoid the temptation to fill a catalog with your speculations about what ideas would be good at finding faults. Remember that each test idea you put in the catalog costs time and money: your time to maintain the catalog, other programmers' time to think about the test idea, possibly other programmers' time to implement a test. Add only ideas that have a demonstrated track record. You should be able to point to at least one actual fault that the test idea would have caught. Ideally, the fault should be one that was missed by other testing (that is, was one reported from the field). One good way to build catalogs is to browse through your company's bug database and ask questions about how each fault could have been detected earlier.

If creating and maintaining a Test-Ideas Catalog is something you do in your spare time, it's unlikely to work. You'll need time specifically allocated to this task, just like for any other important one. We recommend you create and maintain your Test-Ideas Catalog during Workflow Detail: Improve Test Assets.

Rational Unified Process