STOCHASTIC CLAIMS: FAULT DENSITY
13.2 STOCHASTIC CLAIMS: FAULT DENSITY
It appears from the previous section that testing does not yield much in terms of logical claims: Concrete testing yields very weak logical claims (in terms of partial correctness with respect to very partial specifications), while symbolic testing may yield stronger claims of correctness with respect to arbitrary specifications, provided we have extracted the function of candidate programs in all its minute detail (a tedious, complex, error-prone, and potentially wasteful task). In this and the following sections, we consider stochastic claims, which focus on likely properties rather than logically provable properties.
The first stochastic property we consider is fault density. A technique known as fault seeding consists in injecting faults into the source code of the candidate program and then submitting the program to a test data set T and counting:
• The number of seeded faults that have been uncovered and • The number of unseeded faults that have been uncovered.
If we assume that the test data we have used detects the same proportion of seeded faults as unseeded faults, we can use this information to estimate the number of unseeded faults in the program. Specifically, if we let:
D be the number of faults seeded into the program, •
D be the number of seeded faults that were discovered through test data T, • N be the number of native faults that were discovered through test data T, and
• N be the total number of native faults in the program, Then we can estimate the total number of native faults, say N, by means of the
following formula:
13.2 STOCHASTIC CLAIMS: FAULT DENSITY 285
Whence we can infer
N×D N=
This formula assumes that test T is as effective at finding seeded faults as it is at finding native faults (see Fig. 13.1) and the estimation is only as good as the assumption. Hence for example, if we seed 100 faults (D = 100) and we find that our test detects
70 faults of which 20 are seeded faults (D = 20), we estimate the number of native faults as follows:
This approach is based on the assumption that test T is as effective at exposing seeded faults as it is at exposing native faults. If we do not know enough about the type of native faults that the program has, or about the effectiveness of the test data set T to expose seeded and native faults, then we can use an alternative technique.
The alternative technique, which we call cross testing, consists in generating two test data sets of equivalent size, where the goal of each test is to expose as many faults as possible, then to analyze how many faults they expose in fact, and how many of these two sets of faults are common. We denote by:
• T1 and T2 the two test data sets. • F1 and F2 the set of faults exposed by T1 and T2 (respectively); by abuse
of notation, we may use F1 and F2 to designate the cardinality of the sets, in addition to the sets themselves.
D D′
N′
Figure 13.1 Fault distribution (native vs. seeded).
286 TEST OUTCOME ANALYSIS
• Q the number of faults that are exposed by T1 and by T2. • N the total number of faults that we estimate to be in the software product.
If we consider the set of faults exposed by T2, we can assume (in the absence of other sources of knowledge) that test data set T1 catches the same proportion of faults in T2 as it catches in the remainder of the input space (Fig. 13.2). Hence we write:
Q F1 = F2 N
From which we infer:
F1 × F2 N= Q
If test data T1 exposes 70 faults and test data T2 exposes 55 faults from which
30 are already exposed by T1 then we estimate the number of faults in the program to be:
So far we have discussed fault density as though faults are independent attributes of the software product, which lend themselves to precise enumeration; in reality, faults are very dependent on each other so that removing one fault may
F1
F2 Q
Figure 13.2 Estimating native faults.
13.3 STOCHASTIC CLAIMS: FAILURE PROBABILITY 287
well affect the existence, number, location, and nature of the other faults; this issue is addressed in the exercises at the end of the chapter. It is best to view fault density as an approximate measure of product quality rather than a precise census of precisely identified faults. Not only is the number of faults itself vaguely defined (see the discussions in Chapter 6), but their impact on failure rate varies widely in practice (from 1 to 50 according to some studies); hence a program with
50 low impact faults may be more reliable than a program with one high impact fault; this leads us to focus on failure probability as the next stochastic claim we study.