ERROR DETECTABILITY
14.3 ERROR DETECTABILITY
What makes it possible to detect errors in a program? We argue that redundancy does; more specifically, what makes it possible to detect errors is the redundancy in the way program states are represented. When we declare variables in a program to represent data, we have in mind a relation between the data we want to manipulate and the rep- resentation of this data by means of the program variables. We refer to this relation as the representation relation; ideally, we may want a representation relation to have the following properties:
• Totality: each datum has at least one representation. • Determinacy: each datum has at most one representation. • Injectivity: different data have different representations. • Surjectivity: all representations represent valid data.
It is very common for representation relations to fall short of these properties: in fact it is common to have none of them.
14.3 ERROR DETECTABILITY 321
• When a representation relation is not total, we observe an incomplete representation: for example, not all integers can be represented in computer
arithmetic. • When a representation is not deterministic, we observe ambivalence: for example, in a sign-magnitude representation of integers, zero has two distinct representations, −0 and +0.
• When a representation is not injective, we observe a loss of precision: for exam- ple, real numbers in the neighborhood of a representable floating point value are all mapped to that value.
• When a representation is not surjective, we observe redundancy: for example, in
a parity-bit representation scheme of bytes, not all 8-bit patterns represent legit- imate bytes.
More generally, redundancy in the representation of data in a program stems from the non-surjectivity of the representation relation of program data, which maps a small data set to a vast state space of the program. If the representation relation were sur- jective, then all representations would be legitimate; hence if by mistake one repre- sentation were altered to produce another representation, we would have no way to detect the error; by contrast, if the representation relation were not surjective, and one representation were altered to produce another representation that is outside the range of the representation relation, then we would know for sure that we have an error. Hence the essence of state redundancy is the non-surjectivity of the repre- sentation relation. Whence the definition:
Definition: State Redundancy Let g be a program on space S, and let σ be the random variable that represents actual values that the program state may take at a particular stage in its execution. The state redundancy of program g at a stage in its computation is defined as the difference between the entropy of its state space S and the entropy of σ at that stage; the state redundancy of program g is defined as the interval formed by its state redundancy at its initial state and its state redundancy at its final state.
As an illustration, let us consider three program variables that we use to represent: the year of birth of a person, the age of the person, and the current year. The variable declarations of the program would look like:
int yob, age, thisyear;
If we assume that integer variables are coded in 32 bits, then the entropy of the program state is 3 × 32 bits = 96 bits. As for the entropy of the actual set of values that we want to represent, we assume that ages range between 0 and 150, years of birth range between 1990 and 2090 (101 different values), and current year ranges between 2014 and 2140 (127 different values). Because we have the equation yob +age = thisyear, the condition that age is between 0 and 150 is redundant and
322 METRICS FOR SOFTWARE TESTING
age can be inferred from the other two variables. Hence the entropy of the set of actual values is log(101) + log(127) = 27.62 bits. Hence the redundancy (excess bits) is 96 − 27.62 = 62.38 bits.
The redundancy of a state reflects the strength of an assertion that we can check about the state. For example, a redundancy of 32 bits means that we can check an assertion in the form of an equality between two 32-bit integer expressions.
Now that we know how to compute the redundancy of a state, we use it to define the redundancy of a program: to this effect, we observe that while the set of program variables remains unchanged through the execution of the program, the range of values that program states may take shrinks, as the program establishes new relations between program variables; for example, a sorting routine starts with an array whose cells are in random order and rearrange them in increasing order. Given that the entropy of a random variable decreases as we apply functions to it, we can infer that the entropy of the final state of a program is smaller than the entropy of the initial state (prior to applying the function of the program) and the redundancy of the final state is greater than the redundancy of the initial state (assuming the set of program variables remains unchanged, that is, no variables have been declared or returned through the execution of the program). We can define the redundancy of a program by
• The state redundancy of its initial state, or • The (larger) state redundancy of its final state, or • The pair of values representing the initial and final state redundancies.
As an illustration, we consider the following program that reads two integers between 1 and 1024 and computes their greatest common divisor.
{int x, y; cin << x << y ; // initial state While (x!=y) {if (x>y) {x=x-y;} else {y=y-x;}} // final state
The set defined by the declared variables which are two integers; which we assume to
be 32 bits wide; hence the entropy of the declared state is 2 × 32 bits = 64 bits. Because the variables range from 1 to 1024, the entropy of the set of values that these variables take is actually 2 × log(1024) = 20 bits. Hence the redundancy of the initial state is
44 bits. In the final state the two variables are identical and hence the entropy of the final state is merely log(1024), which is 10 bits. Hence the redundancy of the final state is 54 bits. The state redundancy of this program can be represented by the pair (44 bits, 54 bits).
14.4 ERROR MASKABILITY 323