Random Test Generation
12.3.1 Random Test Generation
We consider the specification of a state-based software product in the form of axioms and rules, and we consider a candidate implementation in the form of a class (an encapsulated module that maintains an internal state and allows access to
a number of externally accessible methods). In order to verify the correctness of the proposed implementation with respect to the specification, we resolve to proceed as follows:
• Verify the Implementation Against the Axioms. Each axiom of the specification can be mapped onto a Hoare formula, whose precondition is True and whose postcondition is a statement about the behavior specified by the axiom. We con- sider the following axiom in the specification of the stack ADT:
stack(init.push(a).top)=a, and we let g be a candidate implementation that has a method of the same name
as each input symbol of the specification. Then to verify the correctness of the implementation we generate the following formula:
258 TEST DRIVER DESIGN
v: {true} g.init(); g.push(a); y=g.top(); {y = a} where y is a variable of type itemType and a is a constant of the same type. Such
formulas typically deal with trivial cases (by definition) and hence involve none of the issues that usually make correctness proofs complicated; in particular, they typically give rise to simple intermediate assertions and do not involve complex invariant assertion generation.
• Test the Implementation Against the Rules. Rules are typically used to build an inductive argument linking the behavior of the specified product on simple input histories to its behavior on more complex input histories. The vast majority of rules fall into two broad classes: a class that provides that two input histories are equiv- alent and a class that provides an equation between the values of a VX operation at the end of a complex input history as a function of the value of the VX operation at the end of a simpler input history. Representative examples of these categories of rules for the stack specification include the following:
stack(init.h.push(a).pop.h+) = stack(init.h.h+). stack(init.h.push(a).size) = 1+stack(init.h.size).
We resolve to test candidate implementations against specification rules by gen- erating random sequences for h and h+ (the latter being necessarily nonempty) and checking these equalities for each random instance. For rules of the first cat- egory, which require that we check the equivalence of two states, we resolve to consider (as an approximation) that two states are equivalent if and only if they deliver identical values for all the XV operations.
In light of these decisions, we write the outermost structure of our test driver as follows:
#include <iostream> #include “stack.cpp” #include “rand.cpp”
using namespace std; typedef int boolean;
typedef int itemtype; const int testsize = 10000;
const int hlength = 9; const int Xsize = 5; const itemtype paramrange=7; // drawing parameter to push()
// random number generators int randnat(int rangemax); int gt0randnat(int rangemax);
12.3 SELECTING A SPECIFICATION MODEL 259 // rule testers
void initrule(); void initpoprule(); void pushpoprule(); void sizerule(); void emptyrulea(); void emptyruleb(); void vopruletop(); void voprulesize(); void vopruleempty(); // history generator void historygenerator
(int hl, int hop[hlength], itemtype hparam[hlength]); /* State Variables */
stack s; // test object int nbf; // number of failures
int main () { /* initialization */ nbf=0;
// counting the number of failures SetSeed (825); // random number generator
for (int i=0; i<testsize; i++) {switch(i%9) {case 0: initrule(); break; case 1: initpoprule(); break; case 2: pushpoprule(); break;
case 3: sizerule(); break; case 4: emptyrulea(); break;
case 5: emptyruleb(); break; case 6: vopruletop(); break;
case 7: voprulesize(); break; case 8: vopruleempty(); break;} } cout << “failure rate: ” << nbf << “ out of ” << testsize << endl;
This loop will cycle through the rules, testing them one by one successively. The factor testsize determines the overall size of the test data; because test data is generated automatically, this constant can be arbitrarily large, affording us an arbitrarily thorough test. The variable nbf represents the number of failing tests and is incremented by the routines that are invoked in the switch statement, whenever
a test fails. For the sake of illustration, we consider the function pushpoprule(), which we detail below:
void pushpoprule() {
// stack(init.h.push(a).pop.h+) = stack(init.h.h+)
260 TEST DRIVER DESIGN
int hl, hop[hlength]; itemtype hparam[hlength]; // storing h int hplusl, hplusop[hlength]; itemtype hplusparam [hlength]; // storing h+ int storesize;
// size in LHS
boolean storeempty; // empty in LHS itemtype storetop; // top in LHS boolean successfultest; // successful test
// drawing h and h+ at random, storing them in hop, hplusop hl = randnat(hlength); for (int k=0; k<hl-1; k++)
{hop[k]=gt0randnat(Xsize); if (hop[k]==1) {hparam[k]=randnat(paramrange);}}
hplusl = gt0randnat(hlength); for (int k=0; k<hplusl-1; k++)
{hplusop[k]=gt0randnat(Xsize); if (hplusop[k]==1) {hplusparam[k]=randnat (paramrange);}}
// left hand side of rule s.sinit(); historygenerator(hl,hop,hparam); itemtype a=randnat(paramrange); s.push(a); s.pop(); historygenerator(hplusl,hplusop,hplusparam); // store resulting state storesize = s.size(); storeempty=s.empty();
storetop=s.top(); // right hand side of rule
s.sinit(); historygenerator(hl,hop,hparam); historygenerator(hplusl,hplusop,hplusparam); // compare current state with stored state successfultest =
(storesize==s.size()) && (storeempty==s.empty()) && (storetop==s.top());
if (! successfultest) {nbf++;} }
where function historygenerator (shown below) transforms an array of integers (which represent sequence h or h+) into a sequence of method calls, as shown below:
void historygenerator (int hl, int hop[hlength], itemtype hparam[hlength])
12.3 SELECTING A SPECIFICATION MODEL 261 {int dumsize; boolean dumempty; itemtype dumtop;
for (int k=0; k<hl-1; k++) {switch (hop[k]) {case 1: {itemtype a=hparam[k]; s.push(a);} break; case 2: s.pop(); break; case 3: dumsize=s.size(); break; case 4: dumempty=s.empty(); break; case 5: dumtop=s.top(); break;
and the random number generators are defined as follows:
int randnat(int rangemax)
{// returns a random value between 0 and rangemax return (int) (rangemax+1)*NextRand(); }
int gt0randnat(int rangemax)
{// returns a random value between 1 and rangemax return 1 + randnat(rangemax-1); }
We have included comments in the code to explain it. Basically, this function pro- ceeds as follows: First, it generates histories h and h+; then it executes the sequence init.h.push(a).pop.h+, for some arbitrary item a; then it takes a snapshot of the current state by calling all the VX operations and storing the values they return. Then it rein- itializes the stack and calls the sequence init.h.h+; finally it verifies that the current state of the stack (as defined by the values returned by the VX operations) is identical to the state of the stack at the sequence given in the left hand side (which was previ- ously stored). If the values are identical, then we declare a successful test; if not, we increment nbf.
As an example of the second class of rules, those that end with a VX operation, we consider the size rule, for which we write the following code:
void sizerule() {// stack(init.h.push(a).size) = 1+stack(init.h.size)
int hl, hop[hlength]; itemtype hparam[hlength]; // storing h int storesize;
// size in LHS
boolean successfultest; // successful test // drawing h and h+ at random,
// storing them in hop, hplusop
262 TEST DRIVER DESIGN
hl = randnat(hlength); for (int k=0; k<hl-1; k++)
{hop[k]=gt0randnat(Xsize); if (hop[k]==1) {hparam[k]=randnat(paramrange);}}
// left hand side of rule s.sinit(); historygenerator(hl,hop,hparam); itemtype a=randnat(paramrange); s.push(a); // store resulting state storesize = s.size(); // size value on the left hand side
// right hand side of rule s.sinit(); historygenerator(hl,hop,hparam); // compare current state with stored state successfultest =(storesize==1+s.size()); if (! successfultest) {nbf++;}
Once we generate a function for each of the nine rules, we can run the test driver with an arbitrary value of variable testsize (to make the test arbitrarily thorough), an arbitrary value of variable hlength (to make h sequences arbitrarily large), and an arbitrary value of variable paramrange (to let the items stored on the stack take their values from a wide range).
Execution of the test driver on our stack with the following parameter values
• testsize = 10000; • hlength = 9; • paramrange = 7;
yields the following outcome:
failure rate: 0 out of 10000
which means that all 10,000 executions of the stack were consistent with the rules of the stack specification. Of course, typically, when we are dealing with a large and complex module, a more likely outcome is to observe a number of failures. Notice that because test data generation and oracle design are both based on an analysis of the specification, we have written the test driver without looking at the candidate implementation; this means, in particular, that this test driver can be deployed on any implementation of the stack that purports to satisfy the specification we are using; we will uncover this implementation in Section 12.3.3.
12.3 SELECTING A SPECIFICATION MODEL 263