Modeling Software Reliability
13.3.3 Modeling Software Reliability
A software reliability model is a statistical model that represents the reliability of a software product as a function of relevant parameters; each model can be characterized by the assumptions it makes about relevant parameters, their properties, and their impact on the likelihood of product failure. It is customary to classify reliability models according to the following criteria:
• Time Domain. Some models are based on wall clock time whereas others measure time in terms of the number of executions. • Type. The type of a model is defined by the probability distribution function of the number of failures experienced by the software product as a func-
tion of time. • Category. Models are divided into two categories depending on the number of failures they can experience over an infinite amount of operation time: models for which this number is finite and models for which it is infinite. ○ For Finite Failures Category. Functional form of the failure intensity in terms
of time. ○ For Infinite Failures Category. Functional form of the failure intensity in
terms of the number of observed failures. We consider the reliability of a system under the assumption that the number
of failures it can experience is finite, and we let M(t) be the random variable that represents the number of failures experienced by a software product from its first execution (or from the start of its test phase) to time t. We denote by μ(t) the expected value of M(t) at time t and we assume that μ(t) is a non-decreasing, continuous, and differentiable function of t and we let λ(t) be the derivative of μ(t) with respect to time:
dμt λ t= dt
292 TEST OUTCOME ANALYSIS
This function represents the rate of increase of function μ(t) with time; we refer to it the failure intensity (or the failure rate) of the software product. If this failure rate is constant (independent of time), which is a reasonable assumption so long as the soft- ware product is not modified (no fault removal) and its operating conditions (usage pattern) are maintained, then we find:
μ t = λt + c
for some constant c; given that time t = 0 corresponds to the first execution of the product under observation, no failures are observed at t = 0, hence we find μ t = λt.
A common model of software reliability for constant failure intensity provides the following equation between failure, intensity, time, and the probability of failure free operation:
Rt=e − λt
The probability F(t) that the system has failed at least once by time t is the complement of R(t), that is,
F t = 1 −R t = 1−e − λt
The probability density function f(t) of probability F(t) is the derivative of F(t) with respect to time, which is:
f t = λe − λt
The probability that a failure occurs between time t 0 and time t 1 is given by the following integral:
0 =e − λt −e − λt t 1
f t dt = λ e − λt dt = −e − λt t 1 t
The probability that a failure occurs before time t is a special case of this formula, for t 0 = 0 and t 1 = t, which yields:
F t = 1−e − λt ,
which is what we had found earlier using the definition of R(t). The mean time to failure of the system can be estimated by integrating, for t from 0 to infinity, the function that represents the product of t by the probability that the failure occurs at time t. We write this as:
MTTF =
tλe − λt dt
13.3 STOCHASTIC CLAIMS: FAILURE PROBABILITY 293
We compute this integral using integration by parts:
tλe − λt dt
= {integration by parts}
e − λt dt −te − λt ∞
= {evaluating the second term, which is zero at both ends}
e − λt dt
= {simple integral}
1 − e − λt ∞ λ
= {value at zero}
We highlight this result: Under the exponential reliability model, the mean time to failure is the inverse of the
failure intensity.
This equation enables us to correlate the mean time to failure with all the relevant probabilities of system failure. For example, the following table shows the probability that the system operates failure-free for a length of time t, for various values of t, assuming that the system’s mean time to failure is 10,000 hours:
Operation time, Probability of failure t (in hours)
free operation 0 1.0 1 0.9999 10 0.999
Probabilities of failure free operation for MTTF = 10,000 hours
294 TEST OUTCOME ANALYSIS
From this table, we can estimate the probability that a failure occurs in each of the intervals indicated in the table below, by virtue of the formula
Pt 0
0 ≤t≤t 1 =e − λt −e − λt 1
where P t 0 ≤t≤t 1 designates the probability that the failure occurs between time t 0 and time t 1 .
Operation time, t (in hours) Probability of failure free operation Within the first hour
After the first hour, within 10 hours
After the first 10 hours, within 100 hours 0.00895 After the first 100 hours, within 1,000 hours
0.085213 After the first 1,000 hours, within 10,000 hours
0.536958 After the first 10,000 hours, within 10,0000 hours
0.367834 After the first 100, 000 hours
0.0000454 Total
Probabilities of failure per interval of operation time, for MTTF = 10, 000 hours