The Weibull Model

9.4.2 The Weibull Model

The Weibull distribution offers a more general model for describing survival data than the exponential model does. Instead of a constant hazard function, it uses the

following parametric form, with positive parameters λ and γ, of the hazard function:

h ( t ) = λγ γ t − 1 .

The exponential model corresponds to the particular case γ = 1. For γ > 1, the hazard increases monotonically with time, whereas for γ < 1, the hazard function decreases monotonically. Taking into account 9.23, one obtains:

The probability density function of the survival time is given by the derivative of F(t) = 1 – S(t). Thus:

λγ t t − 1 e − λ . 9.32

9 Survival Analysis This is the Weibull density function with shape parameter γ and scale parameter

1 / λ (see B.2.4):

f ( t ) = w γ , γ 1 / λ ( t ) . 9.33

Figure B.11 illustrates the influence of the shape and scale parameters of the Weibull distribution. Note that in all cases the distribution is positively skewed, i.e., the probability of survival in a given time interval always decreases with increasing time.

The parameters of the distribution can be estimated from the data using a log- likelihood approach, as described in the previous section, resulting in a system of two equations, which can only be solved by an iterative numerical procedure. An alternative method to fitting the distribution uses a weighted least squares approach, similar to the method described in section 7.1.2. From the estimates

λ ˆ and γ ˆ , the following statistics are then derived:

ˆ 1 / γ t ˆ ˆ 0 . 5 = ( ln 2 / λ ) . 9.34

t ˆ p = ( ln( 1 /( 1 − p )) / λ ˆ ) .

The standard error of these estimates has a complex expression (see e.g. Collet D, 1994 or Kleinbaum DG, Klein M, 2005). In the assessment of the suitability of a particular distribution for modelling the data, one can resort to the comparison of the survivor function obtained from the data, using the Kaplan-Meier estimate, S ˆt ( ) , with the survivor function prescribed by the model, S(t). From 9.31 we have:

ln( −lnS(t)) = ln λ + γ ln t. 9.35

If S(t) is close to S ˆt ( ) , the log-cumulative hazard plot of ln( −ln S ˆt ( ) ) against ln t will be almost a straight line.

An alternative way to assessing the suitability of the model uses the χ 2 goodness

of fit test described in section 5.1.3.

Example 9.9

Q: Consider the amount of time until breaking of aluminium specimens submitted to high amplitude sinusoidal loads in fatigue tests, a sample of which is given in the Fatigue dataset. Determine the Weibull estimate of the survivor function and assess the validity of the model. What is the point estimate of the median time until breaking?

A: Figure 9.8 shows the Weibull estimate of the survivor function, determined with STATISTICA ( Life tables & Distributions, Number of intervals = 12), using a weighted least square approach similar to the one mentioned in Example 9.8 ( Weight 3). Note that the t values are divided, as in

9.4 Models for Survival Data 371

Example 9.3, by 10 4 . The observed probability of the chi-square goodness of fit test is very high: p = 0.96. The model parameters computed by STATISTICA are:

λˆ l = 0.187; γˆ = 0.703.

Figure 9.7 also shows the log-cumulative hazard plot obtained with EXCEL and computed from the values of the Kaplan-Meier estimate. From the straight-line fit of this plot, one can compute another estimate of the parameter γˆ = 0.639. Inspection of this plot and the previous chi-square test result are indicative of a good fit to the Weibull distribution. The point estimate of the median time until breaking is computed with formula 9.34:

t 0 . 5 = ( ln 2 / λ ) = 

Thus, taking into account the 10 4 scale factor used for the t axis, a median number of 1970020 cycles is estimated for the time until breaking of the aluminium specimens.

g 1.0 0.20 in iv

0.10 u ln(-lnS (t )) rv

-0.20 u mul C -0.30 0.4 -0.40

Interval Start

ln(t )

b 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Figure 9.8. Fitting the Weibull model to the time until breaking of aluminium

0.0 -0.70

a 0.000 8.894 17.79 26.68 35.58 44.47 53.37 62.26 71.16 80.05 88.94 97.84 106.7

specimens submitted to high amplitude sinusoidal loads in fatigue tests: a) Life- table estimate of the survivor function with Weibull estimate (solid line); b) Log- cumulative hazard plot (solid line) with fitted regression line (dotted line).