Comparing Two Groups of Survival Data

9.3 Comparing Two Groups of Survival Data

Let h 1 (t) and h 2 (t) denote the hazard functions of two independent groups of survival data, often called the exposed and unexposed groups. Comparison of the two groups of survival data can be performed as a hypothesis test formalised in

terms of the hazard ratio ψ=h 1 (t)/ h 2 (t), as follows:

H 0 : ψ = 1 (survival curves are the same);

H 1 : ψ ≠ 1 (one of the groups will consistently be at a greater risk).

9.3 Comparing Two Groups of Survival Data

The following two non-parametric tests are of widespread use:

1. The Log-Rank Test.

Suppose that there are r distinct death times, t 1 ,t 2 , …, t r , across the two groups, and

that at each time t j , there are d 1j ,d 2j individuals of groups 1 and 2 respectively, that die. Suppose further that just before time t j , there are n 1j ,n 2j individuals of groups 1 and 2 respectively, at risk of dying. Thus, at time t j there are d j =d 1j +d 2j deaths in

a total of n j =n 1j +n 2j individuals at risk, as shown in Table 9.6.

Table 9.6. Number of deaths and survivals at time t j in a two-group comparison.

Individuals at risk

Group Deaths at t j Survivals beyond t j

before t j – δ

If the marginal totals along the rows and columns in Table 9.6 are considered fixed, and the null hypothesis is true (survival time is independent of group), the

remaining four cells in Table 9.6 only depend on one of the group deaths, say d 1j . As described in section B.1.4, the probability of the associated random variable,

D 1j , taking value in [0, min(n 1j ,d j )], is given by the hypergeometric law:

The mean of D 1j is the expected number of group 1 individuals who die at time t j (see B.1.4):

e 1j =n 1j (d j /n j ). 9.19

The Log-Rank test combines the information of all 2 × 2 contingency tables,

similar to Table 9.6 that one can establish for all t j , using a test based on the χ 2 test

(see 5.1.3). The method of combining the information of all 2 × 2 contingency tables is known as the Mantel-Haenszel procedure. The test statistic is:

∑ r j = 1 d 1 j − ∑ j = 11 e j − 0 . 5  *  2

χ 2 ~ χ (under H

Note that the numerator, besides the 0.5 continuity correction, is the absolute difference between observed and expected frequencies of deaths in group 1. The

9 Survival Analysis

denominator is the sum of the variances of D 1j , according to the hypergeometric law.

2. The Peto-Wilcoxon test.

The Peto-Wilcoxon test uses the following test statistic:

1 (under H 0 ). 9.21

This statistic differs from 9.20 on the factor n j that weighs the differences between observed and expected group 1 deaths.

The Log-Rank test is more appropriate then the Peto-Wilcoxon test when the alternative hypothesis is that the hazard of death for an individual in one group is proportional to the hazard at that time for a similar individual in the other group. The validity of this proportional hazard assumption can be elucidated by looking at the survivor functions of both groups. If they clearly do not cross each other then the proportional hazard assumption is quite probably true, and the Log-Rank test should be used. In other cases, the Peto-Wilcoxon test is used instead.

Example 9.7

Q: Consider the fatigue test results for iron and aluminium specimens, subject to low amplitude sinusoidal load (Group 1), given in the Fatigue dataset. Compare the survival times of the iron and aluminium specimens using the Log-Rank and the Peto-Wilcoxon tests.

A: With SPSS or STATISTICA one must fill in a datasheet with columns for the “time”, censored and group data. In SPSS one must run the test within the Kaplan- Meier option and select the appropriate test in the Compare Factor window. Note that SPSS calls the Peto-Wilcoxon test as Breslow test.

In R the survdiff function for the log-rank test (default value for rho, rho = 0), is applied as follows:

> survdiff(Surv(cycles,break==1) ~ group) Call: survdiff(formula = Surv(cycles, cens == 1) ~ group)

N Observed Expected (O-E)^2/E (O-E)^2/V group=1 39 23 24.6 0.1046 0.190 group=2 48 32 30.4 0.0847 0.190

Chisq= 0.2 on 1 degrees of freedom, p= 0.663

9.4 Models for Survival Data 367

The Peto-Wilcoxon test is performed by setting rho = 1. SPSS, STATISTICA and R report observed significances of 0.66 and 0.89 for

the Log-Rank and Peto-Wilcoxon tests, respectively. Looking at the survivor functions shown in Figure 9.6, drawn with values computed with STATISTICA, we observe that they practically do not cross. Therefore, the proportional hazard assumption is probably true and the Log-Rank is more appropriate than the Peto-Wilcoxon test. With p = 0.66, the null hypothesis of equal hazard functions is not rejected.

1 S (t ) Iron Aluminium

Figure 9.6. Life-table estimates of the survivor functions for the iron and aluminium specimens (Group 1). (Plot obtained with EXCEL using SPSS results.)