22 The bond shear strength data introduced in 12 contains values of four
Example 13.22 The bond shear strength data introduced in Example 13.12 contains values of four
different independent variables x 1 2x 4 . We found that the model with only these four
variables as predictors was useful, and there is no compelling reason to consider the inclusion of second-order predictors. Figure 13.19 is the Minitab output that results from a request to identify the two best models of each given size.
The best two-predictor model, with predictors power and temperature, seems
to be a very good choice on all counts: R 2 is significantly higher than for models with fewer predictors yet almost as large as for any larger models, adjusted R 2 is almost
at its maximum for this data, and C 2 is small and close to 21153 .
13.5 Other Issues in Multiple Regression
Response is strength
Figure 13.19 Output from Minitab’s Best Subsets option
■
Stepwise Regression When the number of predictors is too large to allow for explicit or implicit examination of all possible subsets, several alternative selection procedures will generally identify good models. The simplest such procedure is the backward elimination (BE) method. This method starts with the model in which all predictors under consideration are used. Let the set of all such predictors be
x 1 , c, x m . Then each t ratio bˆ 1 s bˆ i (i 5 1, c, m) appropriate for testing H 0 :b i 50 versus H a :b i 2 0 is examined. If the t ratio with the smallest absolute value is less
than a prespecified constant t out , that is, if
bˆ i51, c, m i `
min
s `,t bˆ out i
then the predictor corresponding to the smallest ratio is eliminated from the model. The reduced model is now fit, the m21 t ratios are again examined, and another predictor is eliminated if it corresponds to the smallest absolute t ratio smaller than t out . In this way, the algorithm continues until, at some stage, all absolute t ratios are at least t out . The model used is the one containing all predictors that were not elimi-
nated. The value t out 52 is often recommended since most t .05 values are near 2.
Some computer packages focus on P-values rather than t ratios.
Example 13.23 For the coded full quadratic model in which y 5 tar content , the five potential pre- (Example 13.20
dictors are xr 1 , xr 2 , xr 3 5 xr 2 1 , xr 4 5 xr 2 , and xr 5 5 xr 1 xr 2 (so m 5 5) . Without specifying
continued)
t out , the predictor with the smallest absolute t ratio (asterisked) was eliminated at each stage, resulting in the sequence of models shown in Table 13.11.
Table 13.11 Backward Elimination Results for the Data of Example 13.20
u t - ratio u
Using t out 52 , the resulting model would be based on xr 1 , xr 2 , and , since at Step 3 xr 3
no predictor could be eliminated. It can be verified that each subset is actually the best subset of its size, though this is by no means always the case.
■
CHAPTER 13 Nonlinear and Multiple Regression
An alternative to the BE procedure is forward selection (FS). FS starts with
no predictors in the model and considers fitting in turn the model with only x 1 , only
x 2 ,c , and finally only x m . The variable that, when fit, yields the largest absolute t ratio enters the model provided that the ratio exceeds the specified constant t in .
Suppose x 1 enters the model. Then models with (x 1 ,x 2 ), (x 1 ,x 3 ), c(x 1 ,x m )
are considered in turn.The largest u bˆ j s bˆ j u ( j 5 2, c, m) then specifies the entering predictor provided that this maximum also exceeds t in . This continues until at some step no absolute t ratios exceed t in . The entered predictors then specify the model. The
value t in 52 is often used for the same reason that t out 52 is used in BE. For the tar-
content data, FS resulted in the sequence of models given in Steps
5, 4, c, 1 in
Table 13.11 and thus is in agreement with BE. This will not always be the case.
The stepwise procedure most widely used is a combination of FS and BE, denoted by FB. This procedure starts as does forward selection, by adding variables to the model, but after each addition it examines those variables previously entered to see whether any is a candidate for elimination. For example, if there are eight pre-
dictors under consideration and the current set consists of x 2 ,x 3 ,x 5 , and x 6 with x 5 having just been added, the t ratios bˆ 2 s bˆ 2 , bˆ 3 s bˆ 3 , and bˆ 6 s bˆ 6 are examined. If the
smallest absolute ratio is less than t out , then the corresponding variable is eliminated
from the model (some software packages base decisions on f5t 2 ). The idea behind
FB is that, with forward selection, a single variable may be more strongly related to y than to either of two or more other variables individually, but the combination of these variables may make the single variable subsequently redundant. This actually
happened with the gas-mileage data discussed in Example 13.21, with x 2 entering
and subsequently leaving the model.
Although in most situations these automatic selection procedures will identify
a good model, there is no guarantee that the best or even a nearly best model will result. Close scrutiny should be given to data sets for which there appear to be strong relationships among some of the potential predictors; we will say more about this shortly.