3 Wire Bond Strength ANOVA

EXAMPLE 12-3 Wire Bond Strength ANOVA

  We will test for significance of regression (with

  0.05) us-

  and by subtraction

  ing the wire bond pull strength data from Example 12-1. The total sum of squares is

  SS E SS T SS R y¿y ␤ ˆ ¿ X¿y 115.1716 The analysis of variance is shown in Table 12-10. To test

  n

  2 H 0 : 1 2 0, we calculate the statistic

  Since f 0 f 0.05,2,22

  3.44 (or since the P-value is consider-

  The regression or model sum of squares is computed from

  ably smaller than = 0.05), we reject the null hypothesis and

  Equation 12-20 as follows:

  conclude that pull strength is linearly related to either wire length or die height, or both.

  n

  2 Practical Interpretation: Rejection of H 0 does not neces-

  i 1 1725.822

  aa y i b 2 sarily imply that the relationship found is an appropriate

  model for predicting pull strength as a function of wire length

  25 and die height. Further tests of model adequacy are required

  before we can be comfortable using this model in practice.

  Most multiple regression computer programs provide the test for significance of regression in their output display. The middle portion of Table 12-4 is the Minitab output for this example. Compare Tables 12-4 and 12-10 and note their equivalence apart from rounding. The P-value is rounded to zero in the computer output.

  Table 12-10 Test for Significance of Regression for Example 12-3

  Source of

  Degrees of

  Variation

  Sum of Squares

  Freedom

  Mean Square

  f 0 P-value

  Error or residual

  Total

  JWCL232_c12_449-512.qxd 11510 10:07 PM Page 472

  CHAPTER 12 MULTIPLE LINEAR REGRESSION

  R 2 and Adjusted R 2

  We may also use the coefficient of multiple determination R 2 as a global statistic to assess

  the fit of the model. Computationally,

  2 SS R

  SS E

  R 1 (12-22)

  SS T

  SS T

  For the wire bond pull strength data, we find that R 2 SS R 兾SS T 5990.7712兾6105.9447

  0.9811. Thus the model accounts for about 98 of the variability in the pull strength response

  (refer to the Minitab output in Table 12-4). The R 2 statistic is somewhat problematic as a

  measure of the quality of the fit for a multiple regression model because it never decreases when a variable is added to a model.

  To illustrate, consider the model fit to the wire bond pull strength data in Example 11-8.

  This was a simple linear regression model with x 1 wire length as the regressor. The value of

  R 2 for this model is R 2

  0.9640. Therefore, adding x 2

  2 die height to the model increases R

  by 0.9811

  0.9640 0.0171, a very small amount. Since R 2 can never decrease when a

  regressor is added, it can be difficult to judge whether the increase is telling us anything useful about the new regressor. It is particularly hard to interpret a small increase, such as observed in the pull strength data.

  Many regression users prefer to use an adjusted R 2 statistic:

  Adjusted

  R 2

  SS E 1n p2

  R 2

  1 SS (12-23)

  adj

  T 1n 12

  Because SS E 1n p2 is the error or residual mean square and SS

  is a constant, R T 2 1n 12 adj

  will only increase when a variable is added to the model if the new variable reduces the error

  mean square. Note that for the multiple regression model for the pull strength data R 2 adj 0.979 (see the Minitab output in Table 12-4), whereas in Example 11-8 the adjusted R 2 for the one-variable model is R 2 adj 0.962. Therefore, we would conclude that adding x 2 die

  height to the model does result in a meaningful reduction in unexplained variability in the response.

  The adjusted R 2 statistic essentially penalizes the analyst for adding terms to the

  model. It is an easy way to guard against overfitting, that is, including regressors that are not really useful. Consequently, it is very useful in comparing and evaluating competing

  regression models. We will use R 2 adj for this when we discuss variable selection in regres-

  sion in Section 12-6.3.

  12-2.2 Tests on Individual Regression Coefficients

  and Subsets of Coefficients

  We are frequently interested in testing hypotheses on the individual regression coefficients. Such tests would be useful in determining the potential value of each of the regressor variables in the regression model. For example, the model might be more effective with the inclusion of additional variables or perhaps with the deletion of one or more of the regressors presently in the model.

  JWCL232_c12_449-512.qxd 11510 10:07 PM Page 473

  12-2 HYPOTHESIS TESTS IN MULTIPLE LINEAR REGRESSION

  The hypothesis to test if an individual regression coefficient, say j equals a value j0 is

  H 0 : j j0

  H 1 : j j0

  (12-24)