7 The article “Residual Stresses and Adhesion of Thermal Spray Coatings” (Surface
Example 13.7 The article “Residual Stresses and Adhesion of Thermal Spray Coatings” (Surface
Engineering, 2005: 35–40) considered the relationship between the thickness (mm) of NiCrAl coatings deposited on stainless steel substrate and corresponding bond strength (MPa). The following data was read from a plot in the paper:
We will see in Section 13.4 that polynomial regression is a special case of multiple regression, so a command appropriate for this latter task is generally used.
13.3 Polynomial Regression
The scatter plot in Figure 13.11(a) supports the choice of the quadratic regression model. Figure 13.11(b) contains Minitab output from a fit of this model. The estimated regression coefficients are
from which the estimated regression function is y 5 14.521 1 .04323x 2 .00006001x 2
Substitution of the successive x values 220, 220, . . . , 860, and 860 into this function
gives the predicted values yˆ 1 5 21.128, c, yˆ 20 5 7.321 , and the residuals y 1 2 yˆ 1 5 2.872, c, y 20 2 yˆ 20 5 24.521 result from subtraction. Figure 13.12
shows a plot of the standardized residuals versus and also a normal probability plot yˆ of the standardized residuals, both of which validate the quadratic model.
The regression equation is
strength 5 14.5 1 0.0432 thickness 0.000060 thicksqd
Predictor
Coef
SE Coef
R - Sq 5 78.0 R - Sq(adj) 5 75.4
Analysis of Variance Source
Residual Error
Predicted Values for New Observations New
Values of Predictors for New Observations New
Figure 13.11 Scatter plot of data from Example 13.7 and Minitab output from fit of quadratic model
CHAPTER 13 Nonlinear and Multiple Regression
Normal Probability Plot of the Residuals
Residuals Versus the Fitted Values
99 2 90 1 cent 50 0
Per
10 –1 1 Standardized Residual –2
Standardized Residual
Fitted Value
Figure 13.12 Diagnostic plots for quadratic model fit to data of Example 13.7
■
2 s 2 ˆ and R
To make further inferences, the error variance s 2 must be estimated. With
yˆ i 5 bˆ 0 1 bˆ 1 x 1 c 1 bˆ k i k x i , the ith residual is y i 2 yˆ i , and the sum of squared
residuals (error sum of squares) is SSE 5 g (y i 2 yˆ i ) 2 . The estimate of s 2 is then
where the denominator n 2 (k 1 1) is used because k11
df are lost in estimating
b 0 ,b 1 , c, b k .
If we again let SST 5 g (y i 2y) 2 , then SSESST is the proportion of the total
variation in the observed y i ’s that is not explained by the polynomial model. The quantity 1 2 SSESST , the proportion of variation explained by the model, is called
the coefficient of multiple determination and is denoted by R 2 .
Consider fitting a cubic model to the data in Example 13.7. Because this model includes the quadratic as a special case, the fit will be at least as good as the fit to a quadratic. More generally, with SSE k 5 the error sum of squares from a kth-degree
polynomial, SSE kr SSE and R 2 k 2 kr R k whenever kr . k . Because the objective of
regression analysis is to find a model that is both simple (relatively few parameters) and provides a good fit to the data, a higher-degree polynomial may not specify a
better model than a lower-degree model despite its higher R 2 value. To balance the cost of using more parameters against the gain in R 2 , many statisticians use the
adjusted coefficient of multiple determination
adjusted R 2 512
n 2 (k 1 1) SST
n212k
Adjusted R 2 adjusts the proportion of unexplained variation upward [since the ratio
(n 2 1)(n 2 k 2 1) exceeds 1], which results in adjusted R 2 ,R 2 . For example, if
R 2 5 .66, R 2 3 5 .70 , and n 5 10 , then
3 5 10 2 4 5 .550 Thus the small gain in R 2 in going from a quadratic to a cubic model is not enough
adjusted R 2 5 5 .563 adjusted R 2
to offset the cost of adding an extra parameter to the model.
Example 13.8 SSE and SST are typically found on computer output in an ANOVA table.
(Example 13.7
Figure 13.11(b) gives SSE 5 181.71 and SST 5 825.00 for the bond strength data,
continued)
from which R 2 5 1 2 181.71825.00 5 .780 (alternatively, R 2 5 SSRSST 5
643.29825.00 5 .780 ). Thus 78.0 of the observed variation in bond strength can
13.3 Polynomial Regression
be attributed to the model relationship. Adjusted R 2 5 .754 , only a small downward
change in R 2 . The estimates of s 2 and s are SSE
ˆ 5 s 5 3.27 s
■ Besides computing R 2 and adjusted R 2 , one should examine the usual diagnostic
plots to determine whether model assumptions are valid or whether modification may
be appropriate (see Figure 13.12). There is also a formal test of model utility, an F test based on the ANOVA sums of squares. Since polynomial regression is a special case of multiple regression, we defer discussion of this test to the next section.