7 The article “Residual Stresses and Adhesion of Thermal Spray Coatings” (Surface
ExamplE 13.7 The article “Residual Stresses and Adhesion of Thermal Spray Coatings” (Surface
Engineering, 2005: 35–40) considered the relationship between the thickness (mm) of NiCrAl coatings deposited on stainless steel substrate and corresponding bond strength (MPa). The following data was read from a plot in the paper:
Thickness 220 220 220 220 370 370 370 370 440 440 Strength 24.0 22.0 19.1 15.5 26.3 24.6 23.1 21.2 25.2 24.0
Thickness
Strength 21.7 19.2 17.0 14.9 13.0 11.8 12.2 11.2 6.6 2.8 We will see in Section 13.4 that polynomial regression is a special case of multiple regression, so a
command appropriate for this latter task is generally used.
564 ChApter 13 Nonlinear and Multiple regression
The scatterplot in Figure 13.11(a) supports the choice of the quadratic regres- sion model. Figure 13.11(b) contains Minitab output from a fit of this model. The
estimated regression coefficients are
from which the estimated regression function is y5 14.521 1 .04323x 2 .00006001x 2
Substitution of the successive x values 220, 220, …, 860, and 860 into this
function gives the predicted values yˆ 1 5 21.128, …, yˆ 20 5 7.321, and the residuals y 1 2 yˆ 1 5 2.872, …, y 20 2 yˆ 20 52 4.521 result from subtraction. Figure 13.12
shows a plot of the standardized residuals versus yˆ and also a normal probability plot of the standardized residuals, both of which validate the quadratic model.
The regression equation is strength 5 14.5 1 0.0432 thickness 2 0.000060 thicksqd
Predictor
Coef
SE Coef
RSq 5 78.0
RSq(adj) 5 75.4
Analysis of Variance Source
Residual Error
Predicted Values for New Observations New
Values of Predictors for New Observations New
Obs thickness thicksqd
Figure 13.11 Scatterplot of data from Example 13.7 and Minitab output from fit of quadratic model
13.3 polynomial regression 565
Normal Probability Plot of the Residuals
Residuals Versus the Fitted Values
99 2 90 1 cent 50 0
Per
10 –1 1 Standardized Residual –2
Standardized Residual
Fitted Value
Figure 13.12 Diagnostic plots for quadratic model fit to data of Example 13.7 n
sˆ 2 and 2 r
To make further inferences, the error variance s 2 must be estimated. With
yˆ
k
5 bˆ 0 1 bˆ 1 x i 1 … 1 bˆ k x i , the ith residual is y i 2 yˆ i , and the sum of squared residu- als (error sum of squares) is SSE 5 osy i 2 yˆ 2 i d . The estimate of s 2 is then
where the denominator n 2 sk 1 1d is used because k 1 1 df are lost in estimating
b 0 ,b 1 , …, b k . If we again let SST 5
osy 2
i 2 y d , then SSESST is the proportion of the total
variation in the observed y i ’s that is not explained by the polynomial model. The quantity 1 2 SSESST, the proportion of variation explained by the model, is called
the coefficient of multiple determination and is denoted by R 2 .
Consider fitting a cubic model to the data in Example 13.7. Because this model includes the quadratic as a special case, the fit will be at least as good as
the fit to a quadratic. More generally, with SSE k 5 the error sum of squares from a k th-degree polynomial, SSE k9 SSE k and R 2 k9 R 2 k whenever k9 . k. Because the
objective of regression analysis is to find a model that is both simple (relatively few parameters) and provides a good fit to the data, a higher-degree polynomial may not
specify a better model than a lower-degree model despite its higher R 2 value. To balance the cost of using more parameters against the gain in R 2 , many statisticians
use the adjusted coefficient of multiple determination
n2 1 SSE sn 2 1dR 2 k
adjusted R 2 5 12 ?
Adjusted R 2 adjusts the proportion of unexplained variation upward [since the ratio
sn 2 1dysn 2 k 2 1d exceeds 1], which results in adjusted R 2 , R 2 . For example, if
R 2 5 .66, R 2 3 5 .70, and n 5 10, then
9 s.66d 2 2
9 s.70d 2 3
adjusted R 2 5 5 .563
adjusted R 2 3 5 5 .550
Thus the small gain in R 2 in going from a quadratic to a cubic model is not enough
to offset the cost of adding an extra parameter to the model.
ExamplE 13.8 SSE and SST are typically found on computer output in an ANOVA table. Figure (Example 13.7
13.11(b) gives SSE 5 181.71 and SST 5 825.00 for the bond strength data,
continued)
from which R 2 5 1 2 181.71 y825.00 5 .780 (alternatively, R 2 5 SSR ySST 5
643.29 y825.00 5 .780). Thus 78.0 of the observed variation in bond strength can
566 ChApter 13 Nonlinear and Multiple regression
be attributed to the model relationship. Adjusted R 2 5 .754, only a small downward
change in R 2 . The estimates of s 2 and s are SSE
sˆ 2 5 s 2 5 5 5 10.69
n2 (k 1 1)
sˆ 5 s 5 3.27
n Besides computing R 2 and adjusted R 2 , one should examine the usual diagnostic
plots to determine whether model assumptions are valid or whether modification may
be appropriate (see Figure 13.12). There is also a formal test of model utility, an F test based on the ANOVA sums of squares. Since polynomial regression is a special case of multiple regression, we defer discussion of this test to the next section.
Statistical Intervals and test Procedures
Because the y i ’s appear in the normal equations (13.10) only on the right-hand side
and in a linear fashion, the resulting estimates bˆ 0 , …, b ˆ k are themselves linear func-
tions of the y i ’ s. Thus the estimators are linear functions of the Y i ’s, so each bˆ i has a normal distribution. It can also be shown that each bˆ i is an unbiased estimator of b i .
Let s bˆ i denote the standard deviation of the estimator bˆ i . This standard devia- tion has the form
5 x j ’s, x
a complicated expression involving all
s bˆ i 5s?
j ’s,…, and x k j ’s
Fortunately, the expression in braces has been programmed into all of the most fre- quently used statistical software packages. The estimated standard deviation of bˆ i results from substituting s in place of s in the expression for s bˆ i . These estimated
standard deviations s bˆ 0 ,s bˆ 1 ,…, and s bˆ k appear in output from all the aforementioned
statistical packages. Let S bˆ i denote the estimator of s bˆ i —that is, the random variable whose observed value is s bˆ i . Then it can be shown that the standardized variable
has a t distribution based on n 2 sk 1 1d df. This leads to the following inferential procedures.
A 100(1 2 a) CI for b i , the coefficient of x i in the polynomial regression function, is
bˆ i 6 t a y2,n2(k11) ? s bˆ i
A test of H 0 :b i 5b i 0 is based on the t statistic value bˆ i 2b i 0
t5 s bˆ i
The test is based on n 2 (k 1 1) df and is upper-, lower-, or two-tailed accord-
ing to whether the inequality in H a is . , , , or ?.
A point estimate of m Y ? x —that is, of b 0 1b 1 x 1 …1 b x k k —is mˆ Y?x 5 bˆ 0 1
bˆ 1 x 1 …1 bˆ x k k . The estimated standard deviation of the corresponding estimator
is rather complicated. Many computer packages will give this estimated standard
13.3 polynomial regression 567
deviation for any x value upon request. This, along with an appropriate standardized t variable, can be used to justify the following procedures.
Let x denote a specified value of x. A 100(1 2 a) CI for m Y ? x is estimated SD of
5 m ˆ Y ? x 6
m ˆ Y ? x 6 t a y2,n2(k11) ?
With Yˆ 5 bˆ 0 1 bˆ 1 x 1 … 1 bˆ k (x ) k , yˆ denoting the calculated value of Yˆ for
the given data, and s Yˆ denoting the estimated standard deviation of the statistic Yˆ, the formula for the CI is much like the one in the case of simple
linear regression: yˆ 6 t a y2,n2(k11) ? s Yˆ
A 100(1 2 a) PI for a future y value to be observed when x 5 x is
5 of mˆ
Y?x 2 6
estimated SD 1 y2
m ˆ Y?x 6 t a y2,n2(k11) ? s 2 1 5 yˆ 6 t a ? Ïs 2 1 s 2
y2,n2(k11)
Yˆ
ExamplE 13.9
Figure 13.11(b) shows that bˆ 2 52 .00006001 and s bˆ 2 5 .00001786 (from the SE (Example 13.8 Coef column at the top of the output). The null hypothesis H 0 :b 2 5 0 says that
continued)
as long as the linear predictor x is retained in the model, the quadratic predictor x 2 provides no additional useful information. The relevant alternative is H a :b 2 Þ 0, and the test statistic is T 5 bˆ 2 yS bˆ 2 , with computed value 23.36. The test is based
on n 2 sk 1 1d 5 17 df. At significance level .05, the null hypothesis is rejected
because the reported P-value is .004 (double the area under the t 17 curve to the left of
2 3.36). Thus inclusion of the quadratic predictor in the model equation is justified.
The output in Figure 13.11(b) also contains estimation and prediction informa- tion both for x 5 500 and for x 5 800. In particular, for x 5 500,
yˆ 5 bˆ
0 1 bˆ 1 s500d 1 bˆ 2 s500d 5 Fit 5 21.136
s Yˆ 5 estimated SD of yˆ 5 SE Fit 5 1.167
from which a 95 CI for mean strength when thickness 5 500 is 21.136 6 s2.110d 3 s1.167d 5 s18.67, 23.60d. A 95 PI for the strength resulting from a single bond when
thickness 5 500 is 21.136 6 s2.110d[s3.27d 2 1 s1.167d 2 ] 1 y2 5 s13.81, 28.46d. As be-
fore, the PI is substantially wider than the CI because s is large compared to SE Fit.
n
centering x Values
For the quadratic model with regression function m Y?x 5b 0 1b x1b 2 x 1 2 , the param- eters b 0 ,b 1 , and b 2 characterize the behavior of the function near x 5 0. For exam-
ple, b 0 is the height at which the regression function crosses the vertical axis x 5 0,
whereas b 1 is the first derivative of the function at x 5 0 (instantaneous rate of change of m Y?x at x 5 0). If the x i ’s all lie far from 0, we may not have precise information about the values of these parameters. Let x 5 the average of the x i ’s for which obser- vations are to be taken, and consider the model
Y5b 1b 0 (x 2 x) 1 b (x 2 x) 2 1 2 1e (13.14)
568 ChApter 13 Nonlinear and Multiple regression
In the model (13.14), m Y?x 5b 0 1b 1 sx 2 xd 1 b 2 sx 2 xd 2 , and the parameters now
describe the behavior of the regression function near the center x of the data.
To estimate the parameters of (13.14), we simply subtract x from each x i to
obtain x 95 i x i 2 x and then use the x i 9 ’s in place of the x i ’s. An important benefit of this is that the coefficients of b 0 ,…, b k in the normal equations (13.10) will be of
much smaller magnitude than would be the case were the original x i ’s used. When the system is solved by computer, this centering protects against any round-off error that may result.