Use stepwise regression and other model building techniques to select the appropriate set of vari- ables for a regression model
8. Use stepwise regression and other model building techniques to select the appropriate set of vari- ables for a regression model
12-1 MULTIPLE LINEAR REGRESSION MODEL 12-1.1 Introduction
Many applications of regression analysis involve situations in which there are more than one regressor or predictor variable. A regression model that contains more than one regressor vari- able is called a multiple regression model.
As an example, suppose that the effective life of a cutting tool depends on the cutting speed and the tool angle. A multiple regression model that might describe this relationship is
Y
0 1 x 1 2 x 2 (12-1) where Y represents the tool life, x 1 represents the cutting speed, x 2 represents the tool angle,
and is a random error term. This is a multiple linear regression model with two regressors. The term linear is used because Equation 12-1 is a linear function of the unknown parameters
0 , 1 , and 2 .
JWCL232_c12_449-512.qxd 11510 10:07 PM Page 451
12-1 MULTIPLE LINEAR REGRESSION MODEL
E(Y) 120
Figure 12-1 (a) The regression plane for the model E(Y )
50 10x 1 7x 2 . (b) The contour plot.
The regression model in Equation 12-1 describes a plane in the three-dimensional space
of Y, x 1 , and x 2 . Figure 12-1(a) shows this plane for the regression model
E 1Y 2 50 10x 1 7x 2
where we have assumed that the expected value of the error term is zero; that is E( ) 0. The
parameter 0 is the intercept of the plane. We sometimes call 1 and 2 partial regression
coefficients, because 1 measures the expected change in Y per unit change in x 1 when x 2 is held constant, and 2 measures the expected change in Y per unit change in x 2 when x 1 is held
constant. Figure 12-1(b) shows a contour plot of the regression model— that is, lines of con-
stant E(Y ) as a function of x 1 and x 2 . Notice that the contour lines in this plot are straight lines.
In general, the dependent variable or response Y may be related to k independent or regressor variables. The model
Y
0 1 x 1 2 x 2 p
k x ˛ k
(12-2) is called a multiple linear regression model with k regressor variables. The parameters j ,
˛
j
0, 1, p , k, are called the regression coefficients. This model describes a hyperplane in
the k-dimensional space of the regressor variables {x j }. The parameter j represents the
expected change in response Y per unit change in x j when all the remaining regressors x i (i j)
are held constant.
Multiple linear regression models are often used as approximating functions. That is, the
true functional relationship between Y and x 1 ,x 2 ,p,x k is unknown, but over certain ranges
of the independent variables the linear regression model is an adequate approximation.
Models that are more complex in structure than Equation 12-2 may often still be analyzed by multiple linear regression techniques. For example, consider the cubic polynomial model in one regressor variable.
Y
0 1 x
2 x
3 x 3 (12-3)
If we let x
1 x, x 2 x ,x 3 x , Equation 12-3 can be written as
Y
0 1 x 1 2 x 2 3 x 3 (12-4)
which is a multiple linear regression model with three regressor variables.
JWCL232_c12_449-512.qxd 11510 10:07 PM Page 452
CHAPTER 12 MULTIPLE LINEAR REGRESSION
Models that include interaction effects may also be analyzed by multiple linear regres- sion methods. An interaction between two variables can be represented by a cross-product term in the model, such as
Y
0 1 x 1 2 x 2 12 x 1 x 2 (12-5)