Factorial Experiments in a Regression Setting

15.4 Factorial Experiments in a Regression Setting

Thus far in this chapter, we have mostly confined our discussion of analysis of the data for a 2 k factorial to the method of analysis of variance. The only reference to an alternative analysis resides in Exercise 15.9. Indeed, this exercise introduces much of what motivates the present section. There are situations in which model fitting is important and the factors under study can be controlled. For example,

a biologist may wish to study the growth of a certain type of algae in the water, and so a model that looks at units of algae as a function of the amount of a pollutant and, say, time would be very helpful. Thus, the study involves a factorial experiment in a laboratory setting in which concentration of the pollutant and time are the factors. As we shall discuss later in this section, a more precise model can be fitted if the factors are controlled in a factorial array, with the 2 k factorial often being a useful choice. In many biological and chemical processes, the levels of the regressor variables can and should be controlled.

Recall that the regression model employed in Chapter 12 can be written in matrix notation as

y = Xβ + ǫ.

The X matrix is referred to as the model matrix. Suppose, for example, that a

2 3 factorial experiment is employed with the variables

Pressure (psi): 1000

The familiar +1, −1 levels can be generated through the following centering and scaling to design units:

temperature − 175

humidity − 17.5

pressure − 1250

15.4 Factorial Experiments in a Regression Setting 613 As a result, the X matrix becomes

x 1 x 2 x 3 ⎡ Design Identification ⎤

It is now seen that the contrasts illustrated and discussed in Section 15.2 are directly related to regression coefficients. Notice that all the columns of the X

matrix in our 2 3 example are orthogonal. As a result, the computation of regression coefficients as described in Section 12.3 becomes

c + ac + bc + abc − (1) − a − b − ab where a, ab, and so on, are response measures.

One can now see that the notion of calculated main effects, which has been emphasized throughout this chapter with 2 k factorials, is related to coefficients in

a fitted regression model when factors are quantitative. In fact, for a 2 k with, say, n experimental runs per design point, the relationships between effects and regression coefficients are as follows:

contrast Effect =

2 k−1 (n) contrast

effect

Regression coefficient =

2 This relationship should make sense to the reader, since a regression coefficient

(n)

b j is an average rate of change in response per unit change in x j . Of course, as one goes from −1 to +1 in x j (low to high), the design variable changes by 2 units.

Example 15.2: Consider an experiment where an engineer desires to fit a linear regression of yield y against holding time x 1 and flexing time x 2 in a certain chemical system. All other factors are held fixed. The data in the natural units are given in Table 15.8. Estimate the multiple linear regression model.

Solution : The fitted regression model is

y=b ˆ 0 +b 1 x 1 +b 2 x 2 .

614 Chapter 15 2 k Factorial Experiments and Fractions

Table 15.8: Data for Example 15.2

Holding Time (hr) Flexing Time (hr) Yield (%)

The design units are

holding time − 0.65

flexing time − 0.15

0.15 0.05 and the X matrix is

with the regression coefficients

⎡ (1) + a + b + ab ⎤

36.25 ⎣ b 1 ⎦ = (X ′ X) −1 X ′ y= ⎢ a + ab − (1) − b ⎥ ⎢ = 6.25

Thus, the least squares regression equation is

ˆ y = 36.25 + 6.25x 1 + 2.75x 2 .

This example provides an illustration of the use of the two-level factorial ex- periment in a regression setting. The four experimental runs in the 2 2 design were used to calculate a regression equation, with the obvious interpretation of the regression coefficients. The value b 1 = 6.25 represents the estimated increase in response (percent yield) per design unit change (0.15 hour) in holding time. The value b 2 = 2.75 represents a similar rate of change for flexing time.

Interaction in the Regression Model

The interaction contrasts discussed in Section 15.2 have definite interpretations in the regression context. In fact, interactions are accounted for in regression models by product terms. For example, in Example 15.2, the model with interaction is

y=b 0 +b 1 x 1 +b 2 x 2 +b 12 x 1 x 2 with b 0 ,b 1 ,b 2 as before and

ab + (1) − a − b =

b 12 =

15.4 Factorial Experiments in a Regression Setting 615 Thus, the regression equation expressing two linear main effects and interaction is

y = 36.25 + 6.25x ˆ 1 + 2.75x 2 + 0.75x 1 x 2 .

The regression context provides a framework in which the reader should better understand the advantage of orthogonality that is enjoyed by the 2 k factorial. In Section 15.2, the merits of orthogonality were discussed from the point of view of analysis of variance of the data in a 2 k factorial experiment. It was pointed out that orthogonality among effects leads to independence among the sums of squares. Of course, the presence of regression variables certainly does not rule out the use of analysis of variance. In fact, f-tests are conducted just as they were described in Section 15.2. Of course, a distinction must be made. In the case of ANOVA, the hypotheses evolve from population means, while in the regression case, the hypotheses involve regression coefficients.

For instance, consider the experimental design in Exercise 15.2 on page 609. Each factor is continuous. Suppose that the levels are

5 lb/sec

10 lb/sec

C (x 3 ): 5 5.5

and we have, for design levels, x

% solids − 30 ,

flow rate − 7.5

pH − 5.25

10 2.5 0.25 Suppose that it is of interest to fit a multiple regression model in which all linear

coefficients and available interactions are to be considered. In addition, the engineer wants to obtain some insight into what levels of the factor will maximize cleansing (i.e., maximize the response). This problem will be the subject of Case Study 15.2.

Case Study 15.2: Coal Cleansing Experiment 1 : Figure 15.9 represents annotated computer print- out for the regression analysis for the fitted model

y=b ˆ 0 +b 1 x 1 +b 2 x 2 +b 3 x 3 +b 12 x 1 x 2 +b 13 x 1 x 3 +b 23 x 2 x 3 +b 123 x 1 x 2 x 3 , where x 1 ,x 2 , and x 3 are percent solids, flow rate, and pH of the system, respec-

tively. The computer system used is SAS PROC REG. Note the parameter estimates, standard error, and P-values in the printout. The parameter estimates represent coefficients in the model. All model coefficients are significant except the x 2 x 3 term (BC interaction). Note also that residuals, confidence intervals, and prediction intervals appear as discussed in the regression material in Chapters 11 and 12.

The reader can use the values of the model coefficients and predicted values from the printout to ascertain what combination of the factors results in max- imum cleansing efficiency. Factor A (percent solids circulated) has a large positive coefficient, suggesting a high value for percent solids. In addition, a low value for factor C (pH of the tank) is suggested. Though the B main effect (flow rate of the polymer) coefficient is positive, the rather large positive coefficient of

1 See Exercise 15.2.

616 Chapter 15 2 k Factorial Experiments and Fractions Dependent Variable: Y

Analysis of Variance

DF Squares

Square

F Value Pr > F

Corrected Total 15 492.43704

Root MSE 0.52465

R-Square

Dependent Mean 12.75188

Adj R-Sq

Coeff Var 4.11429 Parameter Estimates Parameter Standard Variable

DF Estimate

Error

t Value Pr > |t|

C 1 -1.41563

AB 1 -0.59938

AC 1 -0.52813

Dependent Predicted

Std Error

Obs Variable Value Mean Predict

95% CL Predict Residual 1 4.6500

95% CL Mean

Figure 15.9: SAS printout for data of Case Study 15.2.

x 1 x 2 x 3 (ABC) suggests that flow rate should be at the low level to enhance effi- ciency. Indeed, the regression model generated in the SAS printout suggests that the combination of factors that may produce optimum results, or perhaps suggest direction for further experimentation, is given by

A: high level

B: low level

C: low level

15.5 The Orthogonal Design 617

Dokumen yang terkait

Optimal Retention for a Quota Share Reinsurance

0 0 7

Digital Gender Gap for Housewives Digital Gender Gap bagi Ibu Rumah Tangga

0 0 9

Challenges of Dissemination of Islam-related Information for Chinese Muslims in China Tantangan dalam Menyebarkan Informasi terkait Islam bagi Muslim China di China

0 0 13

Family is the first and main educator for all human beings Family is the school of love and trainers of management of stress, management of psycho-social-

0 0 26

THE EFFECT OF MNEMONIC TECHNIQUE ON VOCABULARY RECALL OF THE TENTH GRADE STUDENTS OF SMAN 3 PALANGKA RAYA THESIS PROPOSAL Presented to the Department of Education of the State Islamic College of Palangka Raya in Partial Fulfillment of the Requirements for

0 3 22

GRADERS OF SMAN-3 PALANGKA RAYA ACADEMIC YEAR OF 20132014 THESIS Presented to the Department of Education of the State College of Islamic Studies Palangka Raya in Partial Fulfillment of the Requirements for the Degree of Sarjana Pendidikan Islam

0 0 20

A. Research Design and Approach - The readability level of reading texts in the english textbook entitled “Bahasa Inggris SMA/MA/MAK” for grade XI semester 1 published by the Ministry of Education and Culture of Indonesia - Digital Library IAIN Palangka R

0 1 12

A. Background of Study - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 15

1. The definition of textbook - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 38

CHAPTER IV DISCUSSION - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 95