The Orthogonal Design

15.5 The Orthogonal Design

In experimental situations where it is appropriate to ﬁt models that are linear in the design variables and possibly should involve interactions or product terms, there are advantages gained from the two-level orthogonal design, or orthogonal array. By an orthogonal design we mean one in which there is orthogonality among the

columns of the X matrix. For example, consider the X matrix for the 2 2 factorial

of Example 15.2. Notice that all three columns are mutually orthogonal. The X

matrix for the 2 3 factorial also contains orthogonal columns. The 2 3 factorial with

interactions would yield an X matrix of the type

The outline of degrees of freedom is

Lack of ﬁt

4 (x 1 x 2 ,x 1 x 3 ,x 2 x 3 ,x 1 x 2 x 3 )

Error (pure)

Total

The 8 degrees of freedom for pure error are obtained from the duplicate runs at each design point. Lack-of-ﬁt degrees of freedom may be viewed as the diﬀerence between the number of distinct design points and the number of total model terms; in this case, there are 8 points and 4 model terms.

Standard Error of Coeﬃcients and T-Tests

In previous sections, we showed how the designer of an experiment may exploit the notion of orthogonality to design a regression experiment with coefficients that attain minimum variance on a per cost basis. We should be able to make use of our exposure to regression in Section 12.4 to compute estimates of variances of coefficients and hence their standard errors. It is also of interest to note the relationship between the t-statistic on a coefficient and the F-statistic described and illustrated in previous chapters.

Recall from Section 12.4 that the variances and covariances of coeﬃcients appear in A −1 , or, in terms of present notation, the variance-covariance matrix of coeﬃcients is

σ 2 A −1 =σ 2 (X X ) −1 .

In the case of the 2 k factorial experiment, the columns of X are mutually orthog-

Chapter 15 2 k Factorial Experiments and Fractions

onal, imposing a very special structure. In general, for the 2 k we can write

···±1±1

· · · ],

where each column contains 2 k or 2 k n entries, where n is the number of replicate

runs at each design point. Thus, formation of X X yields

X X =2 k nI p ,

where I is the identity matrix of dimension p, the number of model parameters.

Example 15.3: Consider a 2 3 factorial design with duplicated runs ﬁtted to the model

E(Y ) = β 0 +β 1 x 1 +β 2 x 2 +β 3 x 3 +β 12 x 1 x 2 +β 13 x 1 x 3 +β 23 x 2 x 3 . Give expressions for the standard errors of the least squares estimates of b 0 ,b 1 ,b 2 ,

b 3 ,b 12 ,b 13 , and b 23 .

x 1 x 2 x 3 x 1 x x x

x Solution : x ⎡ 2 1 3 2 3 ⎤

with each unit viewed as being repeated (i.e., each observation is duplicated). As

a result,

From the foregoing it should be clear that the variances of all coeﬃcients for a

2 k factorial with n runs at each design point are

σ 2 Var(b j )= k , n

and, of course, all covariances are zero. As a result, standard errors of coeﬃcients are calculated as

where s is found from the square root of the mean square error (hopefully obtained

from adequate replication). Thus, in our case with the 2 3 ,

s b j =s

15.5 The Orthogonal Design

Example 15.4: Consider the metallurgy experiment in Exercise 15.3 on page 609. Suppose that

the ﬁtted model is

E(Y ) = β 0 +β 1 x 1 +β 2 x 2 +β 3 x 3 +β 4 x 4 +β 12 x 1 x 2 +β 13 x 1 x 3 +β 14 x 1 x 4 +β 23 x 2 x 3 +β 24 x 2 x 4 +β 34 x 3 x 4 .

What are the standard errors of the least squares regression coeﬃcients? Solution : Standard errors of all coeﬃcients for the 2 k factorial are equal and are

which in this illustration is

In this case, the pure mean square error is given by s 2 = 2.46 (16 degrees of

freedom). Thus,

s b j = 0.28.

The standard errors of coefficients can be used to construct t-statistics on all coefficients. These t-values are related to the F-statistics in the analysis of variance. We have already demonstrated that an F-statistic on a coefficient, using the 2 k factorial, is

This is the form of the F-statistics on page 610 for the metallurgy experiment (Exercise 15.3). It is easy to verify that if we write

t 2 (contrast) = 2 = f. s 2 k n

As a result, the usual relationship holds between t-statistics on coefficients and the F-values. As we might expect, the only difference between the use of t and F in assessing significance lies in the fact that the t-statistic indicates the sign, or direction, of the effect of the coefficient.

It would appear that the 2 k factorial plan would handle many practical situations in which regression models are ﬁtted. It can accommodate linear and interaction terms, providing optimal estimates of all coeﬃcients (from a variance point of view). However, when k is large, the number of design points required is very large. Often, portions of the total design can be used and still allow orthogonality with all its advantages. These designs are discussed in Section 15.6.

Chapter 15 2 k Factorial Experiments and Fractions

A More Thorough Look at the Orthogonality Property in the 2 k Factorial

We have learned that for the case of the 2 k factorial all the information that is delivered to the analyst about the main effects and interactions is in the form of contrasts. These “2 k − 1 pieces of information” carry a single degree of freedom apiece and they are independent of each other. In an analysis of variance, they manifest themselves as effects, whereas if a regression model is being constructed, the effects turn out to be regression coefficients, apart from a factor of 2. With either form of analysis, significance tests can be carried out and the t-test for

a given effect is numerically the same as that for the corresponding regression coefficient. In the case of ANOVA, variable screening and scientific interpretation of interactions are important, whereas in the case of a regression analysis, a model may be used to predict response andor determine which factorlevel combinations are optimum (e.g. maximize yield or maximize cleaning efficiency, as in the case of Case Study 15.2).

It turns out that the orthogonality property is important whether the analysis is to be ANOVA or regression. The orthogonality among the columns of X, the model matrix in, say, Example 15.3, provides special conditions that have an important impact on the variance of effects or regression coefficients. In fact, it has already become apparent that the orthogonal design results in equality of variance for all effects or coefficients. Thus, in this way, the precision, for purposes of estimation or testing, is the same for all coefficients, main effects, or interactions. In addition, if the regression model contains only linear terms and thus only main effects are of interest, the following conditions result in the minimization of variances of all effects (or, correspondingly, first-order regression coefficients).

Conditions for If the regression model contains terms no higher than ﬁrst order, and if the

Minimum ranges on the variables are given by x j ∈ [−1, +1] for j = 1, 2, . . . , k, then

Variances of Var(b j )σ 2 , for j = 1, 2, . . . , k, is minimized if the design is orthogonal and all

Coeﬃcients x i levels in the design are at ±1 for i = 1, 2, . . . , k.

Thus, in terms of coeﬃcients of model terms or main eﬀects, orthogonality in the

2 k is a very desirable property.

Another approach to a better understanding of the “balance” provided by the

2 3 is to look at the situation graphically. All of the contrasts that are orthogonal and thus mutually independent are shown graphically in Figure 15.10. In the graphs, the planes of the squares whose vertices contain the responses labeled “+” are compared to those containing the responses labeled “ −.” Those given in (a) show contrasts for main eﬀects and should be obvious to the reader. Those in (b) show the planes representing “+” vertices and “ −” vertices for the three two-factor interaction contrasts. In (c), we see the geometric representation of the contrasts for the three-factor (ABC) interaction.

Center Runs with 2 k Designs

In the situation in which the 2 k design is implemented with continuous design variables and one is seeking to ﬁt a linear regression model, the use of replicated runs in the design center can be extremely useful. In fact, quite apart from the advantages that will be discussed in what follows, a majority of scientists and

15.5 The Orthogonal Design

−

A B C

(a) Main effects

−

− +

AB AC BC

(b) Two−factor interactions

= + runs

= − runs

ABC (c) Three−factor interaction

Figure 15.10: Geometric presentation of contrasts for the 2 3 factorial design.

Chapter 15 2 k Factorial Experiments and Fractions

engineers would consider center runs (i.e., the runs at x i = 0 for i = 1, 2, . . . , k) as not only a reasonable practice but something that was intuitively appealing. In many areas of application of the 2 k design, the scientist desires to determine if he or she might beneﬁt from moving to a diﬀerent region of interest in the factors. In many cases, the center (i.e., the point (0, 0, . . . , 0) in the coded factors) is often either the current operating conditions of the process or at least those conditions that are considered “currently optimum.” So it is often the case that the scientist will require data on the response at the center.

Center Runs and Lack of Fit

In addition to the intuitive appeal of the augmentation of the 2 k with center runs,

a second advantage is enjoyed that relates to the kind of model that is ﬁtted to the data. Consider, for example, the case with k = 2, illustrated in Figure 15.11.

B(x 2 ) (0, 0)

1 1 A(x 1 )

Figure 15.11: A 2 2 design with center runs.

It is clear that without the center runs the model terms are the intercept, x 1 ,

x 2 ,x 1 x 2 . These account for the four model degrees of freedom delivered by the four design points, apart from any replication. Since each factor has response information available only at two locations {−1, +1}, no “pure” second-order curvature

terms can be accommodated in the model (i.e, x 2 1 or x 2 ). But the information

at (0, 0) produces an additional model degree of freedom. While this important

degree of freedom does not allow both x 2 1 and x 2 to be used in the model, it does allow for testing the signiﬁcance of a linear combination of x 2 1 and x 2 . For n c center

runs, there are then n c − 1 degrees of freedom available for replication or “pure”

error. This allows an estimate of σ 2 for testing the model terms and signiﬁcance

of the 1 d.f. for quadratic lack of ﬁt. The concept here is very much like that discussed in the lack-of-ﬁt material in Chapter 11.

In order to gain a complete understanding of how the lack-of-ﬁt test works, assume that for k = 2 the true model contains the full second-order complement

of terms, including x 2 1 and x 2 . In other words,

E(Y ) = β 0 +β 1 x 1 +β 2 x 2 +β 12 x 1 x 2 +β 11 x 2 1 +β 22 x 2 .

15.5 The Orthogonal Design

Now, consider the contrast

y ¯ f − ¯y 0 ,

where ¯ y f is the average response at the factorial locations and ¯ y 0 is the average

response at the center point. It can be shown easily (see Review Exercise 15.46) that

E(¯ y f − ¯y 0 )=β 11 +β 22 ,

and, in fact, for the general case with k factors,

As a result, the lack-of-ﬁt test is a simple t-test (or F = t 2 ) with

y ¯ f − ¯y 0 ¯ y f − ¯y 0

t n c −1 =

= , s y ¯ f −¯ y 0 M SE(1n f + 1n c )

where n f is the number of factorial points and MSE is simply the sample variance of the response values at (0, 0, . . . , 0).

Example 15.5: This example is taken from Myers, Montgomery, and Anderson-Cook (2009). A

chemical engineer is attempting to model the percent conversion in a process. There are two variables of interest, reaction time and reaction temperature. In an attempt to arrive at the appropriate model, a preliminary experiment was conducted in a

2 factorial using the current region of interest in reaction time and temperature. Single runs were made at each of the four factorial points and ﬁve runs were made at the design center in order that a lack-of-ﬁt test for curvature could be conducted. Figure 15.12 shows the design region and the experimental runs on yield.

The time and temperature readings at the center are, of course, 35 minutes and 145 ◦

C. The estimates of the main eﬀects and single interaction coeﬃcient are

computed through contrasts, just as before. The center runs play no role in

the computation of b 1 ,b 2 , and b 12 . This should be intuitively reasonable to

the reader. The intercept is merely ¯ y for the entire experiment. This value is y = 40.4444. The standard errors are found through the use of diagonal elements ¯

of (X X ) −1 , as discussed earlier. For this case,

⎡

x 1 x 2 x 1 x 2 ⎤

1 −1

−1

⎢

1 −1 ⎥

1 −1

X =

⎣

⎦

Chapter 15 2 k Factorial Experiments and Fractions

emper T

Figure 15.12: 2 2 factorial with 5 center runs.

After the computations, we have

b 12 = −0.0250, s b 0 = 0.06231, s b 1 = 0.09347, s b 2 = 0.09347, s b 12 = 0.09347,

t b 0 = 649.07

t b 1 = 8.29

t b 2 = 3.48

t b 12 = −0.27 (P = 0.800).

The contrast ¯ y f − ¯y 0 = 40.425 − 40.46 = −0.035, and the t-statistic that tests for

curvature is given by

As a result, it appears as if the appropriate model should contain only ﬁrst-order terms (apart from the intercept).

An Intuitive Look at the Test on Curvature

If one considers the simple case of a single design variable with runs at −1 and +1, it should seem clear that the average response at −1 and +1 should be close to the response at 0, the center, if the model is ﬁrst order in nature. Any deviation would certainly suggest curvature. This is simple to extend to two variables. Consider Figure 15.13.

The ﬁgure shows the plane on y that passes through the y values of the factorial points. This is the plane that would represent the perfect ﬁt for the model

containing x 1 ,x 2 , and x 1 x 2 . If the model contains no quadratic curvature (i.e.,

β 11 =β 22 = 0), we would expect the response at (0, 0) to be at or near the plane. If the response is far away from the plane, as in the case of Figure 15.13, then it can be seen graphically that quadratic curvature is present.

Responses at (0, 0)

Figure 15.13: 2 2 factorial with runs at (0, 0).

Exercises

15.13 Consider a 2 5 experiment where the experi-

x 1 ,

x 3

mental runs are on 4 diﬀerent machines. Use the ma-

Sodium

x 2 , Equilibration

chines as blocks, and assume that all main eﬀects and

Citrate Glycerol

Time Survival

two-factor interactions may be important.

(a) Which runs would be made on each of the 4 ma-

(b) Which eﬀects are confounded with blocks?

15.14 An experiment is described in Myers, Mont-

−1

gomery, and Anderson-Cook (2009) in which optimum

conditions are sought for storing bovine semen to ob-

tain maximum survival. The variables are percent

sodium citrate, percent glycerol, and equilibration time in hours. The response is percent survival of the motile

spermatozoa. The natural levels are found in the above 15.15 Oil producers are interested in nickel alloys

reference. The data with coded levels for the factorial that are strong and corrosion resistant. An experiment portion of the design and the center runs are given.

was conducted in which yield strengths were compared for nickel alloy tensile specimens charged in solutions

(a) Fit a linear regression model to the data and deter- of sulfuric acid saturated with carbon disulﬁde. Two

mine which linear and interaction terms are signif- alloys were compared: a 75 nickel composition and

icant. Assume that the x 1 x 2 x 3 interaction is neg-

a 30 nickel composition. The alloys were tested un-

ligible.

der two diﬀerent charging times, 25 and 50 days. A 2 3

(b) Test for quadratic lack of ﬁt and comment.

Chapter 15 2 k Factorial Experiments and Fractions

factorial was conducted with the following factors:

(a) Test to determine which main eﬀects and interac-

sulfuric acid: 4, 6

tions should be involved in the ﬁtted model.

charging time: 25 days, 50 days (x )

(b) Test for quadratic curvature.

2 (c) If quadratic curvature is signiﬁcant, how many

nickel composition: 30, 75

(x 3 )

additional design points are needed to determine

A specimen was prepared for each of the 8 conditions.

which quadratic terms should be included in the

Since the engineers were not certain of the nature of the

model?

model (i.e., whether or not quadratic terms would be

needed), a third level (middle level) was incorporated, 15.16 Suppose a second replicate of the experiment

and 4 center runs were employed using 4 specimens at in Exercise 15.13 could be performed.

5 sulfuric acid, 37.5 days, and 52.5 nickel composi- (a) Would a second replication of the blocking scheme tion. The following are the yield strengths in kilograms

of Exercise 15.13 be the best choice?

per square inch.

(b) If the answer to part (a) is no, give the layout for

Charging Time

a better choice for the second replicate.

25 Days

50 Days

Nickel Sulfuric Acid Comp.

4 6 4 6 15.17 Consider Figure 15.14, which represents a 2 2 75 52.5 56.5 47.9 47.2 factorial with 3 center runs. If quadratic curvature is

30 50.2 50.8 47.4 41.7 signiﬁcant, what additional design points would you

The center runs gave the following strengths:

select that might allow the estimation of the terms

x 2 1 2 ,x ? Explain.

Figure 15.14: Graph for Exercise 15.17.

The Orthogonal Design

15.5 The Orthogonal Design

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Antiremed Kelas 12 Matematika (4)

Transmission of Greek and Arabic Veteri

Services for adults with an autism spect

Dukungan

Links

The Orthogonal Design

15.5 The Orthogonal Design

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Antiremed Kelas 12 Matematika (4)

Transmission of Greek and Arabic Veteri

Services for adults with an autism spect

Dokumen yang Anda mencari sudah siap untuk unduhkan