Two-way ANOVA

7.4 Two-way ANOVA

independent-groups t -test,

The independent-groups t-test compares the group means for two different groups as indicated

Section 6.3.1 , p. 130

by two different values of the grouping variable, called a treatment variable in the context of

one-way ANOVA,

an experiment. The one-way ANOVA generalizes this analysis to a design also with a single

Section 7.2 , p. 150

grouping or treatment variable, but possibly with more than two groups. The design introduced

factorial design:

here generalizes the analysis to two different treatment variables. This more comprehensive

A design with two or more treatment

design, a factorial design, provides for the simultaneous study of the effects of two different

variables.

variables on the response variable. Other names for these variables are independent variables and factors. The values of each grouping variable are its categories or levels.

7.4.1 An Example with Data

Scenario Examine the effects of Dosage on task completion Time To what extent does the level of arousal impact the ability to complete a task? To study this question, laboratory rats were randomly and equally divided into groups, and then given one of three dosages of an arousal inducing drug: 0, 5, and 10 milligrams. Following the dosage, each rat completed either an easy or a hard maze to obtain a food reward. The response (dependent) variable is the Time in seconds to complete the maze.

cell: A specific

To evaluate the influence of drug Dosage and task Difficulty, 48 laboratory rats were randomly

combination of the values of the

assigned to the 6 combinations of Dosage and Difficulty, resulting in 8 rats for each cell.

grouping variables

The first task is to read the data stored as a csv file.

in a study.

> mydata <- Read("http://lessRstats.com/data/anova_2way.csv")

Read function, Section 2.2.1 , p. 32

The first two and last two rows of this data table appear in Listing 7.17 , for the variables Difficulty, Dosage, and Time.

one-way ANOVA,

Difficulty has levels of Easy and Hard. Dosage has values of 05mg, 10mg, and 15mg. The data

Section 7.2 , p. 150

values for the Easy task Difficulty are the same as for the one-way ANOVA previously illustrated. The data for this two-way ANOVA adds a Hard task Difficulty level, and so doubles the number of data values from 24 to 48.

Compare Multiple Samples 167

> mydata Difficulty Dosage Time 1 Easy

mg00 25.6 2 Easy

mg00 25.8 ... 47 Hard

mg10 39.3 48 Hard

mg10 43.0

Listing 7.17 Beginning and last rows of data for the two-way ANOVA.

By coincidence the levels of Difficulty, Easy, and Hard, are in the correct order for the R output according to their alphabetical ordering. The levels of Dosage are in their correct order because the labels include the numerical amount of the dosage. In many situations, however, order the categories this order will need to be explicitly specified. To re-order the values of a categorical grouping of a variable, Section 3.3.2 , variable to the desired order that they should appear on the R output, refer to the R factor p. 60 function.

Each rat in the study has only one data value, the measured Time to complete the maze. The long form and wide form of this data table are the same. All the data for each participant long form data appears in one row because there is only one data value per participant, so analysis is ready to table, Section 7.2.2 , proceed.

p. 151

7.4.2 Main Effects and Interaction Effects

There are several questions of interest in this study, each formally stated as a null hypothesis. The first such hypothesis is the statement of the equality of the mean completion time for each Dosage.

main effect for Dosage H 0 :µ mg00 =µ mg05 =µ mg10

Another null hypothesis is the equality of the mean response Time for each task Difficulty.

main effect for task Difficulty H 0 :µ Easy =µ Hard

main effect:

The effect of each of the treatment variables or factors in the context of a factorial design is a Effect of a treatment variable main effect. A main effect occurs when the value of the response variable varies according to the on the response levels of a treatment variable. For example, if there is a main effect of Dosage on task completion variable. Time such that more arousal leads to improved completion Time, then average completion Time would decrease as Dosage increased.

interaction

A factorial design provides analysis for a third potential effect, the interaction. For an intuitive effect: The effect of one factor is illustration, consider two treatment variables or factors, peanut butter and jelly, with two levels different at each, present and absent. Four treatment combinations result: plain bread, a peanut butter different levels of another factor. sandwich, a jelly sandwich, and a jelly and a peanut butter sandwich. The response variable is the perceived taste of the resulting food. Plain bread is dull, perhaps with a taste rating of 1. Peanut butter makes plain bread taste better by a certain amount, say by 10 units. Jelly makes plain bread taste better by a certain amount, say by 8 units.

The combined taste of peanut butter and jelly, however, produces much better taste than the combined effects of peanut butter and jelly by themselves, which would be 10 + 8 = 18 units.

168 Compare Multiple Samples

Peanut butter and jelly interact. Instead of 18 units, the taste of peanut butter and jelly is, say,

29 units. The interaction is the extra contribution of the two factors simultaneously beyond their individual, additive contributions. In the presence of an interaction, the main effect of either of the relevant factors is problematic because a general effect for each factor does not exist. In the peanut butter sandwich example, whatever effect peanut butter has on taste differs if jelly is present or not.

The null hypothesis for the interaction of Dosage and Difficulty is that it does not exist. interaction effect H 0 : Effect of each treatment variable is the

same at each level of the other treatment variable With no interaction, to the extent that Dosage raises or lowers completion Time, it would

have the same effect regardless of the assigned Difficulty level. Dosage and Difficulty interact, however, if the effect of Difficulty differs depending on the Dosage. With an interaction, no one main effect for either Difficulty or Dosage describes all situations.

7.4.3 The Model and Descriptive Statistics

Randomized blocks

The lessR function ANOVA , abbreviated av , provides the analysis. Both the two-way factorial

design, Section 7.3 , p. 158

design and the previously introduced randomized blocks design have two independent variables or factors. Both of the independent variables in the two-way factorial design, however, are treatment variables. This contrasts with the blocking factor in the randomized blocks design, which is generally not of substantive interest but rather is included to help minimize the error variability in the analysis of the one treatment variable.

The distinction in syntax in the function call to ANOVA for a factorial design is to replace the + in the model specification of a randomized block design with an asterisk, * . The asterisk

ANOVA function:

instructs R to also include an interaction term in the analysis.

Includes analysis for the two-way factorial ANOVA.

lessR Input Analysis of a two-way factorial design > ANOVA(Time ∼ Dosage * Difficulty)

The first part of the output from ANOVA , not listed here, describes the variables and data in

balanced design,

Section 7.3.2 ,

the analysis. Also assessed is whether the design is balanced. A balanced design has all treatment

p. 161

level combinations or cells with the same number of participants in each cell. If the design is not

cell mean: The

balanced, then a more sophisticated analysis is required, and so the present analysis terminates.

mean of the response variable

The analysis of variance compares the means of the cells in the design. The means for the

for one cell in the

individual cells in the design are called the cell means, shown in Listing 7.18 .

design.

Cell Means ----------

Dosage

Difficulty mg00 mg05 mg10

Easy 24.26 23.41 17.82 Hard 34.73 31.71 39.35

Listing 7.18 Cell means, one for each of the six treatment combinations in the experimental design.

Compare Multiple Samples 169

The ANOVA function both provides cell means from Listing 7.18 and also their plot, Figure 7.5 .

Difficulty Hard

Figure 7.5 Plot of the six cell means of task completion Time for Dosage and Difficulty.

The means for each treatment level averaged across the treatment levels of the other treatment variable are the marginal means. ANOVA lists the marginal means separately, as in

Listing 7.19 marginal mean: . The mean of all the data is the grand mean, 28.548, which is also reported in the

Mean of the

output though not included in Listing 7.19 .

response variable for a level of one treatment variable

Marginal Means

calculated.

-------------- Dosage

mg00 mg05 mg10 29.49 27.56 28.59

Difficulty Easy Hard 21.83 35.26

Listing 7.19 The marginal means and grand mean of completion Time.

The marginal means in Listing 7.19 indicate, for these data, a substantial difference in task completion Time depending on the Difficulty level. The completion Times for the Hard Difficulty level is 35.26 sec, which drops to 21.83 sec for the Easy Difficulty level. In contrast, the Dosage means exhibit little difference, varying only from a high of 29.49 sec to a low of 27.56 sec. The patterning of the cell means in Listing 7.18 and Figure 7.5 , however, clarifies this pattern. Rats who ran the Easy maze improved their completion Times as Dosage of the arousal drug increased. The pattern is much different for rats who encountered the Hard maze. The highest Dosage, and so the highest arousal, inhibited completion Time, yielding by far the largest cell mean of 39.35 secs.

170 Compare Multiple Samples

This pattern indicates an interaction between assigned Difficulty and Dosage of training. The effect of one treatment variable on the response variable depends on the level of the other treatment variable. If corroborated by the inferential analysis of the corresponding population values, the effect of the Dosage of the arousal drug on task completion Time depends on the Difficulty of the task. An interaction here implies that there is no general effect of Dosage in isolation of task Difficulty. Without an interaction the two curves in Figure 7.5 would be parallel. Instead the lines that connect the 10 mg cell mean to the 5 mg cell mean for each level of Difficulty move in the opposite direction.

Also provided, in Listing 7.20 , is the table of standard deviations of the response variable for all of the cells. The analysis of variance, like the classic t-test, assumes that the population variances of all the cells are equal. Even so, the sample variances, and their corresponding standard deviations, will not be equal. The analysis is reasonably robust against violations of this assumption, but the standard deviations should not be too widely divergent from each other, as is true of these standard deviations.

Cell Standard Deviations ------------------------

Dosage Difficulty mg00 mg05 mg10 Easy 2.69 2.85 3.96 Hard 4.93 4.81 4.39

Listing 7.20 Sample standard deviations for each cell in the design.

7.4.4 Inferential Analysis

Following the descriptive statistics are the inferential tests, which ANOVA presents in the form of the traditional analysis of variance summary table in Listing 7.21 . The purpose is to evaluate if the patterns observed in the descriptive statistics for the sample generalize to the population as

a whole. There are three tests, each based on the p-value of the corresponding test statistic, the F-value. The first effect to examine is the interaction of the two factors, indicated by Dosage:Difficulty, which can subsume the main effects. The two main effects are for Dosage and Difficulty.

Analysis of Variance --------------------

df Sum Sq

Mean Sq

Dosage:Difficulty 2 402.61

Listing 7.21 ANOVA summary table.

The interaction of Difficulty and Dosage on completion Time is significant. Dosage Effect x Difficulty:

p -value = 0 . 0001 <α= 0 . 05 , so reject H 0

Compare Multiple Samples 171

The conclusion is that Difficulty and Dosage, jointly, affect task completion Time. Because the interaction exists, the main effects are of less importance. Neither Difficulty nor Dosage has a general effect on the task completion Time. Instead, the effect of one of the treatment variables depends on the level of the other treatment variable.

The interaction entirely obscures the Dosage main effect, which is not significant.

Dosage Effect: p -value = 0 . 4076 >α= 0 . 05 , so do not reject H 0

The interaction shows that contrary to the outcome of the test of the marginal Dosage means, Dosage does have an effect, but that the effect differs depending on the level of Difficulty. In the one-way ANOVA, which examined the effect of Dosage on completion Time, but only considered the Easy Task, the higher the Dosage the quicker the average completion Time. The two-way ANOVA adds the Difficult Task, in which case the largest Dosage of the arousal drug diminishes performance, resulting in the slowest completion Time. The high Dosage cell mean for the Difficult task is 39.35 secs, the largest of the six cell means.

The main effect for task Difficulty is shown to be significant.

Difficulty Effect: p -value = 0 . 0000 <α= 0 . 05 , so do reject H 0

The Hard Difficulty task requires much more time to complete, on average, than the Low Difficulty task. Even so, this effect is modulated by the significant interaction effect. High Dosage of the arousal inducing drug has the opposite effect depending on the task Difficulty. The Easy task facilitates performance when highly aroused, and the Difficult task diminishes performance.

The significance of the interaction effect is the primary finding of this analysis, which implies that the interpretation of the main effect of each treatment variable is either obscured, as with Dosage, or at least modulated, as with Difficulty. Yes, the more Difficult task takes longer to complete, but a fuller understanding of this effect requires an understanding of the resulting interaction.

7.4.5 Effect Size

A follow-up to the detection of a significant effect is the estimation of the size of the effect, shown in Listing 7.22 .

effect size, Section 7.2.4 ,

Effect Size p. 155 -----------

Partial Omega Squared for Dosage: -0.00 Partial Omega Squared for Difficulty: 0.73 Partial Omega Squared for Dosage_&_Difficulty: 0.32

Cohen’s f for Difficulty: 1.66 Cohen’s f for Dosage_&_Difficulty: 0.69

Listing 7.22 Magnitude of the two main effects and interaction effect.

Consistent with the lack of significance, the estimated association of Dosage with completion Time is zero to within two decimal digits. The population value of omega squared is constrained between 0 and 1, but the estimate of a small population value close to 0 may be negative, as is

172 Compare Multiple Samples

the case with the omega squared estimate for Dosage. The estimated associate of task Difficulty with completion Time is large, 0.73. The corresponding association for the interaction effect is moderate, 0.32.

There is no effect size statistic, Cohen’s f , reported for Dosage because the corresponding omega squared value is less than 0. The value of f for Difficulty is substantial, 1.66, well beyond

a generally large effect size of f = 0 . 40 suggested by Cohen (1988). The corresponding value for the interaction of f = 0 . 69 is also large, though much smaller than the Difficulty effect.

7.4.6 Post-hoc Multiple Comparisons

The large separation between the cell means of the Easy and Difficulty conditions, and the strong interaction of Dosage and Difficulty, imply that most of the pairwise comparisons of the cell means are significant. The lack of significance of a main effect for Dosage, however,

Multiple

would imply that marginal means for Dosage are not distinguishable from each other. These

comparisons of main effects,

results are verified by the Tukey post-hoc comparisons at the family-wise significance level

Listing 7.6 , p. 156;

of α= 0 . 05.

Listing 7.14 , p. 163

The Tukey multiple comparisons are presented for the two main effects. All three pairwise comparisons for the three marginal means of Dosage are not significant, and the Hard–Easy marginal mean comparison is significant. The pairwise comparisons for the interaction effect, the latter of which are reported in Listing 7.23 , are in terms of the cell means. Only three cell mean comparisons are not significant. There is no observed distinction between the two lowest Dosage levels at both the Easy and Hard Difficulty levels. Also there is no distinction between the lowest and medium Dosage levels at the Hard Difficulty level.

Cell Means ----------

mg05:Easy-mg00:Easy -0.85 -6.87 5.17 1.00

mg10:Easy-mg00:Easy -6.44 -12.46 -0.41 0.03 mg00:Hard-mg00:Easy 10.46

4.44 16.49 0.00 mg05:Hard-mg00:Easy 7.45 1.43 13.47 0.01

mg10:Hard-mg00:Easy 15.09

mg10:Easy-mg05:Easy -5.59 -11.61 0.44 0.08

mg00:Hard-mg05:Easy 11.31

5.29 17.34 0.00 mg05:Hard-mg05:Easy 8.30 2.28 14.32 0.00

mg10:Hard-mg05:Easy 15.94

mg00:Hard-mg10:Easy 16.90

mg05:Hard-mg10:Easy 13.89

mg10:Hard-mg10:Easy 21.52

mg05:Hard-mg00:Hard -3.01 -9.04 3.01 0.67 mg10:Hard-mg00:Hard 4.62 -1.40 10.65 0.22 mg10:Hard-mg05:Hard 7.64 1.61 13.66 0.01

Listing 7.23 Multiple comparison of cell means.

Understanding these results emphasizes the importance of understanding and interpreting an interaction effect in a two-way ANOVA design. Collapsed across Difficulty levels, there is no Dosage effect, yet Dosage has a substantial relationship to task completion Time. This effect, however, is only evident from the corresponding interaction.

Compare Multiple Samples 173