Chapter11 - Analysis of Variance
Statistics for Managers Using Microsoft® Excel 5th Edition
Learning Objectives
In this chapter, you learn:
The basic concepts of experimental design
How to use the one-way analysis of variance
to test for differences among the means of several groups
How to use the two-way analysis of variance
and interpret the interactionChapter Overview
Analysis of Variance (ANOVA) F-test
Tukey- Kramer test
One-Way ANOVA
Two-Way ANOVA
Interaction Effects General ANOVA Setting
Investigator controls one or more factors of interest
Each factor contains two or more levels
Levels can be numerical or categorical
Different levels produce different groups
Think of the groups as populations
Observe effects on the dependent variable
Are the groups the same?
Experimental design: the plan used to collect the data
Completely Randomized Design
Experimental units (subjects) are assigned randomly to the different levels (groups)
Subjects are assumed homogeneous
Only one factor or independent variable
With two or more levels (groups)
Analyzed by one-factor analysis of variance
(one-way ANOVA)One-Way Analysis of Variance
Evaluate the difference among the means of three or
more groups
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires
Assumptions Populations are normally distributed
Populations have equal variances
Samples are randomly and independently drawn
Hypotheses: One-Way ANOVA H : μ μ μ μ
1
2 3 c All population means are equal
i.e., no treatment effect (no variation in means among groups)
H : Not all of the population means are the same
1
At least one population mean is different i.e., there is a treatment (groups) effect
Does not mean that all population means are different (at
least one of the means is different from the others)
Hypotheses: One-Way ANOVA H : μ μ μ μ
1
2 3 c H : Not all μ are the same
1 j
All Means are the same: The Null Hypothesis is True
(No Group Effect)
μ μ μ
1
2
3 Hypotheses: One-Way ANOVA
At least one mean is different:
H : μ μ μ μ
1
2 3 c
The Null Hypothesis is NOT true
j H : Not all μ are the same
1
(Treatment Effect is present) or
μ μ μ μ μ μ
1
2
3
1
2
3 Partitioning the Variation
Total variation can be split into two parts:
SST = Total Sum of Squares (Total variation)
SSA = Sum of Squares Among Groups (Among-group variation)
SSW = Sum of Squares Within Groups (Within-group variation)
SST = SSA + SSW Partitioning the Variation
SST = SSA + SSW
Total Variation = the aggregate dispersion of the individual data values around the overall (grand) mean of all factor levels (SST) Among-Group Variation = dispersion between the factor sample means (SSA) Within-Group Variation = dispersion that exists among the data values within the particular factor levels (SSW)
=
Partitioning the Variation
Among Group Variation (SSA) Within Group Variation (SSW) Total Variation (SST)
2 SST (
X X ) ij
j 1 i
1
Where: SST = Total sum of squares c = number of groups n = number of values in group j j th X = i value from group j ij
X = grand mean (mean of all data values)
The Total Sum of Squares
2
2
12
2
11 ) ( ... ) ( ) (
X X
X X
X X SST nc
c j n i ijj
X X SST
1
1
2 ) ( Among-Group Variation SST = SSA + SSW c
2 SSA n (
X j X ) j
j
1
Where: SSA = Sum of squares among groups c = number of groups n = sample size from group j j
X = sample mean from group j j X = grand mean (mean of all data values) Among-Group Variation
2 j c 1 j j )
X X ( n SSA 2 c c 2
2
2 2 1 1)
X X ( n ... )
X X ( n )
X X ( n SSA
µ 1 µ 2 µ c Within-Group Variation SST = SSA + SSW n j c
2 SSW (
X X ) j ij
Where:
j 1 i
1
SSW = Sum of squares within groups c = number of groups n = sample size from group j j
X = sample mean from group j j th X = i value in group j ij
Within-Group Variation
2 2 21 2 11
) ( ... ) ( ) (
1 1 cX X
X X
X X SSW nc
2
1
1 ) ( j ij n i c j
X X SSW j
j
Obtaining the Mean Squares SSA
Mean Squares Among
MSA c
1 SSW MSW
Mean Squares Within
n c SST
MST
Mean Squares Within
n 1 One-Way ANOVA Table Source of df SS MS F-Ratio Variation
(Variance) Among c-1 SSA MSA MSA F
Groups MSW
Within n-c SSW MSW Groups Total n-1 SST = SSA+SSW c = number of groups n = sum of the sample sizes from all groups df = degrees of freedom
One-Way ANOVA Test Statistic
Test statistic
MSA is mean squares among variances
MSW is mean squares within variances
Degrees of freedom
df 1 = c – 1 (c = number of groups)
df 2 = n – c (n = sum of all sample sizes) MSW MSA
H : μ 1 = μ 2 = … = μ c H 1 : At least two population means are different
F
One-Way ANOVA Test Statistic
The F statistic is the ratio of the among variance to the
within variance The ratio must always be positive
df 1 = c -1 will typically be small
df 2 = n - c will typically be large
Decision Rule: Reject H if F > F U , otherwise do not reject H
= .05
Reject H Do not reject H F U One-Way ANOVA Example You want to see if three different golf clubs yield
Club 1 Club 2 Club 3 different distances. You
254 234 200
randomly select five 263 218 222 measurements from trials on an
241 235 197
automated driving machine for
237 227 206
each club. At the .05
251 216 204
significance level, is there a difference in mean distance?
One-Way ANOVA Example Distance Club 1 Club 2 Club 3
270 254 234 200
- • 260
263 218 222
X
1
- •• 250
241 235 197
- • 240
237 227 206
- • ••
251 216 204 230
X
- • 2
X
- • 220
- •• X
3
210
- •• x 249.2 x 226.0 x 205.8 1 2 3
- •• x 227.0
200
1 2 3 190
Club
One-Way ANOVA Example
X = 249.2 1 n = 5 1 X = 226.0 2 n = 5 2 X = 205.8 3 n = 5 3 n = 15 X = 227.0 2 c = 3 2 2 SSA = 5 (249.2 – 227) + 5 (226 – 227) + 5 (205.8 – 227) = 4716.4 2 2 2 SSW = (254 – 249.2) + (263 – 249.2) +…+ (204 – 205.8) = 1119.6 One-Way ANOVA Example
MSA = 4716.4 / (3-1) = 2358.2
2358.2 F 25.275
MSW = 1119.6 / (15-3) = 93.3
93.3 Critical Value:
F = 3.89
U = .05
Do not Reject H reject H F = 25.275 F = 3.89 U
One-Way ANOVA Example
H : μ 1 = μ 2
= μ
3 H 1 : μ j
not all equal
= .05
df 1 = 2 df
2
= 12 Decision: Reject H at α = 0.05Conclusion: There is evidence that at least one
μ j
differs from the rest
One-Way ANOVA in Excel
SUMMARYEXCEL:
Groups Count Sum Average Variance
Select
Club 1 5 1246 249.2 108.2
Tools ____
Club 2 5 1130 226
77.5 Data Club 3 5 1029 205.8
94.2 Analysis ANOVA
_____
Source of SS df MS F P-value F crit
ANOVA:
Variation
single
Between 4716.4 2 2358.2 25.275
4.99E-05
3.89
factor
Groups Within 1119.6
12
93.3 Groups Total 5836.0
14
The Tukey-Kramer Procedure
Tells which population means are significantly different
1 2 3
e.g.: μ = μ ≠ μ
Done after rejection of equal means in ANOVA
Allows pair-wise comparisons
Compare absolute mean differences with critical
range
x
μ μ = μ
1
2
3
Tukey-Kramer Critical Range
MSW1
1 Critical Range Q
U
2 n n j j'
where: Q = Value from Studentized Range Distribution with c U and n - c degrees of freedom for the desired level of (see appendix E.9 table) MSW = Mean Square Within n and n = Sample sizes from groups j and j’ j j’
The Tukey-Kramer Procedure
1. Compute absolute mean differences: Club 1 Club 2 Club 3
254 234 200 263 218 222 241 235 197 237 227 206 251 216 204
20.2 205.8 226.0 x x 43.4 205.8 249.2 x x 23.2 226.0 249.2 x x
3 2 3 1 2 1
2. Find the Q U value from the table in appendix E.9 with c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom for the desired level of ( = .05 used here):
3.77 Q U The Tukey-Kramer Procedure
5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at the 5% level of significance.
20.2 x x 43.4 x x 23.2 x x 3 2 3 1 2 1
3. Compute Critical Range:
2 MSW Q Range Critical U j' j
16.285
1
3.77 n 1 n
93.3
2
1
5
1
5
4. Compare:
ANOVA Assumptions
Randomness and Independence Select random samples from the c groups (or
randomly assign the levels) Normality
The sample values from each group are from a
normal population Homogeneity of Variance
Can be tested with Levene’s Test
ANOVA Assumptions Levene’s Test
Tests the assumption that the variances of each group are equal.
First, define the null and alternative hypotheses:
H : σ
21 = σ
22 = …=σ
2c
H 1 : Not all σ 2j are equal
Second, compute the absolute value of the difference between each value and the median of each group.
Third, perform a one-way ANOVA on these absolute differences.
Two-Way ANOVA
Examines the effect of
Two factors of interest on the dependent variable
e.g., Percent carbonation and line speed on soft drink bottling process
Interaction between the different levels of these two factors
e.g., Does the effect of one particular carbonation level depend on which level the line speed is set?
Two-Way ANOVA
Assumptions
Populations are normally distributed
Populations have equal variances
Independent random samples are selected Two-Way ANOVA Sources of Variation
Two Factors of interest: A and B
r = number of levels of factor A c = number of levels of factor B / n = number of replications for each cell n = total number of observations in all cells
/ (n = rcn )
X = value of the kth observation of level i ijk of factor A and level j of factor B Two-Way ANOVA Sources of Variation
Degrees of
SST = SSA + SSB + SSAB + SSE
Freedom: SSA r – 1
Factor A Variation SST
SSB c – 1
Total Variation Factor B Variation
SSAB Variation due to interaction
(r – 1)(c – 1) between A and B n - 1
SSE / rc(n – 1)
Random variation (Error) Two-Way ANOVA Equations
r 1 i c 1 j n 1 k
2 ijk )
X X ( SST 2 r
1 i .. i )
X X ( n c SSA
2 c 1 j .j .
)
X X ( n r SSB
Total Variation: Factor A Variation:
Factor B Variation:
Two-Way ANOVA Equations
2
1
1 . . .. .
) (
X X
X X n SSAB r i c j j i ij
r 1 i
c
1 j
n 1 k2 . ij ijk
)
Interaction Variation: Sum of Squares Error:
X X ( SSE
Two-Way ANOVA Equations Mean Grand n rc
X X r 1 i c 1 j n 1 k ijk
..., r) 2, 1, (i A factor of level i of Mean n c
X X .. i th c 1 j n 1 k ijk
r = number of levels of factor A c = number of levels of factor B n / = number of replications in each cell Two-Way ANOVA Equations r n
X ijk i 1 k 1 th
X .j. Mean of j level of factor B (j 1, 2, ...,
c) n r n ij.
X ijk
X Mean of cell ij n k 1 r = number of levels of factor A c = number of levels of factor B / n = number of replications in each cell
Two-Way ANOVA Equations 1 r SSA
A factor square Mean MSA
1 c SSB B factor square Mean MSB
) )( 1 c
( 1 r SSAB Mean n interactio square MSAB
) ( 1 ' n rc
SSE Mean error square MSE Two-Way ANOVA: The F Test Statistic F Test for Factor B Effect F Test for Interaction Effect H : μ 1.. = μ 2..
= • • • = μ r.. H 1 : Not all μ i.. are equal
H : the interaction of A and B is equal to zero H 1 : interaction of A and B isn’t zero F Test for Factor A Effect H : μ .1. = μ .2.
= • • • = μ .c. H 1 : Not all μ .j. are equal
Reject H if F > F U
MSE MSA F
MSE MSB F
MSE MSAB
Reject H if F > F U Reject H if F > F U
F
Two-Way ANOVA: Summary Table Source of Variation Degrees of Freedom Sum of Squares Mean Squares F Statistic Factor A r – 1 SSA MSA
= SSA /(r – 1) MSA/ MSE Factor B c – 1 SSB MSB
= SSB /(c – 1) MSB/ MSE AB (Interaction) (r – 1)(c – 1) SSAB MSAB
= SSAB / (r – 1)(c – 1) MSAB/ MSE Error rc(n’ – 1) SSE MSE =
SSE/rc(n’ – 1) Total n – 1 SST Two-Way ANOVA: Features Degrees of freedom always add up
/
Total = error + factor A + factor B + interaction
n-1 = rc(n -1) + (r-1) + (c-1) + (r-1)(c-1)
The denominator of the F Test is always the same
but the numerator is different The sums of squares always add up
SST = SSE + SSA + SSB + SSAB
Total = error + factor A + factor B + interaction
Two-Way ANOVA: Interaction
No Significant Interaction: Factor B Level 1 Factor B Level 3 Factor B Level 2 Factor A Levels
Factor B Level 1 Factor B Level 3 Factor B Level 2
Factor A Levels M ea n R es po ns e
M ea n R es po ns e
Interaction is present:
Chapter Summary
Described one-way analysis of variance
The logic of ANOVA
ANOVA assumptions
F test for difference in c means
The Tukey-Kramer procedure for multiple comparisons
Described two-way analysis of variance
Examined effects of multiple factors
Examined interaction between factors In this chapter, we have