Inferences Concerning Two Population Variances
9.5 Inferences Concerning Two Population Variances
Methods for comparing two population variances (or standard deviations) are occa- sionally needed, though such problems arise much less frequently than those involv- ing means or proportions. For the case in which the populations under investigation are normal, the procedures are based on a new family of probability distributions.
the F distribution
The F probability distribution has two parameters, denoted by n 1 and n 2 . The param- eter n 1 is called the number of numerator degrees of freedom, and n 2 is the number of denominator degrees of freedom; here n 1 and n 2 are positive integers. A random
variable that has an F distribution cannot assume a negative value. Since the density function is complicated and will not be used explicitly, we omit the formula. There is
an important connection between an F variable and chi-squared variables. If X 1 and
X 2 are independent chi-squared rv’s with n 1 and n 2 df, respectively, then the rv
(the ratio of the two chi-squared variables divided by their respective degrees of
freedom), can be shown to have an F distribution.
Figure 9.7 illustrates the graph of a typical F density function. Analogous to
the notation t a ,v and x 2 a ,v , we use F a ,v 1 ,v 2 for the value on the horizontal axis that captures a of the area under the F density curve with n 1 and n 2 df in the upper tail.
The density curve is not symmetric, so it would seem that both upper- and lower-tail
critical values must be tabulated. This is not necessary, though, because of the fact that
F 12a,v 1 ,v 2 5 1 yF a ,v 2 ,v 1 .
F density curve with 1 and n 2 df
Shaded area 5 a
f F a , 1, 2 n n
Figure 9.7 An
F density curve and critical value
Appendix Table A.9 gives F a ,v 1 ,v 2 for a 5 .10, .05, .01, and .001, and
various values of v
1 (in different columns of the table) and v 2 (in different groups of rows of the table). For example, F .05,6,10 5 3.22 and F .05,10,6 5 4.06. The critical value
F .95,6,10 , which captures .95 of the area to its right (and thus .05 to the left) under the
F curve with v 1 5 6 and v 2 5 10, is F .95,6,10 5 1 yF .05,10,6 5 1 y4.06 5 .246.
the F test for Equality of Variances
A test procedure for hypotheses concerning the ratio s 2 1 ys 2 is based on the following
result.
400 Chapter 9 Inferences Based on two Samples
thEorEm
Let X 1 ,…, X m
be a random sample from a normal distribution with variance
s 2 1 , let Y 1 ,…, Y n
be another random sample (independent of the X i ’s) from a
normal distribution with variance s
2 , and let S 1 and S 2 denote the two sample
variances. Then the rv S 2 1 2 ys 1
F5 (9.9)
S 2 ys 2 has an F distribution with v 1 5 m2 1 and v 2 5 n2 1.
This theorem results from combining (9.8) with the fact that the variables
(m 2 1)S 2 1 2 ys 1 and (n 2 1)S 2 ys 2 each have a chi-squared distribution with m 2 1
and n 2 1 df, respectively (see Section 7.4). Because F involves a ratio rather than a
difference, the test statistic is the ratio of sample variances. The claim that s 2 1 5s 2
is implausible if the ratio differs by too much from 1.
Recall that the P-value for an upper-tailed t test is the area under an appropri- ate t curve to the right of the calculated t, whereas for a lower-tailed test the P-value is the area under the curve to the left of t. Analogously, the P-value for an upper- tailed F test is the area under an appropriate F curve (the one with specified numera- tor and denominator dfs) to the right of f, and the P-value for a lower-tailed test is the area under an F curve to the left of f. Because t curves are symmetric, the P-value for a two-tailed test is double the captured lower tail area if t is negative and double the captured upper tail area if t is positive. Although F curves are not symmetric, by analogy the P-value for a two-tailed F test is twice the captured lower tail area if f is below the median and twice the captured upper tail area if it is above the median.
Figure 9.8 illustrates this for an upper-tailed test based on n 1 5 4 and n 2 5 6.
F density curve for v 1 = 4, v 2 =6
Shaded area = P-value
f = 6.23
Figure 9.8 A P-value for an upper-tailed F test
Null hypothesis: H 2 0 :s 1 5s 2 Test statistic value: f 5 s 2 1 ys 2
Alternative Hypothesis
P-Value determination
H a :s 2 1 .s 2 A R 5 Area under the F m 2 1, n 2 1 curve to the right of f
H a 2 :s 1 ,s 2 A L 5 Area under the F m 2 1, n 2 1 curve to the left of f
H a 2 :s 1 ±s 2 2 ∙ min(A R ,A L )
Assumption: The population distributions are both normal, and the two ran- dom samples are independent of one another.
9.5 Inferences Concerning two population Variances 401
Tabulation of F-curve upper-tail areas is much more cumbersome than for t
curves because two df’s are involved. For each combination of n 1 and n 2 , our F table
gives only the four critical values that capture areas .10, .05, .01, and .001. Because of this, the table will generally provide only an upper or lower bound (or both) on the P-value. For example, suppose the test is upper-tailed and based on 4 numera- tor df and 6 denominator df. If f 5 5.82, then the P-value is the area under the F 4,6 curve to the right of 5.82. Because F .05,4,6 5 4.53, the area to the right of 4.53 is by definition .05. Similarly, F .01,4,6 5 9.15 implies that the area under the curve to the right of this value is .01. Since 5.82 lies in between 4.53 and 9.15, the area to the right of 5.82 must be between .01 and .05. That is, .01 , P-value , .05. Figure 9.9 shows what can be said about the P-value depending on where f falls relative to the four relevant tabulated critical values.
P -value > .10
.01 < P-value < .05
.001 < P-value < .01 P -value < .001
.05 < P-value < .10
Figure 9.9 Obtaining P-value information from the F table for an upper-tailed F test
Again considering a test with v 1 5 4 and v 2 5 6,
f 5 5.82
1 .01 , Pvalue , .05
f 5 2.16
1 Pvalue . .10
f 5 25.03 1 Pvalue , .001
Only if f equals a tabulated value do we obtain an exact P-value (e.g., if f 5 4.53,
then Pvalue 5 .05). Once we know that .01 , Pvalue , .05, H 0 would be rejected at a significance level of .05 but not at a level of .01. When Pvalue , .001, H 0
should be rejected at any reasonable significance level.
The F tests discussed in succeeding chapters will all be upper-tailed. If, how- ever, a lower-tailed F test is appropriate, then lower-tailed critical values should be obtained as described earlier so that a bound or bounds on the P-value can be estab- lished. In the case of a two-tailed test, the bound or bounds from a one-tailed test
should be multiplied by 2. For example, if f 5 5.82 when v 1 5 4 and v 2 5 6, then
since 5.82 falls between the .05 and .01 critical values, 2(.01) , Pvalue , 2(.05),
giving .02 , Pvalue , .10. H 0 would then be rejected if a 5 .10 but not if a 5 .01.
In this case, we cannot say from our table what conclusion is appropriate when
a5 .05 (since we don’t know whether the P-value is smaller or larger than this). However, statistical software shows that the area to the right of 5.82 under this F curve is .029, so the P-value is .058 and the null hypothesis should therefore not be rejected at level .05. Various statistical software packages will, of course, provide an exact P-value for any F test.
402 Chapter 9 Inferences Based on two Samples
ExamPlE 9.14
A random sample of 200 vehicles traveling on gravel roads in a county with a posted speed limit of 35 mph on such roads resulted in a sample mean speed of 37.5 mph and a sample standard deviation of 8.6 mph, whereas another random sample of 200 vehicles in a county with a posted speed limit of 55 mph resulted in a sample mean and sample standard deviation of 35.8 mph and 9.2 mph, respectively (these means and standard
deviations were reported in the article “Evaluation of Criteria for Setting Speed
Limits on Gravel Roads” (J. of Transp. Engr., 2011: 57–63) ; the actual sample sizes result in dfs that exceed the largest of those in our F table). Let’s carry out a test at signifi- cance level .10 to decide whether the two population distribution variances are identical.
1. s 1 2 is the variance of the speed distribution on the 35 mph roads, and s 2 is the
variance of the speed distribution on 55 mph roads.
2. H 0 :s 1 2 5s 2
3. H a :s 2 1 ±s 2
4. Test statistic value: f 5 s 1 2 ys 2
5. Calculation: f 5 (8.6) 2 y(9.2) 2 5 .87
6. P-value determination: .87 lies in the lower tail of the F curve with 199 numerator df and 199 denominator df. A glance at the F table shows that F .10,199,199 1.20 (consult the v 1 5 120 and v 1 5 1000 columns), implying F .90,199,199 < 1y1.20 5 .83 (these values are confirmed by software). That is, the area under the relevant F curve to the left of .83 is .10. Thus the area under the curve to the left of .87 exceeds .10, and so P-value . 2(.10) 5 .2 (software gives .342). 7. The P-value clearly exceeds the mandated significance level. The null hypoth- esis therefore cannot be rejected; it is plausible that the two speed distribution variances are identical. The sample sizes in the cited article were 2665 and 1868, respectively, and the P-value reported there was .0008. So for the actual data, the hypothesis of equal variances would be rejected not only at significance level .10—in contrast to our conclusion—but also at level .05, .01, and even .001. This illustrates again how quite large sample sizes can magnify a small difference in estimated values. Note also that the sample mean speed for the county with the lower posted speed limit was higher than for the county with the lower limit, a counterintuitive result that surprised the investigators; and because of the very large sample sizes, this difference in means is highly statistically significant. n