Estimation and Tests for a Population Variance

Upper-tail values of the chi-square distribution can be found in Table 7 in the Appendix. Entries in the table are values of that have an area ␣ to the right under the curve. The degrees of freedom are specified in the left column of the table, and values of ␣ are listed across the top of the table. Thus, for df ⫽ 14, the value of chi-square with an area ␣ ⫽ .025 to its right under the curve is 26.12 see Figure 7.3. To determine the value of chi-square with an area .025 to its left under the curve, we compute and obtain 5.629 from Table 7 in the Appendix. Combining these two values, we have that the area under the curve between 5.629 and 26.12 is . See Figure 7.3. We can use this information to form a confidence interval for . Because the chi-square distribution is not symmetrical, the confidence intervals based on this distribution do not have the usual form, estimate error, as we saw for . The 100 con- fidence interval for is obtained by dividing the estimator of , s 2 , by the lower and upper percentiles, , as described following. x 2 L and x 2 U a 兾2 s 2 s 2 1 ⫺ a m and m 1 ⫺ m 2 ⫾ s 2 1 ⫺ .025 ⫺ .025 ⫽ .95 ␣ ⫽ 1 ⫺ .025 x 2 FIGURE 7.2 Densities of the chi-square df ⫽ 5, 15, 30 distribution df = 30 df = 15 df = 5

10 20

30 40 50 60 70 .02 .04 .06 .08 .10 .12 .14 .16 Chi-square density Value of chi-square FIGURE 7.3 Critical values of the chi-square distribution with df ⫽ 14 .95 .025 .025 f 2 2 5.629 Value of chi-square 26.12 FIGURE 7.4 Upper-tail and lower-tail values of chi-square 2 2 2 f 2 2 2 U L General Confidence Interval for ␴ 2 or ␴ with Confidence Coefficient 1 ⴚ ␣ where is the upper-tail value of chi-square for with area to its right, and is the lower-tail value with area to its left see Figure 7.4. We can determine and for a specific value of df by obtaining the critical value in Table 7 of the Appendix corresponding to respec- tively. Note: The confidence interval for is found by taking square roots throughout. s a 兾2 and 1 ⫺ a兾2, x 2 L x 2 U a 兾2 x 2 L a 兾2 df ⫽ n ⫺ 1 x 2 U n ⫺ 1s 2 x 2 U ⬍ s 2 ⬍ n ⫺ 1s 2 x L 2 EXAMPLE 7.1 The machine that fills 500-gram coffee containers for a large food processor is monitored by the quality control department. Ideally, the amount of coffee in a container should vary only slightly about the nominal 500-gram value. If the varia- tion was large, then a large proportion of the containers would be either under- filled, thus cheating the customer, or overfilled, thus resulting in economic loss to the company. The machine was designed so that the weights of the 500-gram con- tainers would have a normal distribution with mean value of 506.6 grams and a standard deviation of 4 grams. This would produce a population of containers in which at most 5 of the containers weighed less than 500 grams. To maintain a population in which at most 5 of the containers are underweight, a random sam- ple of 30 containers is selected every hour. These data are then used to determine whether the mean and standard deviation are maintained at their nominal values. The weights from one of the hourly samples are given here: 501.4 498.0 498.6 499.2 495.2 501.4 509.5 494.9 498.6 497.6 505.5 505.1 499.8 502.4 497.0 504.3 499.7 497.9 496.5 498.9 504.9 503.2 503.0 502.6 496.8 498.2 500.1 497.9 502.2 503.2 Estimate the mean and standard deviation in the weights of coffee containers filled during the hour, in which the random sample of 30 containers was selected using a 99 confidence interval. Solution For these data, we find y ⫽ 500.453 and s ⫽ 3.433 To use our method for constructing a confidence interval for and , we must first check whether the weights are a random sample from a normal population. Figure 7.5 is a normal probability plot of the 30 weights. The 30 values fall near the straight line. Thus, the normality condition appears to be satisfied. The confidence coefficient for this example is . The upper-tail chi-square value can be obtained from Table 7 in the Appendix, for Sim- ilarly, the lower-tail chi-square value is obtained from Table 7, with 1 ⫺ ␣ 兾2 ⫽ .995. Thus, x 2 L ⫽ 13.12 and x 2 U ⫽ 52.34 df ⫽ n ⫺ 1 ⫽ 29 and ␣ 兾2 ⫽ .005. 1 ⫺ a ⫽ .99 s m FIGURE 7.5 Normal probability plot of container weights .999 .99 .95 .80 .50 .20 .05 .01 .001 495 500 505 510 Weight Probability The 99 confidence interval for is then or Thus, we are 99 confident that the standard deviation in the weights of coffee cans lies between 2.56 and 5.10 grams. The designed value for , 4 grams, falls within our confidence interval. Using our results from Chapter 5, a 99 confi- dence interval for is or Thus, it appears the machine is underfilling the containers, because 506.6 grams does not fall within the confidence limits. In addition to estimating a population variance, we can construct a statistical test of the null hypothesis that equals a specified value, . This test procedure is summarized next. s 2 s 2 498.7 ⬍ m ⬍ 502.2 500.453 ⫾ 2.756 3.433 130 500.453 ⫾ 1.73 m s 2.56 ⬍ s ⬍ 5.10 A 293.433 2 52.34 ⬍ s ⬍ A 293.433 2 13.12 s EXAMPLE 7.2 New guidelines define persons as diabetic if results from their fasting plasma glu- cose test on two different days are 126 milligrams per deciliter mgdL or higher. People who have a reading of between 110 and 125 are considered in danger of be- coming diabetic as their ability to process glucose is impaired. These people should be tested more frequently and counseled about ways to lower their blood sugar level and reduce the risk of heart disease. Amid sweeping changes in U.S. health care, the trend toward cost-effective self-care products used in the home emphasizes prevention and early intervention. The home test kit market is offering faster and easier products that lend themselves to being used in less-sophisticated environments to meet consumers’ needs. A home blood sugar glucose test measures the level of glucose in your blood at the time of testing. The test can be done at home, or anywhere, using a small portable machine called a blood glucose meter. People who take insulin to control their dia- betes may need to check their blood glucose level several times a day. Testing blood sugar at home is often called home blood sugar monitoring or self-testing. Home glucose meters are not usually as accurate as laboratory measurement. Problems arise from the machines not being properly maintained and, more im- portantly, when the persons conducting the tests are the patients themselves, who may be quite elderly and in poor health. In order to evaluate the variability in read- ings from such devices, blood samples with a glucose level of 200 mgdL are given to 20 diabetic patients to perform a self-test for glucose level. Trained technicians using the same self-test equipment obtain readings that have a standard deviation of 5 mgdL. The manufacturer of the equipment claims that, with minimal instruc- tion, anyone can obtain the same level of consistency in their measurements. The readings from the 20 diabetic patients are given here: Statistical Test for ␴ 2 or ␴ H :

1. H

a : 1. 2. 2. 3. 3. T.S.: R.R.: For a specified value of ,

1. Reject H

if is greater than , the upper-tail value for and .

2. Reject H

if is less than , the lower-tail value for 1 ⫺ and .

3. Reject H

if is greater than , based on and or less than , based on 1 ⫺ and . Check assumptions and draw conclusions. df ⫽ n ⫺ 1 a 兾2 x 2 L df ⫽ n ⫺ 1, a 兾2 x 2 U x 2 df ⫽ n ⫺ 1 a x 2 L x 2 df ⫽ n ⫺ 1 a x 2 U x 2 a x 2 ⫽ n ⫺ 1s 2 s 2 s 2 ⫽ s 2 s 2 ⫽ s 2 s 2 ⬍ s 2 s 2 ⱖ s 2 s 2 ⬎ s 2 s 2 ⱕ s 2 203.1 184.5 206.8 211.0 218.3 174.2 193.2 201.9 199.9 194.3 199.4 193.6 194.6 187.2 197.8 184.3 196.1 196.4 197.5 187.9 Use these data to determine whether there is sufficient evidence that the vari- ability in readings from the diabetic patients is higher than the manufacturer’s claim. Use a ⫽ .05. Solution The manufacturer claims that the diabetic patients should have a stan- dard deviation of 5 mgdL. The appropriate hypotheses are manufacturer’s claim is correct manufacturer’s claim is false In order to apply our test statistic to these hypotheses, it is necessary to check whether the data appear to have been generated from a normally distributed pop- ulation. From Figure 7.6, we observe that the plotted points fall relatively close to the straight line and that the p-value for testing normality is greater than .10. Thus, the normality condition appears to be satisfied. From the 20 data values, we compute the sample standard deviation s ⫽ 9.908. The test statistic and rejection regions are as follows: T.S.: R.R.: For , the null hypothesis, H is rejected if the value of the T.S. is greater than 30.14, obtained from Table 7 in the Appendix for and . Conclusion: Since the computed value of the T.S., 74.61, is greater than the critical value 30.14, there is sufficient evidence to reject H , the manufacturer’s claim at the .05 level. In fact, the p-value of the T.S. is p ⫺ value ⫽ using Table 7 from the Appendix. Thus, there is very strong evidence that patients using the self-test for glucose may have larger variability in their readings than what the manufacturer claimed. In fact, to further assess the size of this standard deviation, a 95 confidence interval for is given by Therefore, the standard deviation in glucose measurements for the diabetic pa- tients is potentially considerably higher than the standard deviation for the trained technicians. 冢 A 20 ⫺ 19.908 2 32.85 , A 20 ⫺ 19.908 2 8.907 冣 ⫽ 7.53, 14.47 s Px 2 19 ⬎ 43.82 ⫽ .001, Px 2 19 ⬎ 74.61 ⬍ df ⫽ n ⫺ 1 ⫽ 19 a ⫽ .05 a ⫽ .05 x 2 ⫽ n ⫺ 1s 2 s 2 ⫽ 199.908 2 5 2 ⫽ 74.61 H a : s 2 ⬎ 5 H : s 2 ⱕ 5 FIGURE 7.6 Normal probability plot for glucose readings 5 1 C1 10 20 Percent 30 40 50 60 70 80 90 95 99 170 180 190 200 210 220 P -value .100 RJ .983 N 20 StDev 9.908 Mean 196.2 The inference methods about are based on the condition that the random sample is selected from a population having a normal distribution similar to the requirements for using t distribution–based inference procedures. However, when sample sizes are moderate to large , the t distribution–based procedures can be used to make inferences about even when the normality condition does not hold, because for moderate to large sample sizes the Central Limit Theorem pro- vides that the sampling distribution of the sample mean is approximately normal. Unfortunately, the same type of result does not hold for the chi-square–based pro- cedures for making inferences about ; that is, if the population distribution is distinctly nonnormal, then these procedures for are not appropriate even if the sample size is large. Population nonnormality, in the form of skewness or heavy tails, can have serious effects on the nominal significance and confidence probabilities for . If a boxplot or normal probability plot of the sample data shows substantial skew- ness or a substantial number of outliers, the chi-square-based inference procedures should not be applied. There are some alternative approaches that involve compu- tationally elaborate inference procedures. One such procedure is the bootstrap. Bootstrapping is a technique that provides a simple and practical way to estimate the uncertainty in sample statistics like the sample variance. We can use bootstrap techniques to estimate the sampling distribution of sample variance. The estimated sampling distribution is then manipulated to produce confidence intervals for and rejection regions for tests of hypotheses about . Information about bootstrapping can be found in the books by Efron and Tibshirani An Introduction to the Boot- strap, Chapman and Hall, New York, 1993 and by Manly Randomization, Boot- strap and Monte Carlo Methods in Biology, Chapman and Hall, New York, 1998. EXAMPLE 7.3 A simulation study was conducted to investigate the effect on the level of the chi-square test of sampling from heavy-tailed and skewed distributions rather than the required normal distribution. The five distributions were normal, uniform short-tailed, t distribution with df ⫽ 5 heavy-tailed, and two gamma distribu- tions, one slightly skewed and the other heavily skewed. Some summary statistics about the distributions are given in Table 7.1. s s s s s m n ⱖ 30 s TABLE 7.1 Summary statistics for distributions in simulation Distribution Summary Gamma Gamma Statistic Normal Uniform t df ⴝ 5 shape ⴝ 1 shape ⴝ .1 Mean 17.32 10 3.162 Variance 100 100 100 100 100 Skewness 2 6.32 Kurtosis 3 1.8 9 9 63 Note that each of the distributions has the same variance, , but the skewness and kurtosis of the distributions vary. Skewness is a measure of lack of symmetry, and kurtosis is a measure of the peakedness or flatness of a distribution. From each of the distributions, 2,500 random samples of sizes 10, 20, and 50 were selected and a test of H : versus and a test of were conducted using for both sets of hypotheses. A chi-square test of variance was performed for each of the 2,500 sam- ples of the various sample sizes from each of the five distributions. The results are given in Table 7.2. What do the results indicate about the sensitivity of the test to sampling from a nonnormal population? a ⫽ .05 H : s 2 ⱖ 100 versus H a : s 2 ⬍ 100 H a : s 2 ⬎ 100 s 2 ⱕ 100 s 2 ⫽ 100 Solution The values in Table 7.2 are estimates of the probability of a Type I error, , for the chi-square test about variances. When the samples are taken from a nor- mal population, the actual probabilities of a Type I error are very nearly equal to the nominal ⫽ .05 value. When the population distribution is symmetric with shorter tails than a normal distribution, the actual probabilities are smaller than .05, whereas for a symmetric distribution with heavy tails, the Type I error proba- bilities are much greater than .05. Also, for the two skewed distributions, the actual values are much larger than the nominal .05 value. Furthermore, as the popula- tion distribution becomes more skewed, the deviation from .05 increases. From these results, there is strong evidence that the claimed value of the chi-square test of a population variance is very sensitive to nonnormality. This strongly rein- forces our recommendation to evaluate the normality of the data prior to conducting the chi-square test of a population variance.

7.3 Estimation and Tests for Comparing

Two Population Variances In the research study about E. coli detection methods, we are concerned about comparing the standard deviations of the two procedures. In many situations in which we are comparing two processes or two suppliers of a product, we need to compare the standard deviations of the populations associated with process meas- urements. Another major application of a test for the equality of two population variances is for evaluating the validity of the equal variance condition that is, for a two-sample t test. The test developed in this section requires that the two population distributions both have normal distributions. We are interested in comparing the variance of population 1, , to the variance of population 2, . When random samples of sizes n 1 and n 2 have been independently drawn from two normally distributed populations, the ratio s 2 1 兾s 2 1 s 2 2 兾s 2 2 ⫽ s 2 1 兾s 2 2 s 2 1 兾s 2 2 s 2 2 s 2 1 s 2 1 ⫽ s 2 2 a a a a TABLE 7.2 Proportion of times H was rejected a ⫽ .05 Sample Distribution Size Normal Uniform t Gamma 1 Gamma .1 n ⫽ 10 .047 .004 .083 .134 .139 n ⫽ 20 .052 .006 .103 .139 .175 n ⫽ 50 .049 .004 .122 .156 .226 Sample Distribution Size Normal Uniform t Gamma 1 Gamma .1 n ⫽ 10 .046 .018 .119 .202 .213 n ⫽ 20 .050 .011 .140 .213 .578 n ⫽ 50 .051 .018 .157 .220 .528 H a : s 2 ⬍ 100 H a : s 2 ⬎ 100 evaluating equal variance condition possesses a probability distribution in repeated sampling referred to as an F distribution. The formula for the probability distribution is omitted here, but we will specify its properties. F distribution Properties of the F Distribution

1. Unlike t or z but like , F can assume only positive values.

2. The F distribution, unlike the normal distribution or the t distribu-

tion but like the distribution, is nonsymmetrical. See Figure 7.7.

3. There are many F distributions, and each one has a different shape.

We specify a particular one by designating the degrees of freedom associated with and . We denote these quantities by df 1 and df 2 , respectively. See Figure 7.7.

4. Tail values for the F distribution are tabulated and appear in Table 8

in the Appendix. s 2 2 s 2 1 x 2 x 2 Table 8 in the Appendix records upper-tail values of F corresponding to areas , .10, .05, .025, .01, .005, and .001. The degrees of freedom for , des- ignated by df 1 , are indicated across the top of the table; df 2 , the degrees of freedom for , appear in the first column to the left. Values of are given in the next col- umn. Thus, for df 1 ⫽ 5 and df 2 ⫽ 10, the critical values of F corresponding to ⫽ .25, .10, .05, .025, .01, .005, and .001 are, respectively, 1.59, 2.52, 3.33, 4.24, 5.64, 6.78, and 10.48. It follows that only 5 of the measurements from an F distribution with df 1 ⫽ 5 and df 2 ⫽ 10 would exceed 3.33 in repeated sampling. See Figure 7.8. Sim- ilarly, for df 1 ⫽ 24 and df 2 ⫽ 10, the critical values of F corresponding to tail areas of ⫽ .01 and .001 are, respectively, 4.33 and 7.64. a a a s 2 2 s 2 1 a ⫽ .25 FIGURE 7.7 Densities of two F distributions .8 .7 .6 .5 .4 .3 .2 .1 1 2 3 4 5 6 7 8 9 10 df 1 = 10, df 2 = 20 df 1 = 5, df 2 = 10 Value of F F density