Confidence Intervals for the Variance and Standard Deviation of a Normal Population
7.4 Confidence Intervals for the Variance and Standard Deviation of a Normal Population
Although inferences concerning a population variance s 2 or standard deviation s are
usually of less interest than those about a mean or proportion, there are occasions when such procedures are needed. In the case of a normal population distribution,
inferences are based on the following result concerning the sample variance S 2 .
thEorEm
Let X 1 ,X 2 ,…, X n
be a random sample from a normal distribution with param- eters m and s 2 . Then the rv
has a chi-squared (x 2 ) probability distribution with n 2 1 df.
As discussed in Sections 4.4 and 7.1, the chi-squared distribution is a continu- ous probability distribution with a single parameter n, called the number of degrees
of freedom, with possible values 1, 2, 3, … . The graphs of several x 2 probability
density functions (pdf’s) are illustrated in Figure 7.10. Each pdf f(x; n) is posi- tive only for x . 0, and each has a positive skew (stretched out upper tail), though the distribution moves rightward and becomes more symmetric as n increases. To specify inferential procedures that use the chi-squared distribution, we need notation
analogous to that for a t critical value t a ,n .
Figure 7.10 Graphs of chi-squared density functions
notation
Let x 2 a ,n , called a chi-squared critical value, denote the number on the hori- zontal axis such that a of the area under the chi-squared curve with n df lies
to the right of x 2 a ,n .
7.4 Confidence Intervals for the Variance and Standard Deviation of a Normal Population
Symmetry of t distributions made it necessary to tabulate only upper-tailed
t critical values (t a ,n for small values of a). The chi-squared distribution is not sym- metric, so Appendix Table A.7 contains values of x 2 a ,n both for a near 0 and near 1, as illustrated in Figure 7.11(b). For example, x .025,14 2 5 26.119, and x .95,20 2 (the 5th
percentile) 5 10.851.
Each shaded area 5 .01
x n 2 density curve Shaded area 5 a
Figure 7.11 x 2 a ,n notation illustrated
The rv (n 2 1)S 2 ys 2 satisfies the two properties on which the general method for obtaining a CI is based: It is a function of the parameter of interest s 2 , yet its
probability distribution (chi-squared) does not depend on this parameter. The area
under a chi-squared curve with n df to the right of x 2 a y2,n is a y2, as is the area to the
left of x 2 12a y2,n . Thus the area captured between these two critical values is 1 2 a. As a consequence of this and the theorem just stated,
(n 2 1)S
1 s 2
P x 2
12a y2,n21 ,
,x 2 a y2,n21 5 12a (7.17)
The inequalities in (7.17) are equivalent to (n 2 1)S 2 (n 2 1)S 2
,s 2 ,
x 2
x 2 12a y2,n21
a y2,n21
Substituting the computed value s 2 into the limits gives a CI for s 2 , and taking square
roots gives an interval for s.
A 100(1 2 a) confidence interval for the variance s 2 of a normal pop-
ulation has lower limit (n 2 1)s 2 yx a 2 y2,n21 and upper limit (n 2 1)s 2 yx 2 12a y2,n21
A confidence interval for s has lower and upper limits that are the square
roots of the corresponding limits in the interval for s 2 . An upper or a lower
confidence bound results from replacing a y2 with a in the corresponding limit of the CI.
ExamplE 7.15 The accompanying data on breakdown voltage of electrically stressed circuits was
read from a normal probability plot that appeared in the article “Damage of Flexible Printed Wiring Boards Associated with Lightning-Induced Voltage Surges”
306 ChaPter 7 Statistical Intervals Based on a Single Sample
(IEEE Transactions on Components, Hybrids, and Manuf. Tech., 1985: 214–220).
The straightness of the plot gave strong support to the assumption that breakdown voltage is approximately normally distributed.
Let s 2 denote the variance of the breakdown voltage distribution. The computed
value of the sample variance is s 2 5 137, 324.3, the point estimate of s 2 . With
df 5 n 2 1 5 16, a 95 CI requires x 2 6.908 and x .975,16 2 5 .025,16 5 28.845. The
interval is 16(137,324.3) , 16(137,324.3)
Taking the square root of each endpoint yields (276.0, 564.0) as the 95 CI for s. These intervals are quite wide, reflecting substantial variability in breakdown volt- age in combination with a small sample size.
n
CIs for s 2 and s when the population distribution is not normal can be difficult
to obtain. For such cases, consult a knowledgeable statistician.