Estimating a Variance Ratio
3.5 Estimating a Variance Ratio
In statistical tests of hypotheses, concerning more than one distribution, one often needs to compare the respective distribution variances. We now present the topic
of estimating a confidence interval for the ratio of two variances, σ 2 1 2 and σ 2 , based on sample variances, v 1 and v 2 , computed on datasets of size n 1 and n 2 ,
respectively. We assume normal distributions for the two populations from where the data samples were obtained. We use the sampling distribution of the ratio:
2 , 3.21 v 2 / σ 2
which has the F n 1 −n 1 , 2 − 1 distribution as mentioned in the section B.2.9 (Property 6). Thus, the 1– α two-sided confidence interval of the variance ratio can be
computed as:
where we dropped the mention of the degrees of freedom from the F percentiles in order to simplify notation. Note that due to the asymmetry of the F distribution, one needs to compute two different percentiles in two-sided interval estimation.
The confidence intervals for the variance ratio are computed by SPSS, STATISTICA, MATLAB and R as part of hypothesis tests presented in the following chapter. We also provide the MATLAB and R function civar2(v1,n1,v2,n2,alpha) for computing confidence intervals of a variance ratio (see Appendix F).
98 3 Estimating Data Parameters
Example 3.8
Q: Consider the distribution of variable ASTV (percentage of abnormal beat-to- beat variability), for the first two classes of the cardiotocographic data (CTG). The respective dataset histograms are shown in Figure 3.6. Class 1 corresponds to “calm sleep” and class 2 to “rapid-eye-movement sleep”. The assumption of normality for both distributions of ASTV is acceptable (to be discussed in Chapter 5). Determine and interpret the 95% one-sided confidence interval, [r, ∞ [, of the ASTV standard deviation ratio for the two classes.
A: There are n 1 = 384 cases of class 1, and n 2 = 579 cases of class 2, with sample standard deviations s 1 = 15.14 and s 2 = 13.58, respectively. The 95% F percentile, computed by any of the means explained in section 3.2, is:
Thus, with 95% confidence level the standard deviation of class 1 is higher than the standard deviation of class 2 by at least 3%.
Figure 3.6. Histograms obtained with STATISTICA of the variable ASTV (percentage of abnormal beat-to-beat variability), for the first two classes of the cardiotocographic data, with superimposed normal fit.
When using F percentiles the following results can be useful:
3.6 Bootstrap Estimation 99
i. F df 2 , df 1 , 1 − α = 1 / F df 1 , df 2 , α . For instance, if in Example 3.8 we wished to compute a 95% one-sided confidence interval, [0, r], for σ 2 / σ 1 , we would then have to compute F 578 , 383 , 0 . 05 = 1F / 383 , 578 , 0 . 95 = 0.859.
ii. F df 2 , ∞ , α = χ df , α / df . Note that, in formula 3.21, with n 2 → ∞ the sample variance v
2 converges to the true variance, s 2 , yielding, therefore, the single-variance situation described by the chi-square distribution. In this sense the chi-square distribution can be viewed as a limiting case of the F distribution.
Commands 3.6. MATLAB and R commands for obtaining confidence intervals of
a variance ratio. MATLAB
civar2(v1,n1,v2,n2,alpha)
civar2(v1,n1,v2,n2,alpha)
The MATLAB and R function civar2 returns a vector with three elements. The first element is the variance ratio, the other two are the confidence interval limits. As an illustration we show the application of the R function civar2 to the Example 3.8:
> civar2(15.14^2,384,13.58^2,579,0.10) [,1] [1,] 1.242946 [2,] 1.067629 [3,] 1.451063
Note that since we are computing a one-sided confidence interval we need to specify a double alpha value. The obtained lower limit, 1.068, is the square of 1.033, therefore in close agreement to the value we found in Example 3.8.