10 The article “Class Start Times, Sleep, and Academic Performance in College: A
ExamplE 14.10 The article “Class Start Times, Sleep, and Academic Performance in College: A
Path Analysis” (Chronobiology International, 2012: 318–335) reported on a study in which students were surveyed about various aspects of sleep behavior during a particular two-week period. Here is data on average sleep time per day (h) for 100 of the students:
Is it reasonable to assume that the population distribution of average sleep time is at least approximately normal? The histogram in Figure 14.3 is not persuasive. So let’s carry out a chi-squared test of the null hypothesis that the distribution is normal.
Average sleep time
Figure 14.3 Histogram of the sleep time data from Example 14.10
Suppose that prior to sampling, it was believed that plausible values of m and s were
8 and 1, respectively. The eight equiprobable class intervals for the standard normal distribution (each with probability .125) are (2`, 21.15), [21.15, 2.67), [2.67,
2 .32), [2.32, 0), [0, .32), [.32, .67), [.67, .1.15), and [1.15, `), with each endpoint also giving the distance in standard deviations from the mean for any other normal distributions. For m 5 8 and s 5 1, these intervals transform to (2`, 6.85), [6.85, 7.33), [7.33, 7.68), [7.68, 8.00), [8.00, 8.32), [8.32, 8.67), [8.67, 9.15), and [9.15, `).
To obtain the estimated cell probabilities p 1 (mˆ, s ˆ ),…, p 8 (mˆ, sˆ), we first need the mle’s mˆ and sˆ. In Chapter 6, the mle of s was shown to be [ i 2 x 2 osx 1 d yn] y2
(rather than s), so with s 5 .9481,
m ˆ 5 x 5 7.876 sˆ 5 o 5 5 3 .9433
(x 2 x) 2 1 i y2
4 3 n 4
(n 2 1)s 2 1 y2
n
14.2 Goodness-of-Fit tests for Composite hypotheses 635
Each p i smˆ, sˆd is then the probability that a normal rv X with mean 7.876 and stand- ard deviation .9433 falls in the ith class interval. For example,
p 2 (mˆ, s ˆ ) 5 P(6.85 , X , 7.33) 5 P(21.09 , Z , 2.58) 5 .1431 so np 2 (mˆ, s ˆ ) 5 100(.1431) 5 14.31. Observed and estimated expected cell counts
are shown in Table 14.8. Table 14.8 Observed and Expected Counts for Example 14.10
Estimated expected
Estimated expected
The computed value of x 2 is 5.56. With k 5 8 cells and m 5 2 parameters estimated,
the 7 df and 5 df columns of Table A.11 show that both pure chi-squared P-values exceed .10. Therefore our P-value certainly exceeds any reasonable a, indicating that the null hypothesis of population normality cannot be rejected. The evidence from the
entire sample of n 5 253 students is somewhat less supportive of H 0 . The P-value from
the special test for normality described in the next subsection is .086.
n
example 14.11
The article “Some Studies on Tuft Weight Distribution in the Opening Room” (Textile Research J., 1976: 567–573) reports the accompanying data on the distri-
bution of output tuft weight X (mg) of cotton fibers for the input weight x 0 5 70.
Interval
0–8 8–16 16–24 24–32 32–40 40–48 48–56 56–64 64–70
Observed frequency 20
Expected frequency 18.0
The authors postulated a truncated exponential distribution:
The mean of this distribution is
l 12e 2l x 0
x 0 1 x 0 e 2l x 0
m5 xf (x) dx 5 2
The parameter l was estimated by replacing m by x 5 13.086 and solving the resul- ting equation to obtain lˆ 5 .0742 (so lˆ is a method-of-moments estimate and not an mle). Then with lˆ replacing l in f (x), the estimated expected cell frequencies as displayed previously are computed as
a
a i
40(e 2l ˆ a i2 1 2 e 2l ˆ a i )
40p i (lˆ) 5 40P(a i2 1 X,a i ) 5 40
f (x)
dx 5
i2 1 12e 2l ˆx 0
636 Chapter 14 Goodness-of-Fit tests and Categorical Data analysis
where [a i2 1 ,a i d is the ith class interval. To obtain expected cell counts of at least 5, the last six cells are combined to yield observed counts of 20, 8, 7, 5 and expected
counts of 18.0, 9.9, 5.5, 6.6. The computed value of chi-squared is then x 2 5 1.34 with a corresponding P-value that exceeds .10. Therefore H 0 cannot be rejected at
significance level .05, so the truncated exponential model provides a good fit.
n