8 Table 14.7 presents count data on the number of Larrea divaricata plants found in each
Example 14.8 Table 14.7 presents count data on the number of Larrea divaricata plants found in each
of 48 sampling quadrats, as reported in the article “Some Sampling Characteristics of Plants and Arthropods of the Arizona Desert” (Ecology, 1962: 567–571).
Table 14.7 Observed Counts for Example 14.8 Cell
Number of plants
0 1 2 3 ⱖ4
Frequency
14.2 Goodness-of-Fit Tests for Composite Hypotheses
The article’s author fit a Poisson distribution to the data. Let m denote the Poisson parameter and suppose for the moment that the six counts in cell 5 were
actually 4, 4, 5, 5, 6, 6. Then denoting sample values by x 1 , c, x 48 , nine of the x i ’s
were 0, nine were 1, and so on. The likelihood of the observed sample is
c e 248m m 101
e 2m m x 1 2m e m x 48 e 248m m ⌺x i
x 1 ! c x 48 ! x 1 ! c x 48 !
The value of m for which this is maximized is m ˆ 5 gx i n 5 10148 5 2.10 (the value reported in the article).
However, the mˆ required for x 2 is obtained by maximizing Expression (14.4)
rather than the likelihood of the full sample. The cell probabilities are
p 5 (m) 5 1 2 g
so the right-hand side of (14.4) becomes
e 2m m 0 9 e 2m m 1 9 e 2m m 2 10 e 2m m 3 14 3 e 2m m i c 6 d
c d c d c d c1 2 g d
There is no nice formula for , the maximizing value of m, in this latter expression, m ˆ so it must be obtained numerically.
■ Because the parameter estimates are usually more difficult to compute from
the grouped data than from the full sample, they are typically computed using this latter method. When these “full” estimators are used in the chi-squared statistic, the distribution of the statistic is altered and a level a test is no longer specified by the
critical value x 2 a,k212m .
THEOREM
Let u ˆ 1 , c, ˆu m
be the maximum likelihood estimators of u 1 , c, u m based on the full sample X 1 , c, X n , and let x 2 denote the statistic based on these estimators. Then the critical value c a that specifies a level a upper-tailed test
satisfies
x 2 a,k212m c a x a,k21 2 (14.7)
The test procedure implied by this theorem is the following:
If x 2 x 2 a,k21 , reject H 0 .
If x 2 x 2 a,k212m , do not reject H 0 .
If x 2 ,x 2 ,x a,k212m 2 a,k21 , withhold judgement.
CHAPTER 14 Goodness-of-Fit Tests and Categorical Data Analysis
Example 14.9 Using m ˆ 5 2.10 , the estimated expected cell counts are computed from np i (m ˆ) , (Example 14.8
where . n 5 48 For example,
ˆ ) 5 48
5 (48)(e 22.1 ) 5 5.88
Similarly, , np 2 (m ˆ ) 5 12.34, np 3 (m ˆ ) 5 12.96, np 4 (m ˆ ) 5 9.07 and np 5 (m) 5 48 2
5.88 2 c 2 9.07 5 7.75 . Then
2 (9 2 5.88) 2 (6 2 7.75) x 2 5 1c1
5.88 7.75 Since m51 and k55 , at level .05 we need x 2 .05,3 5 7.815 and x 2 .05,4 5 9.488 .
Because
6.31 7.815 , we do not reject H 0 ; at the 5 level, the Poisson distribution
provides a reasonable fit to the data. Notice that x 2 .10,3 5 6.251 and x 2 .10,4 5 7.779 , so at level .10 we would have to withhold judgment on whether the Poisson distri- bution was appropriate.
■
Sometimes even the maximum likelihood estimates based on the full sample are quite difficult to compute. This is the case, for example, for the two-parameter (generalized) negative binomial distribution. In such situations, method-of-moments
estimates are often used and the resulting x 2 compared to x 2 a,k212m , though it is not known to what extent the use of moments estimators affects the true critical value.