Two-Way Contingency Tables
14.3 Two-Way Contingency Tables
In the scenarios of Sections 14.1 and 14.2, the observed frequencies were displayed in a single row within a rectangular table. We now study problems in which the data also consists of counts or frequencies, but the data table will now have I rows (I 2) and J columns, so IJ cells. There are two commonly encountered situations in which such data arises:
1. There are I populations of interest, each corresponding to a different row of the table, and each population is divided into the same J categories. A sam- ple is taken from the ith population (i 5 1,…, I), and the counts are entered in the cells in the ith row of the table. For example, customers of each of
640 Chapter 14 Goodness-of-Fit tests and Categorical Data analysis
I5 3 department-store chains might have available the same J 5 5 payment categories: cash, check, and credit cards from American Express, Visa, and MasterCard.
2. There is a single population of interest, with each individual in the population
categorized with respect to two different factors. There are I categories associated with the first factor and J categories associated with the second factor. A single
sample is taken, and the number of individuals belonging in both category i of factor 1 and category j of factor 2 is entered in the cell in row i, column
j (i 5 1,…, I; j 5 1,…, J). As an example, customers making a purchase might be classified according to both department in which the purchase was made, with I 5 6 departments, and according to method of payment, with J 5 5 as in (1) above.
Let n ij denote the number of individuals in the sample(s) falling in the (i, j)th cell (row i, column j) of the table—that is, the (i, j)th cell count. The table displaying the n ij ’s is called a two-way contingency table ; a prototype is shown in Table 14.9.
Table 14.9 A Two-Way Contingency Table
In situations of type 1, we want to investigate whether the proportions in the different categories are the same for all populations. The null hypothesis states that the populations are homogeneous with respect to these categories. In type 2 situa- tions, we investigate whether the categories of the two factors occur independently of one another in the population.
testing for Homogeneity
Suppose each individual in every one of the I populations belongs in exactly one of the same J categories. A sample of n i individuals is taken from the ith population; let n 5 on i and
n ij 5 the number of individuals in the ith sample who fall into category j
I the total number of individuals among
n ? j 5 n ij 5
i5 o 1 the n sample who fall into category j
The n ij ’s are recorded in a two-way contingency table with I rows and J columns. The sum of the n ij ’s in the ith row is n i , and the sum of entries in the jth column will
be denoted by n ?j . Let
the proportion of the individuals in p ij 5 population i who fall into category j
14.3 two-Way Contingency tables 641
Thus, for population 1, the J proportions are p 11 ,p 12 ,…, p 1J (which sum to 1) and
similarly for the other populations. The null hypothesis of homogeneity states that the proportion of individuals in category j is the same for each population and that
this is true for every category; that is, for every j, p 1j 5 p 2j 5…5 p Ij . When H 0 is true, we can use p 1 ,p 2 ,…, p J to denote the population proportions
in the J different categories; these proportions are common to all I populations. The expected number of individuals in the ith sample who fall in the jth category
when H 0 is true is then E(N ij )5n i ?p j . To estimate E(N ij ), we must first estimate p j , the proportion in category j. Among the total sample of n individuals, N ?j ’s fall into
category j, so we use pˆ j 5 N ? j yn as the estimator (this can be shown to be the maxi-
mum likelihood estimator of p j ). Substitution of the estimate pˆ j for p j in n i p j yields a
simple formula for estimated expected counts under H 0 :
n ? j
eˆ ij 5 estimated expected count in cell (i, j) 5 n i ?
n
(ith row total)( j th column total)
n
The test statistic here has the same form as in Sections 14.1 and 14.2. The num- ber of degrees of freedom comes from the general rule of thumb. In each row of Table 14.9 there are J 2 1 freely determined cell counts (each sample size n i is
fixed), so there are a total of I sJ 2 1d freely determined cells. Parameters p 1 ,…, p J are estimated, but because op i 5 1, only J 2 1 of these are independent. Thus df 5
I sJ 2 1d 2 sJ 2 1d 5 sJ 2 1d sI 2 1d.
Null hypothesis: H 0 :p 1j 5 p 2j 5…5 p Ij j5 1, 2,…, J
Alternative hypothesis: H a :H 0 is not true Test statistic value:
all cells o
2 (observed 2 estimated expected) 2 I J (n ij 2 eˆ ij ) x 2 5 5
o i5 1 o j5 1 eˆ ij
estimated expected
When H 0 is true and eˆ ij 5 for all i, j, the test statistic has approximately a
chi-squared distribution with (I 2 1)(J 2 1) df. The test is again upper-tailed,
so the P-value is the area under the x 2 (I21)(J21) curve to the right of the calcu-
lated x 2 . Table A.11 can be used to obtain P-value information as described in Section 14.1.
example 14.13
A company packages a particular product in cans of three different sizes, each one using a different production line. Most cans conform to specifications, but a quality control engineer has identified the following reasons for nonconformance:
1. Blemish on can
2. Crack in can
3. Improper pull tab location
4. Pull tab missing
5. Other
642 Chapter 14 Goodness-of-Fit tests and Categorical Data analysis
A sample of nonconforming units is selected from each of the three lines, and each unit is categorized according to reason for nonconformity, resulting in the following contingency table data:
Reason for Nonconformity
Sample
Blemish Crack
Location Missing Other Size
Production Line
Does the data suggest that the proportions falling in the various nonconformance categories are not the same for the three lines? The parameters of interest are the various proportions, and the relevant hypotheses are
H 0 : the production lines are homogeneous with respect to the five noncon
formance categories; that is, p 1j 5 p 2j 5 p 3j for j 5 1, … ,5
H a : the production lines are not homogeneous with respect to the categories The estimated expected frequencies (assuming homogeneity) must now be calcu-
lated. Consider the first nonconformance category for the first production line. When the lines are homogeneous,
estimated expected number among the 150 selected units that are blemished
(first row total)(first column total)
total of sample sizes
The contribution of the cell in the upper-left corner to x 2 is then (observed 2 estimated expected) 2 (31 2 33.20) 2
estimated expected
The other contributions are calculated in a similar manner. Figure 14.5 shows Minitab output for the chi-squared test. The observed count is the top number in each cell, and directly below it is the estimated expected count. The contribution of
Expected counts are printed below observed counts Chi–Square contributions are printed below expected counts
Blemish Crack Location Missing Other Total
Chi–Sq 5 21.403, DF 5 8, P–Value 5 0.006
Figure 14.5 Minitab output for the chi-squared test of Example 14.13
14.3 two-Way Contingency tables 643
each cell to x 2 appears below the counts, and the test statistic value is x 2 5 21.403.
All estimated expected counts are at least 5, so combining categories is unneces- sary. The test is based on s3 2 1ds5 2 1d 5 8 df. Appendix Table A.11 shows that the area under the 8 df chi-squared curve to the right of 20.09 is .010 and the area to the right of 21.95 is .005. Therefore we can say that .005 , P-value , .01; Minitab gives P-value 5 .006. Using a significance level of .01, the null hypoth- esis of homogeneity can be rejected in favor of the alternative that the distribution of reason for nonconformity is somehow different for the three production lines.
At this point it is desirable to seek an explanation for why the hypothesis of homogeneity is implausible. Figure 14.6 shows a stacked comparative bar chart of the data. It appears that the three lines are relatively homogenous with respect to the Other and Missing categories but not with respect to the Location, Crack, and Blemish categories. Line 1’s incidence rate of crack nonconformities is much higher than for the other two lines, whereas location nonconformities appear to be more of
a problem for line 2 than for the other two lines and blemish nonconformities occur much more frequently for line 3 than for the other two lines.
Reason Other 80 Missing Location
Crack 60 Blemish
Figure 14.6 Stacked comparative bar chart for the data of Example 14.13 n
testing for Independence (Lack of Association)
We focus now on the relationship between two different factors in a single popula- tion. Each individual in the population is assumed to belong in exactly one of the I categories associated with the first factor and exactly one of the J categories associ- ated with the second factor. For example, the population of interest might consist of all individuals who regularly watch the national news on television, with the first
factor being preferred network (ABC, CBS, NBC, or PBS, so I 5 4) and the second factor political philosophy (liberal, moderate, or conservative, giving J 5 3).
For a sample of n individuals taken from the population, let n ij denote the number among the n who fall both in category i of the first factor and category j of the second factor. The n ij ’s can be displayed in a two-way contingency table with
I rows and J columns. In the case of homogeneity for I populations, the row totals were fixed in advance, and only the J column totals were random. Now only the total sample size is fixed, and both the n i? ’s and n ? j ’s are observed values of random variables. To state the hypotheses of interest, let
p ij 5 the proportion of individuals in the population who belong in category i
of factor 1 and category j of factor 2
5P sa randomly selected individual falls in both category i of factor 1 and
category j of factor 2 d
644 Chapter 14 Goodness-of-Fit tests and Categorical Data analysis
Then
p i? 5 p ij 5 P o (a randomly selected individual falls in category i of factor 1)
j
p ?j 5 p ij 5 P o (a randomly selected individual falls in category j of factor 2)
i
Recall that two events, A and B, are independent if P sA ù Bd 5 PsAd ? PsBd. The null hypothesis here says that an individual’s category with respect to factor 1 is independent of the category with respect to factor
2. In symbols, this becomes
p ij 5 p i? ?p ? j for every pair (i, j).
The expected count in cell (i, j) is n ? p ij , so when the null hypothesis is true,
E (N ij )5n?p i? ?p ?j . To obtain a chi-squared statistic, we must therefore estimate the
p i? ’s (i 5 1, … , I) and p ? j ’s( j 5 1, … , J ). The (maximum likelihood) estimates are
n i?
pˆ i? 5 n 5 sample proportion for category i of factor 1
and n ? j
pˆ ? j 5 5 sample proportion for category j of factor 2
n
This gives estimated expected cell counts identical to those in the case of homogeneity.
(ith row total)(jth column total)
5 n
The test statistic is also identical to that used in testing for homogeneity, as is the number of degrees of freedom. This is because the number of freely determined cell counts is IJ 2 1, since only the total n is fixed in advance. There are I estimated p i? ’s, but only I 2 1
are independently estimated since op i? 5 1; and similarly J 2 1p ? j ’s are independently
estimated, so I 1 J 2 2 parameters are independently estimated. The rule of thumb now yields df 5 IJ 2 1 2 sI 1 J 2 2d 5 IJ 2 I 2 J 1 1 5 sI 2 1d ? sJ 2 1d.
Null hypothesis: H 0 :p ij 5 p i? ?p ? j
i5 1,…, I; j 5 1,…, J
Alternative hypothesis: H a :H 0 is not true Test statistic value:
all cells o
(observed 2 estimated expected) 2 I 2 J (n ij 2 eˆ 2 ij ) x 5 5
o i5 1 j5 o 1 eˆ ij
estimated expected
When H 0 is true and eˆ ij 5 for all i, j, the test statistic has approximately a
chi-squared distribution with (I 2 1)(J 2 1) df. The test is again upper-tailed,
so the P-value is the area under the x 2 (I21)(J21) curve to the right of the calcu-
lated x 2 . Table A.11 can be used to obtain P-value information as described in Section 14.1.
14.3 two-Way Contingency tables 645
example 14.14
The accompanying two-way table from Minitab (Table 14.10) gives a cross- classification in which the row factor is level of paternal education (completed univer- sity, partial university, secondary, partial secondary) and the column factor represents the quartile of neonatal (i.e., newborn) weight gain (Q1 5 lowest 25, Q2 5 next
lowest 25, Q3, Q4); the data appeared in the article “Impact of Neonatal Growth
on IQ and Behavior at Early School Age” (Pediatrics, July 2013, e53–60) . Does it appear that educational level is independent of NWG in the sampled population?
Table 14.10 Observed and Estimated Expected Counts for Example 14.14
Expected counts are printed below observed counts Chi-Square contributions are printed below expected counts
Q4 Total
The contribution to x 2 from the cell in the upper-left corner is (422 2 411.63) 2 y411.63 5 .261. The 15 other contributions are calculated in the same way. Then x 2 5 .261 1 … 1 .253 5 19.016. When H 0 is true, the test statistic has approximately a chi-
squared distribution with (4 2 1)(4 2 1) 5 9 df. The expected value of a chi-squared
rv is just its number of degrees of freedom, so E(x 2 ) 5 9 under the assumption of
independence. Clearly the test statistic value exceeds what would be expected if the two factors were independent, but is it by enough to suggest implausibility of this null hypothesis? Table A.11 shows that .025 is the area to the right of 19.02 under the chi-squared curve with 9 df. Thus the P-value for the test is roughly .025 (which is the value calculated by Minitab; the cited article reported .03). At significance level .05, the null hypothesis of independence would be rejected since P-value < .025 .05 5 a. However, this conclusion would not be justified at a significance level of .01. The P-value is such that people might argue over what conclusion is appropriate.
Someone persuaded by our analysis to reject the assertion of independence would want to look more closely at the data to seek an explanation for that conclu- sion. Perhaps, for example, those in a higher quartile tend to have higher educational levels. Figure 14.7 shows histograms (bar graphs) of the percentages in the various educational level categories for each of the four different quartiles. The four histo- grams appear to be very similar; the visual impression is that the distribution over the four educational levels does not depend much on the NWG quartile. This seemingly contradicts the finding of statistical significance. Now note that the sample size here is extremely large, and this inflates the value of the chi-squared statistic. With the same percentages as in Figure 14.7 but a much more moderate sample size, the value
of x 2 would be much smaller and the P-value much larger. Our test result achieved statistical significance, but there does not seem to be any practical significance.
646 Chapter 14 Goodness-of-Fit tests and Categorical Data analysis
Figure 14.7 Histograms based on the data of Example 14.14 n
Models and methods for analyzing data in which each individual is catego- rized with respect to three or more factors (multidimensional contingency tables) are discussed in several of the chapter references.
ExErCiSES Section 14.3 (24–36)
24. The accompanying two-way table was constructed using
ChiSq 5 3.557 1 0.579 1
data in the article “Television Viewing and Physical
Fitness in Adults” (Research Quarterly for Exercise and
Sport, 1990: 315–320) . The author hoped to determine
whether time spent watching television is associated with
df 5 3
cardiovascular fitness. Subjects were asked about their
25. In an investigation of alcohol use among college stu-
television-viewing habits and were classified as physically
dents, each male student in a sample was categorized
fit if they scored in the excellent or very good category on
both according to age group and according to the number
a step test. We include Minitab output from a chi-squared
of heavy drinking episodes during the previous 30 days
analysis. The four TV groups corresponded to different
(“Alcohol Use in Students Seeking Primary Care
amounts of time per day spent watching TV (0, 1–2, 3–4,
Treatment at University Health Services,” J. of Amer.
or 5 or more hours). The 168 individuals represented in
College Health, 2012: 217–225) .
the first column were those judged physically fit. Expected counts appear below observed counts, and Minitab dis-
Age Group
plays the contribution to x from each cell. State and test
the appropriate hypotheses using a 5 .05. 24
18–20 21–23
Does there appear to be an association between extent
of binge drinking and age group in the population from
which the sample was selected? Carry out a test of
Total 168 1032 1200
hypotheses at significance level .01.
14.3 two-Way Contingency tables 647
26. Contamination of various food products is an ongoing
Degree of Spirituality
problem all over the world. The article “Prevalence and Quantitative Detection of Salmonella in Retail Raw
Very
Moderate
Slightly Not at all
Chicken in Shaanxi, China” (J. of Food Production,
N.S. 56 162
2013) reported the following data on the occurrence of
S.S. 56 223
salmonella in chicken of three different types: (1) super-
G.D. 109
market chilled, (2) supermarket frozen, and (3) wet market fresh slaughtered.
a. Is there substantial evidence for concluding that the three types of individuals are not homogenous with
Salmonella
respect to their degree of spirituality? State and test
Sample Size
Positive Samples
the appropriate hypotheses. 1. 60 27 b. Considering just the natural scientists and social
Type
2. 60 32 scientists, is there evidence for non-homogeneity?
45 30. Three different design configurations are being consid- ered for a particular component. There are four possible
Does it appear that the incidence rate of salmonella
failure modes for the component. An engineer obtained
occurrence depends on the type of chicken? State and
the following data on number of failures in each mode
test the appropriate hypotheses using a significance level
for each of the three configurations. Does the configura-
of .05.
tion appear to have an effect on type of failure?
27. The article “Human Lateralization from Head to Foot:
Sex-Related Factors” (Science, 1978: 1291–1292) Failure Mode
reports for both a sample of right-handed men and a
sample of right-handed women the number of individuals whose feet were the same size, had a bigger left than right
foot (a difference of half a shoe size or more), or had a
Configuration
bigger right than left foot.
Sample
31. A random sample of smokers was obtained, and each
individual was classified both with respect to gender and with respect to the age at which heshe first started
Men 2 10 28 40
smoking. The data in the accompanying table is con-
Women 55 18
sistent with summary results reported in the article
14 87 “Cigarette Tar Yields in Relation to Mortality in the Cancer Prevention Study II Prospective
Does the data indicate that gender has a strong effect on
Cohort” (British Med. J., 2004: 72–79) .
the development of foot asymmetry? State and test the appropriate hypotheses.
Gender
28. A random sample of 175 Cal Poly State University stu-
M ale
Female
dents was selected, and both the email service provider and cell phone provider were determined for each one,
resulting in the accompanying data. State and test the
Age 16217 24 32
appropriate hypotheses
Cell Phone Provider
a. Calculate the proportion of males in each age category, and then do the same for females. Based on these pro-
ATT Verizon Other
portions, does it appear that there might be an associa-
gmail
28 17 7 tion between gender and the age at which an individu-
Email Provider Yahoo 31 26 10
al first smokes?
Other 26 19 11
b. Carry out a test of hypotheses to decide whether
there is an association between the two factors.
29. The accompanying data on degree of spirituality for
32. Eclosion refers to the emergence of an adult insect from
samples of natural and social scientists at research univer-
an egg. The following data on eclosion rates when
sities as well as for a sample of non-academics with
nymphs were exposed to heat for various durations was
graduate degrees appeared in the article “Conflict
extracted from the article “High Temperature
Between Religion and Science Among Academic
Determines the Ups and Downs of Small Brown
Scientists” (J. for the Scientific Study of Religion, 2009:
Planthopper Laodelphax Striatellus Population”
276–292) .
(Insect Science, 2012: 385–392) .
648 Chapter 14 Goodness-of-Fit tests and Categorical Data analysis
Duration (d)
0 1 2 3 5 10 15 35. Suppose that in a particular state consisting of four distinct
Sample size
120 41 47 44 46 42 10 regions, a random sample of n k voters is obtained from the k th region for k 5 1, 2, 3, 4. Each voter is then classified
Emerged:
101 38 44 40 38 35 7 according to which candidate (1, 2, or 3) he or she prefers
Carry out a chi-squared test to decide whether it is
and according to voter registration s1 5 Dem., 2 5 Rep.,
plausible that eclosion rate does not depend on exposure
3 5 Indep. d. Let p ijk denote the proportion of voters in
duration (the cited article included summary information
region k who belong in candidate category i and registration
from the test).
category j. The null hypothesis of homogeneous regions is
33. Show that the chi-squared statistic for the test of inde-
H 0 :p ij 1 5 p ij 2 5 p ij 3 5 p ij 4 for all i, j (i.e., the proportion
pendence can be written in the form
within each candidateregistration combination is the same for all four regions). Assuming that H is true, determine pˆ ijk
x 5 o
and eˆ ijk as functions of the observed n ijk ’s, and use the
1 o i5 j5 1 E ˆ ij 2 general rule of thumb to obtain the number of degrees of freedom for the chi-squared test.
n
Why is this formula more efficient computationally than
36. Consider the accompanying 2 3 3 table displaying the
the defining formula for x 2 ?
sample proportions that fell in the various combinations
34. Suppose that each student in a sample had been catego-
of categories (e.g., 13 of those in the sample were in
rized with respect to political views, marijuana usage,
the first category of both factors).
and religious preference, with the categories of this latter
factor being Protestant, Catholic, and other. The data could
be displayed in three different two-way tables, one corresponding to each category of the third factor. With
p ijk 5 P spolitical category i, marijuana category j, and
a. Suppose the sample consisted of n 5 100 people.
religious category k), the null hypothesis of independence
Use the chi-squared test for independence with sig- of all three factors states that p ijk 5 p i? ? p ? j? p ?? k . Let n ijk nificance level .10.
denote the observed frequency in cell (i, j, k). Show how to estimate the expected cell counts assuming that H
0 is
b. Repeat part (a), assuming that the sample size was
true (eˆ ijk 5 npˆ ijk , so the pˆ ijk ’s must be determined). Then
n5 1000.
use the general rule of thumb to determine the number of
c. What is the smallest sample size n for which these
degrees of freedom for the chi-squared statistic.
observed proportions would result in rejection of the independence hypothesis?
SuPPlEmEnTAry ExErCiSES (37–49)
37. The article “Birth Order and Political Success” (Psych.
Lunar Phase
Days in Phase Births
Reports, 1971: 1239–1242) reports that among 31 ran- domly selected candidates for political office who came
New moon 24 7680
from families with four children, 12 were firstborn, 11
Waxing crescent 152
were middle born, and 8 were last born. Use this data to
First quarter 24 7579
test the null hypothesis that a political candidate from
Waxing gibbous 149 47,814
such a family is equally likely to be in any one of the four
Full moon
ordinal positions.
Waning gibbous 150
47,595 Last quarter 24 73
38. Does the phase of the moon have any bearing on
Waning crescent 152
birthrate? Each of 222,784 births that occurred during
a period encompassing 24 full lunar cycles was clas-
State and test the appropriate hypotheses to answer the
sified according to lunar phase. The following data is
question posed at the beginning of this exercise.
consistent with summary quantities that appeared in
39. Each individual in a sample of nursing home patients
the article “The Effect of the Lunar Cycle on
was cross-classified both with respect to cognitive state
Frequency of Births and Birth Complications”
(normal or mild impairment, moderate impairment,
(Amer. J. of Obstetrics and Gynecology, 2005:
severe impairment) and with respect to drug status (psy-
1462–1464) .
chotropic drug change, psychotropic user without a
Supplementary exercises 649
change, no psychotropic medication). The following
a. State the relevant hypotheses and reach a conclusion
Minitab output resulted from a request to perform a chi-
using a 5 .05.
squared analysis.
b. Do you think that your conclusion in part (a) can be
attributed to a single sport being an anomaly?
Drug change No change No med
41. The accompanying two-way frequency table appears
90.06 64.11 34.83 in the article “Marijuana Use in College” (Youth and
Society, 1979: 323–334) . Each of 445 college students was classified according to both frequency of mari-
juana use and parental use of alcohol and psychoactive
84.40 drugs. Does the data suggest that parental usage and
student usage are independent in the population from
Severe
which the sample was drawn?
Standard Level of Marijuana Use
Never Occasional Regular
Chi-Sq 5 11.294, DF 5 4, P-Value 5 0.023
(“Psychotropic Drug Initiation or Increased Dosage
Neither 141 54 40
and the Acute Risk of Falls,” BMC Geriatrics, 2013: Parental
a. Verify the expected frequency and contribution
and Drugs
to x in the normal–drug change cell of the two-way
Both 17 11 19
table.
42. Much attention has recently focused on the incidence
b. Does there appear to be an association between cog- nitive state and drug status? State and test the appro-
of concussions among athletes. Separate samples of
priate hypotheses using a significance level of .01.
soccer players, non-soccer athletes, and non-athletes
[Note: The cited article reported a P-value.]
were selected. The accompanying table then resulted from determining the number of concussions each
40. The authors of the article “Predicting Professional
individual reported on a medical history questionnaire
Sports Game Outcomes from Intermediate Game
(“No Evidence of Impaired Neurocognitive
Scores” (Chance, 1992: 18–22) used a chi-squared test to
Performance in Collegiate Soccer Players,” Amer. J.
determine whether there was any merit to the idea that
of Sports Med., 2002: 157–162) .
basketball games are not settled until the last quarter, whereas baseball games are over by the seventh inning.
of Concussions
They also considered football and hockey. Data was col-
012 »3
lected for 189 basketball games, 92 baseball games,
80 hockey games, and 93 football games. The games ana-
Soccer
lyzed were sampled randomly from all games played dur-
N-S Athletes
ing the 1990 season for baseball and football and for the
Non-athletes
1990–1991 season for basketball and hockey. For each
Does the distribution of of concussions appear to be
game, the late-game leader was determined, and then it
different for the three types of individuals? Carry out a
was noted whether the late-game leader actually ended up
test of hypotheses.
winning the game. The resulting data is summarized in the accompanying table.
43. In a study to investigate the extent to which individuals are aware of industrial odors in a certain region
Late-Game Late-Game (“Annoyance and Health Reactions to Odor from
Sport
Leader Wins
Leader Loses
Refineries and Other Industries in Carson,
California,” Environmental Research, 1978: 119–132) ,
Basketball 150
39 a sample of individuals was obtained from each of three
Baseball
86 6 different areas near industrial facilities. Each individual
Hockey
65 15 was asked whether he or she noticed odors (1) every day,
Football
72 21 (2) at least onceweek, (3) at least oncemonth, (4) less
The authors state that “Late-game leader is defined as
often than oncemonth, or (5) not at all, resulting in the
the team that is ahead after three quarters in basketball
data and SPSS output at the bottom of the next page.
and football, two periods in hockey, and seven innings
State and test the appropriate hypotheses.
in baseball. The chi-square value on three degrees of
44. Many shoppers have expressed unhappiness because
freedom is 10.52 sP , .015d.”
grocery stores have stopped putting prices on individual
650 Chapter 14 Goodness-of-Fit tests and Categorical Data analysis
grocery items. The article “The Impact of Item Price
won its regional tournament 22 times, the second-ranked
Removal on Grocery Shopping Behavior” (J. of
team won 10 times, the third-ranked team won 5 times,
Marketing, 1980: 73–93) reports on a study in which
and the remaining 11 regional tournaments were won by
each shopper in a sample was classified by age and by
teams ranked lower than 3. Let P ij denote the probability
whether he or she felt the need for item pricing. Based
that the team ranked i in its region is victorious in its
on the accompanying data, does the need for item pric-
game against the team ranked j. Once the P ij ’s are avail-
ing appear to be independent of age?
able, it is possible to compute the probability that any particular seed wins its regional tournament (a compli-
Age
cated calculation because the number of outcomes in the
,30 30–39 40–49 50–59 60
sample space is quite large). The paper “Probability Models for the NCAA Regional Basketball
Number in
Tournaments” (American Statistician, 1991: 35–38)
Sample 150 141 82 63 49
proposed several different models for the P ij ’s.
a. One model postulated P ij 5 .5 2 l si 2 jd with
Number
77 61 41 l5 1 y32 (from which P 16,1 5l ,P 16,2 5 2l, etc.).
Who Want 127 118
Item Pricing
Based on this, P(seed 1 wins) 5 .27477, P(seed 2 wins) 5 .20834, and P(seed 3 wins) 5 .15429. Does this model appear to provide a good fit to the
45. Let p 1 denote the proportion of successes in a particular
data?
population. The test statistic value in Chapter 8 for test-
ing H 0 :p 1 5 p 10 was z 5 spˆ
1 2 p 10 dy Ï p 10 p 20 yn, where
b. A more sophisticated model has game probabilities
P ij 5 .5 1 .2813625 sz i 2 z j d, where the z’s are mea-
p 20 5 12p 10 . Show that for the case k 5 2, the chi-
squared test statistic value of Section 14.1 satisfies x 2 5 z 2 .
sures of relative strengths related to standard normal
[Hint: First show that (n 1 2 np 10 ) 2 5 (n 2 np 20 ) 2 .]
percentiles [percentiles for successive highly seeded teams are closer together than is the case for teams
46. The NCAA basketball tournament begins with 64 teams
seeded lower, and .2813625 ensures that the range of
that are apportioned into four regional tournaments,
probabilities is the same as for the model in part (a)].
each involving 16 teams. The 16 teams in each region
The resulting probabilities of seeds 1, 2, or 3 winning
are then ranked (seeded) from 1 to 16. During the
their regional tournaments are .45883, .18813, and
12-year period from 1991 to 2002, the top-ranked team
.11032, respectively. Assess the fit of this model.
SPSS output for Exercise 43 Crosstabulation: AREA BY CATEGORY
Count
Exp Val
CATEGORY ¡ Row Pct Row AREA
Col
Pct 1.00 2.00 3.00 4.00 5.00 Total
Column 38 74 54 48 77 291 Total
Min E.F.
Cells with E.F. , 5
None
Bibliography 651
47. Have you ever wondered whether soccer players suffer
a. Let p 0 denote the long run proportion of digits in
adverse effects from hitting “headers”? The authors of
the expansion that equal 0, and define p 1 ,…, p 9
the article “No Evidence of Impaired Neurocognitive
analogously. What hypotheses about these propor-
Performance in Collegiate Soccer Players” (Amer. J.
tions should be tested, and what is df for the chi-
of Sports Med., 2002: 157–162) investigated this issue
squared test?
from several perspectives.
b. H 0 of part (a) would not be rejected for the nonrandom
a. The paper reported that 45 of the 91 soccer players
sequence 012…901…901…. Consider nonoverlap-
in their sample had suffered at least one concussion,
ping groups of two digits, and let p ij denote the long
28 of 96 nonsoccer athletes had suffered at least one
run proportion of groups for which the first digit is
concussion, and only 8 of 53 student controls had
i and the second digit is j. What hypotheses about
suffered at least one concussion. Analyze this data
these proportions should be tested, and what is df for
and draw appropriate conclusions.
the chi-squared test?
b. For the soccer players, the sample correlation
c. Consider nonoverlapping groups of 5 digits. Could a
coefficient calculated from the values of x 5 soc-
chi-squared test of appropriate hypotheses about the
cer exposure (total number of competitive seasons
p ijklm ’s be based on the first 100,000 digits? Explain.
played prior to enrollment in the study) and y 5
d. The article “Are the Digits of p an Independent and
score on an immediate memory recall test was
Identically Distributed Sequence?” (The American
r52 .220. Interpret this result.
Statis tician, 2000: 12–16) considered the first 1,254,540
c. Here is summary information on scores on a con-
digits of p, and reported the following P-values for
trolled oral word-association test for the soccer and
group sizes of 1, … , 5: .572, .078, .529, .691, .298. What
nonsoccer athletes:
would you conclude? n 1 5 1 26, x 5 1 37.50, s 5 9.13 49. The Fibonacci sequence of numbers occurs in various
n
scientific contexts. The first two numbers in the sequence 2 5 2 56, x 5 2 39.63, s 5 10.19 are 1,1. Then every succeeding number is the sum of the
Analyze this data and draw appropriate conclusions.
two previous numbers: 1, 1, 1 1 1 5 2, 1 1 2 5 3, 2 1
d. Considering the number of prior nonsoccer concus-