3 To see whether the time of onset of labor among expectant mothers is uniformly

Example 14.3 To see whether the time of onset of labor among expectant mothers is uniformly

  distributed throughout a 24-hour day, we can divide a day into k periods, each of length 24k. The null hypothesis states that f(x) is the uniform pdf on the interval [0, 24], so that p i0 5 1k . The article “The Hour of Birth” (British J. of Preventive and Social Medicine, 1953: 43–59) reports on 1186 onset times, which were categorized into k 5 24 1-hour intervals beginning at midnight, resulting in cell counts of 52, 73,

  89, 88, 68, 47, 58, 47, 48, 53, 47, 34, 21, 31, 40, 24, 37, 31, 47, 34, 36, 44, 78, and 59.

  Each expected cell count is 1186 1 , and the resulting value of x 2 is 162.77.

  Since x .01,23 5 41.637 , the computed value is highly significant, and the null hypoth- esis is resoundingly rejected. Generally speaking, it appears that labor is much more likely to commence very late at night than during normal waking hours. ■

  For testing whether a sample comes from a specific normal distribution, the fundamental parameters are u 1 5m and u 2 5s , and each p i0 will be a function of these parameters.

  Example 14.4 At a certain university, final exams are supposed to last 2 hours. The psychology

  department constructed a departmental final for an elementary course that was believed to satisfy the following criteria: (1) actual time taken to complete the exam is normally distributed, (2) m 5 100 min, and (3) exactly 90 of all students will

  14.1 Goodness-of-Fit Tests When Category Probabilities Are Completely Specified

  finish within the 2-hour period. To see whether this is actually the case, 120 stu- dents were randomly selected, and their completion times recorded. It was decided that k58 intervals should be used. The criteria imply that the 90th percentile of the completion time distribution is m 1 1.28 s 5 120 . Since m 5 100 , this implies that . s 5 15.63

  The eight intervals that divide the standard normal scale into eight equally likely segments are [0, .32), [.32, .675), [.675, 1.15), and [1.15, `) , and their four counter- parts are on the other side of 0. For m 5 100 and s 5 15.63 , these intervals become

  [100, 105), [105, 110.55), [110.55, 117.97), and [117.97, `). Thus p 5 i0 1 8 5 .125

  (i 5 1, c, 8), so each expected cell count is np i0

  5 120 (.125) 5 15. The observed cell counts were 21, 17, 12, 16, 10, 15, 19, and 10, resulting in a x 2 of 7.73. Since

  x 2 .10,7 5 12.017 and 7.73 is not 12.017 , there is no evidence for concluding that the criteria have not been met.

  ■

  EXERCISES Section 14.1 (1–11)

  1. What conclusion would be appropriate for an upper-tailed

  chi-squared test in each of the following situations?

  2 Frequency

  a. a 5 .05, df 5 4, x

  5 12.25 b. a 5 .01, df 5 3, x 2 5 8.54

  Direction

  c. a 5 .10, df 5 2, x 2 5 4.36

  Frequency

  d. a 5 .01, k 5 6, x 2 5 10.20

  Direction

  2. Say as much as you can about the P-value for an upper-tailed chi-squared test in each of the following situations:

  Frequency

  a. x 2 5 7.5, df 5 2

  b. x 2 5 13.0, df 5 6

  c. x 2 5 18.0, df 5 9

  d. x 2 5 21.3, df 5 5

  5. An information-retrieval system has ten storage locations.

  e. x 2 5 5.0, k 5 4

  Information has been stored with the expectation that the long-run proportion of requests for location i is given

  3. The article “Racial Stereotypes in Children’s Television

  by p i 5 (5.5 2 u i 2 5.5 u )30 . A sample of 200 retrieval

  Commercials” (J. of Adver. Res., 2008: 80–93) reported the

  requests gave the following frequencies for locations 1–10,

  following frequencies with which ethnic characters appeared

  respectively: 4, 15, 23, 25, 38, 31, 32, 14, 10, and 8. Use a

  in recorded commercials that aired on Philadelphia television

  chi-squared test at significance level .10 to decide whether

  stations.

  the data is consistent with the a priori proportions (use the P-value approach).

  African

  Ethnicity:

  American Asian Caucasian Hispanic

  6. The article “The Gap Between Wine Expert Ratings and

  Frequency:

  6 Consumer Preferences” (Intl. J. of Wine Business Res., 2008: 335–351) studied differences between expert and

  The 2000 census proportions for these four ethnic groups

  consumer ratings by considering medal ratings for wines,

  are .177, .032, .734, and .057, respectively. Does the data

  which could be gold (G), silver (S), or bronze (B). Three

  suggest that the proportions in commercials are different

  categories were then established: 1. Rating is the same

  from the census proportions? Carry out a test of appropriate

  [(G,G), (B,B), (S,S)]; 2. Rating differs by one medal

  hypotheses using a significance level of .01, and also say as

  [(G,S), (S,G), (S,B), (B,S)]; and 3. Rating differs by two

  much as you can about the P-value.

  medals [(G,B), (B,G)]. The observed frequencies for these three categories were 69, 102, and 45, respectively. On the

  4. It is hypothesized that when homing pigeons are disoriented

  hypothesis of equally likely expert ratings and consumer

  in a certain manner, they will exhibit no preference for any

  ratings being assigned completely by chance, each of the

  direction of flight after takeoff (so that the direction X should

  nine medal pairs has probability 19. Carry out an appro-

  be uniformly distributed on the interval from 0° to 360°). To

  priate chi-squared test using a significance level of .10 by

  test this, 120 pigeons are disoriented, let loose, and the direc-

  first obtaining P-value information.

  tion of flight of each is recorded; the resulting data follows. Use the chi-squared test at level .10 to see whether the data

  7. Criminologists have long debated whether there is a relation-

  supports the hypothesis.

  ship between weather conditions and the incidence of violent

  CHAPTER 14 Goodness-of-Fit Tests and Categorical Data Analysis

  crime. The author of the article “Is There a Season for

  Homicide?” (Criminology, 1988: 287–296) classified 1361

  homicides according to season, resulting in the accompanying

  data. Test the null hypothesis of equal proportions using

  a 5 .01 by using the chi-squared table to say as much as pos-

  sible about the P-value.

  10. a. Show that another expression for the chi-squared statistic is

  x 2 5 g i 2n

  i51 np i0

  8. The article “Psychiatric and Alcoholic Admissions Do Not Occur Disproportionately Close to Patients’ Birthdays”

  Why is it more efficient to compute x 2 using this formula?

  (Psychological Reports, 1992: 944–946) focuses on the

  b. When the null hypothesis is (H 0 :p 1 5p 2 5c5

  existence of any relationship between the date of patient

  p k 5 1k (i.e., for p i0 5 1k all i), how does the formula

  admission for treatment of alcoholism and the patient’s

  of part (a) simplify? Use the simplified expression to cal-

  culate x birthday. Assuming a 365-day year (i.e., excluding leap 2 for the pigeondirection data in Exercise 4. year), in the absence of any relation, a patient’s admission

  11. a. Having obtained a random sample from a population,

  date is equally likely to be any one of the 365 possible days.

  you wish to use a chi-squared test to decide whether the

  The investigators established four different admission

  population distribution is standard normal. If you base

  categories: (1) within 7 days of birthday; (2) between 8 and

  the test on six class intervals having equal probability

  30 days, inclusive, from the birthday; (3) between 31 and 90

  under H 0 , what should be the class intervals?

  days, inclusive, from the birthday; and (4) more than

  b. If you wish to use a chi-squared test to test H 0 : the pop-

  90 days from the birthday. A sample of 200 patients gave

  ulation distribution is normal with m 5 .5, s 5 .002

  observed frequencies of 11, 24, 69, and 96 for categories 1,

  and the test is to be based on six equiprobable (under H 0 )

  2, 3, and 4, respectively. State and test the relevant hypothe-

  class intervals, what should be these intervals?

  ses using a significance level of .01.

  c. Use the chi-squared test with the intervals of part (b) to decide, based on the following 45 bolt diameters,

  9. The response time of a computer system to a request for a

  whether bolt diameter is a normally distributed variable

  certain type of information is hypothesized to have an

  with in., m 5 .5 s 5 .002 in.

  exponential distribution with parameter l51 sec (so if

  X 5 response time , the pdf of X under H 0 is for f 0 (x) 5 e 2x

  a. If you had observed X 1 ,X 2 , c, X n and wanted to use the

  chi-squared test with five class intervals having equal

  probability under H 0 , what would be the resulting class

  b. Carry out the chi-squared test using the following data

  resulting from a random sample of 40 response times: