Headability Jay L. Devore Probability and Statistics

deviation of 5805 and 3990, respectively. Is there strong evi- dence for concluding that true average number of cycles to break for the polyisoprene condom exceeds that for the nat- ural latex condom by more than 1000 cycles? Carry out a test using a significance level of .01. [Note: The cited paper reported P-values of t tests for comparing means of the var- ious types considered.] 77. Information about hand posture and forces generated by the fingers during manipulation of various daily objects is needed for designing high-tech hand prosthetic devices. The article “Grip Posture and Forces During Holding Cylindrical Objects with Circular Grips” Ergonomics, 1996: 1163– 1176 reported that for a sample of 11 females, the sample mean four-finger pinch strength N was 98.1 and the sample standard deviation was 14.2. For a sample of 15 males, the sample mean and sample standard deviation were 129.2 and 39.1, respectively. a. A test carried out to see whether true average strengths for the two genders were different resulted in and . Does the appropriate test procedure described in this chapter yield this value of t and the stated P-value? b. Is there substantial evidence for concluding that true average strength for males exceeds that for females by more than 25 N? State and test the relevant hypotheses. 78. The article “Pine Needles as Sensors of Atmospheric Pollution” Environ. Monitoring, 1982: 273–286 reported on the use of neutron-activity analysis to determine pollutant con- centration in pine needles. According to the article’s authors, “These observations strongly indicated that for those elements which are determined well by the analytical procedures, the distribution of concentration is lognormal. Accordingly, in tests of significance the logarithms of concentrations will be used.” The given data refers to bromine concentration in needles taken from a site near an oil-fired steam plant and from a relatively clean site. The summary values are means and standard deviations of the log-transformed observations. Sample Mean Log SD of Log Site Size Concentration Concentration Steam plant 8 18.0 4.9 Clean 9 11.0 4.6 Let be the true average log concentration at the first site, and define analogously for the second site. a. Use the pooled t test based on assuming normality and equal standard deviations to decide at significance level .05 whether the two concentration distribution means are equal. b. If and the standard deviations of the two log con- centration distributions are not equal, would m 1 and m 2 the means of the concentration distributions be the same if ? Explain your reasoning. 79. The article “The Accuracy of Stated Energy Contents of Reduced-Energy, Commercially Prepared Foods” J. of the m 1 5 m 2 s 2 s 1 m 2 m 1 P -value 5 .019 t 5 2.51 Amer. Dietetic Assoc., 2010: 116–123 presented the accom- panying data on vendor-stated gross energy and measured value both in kcal for 10 different supermarket conven- ience meals: Meal: 1 2 3 4 5 6 7 8 9 10 Stated: 180 220 190 230 200 370 250 240 80 180 Measured: 212 319 231 306 211 431 288 265 145 228 Carry out a test of hypotheses based on a P-value to decide whether the true average difference from that stated dif- fers from zero. [Note: The article stated “Although formal statistical methods do not apply to convenience samples, standard statistical tests were employed to summarize the data for exploratory purposes and to suggest directions for future studies.”] 80. Arsenic is a known carcinogen and poison. The standard laboratory procedures for measuring arsenic concentration mgL in water are expensive. Consider the accompanying summary data and Minitab output for comparing a labora- tory method to a new relatively quick and inexpensive field method from the article “Evaluation of a New Field Measurement Method for Arsenic in Drinking Water Samples,” J. of Envir. Engr., 2008: 382–388. Two-Sample T-Test and CI Sample N Mean StDev SE Mean 1 3 19.70 1.10 0.64 2 3 10.90 0.60 0.35 Estimate for difference: 8.800 95 CI for difference: 6.498, 11.102 What conclusion do you draw about the two methods, and why? Interpret the given confidence interval. [Note: One of the article’s authors indicated in private communication that they were unsure why the two methods disagreed.] 81. The accompanying data on response time appeared in the article “The Extinguishment of Fires Using Low-Flow Water Hose Streams—Part II” Fire Technology, 1991: 291–320. Good visibility .43 1.17 .37 .47 .68 .58 .50 2.75 Poor visibility 1.47 .80 1.58 1.53 4.33 4.23 3.25 3.22 The authors analyzed the data with the pooled t test. Does the use of this test appear justified? [Hint: Check for normality. The z percentiles for are , and 1.53.] 82. Acrylic bone cement is commonly used in total joint arthro- plasty as a grout that allows for the smooth transfer of loads from a metal prosthesis to bone structure. The paper “Validation of the Small-Punch Test as a Technique for Characterizing the Mechanical Properties of Acrylic Bone Cement” J. of Engr. in Med., 2006: 11–21 gave the fol- lowing data on breaking force N: .15, .49, .89 2 1.53, 2.89, 2.49, 2.15, n 5 8 T -Value 5 12.16 P-Value 5 0.001 DF 5 3 T -Test of difference 5 0 vs not 5: Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. Temp Medium n s 22° Dry 6 170.60 39.08 37° Dry 6 325.73 34.97 22° Wet 6 366.36 34.82 37° Wet 6 306.09 41.97 Assume that all population distributions are normal. a. Estimate true average breaking force in a dry medium at 37° in a way that conveys information about reliability and precision, and interpret your estimate. b. Estimate the difference between true average breaking force in a dry medium at 37° and true average force at the same temperature in a wet medium, and do so in a way that conveys information about precision and reliability. Then interpret your estimate. c. Is there strong evidence for concluding that true average force in a dry medium at the higher temperature exceeds that at the lower temperature by more than 100 N? 83. In an experiment to compare bearing strengths of pegs inserted in two different types of mounts, a sample of 14 observations on stress limit for red oak mounts resulted in a sample mean and sample standard deviation of 8.48 MPa and .79 MPa, respec- tively, whereas a sample of 12 observations when Douglas fir mounts were used gave a mean of 9.36 and a standard deviation of 1.52 “Bearing Strength of White Oak Pegs in Red Oak and Douglas Fir Timbers,” J. of Testing and Evaluation, 1998, 109–114. Consider testing whether or not true average stress limits are identical for the two types of mounts. Compare df’s and P-values for the unpooled and pooled t tests. 84. How does energy intake compare to energy expenditure? One aspect of this issue was considered in the article “Measurement of Total Energy Expenditure by the Doubly Labelled Water Method in Professional Soccer Players” J. of Sports Sciences, 2002: 391–397, which contained the accompanying data MJday. Player 1 2 3 4 5 6 7 Expenditure 14.4 12.1 14.3 14.2 15.2 15.5 17.8 Intake 14.6 9.2 11.8 11.6 12.7 15.0 16.3 Test to see whether there is a significant difference between intake and expenditure. Does the conclusion depend on whether a significance level of .05, .01, or .001 is used? 85. An experimenter wishes to obtain a CI for the difference between true average breaking strength for cables manufac- tured by company I and by company II. Suppose breaking strength is normally distributed for both types of cable with psi and psi. a. If costs dictate that the sample size for the type I cable should be three times the sample size for the type II cable, how many observations are required if the 99 CI is to be no wider than 20 psi? s 2 5 20 s 1 5 30 x b. Suppose a total of 400 observations is to be made. How many of the observations should be made on type I cable samples if the width of the resulting interval is to be a minimum? 86. An experiment to determine the effects of temperature on the survival of insect eggs was described in the article “Development Rates and a Temperature-Dependent Model of Pales Weevil” Environ. Entomology, 1987: 956–962. At 11°C, 73 of 91 eggs survived to the next stage of develop- ment. At 30°C, 102 of 110 eggs survived. Do the results of this experiment suggest that the survival rate proportion surviving differs for the two temperatures? Calculate the P -value and use it to test the appropriate hypotheses. 87. Wait staff at restaurants have employed various strategies to increase tips. An article in the Sept. 5, 2005, New Yorker reported that “In one study a waitress received 50 more in tips when she introduced herself by name than when she didn’t.” Consider the following fictitious data on tip amount as a percentage of the bill: Introduction: No introduction: Does this data suggest that an introduction increases tips on average by more than 50? State and test the relevant hypotheses. [Hint: Consider the parameter .] 88. The paper “Quantitative Assessment of Glenohumeral Translation in Baseball Players” The Amer. J. of Sports Med., 2004: 1711–1715 considered various aspects of shoulder motion for a sample of pitchers and another sample of position players [glenohumeral refers to the articulation between the humerus ball and the glenoid socket]. The authors kindly supplied the following data on anteroposterior translation mm, a measure of the extent of anterior and posterior motion, both for the dominant arm and the nondominant arm. Pos Dom Tr Pos ND Tr Pit Dom Tr Pit ND Tr 1 30.31 32.54 27.63 24.33 2 44.86 40.95 30.57 26.36 3 22.09 23.48 32.62 30.62 4 31.26 31.11 39.79 33.74 5 28.07 28.75 28.50 29.84 6 31.93 29.32 26.70 26.71 7 34.68 34.79 30.34 26.45 8 29.10 28.87 28.69 21.49 9 25.51 27.59 31.19 20.82 10 22.49 21.01 36.00 21.75 11 28.74 30.31 31.58 28.32 12 27.89 27.92 32.55 27.22 13 28.48 27.85 29.56 28.86 14 25.60 24.95 28.64 28.58 15 20.21 21.59 28.58 27.15 16 33.77 32.48 31.99 29.46 17 32.59 32.48 27.16 21.26 18 32.60 31.61 19 29.30 27.46 mean 29.4463 29.2137 30.7112 26.6447 sd 5.4655 4.7013 3.3310 3.6679 a. Estimate the true average difference in translation between dominant and nondominant arms for pitchers in u 5 m 1 2 1.5m 2 s 2 5 6.10 y 5 14.15 n 5 50 s 1 5 7.82 x 5 22.63 m 5 50 Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. a way that conveys information about reliability and pre- cision, and interpret the resulting estimate. b. Repeat a for position players. c. The authors asserted that “pitchers have greater difference in side-to-side anteroposterior translation of their shoulders compared with position players.” Do you agree? Explain. 89. Suppose a level .05 test of versus is to be performed, assuming and normality of both distributions, using equal sample sizes . Evaluate the probability of a type II error when and , and 10,000. Can you think of real problems in which the difference has little practical significance? Would sam- ple sizes of be desirable in such problems? 90. The following data refers to airborne bacteria count num- ber of coloniesft 3 both for carpeted hospital rooms and for uncarpeted rooms “Microbial Air Sampling in a Carpeted Hospital,” J. of Environmental Health , 1968: 405. Does there appear to be a difference in true average bacteria count between carpeted and uncarpeted rooms? Carpeted 11.8 8.2 7.1 13.0 10.8 10.1 14.6 14.0 Uncarpeted 12.1 8.3 3.8 7.2 12.0 11.1 10.1 13.7 Suppose you later learned that all carpeted rooms were in a veterans’ hospital, whereas all uncarpeted rooms were in a children’s hospital. Would you be able to assess the effect of carpeting? Comment. 91. Researchers sent 5000 resumes in response to job ads that appeared in the Boston Globe and Chicago Tribune. The resumes were identical except that 2500 of them had “white sounding” first names, such as Brett and Emily, whereas the other 2500 had “black sounding” names such as Tamika and Rasheed. The resumes of the first type elicited 250 responses and the resumes of the second type only 167 responses these numbers are very consistent with informa- tion that appeared in a Jan. 15, 2003, report by the Associated Press. Does this data strongly suggest that a resume with a “black” name is less likely to result in a response than is a resume with a “white” name? 92. McNemar’s test, developed in Exercise 54, can also be used when individuals are paired matched to yield n pairs and then one member of each pair is given treatment 1 and the other is given treatment 2. Then X 1 is the num- ber of pairs in which both treatments were successful, and similarly for X 2 , X 3 , and X 4 . The test statistic for testing equal efficacy of the two treatments is given by , which has approximately a stan- dard normal distribution when H is true. Use this to test X 2 2 X 3 1X 2 1 X 3 n 5 8 m 5 8 n 5 10,000 m 1 2 m 2 5 1 n 5 25, 100, 2500 m 1 2 m 2 5 1 m 5 n 10 s 1 5 s 2 5 H a : m 1 2 m 2 . H : m 1 2 m 2 5 whether the drug ergotamine is effective in the treatment of migraine headaches. Ergotamine S F Placebo S 44 34 F 46 30 The data is fictitious, but the conclusion agrees with that in the article “Controlled Clinical Trial of Ergotamine Tar- trate” British Med. J., 1970: 325–327. 93. The article “Evaluating Variability in Filling Operations” Food Tech., 1984: 51–55 describes two different filling operations used in a ground-beef packing plant. Both filling operations were set to fill packages with 1400 g of ground beef. In a random sample of size 30 taken from each filling operation, the resulting means and standard deviations were 1402.24 g and 10.97 g for operation 1 and 1419.63 g and 9.96 g for operation 2. a. Using a .05 significance level, is there sufficient evi- dence to indicate that the true mean weight of the pack- ages differs for the two operations? b. Does the data from operation 1 suggest that the true mean weight of packages produced by operation 1 is higher than 1400 g? Use a .05 significance level. 94. Let be a random sample from a Poisson distribu- tion with parameter m 1 , and let be a random sam- ple from another Poisson distribution with parameter m 2 . We wish to test against one of the three standard alternatives. When m and n are large, the large-sample z test of Section 9.1 can be used. However, the fact that suggests that a different denominator should be used in stan- dardizing . Develop a large-sample test procedure appropriate to this problem, and then apply it to the following data to test whether the plant densities for a particular species are equal in two different regions where each observation is the number of plants found in a randomly located square sam- pling quadrate having area 1 m 2 , so for region 1 there were 40 quadrates in which one plant was observed, etc.: Frequency 1 2 3 4 5 6 7 Region 1 28 40 28 17 8 2 1 1 Region 2 14 25 30 18 49 2 1 1 95. Referring to Exercise 94, develop a large-sample confidence interval formula for . Calculate the interval for the data given there using a confidence level of 95. m 1 2 m 2 n 5 140 m 5 125 X 2 Y V X 5 mn H : m 1 2 m 2 5 Y 1 , c , Y n X 1 , c , X m Bibliography See the bibliography at the end of Chapter 7. Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 391 10 INTRODUCTION In studying methods for the analysis of quantitative data, we first focused on problems involving a single sample of numbers and then turned to a comparative analysis of two such different samples. In one-sample problems, the data consisted of observations on or responses from individuals or experimental objects randomly selected from a single population. In two-sample problems, either the two sam- ples were drawn from two different populations and the parameters of interest were the population means, or else two different treatments were applied to experimental units individuals or objects selected from a single population; in this latter case, the parameters of interest are referred to as true treatment means. The analysis of variance, or more briefly ANOVA, refers broadly to a col- lection of experimental situations and statistical procedures for the analysis of quantitative responses from experimental units. The simplest ANOVA problem is referred to variously as a single-factor, single-classification, or one-way ANOVA. It involves the analysis either of data sampled from more than two numerical populations distributions or of data from experiments in which more than two treatments have been used. The characteristic that differentiates the treatments or populations from one another is called the factor under study, and the different treatments or populations are referred to as the levels of the factor. Examples of such situations include the following:

1. An experiment to study the effects of five different brands of gasoline on

automobile engine operating efficiency mpg

2. An experiment to study the effects of the presence of four different sugar

solutions glucose, sucrose, fructose, and a mixture of the three on bacterial growth The Analysis of Variance Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. Example 10.1

3. An experiment to investigate whether hardwood concentration in pulp

has an effect on tensile strength of bags made from the pulp

4. An experiment to decide whether the color density of fabric specimens

depends on the amount of dye used In 1 the factor of interest is gasoline brand, and there are five different levels of the factor. In 2 the factor is sugar, with four levels or five, if a con- trol solution containing no sugar is used. In both 1 and 2, the factor is qual- itative in nature, and the levels correspond to possible categories of the factor. In 3 and 4, the factors are concentration of hardwood and amount of dye, respectively; both these factors are quantitative in nature, so the levels identify different settings of the factor. When the factor of interest is quantitative, sta- tistical techniques from regression analysis discussed in Chapters 12 and 13 can also be used to analyze the data. This chapter focuses on single-factor ANOVA. Section 10.1 presents the F test for testing the null hypothesis that the population or treatment means are identical. Section 10.2 considers further analysis of the data when H has been rejected. Section 10.3 covers some other aspects of single-factor ANOVA. Chapter 11 introduces ANOVA experiments involving more than a single factor. 10.1 Single-Factor ANOVA Single-factor ANOVA focuses on a comparison of more than two population or treat- ment means. Let the number of populations or treatments being compared the mean of population 1 or the true average response when treatment 1 is applied the mean of population I or the true average response when treatment I is applied The relevant hypotheses are versus If , H is true only if all four m i ’s are identical. H a would be true, for example, if , if , or if all four m i ’s differ from one another. A test of these hypotheses requires that we have available a random sample from each population or treatment. The article “Compression of Single-Wall Corrugated Shipping Containers Using Fixed and Floating Test Platens” J. Testing and Evaluation, 1992: 318–320 describes an experiment in which several different types of boxes were compared m 1 5 m 3 5 m 4 2 m 2 m 1 5 m 2 2 m 3 5 m 4 I 5 4 H a : at least two the of the m i ’s are different H : m 1 5 m 2 5 c5 m I m I 5 m 1 5 I 5 Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook andor eChapters. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.