Evaluate the conditions necessary to validly test hypotheses and construct confidence Does the independence condition appear to be violated?

6.12 Refer to Exercise 6.11. There appears to be a large variation in the mean PCB content across the 13 sites. How could we reduce the effect of variation in PCB content due to site differ- ences on the evaluation of the difference in the mean PCB content between the two years? H.R. 6.13 A firm has a generous but rather complicated policy concerning end-of-year bonuses for its lower-level managerial personnel. The policy’s key factor is a subjective judgment of “contri- bution to corporate goals.” A personnel officer took samples of 24 female and 36 male managers to see whether there was any difference in bonuses, expressed as a percentage of yearly salary. The data are listed here: Gender Bonus Percentage F 9.2 7.7 11.9 6.2 9.0 8.4 6.9 7.6 7.4 8.0 9.9 6.7 8.4 9.3 9.1 8.7 9.2 9.1 8.4 9.6 7.7 9.0 9.0 8.4 M 10.4 8.9 11.7 12.0 8.7 9.4 9.8 9.0 9.2 9.7 9.1 8.8 7.9 9.9 10.0 10.1 9.0 11.4 8.7 9.6 9.2 9.7 8.9 9.2 9.4 9.7 8.9 9.3 10.4 11.9 9.0 12.0 9.6 9.2 9.9 9.0

a. Identify the value of the pooled-variance t statistic the usual t test based on the equal

variance assumption. b. Identify the value of the t⬘ statistic. c. Use both statistics to test the research hypothesis of unequal means at a ⫽ .05 and at a ⫽ .01. Does the conclusion depend on which statistic is used? Boxplots of females’ and males’ bonuses means are indicated by solid circles Males’ bonuses Females’ bonuses 6 7 8 9 10 11 12 Bonus percentage Two-Sample T-Test and Confidence Interval Two-sample T for Female vs Male N Mean StDev SE Mean Female 24 8 .53 1.19 0.24 Male 36 9.68 1.00 0.17 95 CI for mu Female mu Male: 1.74, 0.56 T-Test mu Female mu Male vs : T 3.90 P 0.0002 DF 43 95 CI for mu Female mu Male: 1.72, 0.58 T-Test mu Female mu Male vs : T 4.04 P 0.0001 DF 58 Both use Pooled StDev 1.08 6.3 A Nonparametric Alternative: The Wilcoxon Rank Sum Test 6.14 Set up the rejection regions for testing:

a. H

: ∆ ⫽ 0 versus H a : ∆ ⫽ 0, with n 1 ⫽ 10, n 2 ⫽ 8, and a ⫽ .10

b. H

: ∆ ⫽ 0 versus H a : ∆ ⬍ 0, with n 1 ⫽ n 2 ⫽ 7, and a ⫽ .05

c. H

: ∆ ⫽ 0 versus H a : ∆ ⬎ 0, with n 1 ⫽ 8, n 2 ⫽ 9, and a ⫽ .025

6.15 Conduct a test of H

: ⌬ ⫽ 0 versus H a : ⌬ ⬎ 0, for the sample data given here. Use a ⫽ .05. Treatment 1 4.3 4.6 4.7 5.1 5.3 5.3 5.8 Treatment 2 3.5 3.8 3.7 3.9 4.4 4.7 5.2 6.16 Refer to the data of Exercise 6.15. Place a 95 confidence interval on the median differ- ence between the two treatments ⌬. Bus. 6.17 A cable TV company was interested in making its operation more efficient by cutting down on the distance between service calls while still maintaining at least the same level of serv- ice quality. A treatment group of 18 repairpersons was assigned to a dispatcher who monitored all the incoming requests for cable repairs and then provided a service strategy for that day’s work orders. A control group of 18 repairpersons was to perform their work in a normal fashion, by providing service in roughly a sequential order as requests for repairs were received. The average daily mileages for the 36 repairpersons are recorded here: Treatment Group 62.2 79.3 83.2 82.2 84.1 89.3 95.8 97.9 91.5 96.6 90.1 98.6 85.2 87.9 86.7 99.7 101.1 88.6 Control Group 97.1 70.2 94.6 182.9 85.6 89.5 109.5 101.7 99.7 193.2 105.3 92.9 63.9 88.2 99.1 95.1 92.4 87.3

a. Various plots of the data are given here. Based on these plots, which of the test proce-

dures presented in this chapter appears more appropriate for assessing if the treat- ment group has smaller daily mileages in comparison to the control group?

b. Computer output is provided for two versions of the t test and a Wilcoxon rank sum

test the Mann-Whitney and Wilcoxon rank sum tests are equivalent. Compare the results for these three tests and draw a conclusion about the effectiveness of the new dispatcher program. Use a ⫽ .05.

c. Place a 95 confidence interval on the differences between the daily mileages for the

two groups of repairpersons. d. Does it matter which of the three tests is used in this study? Might it be reasonable to run all three tests in certain situations? Why or why not? 50 Treatment Boxplot of treatment and control groups A v erage daily mileage Control 75 100 125 150 175 200