Obtain the bootstrap p-value for testing whether the mean percentage of SiO Why is there such a good agreement between the t-based and bootstrap values in

290 CHAPTER 6 Inferences Comparing Two Population Central Values 6.1 Introduction and Abstract of Research Study 6.2 Inferences about M 1 ⴚ M 2 : Independent Samples 6.3 A Nonparametric Alternative: The Wilcoxon Rank Sum Test 6.4 Inferences about M 1 ⴚ M 2 : Paired Data 6.5 A Nonparametric Alternative: The Wilcoxon Signed-Rank Test 6.6 Choosing Sample Sizes for Inferences about M 1 ⴚ M 2 6.7 Research Study: Effects of Oil Spill on Plant Growth 6.8 Summary and Key Formulas 6.9 Exercises

6.1 Introduction and Abstract of Research Study

The inferences we have made so far have concerned a parameter from a single population. Quite often we are faced with an inference involving a comparison of parameters from different populations. We might wish to compare the mean corn crop yield for two different varieties of corn, the mean annual income for two ethnic groups, the mean nitrogen content of two different lakes, or the mean length of time between administration and eventual relief for two different antivertigo drugs. In many sampling situations, we will select independent random samples from two populations to compare the populations’ parameters. The statistics used to make these inferences will, in many cases, be the difference between the corre- sponding sample statistics. Suppose we select independent random samples of n 1 observations from one population and n 2 observations from a second population. We will use the difference between the sample means, , to make an infer- ence about the difference between the population means, . The following theorem will help in finding the sampling distribution for the difference between sample statistics computed from independent random samples. m 1 ⫺ m 2 y 1 ⫺ y 2 Theorem 6.1 can be applied directly to find the sampling distribution of the difference between two independent sample means or two independent sample proportions. The Central Limit Theorem discussed in Chapter 4 implies that if two random samples of sizes n 1 and n 2 are independently selected from two popu- lations 1 and 2, then, where n 1 and n 2 are large, the sampling distributions of and will be approximately normal, with means and variances and , respectively. Consequently, because and are independent, nor- mally distributed random variables, it follows from Theorem 6.1 that the sampling distribution for the difference in the sample means, , is approximately normal, with a mean variance and a standard error The sampling distribution of the difference between two independent, normally distributed sample means is shown in Figure 6.1. s y 1 ⫺y 2 ⫽ A s 2 1 n 1 ⫹ s 2 2 n 2 s y 1 ⫺y 2 2 ⫽ s y 1 2 ⫹ s y 2 2 ⫽ s 2 1 n 1 ⫹ s 2 2 n 2 m y 1 ⫺ y 2 ⫽ m 1 ⫺ m 2 y 1 ⫺ y 2 y 2 y 1 m 2 , s 2 2 兾n 2 m 1 , s 2 1 兾n 1 y 2 y 1 If two independent random variables y 1 and y 2 are normally distributed with means and variances and , respectively, the difference between the random variables is normally distributed with mean equal to and variance equal to . Similarly, the sum of the ran- dom variables is normally distributed with mean and variance . s 2 1 ⫹ s 2 2 m 1 ⫹ m 2 y 1 ⫹ y 2 s 2 1 ⫹ s 2 2 m 1 ⫺ m 2 m 2 , s 2 2 m 1 , s 1 2 THEOREM 6.1 Properties of the Sampling Distribution for the Difference between Two Sample Means, y ⴚ 1 ⴚ y ⴚ 2 1. The sampling distribution of is approximately normal for large samples. 2. The mean of the sampling distribution, , is equal to the difference between the population means, .

3.

The standard error of the sampling distribution is s y 1 ⫺y 2 ⫽ A s 2 1 n 1 ⫹ s 2 2 n 2 m 1 ⫺ m 2 m y 1 ⫺y 2 y 1 ⫺ y 2 The sampling distribution for the difference between two sample means, , can be used to answer the same types of questions as we asked about the sampling distribution for in Chapter 4. Because sample statistics are used to make inferences about corresponding population parameters, we can use the sam- pling distribution of a statistic to calculate the probability that the statistic will be within a specified distance of the population parameter. For example, we could use the sampling distribution of the difference in sample means to calculate the prob- ability that will be within a specified distance of the unknown difference in population means . Inferences estimations or tests about will be discussed in succeeding sections of this chapter. m 1 ⫺ m 2 m 1 ⫺ m 2 y 1 ⫺ y 2 y y 1 ⫺ y 2