Evaluating Whether or Not a Population

the same fashion as the population percentiles, which were defined in Section 4.10. Thus, the sample quantile Qu has at least 100u of the data values less than Qu and has at least 1001 ⫺ u of the data values greater than Qu. For example, Q.1 has at least 10 of the data values less than Q.1 and has at least 90 of the data val- ues greater than Q.1. Q.5 has at least 50 of the data values less than Q.5 and has at least 50 of the data values greater than Q.5. Finally, Q.75 has at least 75 of the data values less than Q.75 and has at least 25 of the data values greater than Q.25. This motivates the following definition for the sample quantiles: DEFINITION 4.14 Let y 1 , y 2 , . . . , y n be the ordered values from a data set. The [i ⫺ .5 兾n]th sample quantile, Qi ⫺ .5 兾n is y i . That is, y 1 ⫽ Q.5 兾n is the [.5兾n]th sample quantile, y 2 ⫽ Q1.5 兾n is the [1.5兾n]th sample quantile, . . . , and lastly, y n ⫽ Qn ⫺ .5 兾n] is the [n ⫺ .5兾n]th sample quantile. Suppose we had a sample of n ⫽ 20 observations: y 1 , y 2 , . . . , y 20 . Then, y 1 ⫽ Q.5 兾20 ⫽ Q.025 is the .025th sample quantile, y 2 ⫽ Q1.5 兾20 ⫽ Q.075 is the .075th sample quantile, y 3 ⫽ Q2.5 兾20 ⫽ Q.125 is the .125th sample quantile, . . . , and y 20 ⫽ Q19.5 兾20 ⫽ Q.975 is the .975th sample quantile. In order to evaluate whether a population distribution is normal, a random sample of n observations is obtained, the sample quantiles are computed, and these n quantiles are compared to the corresponding quantiles computed using the con- jectured population distribution. If the conjectured distribution is the normal distribution, then we would use the normal tables to obtain the quantiles z i⫺.5 兾n for i ⫽ 1, 2, . . . , n. The normal quantiles are obtained from the standard normal tables, Table 1, for the n values .5 兾n, 1.5兾n, . . . , n ⫺ .5兾n. For example, if we had n ⫽ 20 data values, then we would obtain the normal quantiles for .5 兾20 ⫽ .025, 1.5 兾20 ⫽ .075, 2.5兾20 ⫽ .125, . . . , 20 ⫺ .5兾20 ⫽ .975. From Table 1, we find that these quantiles are given by z .025 ⫽ ⫺ 1.960, z .075 ⫽ ⫺ 1.440, z .125 ⫽ ⫺ 1.150, . . . , z .975 ⫽ 1.960. The normal quantile plot is obtained by plotting the n pairs of points If the population from which the sample of n values was randomly selected has a normal distribution, then the plotted points should fall close to a straight line. The following example will illustrate these ideas. EXAMPLE 4.27 It is generally assumed that cholesterol readings in large populations have a normal distribution. In order to evaluate this conjecture, the cholesterol readings of n ⫽ 20 patients were obtained. These are given in Table 4.12, along with the corresponding normal quantile values. It is important to note that the cholesterol readings are given in an ordered fashion from smallest to largest. The smallest cholesterol read- ing is matched with the smallest normal quantile, the second-smallest cholesterol reading with the second-smallest quantile, and so on. Obtain the normal quantile plot for the cholesterol data and assess whether the data were selected from a pop- ulation having a normal distribution. z .5 兾n , y 1 ; z 1.5 兾n , y 2 ; z 2.5 兾n , y 3 ; . . . ; z n⫺ .5 兾n , y n . Solution FIGURE 4.27 Normal quantile plot 110 –2 –1 Normal quantiles Cholesterol readings 1 2 130 150 170 190 210 230 250 270 290 TABLE 4.12 Sample and normal quantiles for cholesterol readings Patient Cholesterol Reading i ⴚ .5 兾20 Normal Quantile 1 133 .025 ⫺ 1.960 2 137 .075 ⫺ 1.440 3 148 .125 ⫺ 1.150 4 149 .175 ⫺ .935 5 152 .225 ⫺ .755 6 167 .275 ⫺ .598 7 174 .325 ⫺ .454 8 179 .375 ⫺ .319 9 189 .425 ⫺ .189 10 192 .475 ⫺ .063 11 201 .525 .063 12 209 .575 .189 13 210 .625 .319 14 211 .675 .454 15 218 .725 .598 16 238 .775 .755 17 245 .825 .935 18 248 .875 1.150 19 253 .925 1.440 20 257 .975 1.960 A plot of the sample quantiles versus the corresponding normal quantiles is dis- played in Figure 4.27. The plotted points generally follow a straight line pattern. Using Minitab, we can obtain a plot with a fitted line that assists us in assess- ing how close the plotted points fall relative to a straight line. This plot is displayed in Figure 4.28. The 20 points appear to be relatively close to the fitted line and thus the normal quantile plot would appear to suggest that the normality of the popu- lation distribution is plausible. Using a graphical procedure, there is a high degree of subjectivity in making an assessment of how well the plotted points fit a straight line. The scales of the axes on the plot can be increased or decreased, resulting in a change in our assessment of fit. Therefore, a quantitative assessment of the degree to which the plotted points fall near a straight line will be introduced. In Chapter 3, we introduced the sample correlation coefficient r to measure the degree to which two variables satisfied a linear relationship. We will now dis- cuss how this coefficient can be used to assess our certainty that the sample data was selected from a population having a normal distribution. First, we must alter which normal quantiles are associated with the ordered data values. In the above discussion, we used the normal quantiles corresponding to i ⫺ .5 兾n. In calculat- ing the correlation between the ordered data values and the normal quantiles, a more precise measure is obtained if we associate the i ⫺ .375 兾n ⫹ .25 normal quantiles for i ⫽ 1, . . . , n with the n data values y 1 , . . . , y n . We then calculate the value of the correlation coefficient, r, from the n pairs of values. To provide a more definitive assessment of our level of certainty that the data were sampled from a nor- mal distribution, we then obtain a value from Table 16 in the Appendix. This value, called a p-value, can then be used along with the following criterion Table 4.13 to rate the degree of fit of the data to a normal distribution. FIGURE 4.28 Normal quantile plot 100 –2 –1 Normal quantiles Cholesterol = 195.5 + 39.4884 Normal Quantiles S = 8.30179 R-Sq = 95.9 R-Sqadj = 95.7 Cholesterol readings 1 2 120 140 160 180 200 220 240 260 280 TABLE 4.13 Criteria for assessing fit of normal distribution p-value Assessment of Normality p ⬍ .01 Very poor fit .01 ⱕ p ⬍ .05 Poor fit .05 ⱕ p ⬍ .10 Acceptable fit .10 ⱕ p ⬍ .50 Good fit p ⱖ .50 Excellent fit EXAMPLE 4.28 Consider the cholesterol data in Example 4.27. Calculate the correlation coefficient and make a determination of the degree of fit of the data to a normal distribution. It is very important that the normal quantile plot accompany the calculation of the correlation because large sample sizes may result in an assessment of a poor fit when the graph would indicate otherwise. The following example will illustrate the calculations involved in obtaining the correlation. TABLE 4.14 Normal quantiles data Patient Cholesterol Reading i ⴚ .375 兾20 ⴙ .25 Normal Quantile i y i x i 1 133 .031 ⫺ 1.868 2 137 .080 ⫺ 1.403 3 148 .130 ⫺ 1.128 4 149 .179 ⫺ .919 5 152 .228 ⫺ .744 6 167 .278 ⫺ .589 7 174 .327 ⫺ .448 8 179 .377 ⫺ .315 9 189 .426 ⫺ .187 10 192 .475 ⫺ .062 11 201 .525 .062 12 209 .574 .187 13 210 .623 .315 14 211 .673 .448 15 218 .722 .589 16 238 .772 .744 17 245 .821 .919 18 248 .870 1.128 19 253 .920 1.403 20 257 .969 1.868 The calculation of the correlation between cholesterol reading y and normal quantile x will be done in Table 4.15. First, we compute and . Then the calculation of the correlation will proceed as in our calculations from Chapter 3. x ⫽ 0 y ⫽ 195.5 TABLE 4.15 Calculation of correlation coefficient x i ⴚ y i ⴚ x i ⴚ y i ⴚ y i ⴚ 2 x i ⴚ 2 x i ⴚ y i ⴚ 195.5 x i – 0y i ⴚ 195.5 y i ⴚ 195.5 2 x i ⴚ 2 ⫺ 1.868 ⫺ 62.5 116.765 3906.25 3.49033 ⫺ 1.403 ⫺ 58.5 82.100 3422.25 1.96957 ⫺ 1.128 ⫺ 47.5 53.587 2256.25 1.27271 ⫺ .919 ⫺ 46.5 42.740 2162.25 .84481 ⫺ .744 ⫺ 43.5 32.370 1892.25 .55375 ⫺ .589 ⫺ 28.5 16.799 812.25 .34746 ⫺ .448 ⫺ 21.5 9.627 462.25 .20050 ⫺ .315 ⫺ 16.5 5.190 272.25 .09896 ⫺ .187 ⫺ 6.5 1.214 42.25 .03488 ⫺ .062 ⫺ 3.5 .217 12.25 .00384 .062 5.5 .341 30.25 .00384 .187 13.5 2.521 182.25 .03488 .315 14.5 4.561 210.25 .09896 .448 15.5 6.940 240.25 .20050 .589 22.5 13.263 506.25 .34746 .744 42.5 31.626 1806.25 .55375 .919 49.5 45.497 2450.25 .84481 1.128 52.5 59.228 2756.25 1.27271 1.403 57.5 80.696 3306.25 1.96957 1.868 61.5 114.897 3782.25 3.49033 720.18 30511 17.634 x y y x y x Solution The data are summarized in Table 4.14 along with their corresponding normal quantiles: The correlation is then computed as From Table 16 in the Appendix with n ⫽ 20 and r ⫽ .982, we obtain p-value ⬇ .50. This value is obtained by locating the number in the row for n ⫽ 20 which is closest to r ⫽ .982. The a-value heading this column is the p-value. Thus, we would appear to have an excellent fit between the sample data and the normal distribution. This is consistent with the fit that is displayed in Figure 4.28, where the 20 plotted points are very near to the straight line.

4.15 Research Study: Inferences about Performance-

Enhancing Drugs among Athletes As was discussed in the abstract to the research study given at the beginning of this chapter, the use of performance-enhancing substances has two major consequences: the artificial enhancement of performance known as doping, and the use of po- tentially harmful substances may have significant health effects on the athlete. However, failing a drug test can devastate an athlete’s career. The controversy over performance-enhancing drugs has seriously brought into question the reliability of the tests for these drugs. The article in Chance discussed at the beginning of this chapter examines the case of Olympic runner Mary Decker Slaney. Ms. Slaney was a world-class distance runner during the 1970s and 1980s. After a series of illnesses and injuries, she was forced to stop competitive running. However, at the age of 37, Slaney made a comeback in long-distance running. Slaney submitted to a manda- tory test of her urine at the 1996 U.S. Olympic Trials. The results indicated that she had elevated levels of testosterone and hence may have used a banned performance- enhancing drug. Her attempt at a comeback was halted by her subsequent suspen- sion by the USA Track and Field USATF. Slaney maintained her innocence throughout a series of hearings before USATF and was exonerated in September 1997 by a Doping Hearing Board of the USATF. However, the U.S. Olympic Com- mittee USOC overruled the USATF decision and stated that Slaney was guilty of a doping offense. Although Slaney continued to maintain that she had never used the drug, her career as a competitive runner was terminated. Anti-doping officials regard a positive test result as irrefutable evidence that an illegal drug was used, to the exclusion of any other explanation. We will now address how the use of Bayes’ Formula, sensitivity and specificity of a test, and the prior probability of drug use can be used to explain to anti-doping officials that drug tests can be wrong. We will use tests for detecting artificial increases in testosterone concentra- tions to illustrate the various concepts involved in determining the reliability of a testing procedure. The article states, “Scientists have attempted to detect artificial increases in testosterone concentrations through the establishment of a ‘normal uri- nary range’ for the T 兾E ratio.” Despite the many limitations in setting this limit, scientists set the threshold for positive testosterone doping at a T 兾E ratio greater than 6:1. The problem is to determine the probabilities associated with various tests for the T 兾E ratio. In particular, what is the probability that an athlete is a banned- drug user given she tests positive for the drug positive predictive value, or PPV. We will use the example given in the article. Suppose in a population of 1,000 athletes there are 20 users. That is, prior to testing a randomly selected athlete for the drug there is a 20 兾1,000 ⫽ 2 chance that the athlete is a user the prior prob- ability of randomly selecting a user is .02 ⫽ 2. Suppose the testing procedure r ⫽ g n i⫽1 x i ⫺ x y i ⫺ y 1g n i⫽1 x i ⫺ x 2 g n i⫽1 y i ⫺ y 2 ⫽ 720.18 117.63430511 ⫽ .982 has a sensitivity of 80 and specificity of 99. Thus, 16 of the 20 users would test positive, 20.8 ⫽ 16, and about 10 of the nonusers would test positive, 9801 ⫺ .99 ⫽ 9.8. If an athlete tests positive, what is the probability she is a user? We now have to make use of Bayes’ Formula to compute PPV. , where “sens” is the sensitivity of the test, “spec” is the specificity of the test, and “prior” is the prior probability that an athlete is a banned-drug user. For our ex- ample with a population of 1,000 athletes, Therefore, if an athlete tests positive there is only a 62 chance that she has used the drug. Even if the sensitivity of the test is increased to 100, the PPV is still relatively small: There is a 32 chance that the athlete is a nonuser even though the test result was positive. Thus, if the prior probability is small, there will always be a high degree of uncertainty with the test result even when the test has values of sensitivity and specificity near 1. However, if the prior probability is fairly large, then the PPV will be much closer to 1. For example, if the population consists of 900 users and only 100 nonusers, and the testing procedure has sensitivity ⫽ .9 and specificity ⫽ .99, then the PPV would be .9988, That is, the chance that the tested athlete is a user given she produced a positive test would be 99.88, a very small chance of a false positive. From this we conclude that an essential factor in Bayes’ Formula is the prior probability of an athlete being a banned-drug user. Making matters even worse in this situation is the fact that the prevalence prior probability of substance abuse is very difficult to determine. Hence, there will inevitably be a subjective aspect to assigning a prior probability. The authors of the article comment on the selection of the prior probability suggesting that in their particular sport, a hearing board consisting of ath- letes participating in the same sport as the athlete being tested would be especially appropriate for making decisions about prior probabilities. For example, assuming the board knows nothing about the athlete beyond what is presented at the hearing, they might regard drug abuse to be rare and hence the PPV would be at most moder- ately large. On the other hand, if the board knew that drug abuse is widespread, then the probability of abuse would be larger, based on a positive test result. To investigate further the relationship between PPV, prior probability, and sensitivity, for a fixed specificity of 99, consider Figure 4.29. The calculations of PPV are obtained by using Bayes’ Formula for a selection of prior and sensitivity, and with specificity ⫽ .99. We can thus observe that if the sensitivity of the test is relatively low—say, less than 50—then unless the prior is above 20 we will not be able to achieve a PPV ⫽ .9 900兾1,000 .9 900兾1,000 ⫹ 1 ⫺ .99 1 ⫺ 900兾1,000 ⫽ .9988 PPV ⫽ 1 20兾1,000 1 20兾1,000 ⫹ 1 ⫺ .99 1 ⫺ 20兾1,000 ⫽ .67 PPV ⫽ .8 20兾1,000 .8 20兾1,000 ⫹ 1 ⫺ .99 1 ⫺ 20兾1,000 ⫽ .62 PPV ⫽ sens prior sens prior ⫹ 1 ⫺ spec 1 ⫺ prior