Distribution-Free confidence intervals

15.3 Distribution-Free confidence intervals

  The method we have used so far to construct a confidence interval (CI) can be described

  as follows: Start with a random variable (Z, T, x 2 , F, or the like) that depends on the

  parameter of interest and a probability statement involving the variable, manipulate the inequalities of the statement to isolate the parameter between random endpoints, and, finally, substitute computed values for random variables. Another general method for obtaining CIs takes advantage of the relationship between test procedures and CIs discussed in Section 8.5. A 100 s1 2 ad CI for a parameter u can be obtained from a

  level a test for H 0 :u5u 0 versus H a :u±u 0 . This method will be used to derive inter-

  vals associated with the Wilcoxon signed-rank test and the Wilcoxon rank-sum test.

  pRoposition Suppose we have a level a test procedure for testing H 0 :u5u 0 versus

  H a :u±u 0 . For fixed sample values, let A denote the set of all values u 0 for

  which H 0 is not rejected. Then A is a 100(1 2 a) CI for u.

  This makes intuitive sense because the CI consists of all values of the parameter

  that are plausible at the selected confidence level, and we do not want to reject H 0

  in favor of H a if u 0 is a plausible value.

  There are actually pathological examples in which the set A defined in the proposition is not an interval of u values, but instead the complement of an interval or something even stranger. To be more precise, we should really replace the notion of a CI with that of a confidence set. In the cases of interest here, the set A does turn out to be an interval.

  the Wilcoxon Signed-rank Interval

  To test H 0 :m5m 0 versus H a :m±m 0 using the Wilcoxon signed-rank test,

  where m is the mean of a continuous symmetric distribution, the absolute values

  ux 1 2m 0 u,…, ux n 2m 0 u are ordered from smallest to largest, with the smallest

  receiving rank 1 and the largest rank n. Each rank is then given the sign of its

  associated x i 2m 0 , and the test statistic is the sum of the positively signed ranks. The two-tailed test rejects H 0 if s 1 is either c or n sn 1 1dy2 2 c, where c

  is obtained from Appendix Table A.13 once the desired level of significance a is

  specified. For fixed x 1 ,…, x n , the 100 s1 2 ad signed-rank interval will consist of all m 0 for which H 0 :m5m 0 is not rejected at level a. To identify this interval, it is convenient to express the test statistic S 1 in another form.

  S 1 5 the number of pairwise averages (X i 1 X j ) y2 with i j that

  are m 0

  That is, if we average each x j in the list with each x i to its left, including

  sx j 1 x j dy2 (which is just x j ), and count the number of these averages that are

  m 0 ,s 1 results. In moving from left to right in the list of sample values, we are simply averaging every pair of observations in the sample [again including

  sx j 1 x j dy2] exactly once, so the order in which the observations are listed before

  668 ChapTeR 15 Distribution-Free procedures

  averaging is not important. The equivalence of the two methods for computing

  s is not difficult to verify. The number of pairwise averages is n 1 ( 2 ) 1 n (the first

  term due to averaging of different observations and the second due to averaging each x i with itself), which equals n sn 1 1dy2. It can be shown that P-value a

  if and only if either too many or too few of these pairwise averages are m 0 , in

  which case H 0 is rejected.

  ExAmplE 15.6 The following observations are values of cerebral metabolic rate for rhesus monkeys:

  x 1 5 2 4.51, x 5 3 4.59, x 5 4 4.90, x 5 5 4.93, x

  6 6.80, x 5 7 5.08, x 5 5.67.

  The 28 pairwise averages are, in increasing order,

  The first few and the last few of these are pictured in Figure 15.2.

  At level .046, H 0 is accepted for

  0 in here.

  Figure 15.2 Plot of the data for Example 15.6

  Because S 1 is a discrete rv, a 5 .05 cannot be obtained exactly. Appendix Table

  A.13 shows that the P-value for a two-tailed test is 2(.023) 5 .046 if either s 1 5 26 or 2. Thus H 0 will not be rejected at significance level .046 if 3 s 1 25. That is, if the number of pairwise averages m 0 is between 3 and 25, inclusive, H 0 is not

  rejected. From Figure 15.2 the CI for m with confidence level 95.4 (approximately

  95) is (4.59, 5.94).

  n

  In general, once the pairwise averages are ordered from smallest to largest, the endpoints of the Wilcoxon interval are two of the “extreme” averages. To express this precisely, let the smallest pairwise average be denoted by x s1d , the next smallest

  by x s2d ,…, and the largest by x sn sn11dy2d .

  pRoposition

  If the level a Wilcoxon signed-rank test for H 0 :m5m 0 versus H a :m±m 0 is to reject H 0 if either s 1 c or s 1 n (n 1 1) y2 2 c, then a 100(1 2 a) CI

  for m is

  (x (n(n11) y22c11) ,x (c) ) (15.7)

  In words, the interval extends from the dth smallest pairwise average to the dth larg- est average, where d 5 n sn 1 1dy2 2 c 1 1. Appendix Table A.15 gives the values of c that correspond to approximately the usual confidence levels for n 5 5, 6,…, 25.

  15.3 Distribution-Free Confidence Intervals 669

  ExAmplE 15.7

  For n 5 7, the P-value for a two-tailed test is 2(.055) 5 .11 if s 1 5 24 or s 1 5 4.

  (Example 15.6

  Therefore the null hypothesis will be rejected at significance level .11 if s 1 5 0, 1, 2,

  continued) 3, 4, 24, 25, 26, 27, or 28. Thus an 89.0 interval (approximately 90) is obtained by using c 5 24. The interval is sx s2822411d , x s24d d 5 sx s5d , x s24d d 5 s4.72, 5.85d, which extends from the fifth smallest to the fifth largest pairwise average.

  n

  The derivation of the interval depended on having a single sample from a con- tinuous symmetric distribution with mean (median) m. When the data is paired, the

  interval constructed from the differences d 1 ,d 2 ,…, d n is a CI for the mean (median)

  difference m D . In this case, the symmetry of X and Y distributions need not be assumed;

  as long as the X and Y distributions have the same shape, the X 2 Y distribution will

  be symmetric, so only continuity is required.

  For n . 20, the large-sample approximation to the Wilcoxon test based

  on standardizing S 1 gives an approximation to c in (15.7). The result [for a

  100 s1 2 ad interval] is

  4 y2 Î 24

  n (n 1 1)

  n (n 1 1)(2n 1 1)

  c <

  1 z a

  The efficiency of the Wilcoxon interval relative to the t interval is roughly the same as that for the Wilcoxon test relative to the t test. In particular, for large sam- ples when the underlying population is normal, the Wilcoxon interval will tend to be slightly wider than the t interval, but if the population is quite nonnormal (symmetric but with heavy tails), then the Wilcoxon interval will tend to be much narrower than the t interval.

  the Wilcoxon rank-Sum Interval

  The Wilcoxon rank-sum test for testing H 0 :m 1 2m 2 5D 0 is carried out by first combining the sX i 2D 0 d’s and Y j ’s into one sample of size m 1 n and ranking them

  from smallest (rank 1) to largest (rank m 1 n). The test statistic W is then the sum

  of the ranks of the sX i 2D 0 d’s. For the two-sided alternative, H 0 is rejected if w is

  either too small or too large.

  To obtain the associated CI for fixed x i ’s and y j ’s, we must determine the

  set of all D 0 values for which H 0 is not rejected. This is easiest to do if the test sta-

  tistic is expressed in a slightly different form. The smallest possible value of W is

  m sm 1 1dy2, corresponding to every sX i 2D 0 d less than every Y j , and there are mn differences of the form sX i 2D 0 d2Y j . A bit of manipulation gives

  m (m 1 1)

  W 5 [number of (X i 2 Y j 2D 0 )’s 0] 1

  2 (15.8) m (m 1 1)

  5 [number of (X i 2 Y j )’s D 0 ]1

  The P-value will be at most a, leading to rejection of the null hypothesis, if w is relatively small (close to 0) or large (close to m(m 1 2n 1 1) y2). This is equivalent

  to rejecting H 0 if the number of (x i 2 y j )’s D 0 is either too small or too large. Expression (15.8) suggests that we compute x i 2 y j for each i and j and order these mn differences from smallest to largest. Then if the null value D 0 is neither smaller than most of the differences nor larger than most, H 0 : m 1 2m 2 5D 0 is not rejected. Varying D 0 now shows that a CI for m 1 2m 2 will have as its lower endpoint

  one of the ordered sx i 2 y j d’s, and similarly for the upper endpoint.

  670 ChapTeR 15 Distribution-Free procedures

  pRoposition Let x 1 ,…, x m and y 1 ,…, y n

  be the observed values in two independent samples

  from continuous distributions that differ only in location (and not in shape).

  With d ij 5 x i 2 y j and the ordered differences denoted by d ij (1) ,d ij (2) ,…, d ij (mn) ,

  the general form of a 100(1 2 a) CI for m 1 2m 2 is

  (d ij (mn2c11) ,d ij (c) ) (15.9)

  where c is the critical constant for the two-tailed level a Wilcoxon rank-sum test.

  Notice that the form of the Wilcoxon rank-sum interval (15.9) is very similar to the Wilcoxon signed-rank interval (15.7); that uses pairwise averages from a sin- gle sample, whereas (15.9) uses pairwise differences from two samples. Appendix Table A.16 gives values of c for selected values of m and n.

  ExAmplE 15.8

  The article “Some Mechanical Properties of Impregnated Bark Board” (Forest

  Products J., 1977: 31–38) reports the following data on maximum crushing strength (psi) for a sample of epoxy-impregnated bark board and for a sample of bark board impregnated with another polymer:

  Epoxy (x’s) 10,860 11,120 11,340 12,130 14,380 13,070 Other ( y’s) 4590 4850 6510 5640 6390

  Let’s obtain a 95 CI for the true average difference in crushing strength between the epoxy-impregnated board and the other type of board.

  From Appendix Table A.16, since the smaller sample size is 5 and the larger sample size is 6, c 5 26 for a confidence level of approximately 95. The d ij ’s

  appear in Table 15.5. The five smallest d ij ’s [d ij s1d ,…, d ij s5d ] are 4350, 4470, 4610,

  4730, and 4830; and the five largest d ij ’s are (in descending order) 9790, 9530, 8740, 8480, and 8220. Thus the CI is sd ij s5d ,d ij s26d d 5 s4830, 8220d.

  Table 15.5 Differences for the Rank-Sum Interval in Example 15.8

  y j

  d ij 4590 4850 5640 6390 6510

  10,860 6270 6010 5220 4470 4350 11,120 6530 6270 5480 4730 4610 x i 11,340 6750 6490 5700 4950 4830 12,130 7540 7280 6490 5740 5620 13,070 8480 8220 7430 6680 6560 14,380 9790 9530 8740 7990 7870

  n

  When m and n are both large, the Wilcoxon test statistic has approximately a normal distribution. This can be used to derive a large-sample approximation for the value c in interval (15.9). The result is

  2 a y2 Î 12

  mn

  mn sm 1 n 1 1d

  c <

  1z

  15.4 Distribution-Free aNOVa 671

  As with the signed-rank interval, the rank-sum interval (15.9) is quite efficient with respect to the t interval; in large samples, it will tend to be only a bit wider than the t interval when the underlying populations are normal and may be considerably narrower than the t interval if the underlying populations have heavier tails than do normal populations.

  ExERciSES section 15.3 (17–22)

  17. The article “The Lead Content and Acidity of Christ-

  Calculate a CI using a confidence level of roughly 95 church Precipitation” (N. Zeal. J. of Science, 1980: 311– for the difference between the true average amount

  312) reports the accompanying data on lead concentration

  extracted using the first solvent and the true average

  (mgL) in samples gathered during eight different summer

  amount extracted using the second solvent.

  rainfalls: 17.0, 21.4, 30.6, 5.0, 12.2, 11.8, 17.3, and 18.8.

  20. The following observations are amounts of hydrocarbon

  Assuming that the lead-content distribution is symmetric, use

  emissions resulting from road wear of bias-belted tires

  the Wilcoxon signed-rank interval to obtain a 95 CI for m.

  under a 522 kg load inflated at 228 kPa and driven at

  18. Compute the 99 signed-rank interval for true average

  64 kmhr for 6 hours (“Characterization of Tire

  pH m (assuming symmetry) using the data in Exercise

  Emissions Using an Indoor Test Facility,” Rubber

  15.3. [Hint: Try to compute only those pairwise averages

  Chemistry and Technology, 1978: 7–25) : .045, .117,

  having relatively small or large values (rather than all

  .062, and .072. What confidence levels are achievable

  105 averages).]

  for this sample size using the signed-rank interval? Select an appropriate confidence level and compute the

  19. An experiment was carried out to compare the abilities

  interval.

  of two different solvents to extract creosote impregnated in test logs. Each of eight logs was divided into two seg-

  21. Compute the 90 rank-sum CI for m 1 2m 2 using the

  ments, and then one segment was randomly selected for

  data in Exercise 11.

  application of the first solvent, with the other segment

  22. Compute a 99 CI for m 1 2m 2 using the data in

  receiving the second solvent.

  Solvent 1 3.92 3.79 3.70 4.08 3.87 3.95 3.55 3.76 Solvent 2 4.25 4.20 4.41 3.89 4.39 3.75 4.20 3.90

Dokumen yang terkait

AN ALIS IS YU RID IS PUT USAN BE B AS DAL AM P E RKAR A TIND AK P IDA NA P E NY E RTA AN M E L AK U K A N P R AK T IK K E DO K T E RA N YA NG M E N G A K IB ATK AN M ATINYA P AS IE N ( PUT USA N N O MOR: 9 0/PID.B /2011/ PN.MD O)

0 82 16

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

Anal isi s L e ve l Pe r tanyaan p ad a S oal Ce r ita d alam B u k u T e k s M at e m at ik a Pe n u n jang S MK Pr ogr a m Keahl ian T e k n ologi , Kese h at an , d an Pe r tani an Kelas X T e r b itan E r lan gga B e r d asarkan T ak s on om i S OL O

2 99 16

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22