Rank Correlation Coefficient

16.7 Rank Correlation Coefficient

In Chapter 11, we used the sample correlation coefficient r to measure the pop- ulation correlation coefficient ρ, the linear relationship between two continuous variables X and Y . If ranks 1, 2, . . . , n are assigned to the x observations in or- der of magnitude and similarly to the y observations, and if these ranks are then substituted for the actual numerical values in the formula for the correlation coef- ficient in Chapter 11, we obtain the nonparametric counterpart of the conventional correlation coefficient. A correlation coefficient calculated in this manner is known as the Spearman rank correlation coefficient and is denoted by r s . When there are no ties among either set of measurements, the formula for r s reduces to

a much simpler expression involving the differences d i between the ranks assigned to the n pairs of x’s and y’s, which we now state.

16.7 Rank Correlation Coefficient 675

Rank Correlation

A nonparametric measure of association between two variables X and Y is given Coefficient by the rank correlation coefficient

where d i is the difference between the ranks assigned to x i and y i and n is the number of pairs of data.

In practice, the preceding formula is also used when there are ties among ei- ther the x or y observations. The ranks for tied observations are assigned as in the signed-rank test by averaging the ranks that would have been assigned if the observations were distinguishable.

The value of r s will usually be close to the value obtained by finding r based on numerical measurements and is interpreted in much the same way. As before, the value of r s will range from −1 to +1. A value of +1 or −1 indicates perfect association between X and Y , the plus sign occurring for identical rankings and the minus sign occurring for reverse rankings. When r s is close to zero, we conclude that the variables are uncorrelated.

Example 16.8: The figures listed in Table 16.7, released by the Federal Trade Commission, show the milligrams of tar and nicotine found in 10 brands of cigarettes. Calculate the rank correlation coefficient to measure the degree of relationship between tar and nicotine content in cigarettes.

Table 16.7: Tar and Nicotine Contents Cigarette Brand Tar Content Nicotine Content

Chesterfield

Old Gold

Philip Morris

31 2.0 Solution : Let X and Y represent the tar and nicotine contents, respectively. First we assign

Players

ranks to each set of measurements, with the rank of 1 assigned to the lowest number in each set, the rank of 2 to the second lowest number in each set, and so forth, until the rank of 10 is assigned to the largest number. Table 16.8 shows the individual rankings of the measurements and the differences in ranks for the

10 pairs of observations.

676 Chapter 16 Nonparametric Statistics

Table 16.8: Rankings for Tar and Nicotine Content

Cigarette Brand

Chesterfield

Old Gold

Philip Morris

Substituting into the formula for r s , we find that (6)(5.50)

indicating a high positive correlation between the amounts of tar and nicotine found in cigarettes.

Some advantages to using r s rather than r do exist. For instance, we no longer assume the underlying relationship between X and Y to be linear and therefore, when the data possess a distinct curvilinear relationship, the rank correlation co- efficient will likely be more reliable than the conventional measure. A second ad- vantage to using the rank correlation coefficient is the fact that no assumptions of normality are made concerning the distributions of X and Y . Perhaps the greatest advantage occurs when we are unable to make meaningful numerical measurements but nevertheless can establish rankings. Such is the case, for example, when dif- ferent judges rank a group of individuals according to some attribute. The rank correlation coefficient can be used in this situation as a measure of the consistency of the two judges.

To test the hypothesis that ρ = 0 by using a rank correlation coefficient, one needs to consider the sampling distribution of the r s -values under the assumption of no correlation. Critical values for α = 0.05, 0.025, 0.01, and 0.005 have been calculated and appear in Table A.21. The setup of this table is similar to that of the table of critical values for the t-distribution except for the left column, which now gives the number of pairs of observations rather than the degrees of freedom. Since the distribution of the r s -values is symmetric about zero when ρ = 0, the r s -value that leaves an area of α to the left is equal to the negative of the r s -value that leaves an area of α to the right. For a two-sided alternative hypothesis, the critical region of size α falls equally in the two tails of the distribution. For a test in which the alternative hypothesis is negative, the critical region is entirely in the left tail of the distribution, and when the alternative is positive, the critical region is placed entirely in the right tail.

Exercises 677

Example 16.9: Refer to Example 16.8 and test the hypothesis that the correlation between the amounts of tar and nicotine found in cigarettes is zero against the alternative that it is greater than zero. Use a 0.01 level of significance.

4. Critical region: r s > 0.745 from Table A.21.

5. Computations: From Example 16.8, r s = 0.967.

6. Decision: Reject H 0 and conclude that there is a significant correlation be- tween the amounts of tar and nicotine found in cigarettes. Under the assumption of no correlation, it can be shown that the distribution of the r s -values approaches a normal distribution with a mean of 0 and a standard √ deviation of 1/ n − 1 as n increases. Consequently, when n exceeds the values given in Table A.21, one can test for a significant correlation by computing

and comparing with critical values of the standard normal distribution shown in Table A.3.

Exercises

16.23 A random sample of 15 adults living in a small 16.25 Use the runs test to test, at level 0.01, whether town were selected to estimate the proportion of voters there is a difference in the average operating time for favoring a certain candidate for mayor. Each individual the two calculators of Exercise 16.17 on page 670. was also asked if he or she was a college graduate. By letting Y and N designate the responses of “yes” and

16.26 In an industrial production line, items are in- “no” to the education question, the following sequence spected periodically for defectives. The following is a was obtained:

sequence of defective items, D, and nondefective items, N , produced by this production line:

NNNNYYNYYNYNNNN DDNNNDNNDDNNNN

Use the runs test at the 0.1 level of significance to de- NDDDNNDNNNNDND termine if the sequence supports the contention that the sample was selected at random.

Use the large-sample theory for the runs test, with a significance level of 0.05, to determine whether the de-

16.24 A silver-plating process is used to coat a cer- fectives are occurring at random. tain type of serving tray. When the process is in con- trol, the thickness of the silver on the trays will vary

16.27 Assuming that the measurements of Exercise randomly following a normal distribution with a mean

1.14 on page 30 were recorded successively from left of 0.02 millimeter and a standard deviation of 0.005 to right as they were collected, use the runs test, with

millimeter. Suppose that the next 12 trays examined α = 0.05, to test the hypothesis that the data represent show the following thicknesses of silver: 0.019, 0.021,

a random sequence.

0.020, 0.019, 0.020, 0.018, 0.023, 0.021, 0.024, 0.022, 0.023, 0.022. Use the runs test to determine if the

16.28 How large a sample is required to be 95% con- fluctuations in thickness from one tray to another are fident that at least 85% of the distribution of measure- random. Let α = 0.05.

ments is included between the sample extremes?

678 Chapter 16 Nonparametric Statistics 16.29 What is the probability that the range of a (b) test the hypothesis, at the 0.025 level of signif-

random sample of size 24 includes at least 90% of the icance, that ρ = 0 against the alternative that population?

16.30 How large a sample is required to be 99% con- 16.36 A consumer panel tests nine brands of mi- fident that at least 80% of the population will be less crowave ovens for overall quality. The ranks assigned

than the largest observation in the sample? by the panel and the suggested retail prices are as fol- lows:

16.31 What is the probability that at least 95% of a population will exceed the smallest value in a random

Panel Suggested sample of size n = 135?

Manufacturer Rating Price A 6 $480 16.32 The following table gives the recorded grades

B 9 395 for 10 students on a midterm test and the final exam-

C 2 575 ination in a calculus course:

D 8 550 E 5 510

F 1 545 Student

Midterm

Final

G 7 400 L.S.A.

Test

Examination

84 73 H 4 465 W.P.B.

98 63 I 3 420 R.W.K.

J.R.L. 72 66 Is there a significant relationship between the quality J.K.L.

86 78 and the price of a microwave oven? Use a 0.05 level of D.L.P.

93 78 significance.

B.L.P.

D.W.M. 0 0 16.37 Two judges at a college homecoming parade M.N.M.

92 88 rank eight floats in the following order: R.H.S.

87 77 Float (a) Calculate the rank correlation coefficient.

1 2 3 4 5 6 7 8 (b) Test the null hypothesis that ρ = 0 against the

5 8 4 3 6 2 7 1 alternative that ρ > 0. Use α = 0.025.

Judge A

7 5 4 2 8 1 6 3 (a) Calculate the rank correlation coefficient.

Judge B

16.33 With reference to the data of Exercise 11.1 on (b) Test the null hypothesis that ρ = 0 against the page 398,

alternative that ρ > 0. Use α = 0.05. (a) calculate the rank correlation coefficient;

(b) test the null hypothesis, at the 0.05 level of sig- 16.38 In the article called “Risky Assumptions” by nificance, that ρ = 0 against the alternative that Paul Slovic, Baruch Fischoff, and Sarah Lichtenstein, published in Psychology Today (June 1980), the risk of in Exercise 11.44 on page 435.

dying in the United States from 30 activities and tech- nologies is ranked by members of the League of Women

16.34 Calculate the rank correlation coefficient for Voters and also by experts who are professionally in- the daily rainfall and amount of particulate removed volved in assessing risks. The rankings are as shown in in Exercise 11.13 on page 400.

Table 16.9. (a) Calculate the rank correlation coefficient.

16.35 With reference to the weights and chest sizes (b) Test the null hypothesis of zero correlation between of infants in Exercise 11.47 on page 436,

the rankings of the League of Women Voters and (a) calculate the rank correlation coefficient;

the experts against the alternative that the corre- lation is not zero. Use a 0.05 level of significance.

Review Exercises 679

Table 16.9: The Ranking Data for Exercise 16.38

Activity or

Activity or

Technology Risk

Voters Experts Nuclear power

Voters

Experts

Technology Risk

2 1 Handguns

1 20 Motor vehicles

4 2 Motorcycles

3 4 Smoking

6 3 Private aviation

5 6 Alcoholic beverages

8 17 Pesticides

7 12 Police work

10 5 Fire fighting

9 8 Surgery

12 13 Hunting

11 18 Large construction

14 26 Mountain climing

13 23 Spray cans

16 15 Commercial aviation

15 29 Bicycles

18 9 Swimming

17 16 Electric power

24 19 Food preservatives

23 27 Railroads

26 21 Power mowers

25 14 Food coloring

28 24 Home appliances

Review Exercises

16.39 A study by a chemical company compared the (a) Use the sign test at the 0.05 level to test the null drainage properties of two different polymers. Ten dif-

hypothesis that polymer A has the same median ferent sludges were used, and both polymers were al-

drainage as polymer B.

lowed to drain in each sludge. The free drainage was (b) Use the signed-rank test to test the hypotheses of measured in mL/min.

part (a).

Sludge Type Polymer A Polymer B 1 12.7 12.0 16.40 In Review Exercise 13.45 on page 555, use the

2 14.6 15.0 Kruskal-Wallis test, at the 0.05 level of significance, to 3 18.6 19.2 determine if the chemical analyses performed by the 4 17.5 17.3 four laboratories give, on average, the same results.

6 16.9 16.6 16.41 Use the data from Exercise 13.14 on page 530 7 19.9 20.1 to see if the median amount of nitrogen lost in perspi- 8 17.6 17.6 ration is different for the three levels of dietary protein.

This page intentionally left blank

Chapter 17

Statistical Quality Control

Dokumen yang terkait

Optimal Retention for a Quota Share Reinsurance

0 0 7

Digital Gender Gap for Housewives Digital Gender Gap bagi Ibu Rumah Tangga

0 0 9

Challenges of Dissemination of Islam-related Information for Chinese Muslims in China Tantangan dalam Menyebarkan Informasi terkait Islam bagi Muslim China di China

0 0 13

Family is the first and main educator for all human beings Family is the school of love and trainers of management of stress, management of psycho-social-

0 0 26

THE EFFECT OF MNEMONIC TECHNIQUE ON VOCABULARY RECALL OF THE TENTH GRADE STUDENTS OF SMAN 3 PALANGKA RAYA THESIS PROPOSAL Presented to the Department of Education of the State Islamic College of Palangka Raya in Partial Fulfillment of the Requirements for

0 3 22

GRADERS OF SMAN-3 PALANGKA RAYA ACADEMIC YEAR OF 20132014 THESIS Presented to the Department of Education of the State College of Islamic Studies Palangka Raya in Partial Fulfillment of the Requirements for the Degree of Sarjana Pendidikan Islam

0 0 20

A. Research Design and Approach - The readability level of reading texts in the english textbook entitled “Bahasa Inggris SMA/MA/MAK” for grade XI semester 1 published by the Ministry of Education and Culture of Indonesia - Digital Library IAIN Palangka R

0 1 12

A. Background of Study - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 15

1. The definition of textbook - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 38

CHAPTER IV DISCUSSION - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 95