How could Congress use the information in these plots for writing tax laws that allow

3.4 Describing Data on a Single Variable: Measures of Central Tendency

Basic 3.15 Compute the mean, median, and mode for the following data: 55 85 90 50 110 115 75 85 8 23 70 65 50 60 90 90 55 70 5 31 Basic 3.16 Refer to the data in Exercise 3.15 with the measurements 110 and 115 replaced by 345 and 467. Recompute the mean, median, and mode. Discuss the impact of these extreme measure- ments on the three measures of central tendency. Basic 3.17 Refer to the data in Exercise 3.15 and 3.16. Compute a 10 trimmed mean for both data sets—that is, the original and the one with the two extreme values. Do the extreme values affect the 10 trimmed mean? Would a 5 trimmed mean be affected by the two extreme values? Basic 3.18 Determine the mean, median, and mode for the data presented in the following frequency table. Class Interval Frequency 2.0 – 4.9 5 5.0 –7.9 13 8.0 –10.9 16 11.0 –13.9 9 14.0 –16.9 4 17.0 –19.9 2 20.0 –22.9 2 Engin. 3.19 A study of the reliability of buses [“Large sample simultaneous confidence intervals for the multinominal probabilities on transformations of the cell frequencies,” Technometrics 1980 22:588] examined the reliability of 191 buses. The distance traveled in 1,000s of miles prior to the first major motor failure was classified into intervals. A modified form of the table follows. Distance Traveled 1,000 miles Frequency 0 –20.0 6 20.1– 40.0 11 40.1– 60.0 16 60.1–100.0 59 100.1–120.0 46 120.1–140.0 33 140.1–160.0 16 160.1–200.0 4 a. Sketch the relative frequency histogram for the distance data and describe its shape. b. Estimate the mode, median, and mean for the distance traveled by the 191 buses.

c. What does the relationship among the three measures of center indicate about the

shape of the histogram for these data?

d. Which of the three measures would you recommend as the most appropriate repre-

sentative of the distance traveled by one of the 191 buses? Explain your answer. Med. 3.20 In a study of 1,329 American men reported in American Statistician [1974 28:115 –122] the men were classified by serum cholesterol and blood pressure. The group of 408 men who had blood pressure readings less than 127 mm Hg were then classified according to their serum cholesterol level. Serum Cholesterol mg100cc Frequency 0.0 –199.9 119 200.0 –219.9 88 220.0 –259.9 127 greater than 259 74 a. Estimate the mode, median, and mean for the serum cholesterol readings if possible. b. Which of the three summary statistics is more informative concerning a typical serum cholesterol level for the group of men? Explain your answer. Env. 3.21 The ratio of DDE related to DDT to PCB concentrations in bird eggs has been shown to have had a number of biological implications. The ratio is used as an indication of the move- ment of contamination through the food chain. The paper “The ratio of DDE to PCB concentra- tions in Great Lakes herring gull eggs and its use in interpreting contaminants data’’ [Journal of Great Lakes Research 1998 241:12 –31] reports the following ratios for eggs collected at 13 study sites from the five Great Lakes. The eggs were collected from both terrestrial- and aquatic-feeding birds. DDE to PCB Ratio Terrestrial Feeders 76.50 6.03 3.51 9.96 4.24 7.74 9.54 41.70 1.84 2.50 1.54 Aquatic Feeders 0.27 0.61 0.54 0.14 0.63 0.23 0.56 0.48 0.16 0.18 a. Compute the mean and median for the 21 ratios, ignoring the type of feeder. b. Compute the mean and median separately for each type of feeder.

c. Using your results from parts a and b, comment on the relative sensitivity of the

mean and median to extreme values in a data set.

d. Which measure, mean or median, would you recommend as the most appropriate

measure of the DDE to PCB level for both types of feeders? Explain your answer. Med. 3.22 A study of the survival times, in days, of skin grafts on burn patients was examined in Woolson and Lachenbruch [Biometrika 1980 67:597– 606]. Two of the patients left the study prior to the failure of their grafts. The survival time for these individuals is some number greater than the reported value. Survival time days: 37, 19, 57, 93, 16, 22, 20, 18, 63, 29, 60 The “’’ indicates that the patient left the study prior to failure of the graft; values given are for the day the patient left the study. a. Calculate the measures of center if possible for the 11 patients. b. If the survival times of the two patients who left the study were obtained, how would these new values change the values of the summary statistics calculated in a? Engin. 3.23 A study of the reliability of diesel engines was conducted on 14 engines. The engines were run in a test laboratory. The time in days until the engine failed is given here. The study was terminated after 300 days. For those engines that did not fail during the study period, an asterisk is placed by the number 300. Thus, for these engines, the time to failure is some value greater than 300. Failure time days: 130, 67, 300, 234, 90, 256, 87, 120, 201, 178, 300, 106, 289, 74 a. Calculate the measures of center for the 14 engines. b. What are the implications of computing the measures of center when some of the exact failure times are not known?